링크: https://arxiv.org/abs/1902.00751 Parameter-Efficient Transfer Learning for NLPFine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transferarxiv.org간단 요약 및 목차 BackgroundLLM을 Fine-tuning하면 성능은 뛰어나지만 각..
링크: https://arxiv.org/abs/1909.11942 ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsIncreasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer training times. Toarxiv.org 간단 요약 및 목차 Backgroun..
링크: https://openreview.net/forum?id=SyxS0T4tvS RoBERTa: A Robustly Optimized BERT Pretraining ApproachWe evaluate a number of design decisions when pretraining BERT models and propose an improved recipe that achieves state-of-the-art results on many natural language understanding tasks.openreview.net 간단 요약 및 목차Background모델을 학습시키는것은 많은 자원이 드니까 정확히 어떤 학습방법이 모델의 성능을 향상시키는건지 측정하기가 어려움BERT 모델은 undert..