- Academic Search

DQ Nguyen, T Vu, AT Nguyen - arxiv preprint arxiv:2005.10200, 2020 - arxiv.org

We present BERTweet, the first public large-scale pre-trained language model for English
Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is …

Enregistrer Citer Cité 1045 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] pkwyx.com

[PDF][PDF] BERT rediscovers the classical NLP pipeline

I Tenney - arxiv preprint arxiv:1905.05950, 2019 - fq.pkwyx.com

Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We
focus on one such model, BERT, and aim to quantify where linguistic information is captured …

Enregistrer Citer Cité 1818 fois Autres articles En cache

[Free GPT-4]

[PDF] arxiv.org

Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

K Sinha, R Jia, D Hupkes, J Pineau, A Williams… - arxiv preprint arxiv …, 2021 - arxiv.org

A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …

Enregistrer Citer Cité 264 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] pkwyx.com

[PDF][PDF] Linguistic Knowledge and Transferability of Contextual Representations

NF Liu - arxiv preprint arxiv:1903.08855, 2019 - fq.pkwyx.com

Contextual word representations derived from large-scale neural language models are
successful across a diverse set of NLP tasks, suggesting that they encode useful and …

Enregistrer Citer Cité 845 fois Autres articles En cache

[Free GPT-4]

[PDF] arxiv.org

Evaluating models' local decision boundaries via contrast sets

M Gardner, Y Artzi, V Basmova, J Berant… - arxiv preprint arxiv …, 2020 - arxiv.org

Standard test sets for supervised learning evaluate in-distribution generalization.
Unfortunately, when a dataset has systematic gaps (eg, annotation artifacts), these …

Enregistrer Citer Cité 492 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

What do you learn from context? probing for sentence structure in contextualized word representations

I Tenney, P **a, B Chen, A Wang, A Poliak… - arxiv preprint arxiv …, 2019 - arxiv.org

Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT
(Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of …

Enregistrer Citer Cité 950 fois Autres articles Les 10 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Intermediate-task transfer learning with pretrained models for natural language understanding: When and why does it work?

Y Pruksachatkun, J Phang, H Liu, PM Htut… - arxiv preprint arxiv …, 2020 - arxiv.org

While pretrained models such as BERT have shown large gains across natural language
understanding tasks, their performance can be improved by further training the model on a …

Enregistrer Citer Cité 300 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] pnas.org Full View

A resource-rational model of human processing of recursive linguistic structure

M Hahn, R Futrell, R Levy… - Proceedings of the …, 2022 - National Acad Sciences

A major goal of psycholinguistic theory is to account for the cognitive constraints limiting the
speed and ease of language comprehension and production. Wide-ranging evidence …

Enregistrer Citer Cité 60 fois Autres articles Les 14 versions Free GPT-4

[Free GPT-4]

[PDF] researchgate.net

Automatic mining of opinions expressed about apis in stack overflow

G Uddin, F Khomh - IEEE Transactions on Software …, 2019 - ieeexplore.ieee.org

With the proliferation of online developer forums, developers share their opinions about the
APIs they use. The plethora of such information can present challenges to the developers to …

Enregistrer Citer Cité 81 fois Autres articles Les 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

When do you need billions of words of pretraining data?

Y Zhang, A Warstadt, HS Li, SR Bowman - arxiv preprint arxiv:2011.04946, 2020 - arxiv.org

NLP is currently dominated by general-purpose pretrained language models like RoBERTa,
which achieve strong performance on NLU tasks through pretraining on billions of words …

Enregistrer Citer Cité 151 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

A Gold Standard Dependency Corpus for English.

BERTweet: A pre-trained language model for English Tweets

[PDF][PDF] BERT rediscovers the classical NLP pipeline

Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

[PDF][PDF] Linguistic Knowledge and Transferability of Contextual Representations

Evaluating models' local decision boundaries via contrast sets

What do you learn from context? probing for sentence structure in contextualized word representations

Intermediate-task transfer learning with pretrained models for natural language understanding: When and why does it work?

A resource-rational model of human processing of recursive linguistic structure

Automatic mining of opinions expressed about apis in stack overflow

When do you need billions of words of pretraining data?