- Academic Search

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Save Cite Cited by 931 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] ssrn.com

Data-driven building load prediction and large language models: Comprehensive overview

Y Zhang, D Wang, G Wang, P Xu, Y Zhu - Energy and Buildings, 2024 - Elsevier

Building load forecasting is essential for optimizing the architectural design and managing
energy efficiently, enhancing the performance of Heating, Ventilation, and Air Conditioning …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Not all tokens are what you need for pretraining

Z Lin, Z Gou, Y Gong, X Liu, R Xu… - Advances in …, 2025 - proceedings.neurips.cc

Previous language model pre-training methods have uniformly applied a next-token
prediction loss to all training tokens. Challenging this norm, we posit that''Not all tokens in a …

Save Cite Cited by 3 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

CLEVE: contrastive pre-training for event extraction

Z Wang, X Wang, X Han, Y Lin, L Hou, Z Liu… - arxiv preprint arxiv …, 2021 - arxiv.org

Event extraction (EE) has considerably benefited from pre-trained language models (PLMs)
by fine-tuning. However, existing pre-training methods have not involved modeling event …

Save Cite Cited by 121 Related articles All 6 versions Free GPT-4 View as HTML

A novel neural network model fusion approach for improving medical named entity recognition in online health expert question-answering services

Z Hu, X Ma - Expert Systems with Applications, 2023 - Elsevier

Because of the frequent occurrence of chronic diseases, the COVID-19 pandemic, etc.,
online health expert question-answering (HQA) services have been unable to cope with the …

Save Cite Cited by 24 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] aclanthology.org

Continual knowledge distillation for neural machine translation

Y Zhang, P Li, M Sun, Y Liu - … of the 61st Annual Meeting of the …, 2023 - aclanthology.org

While many parallel corpora are not publicly accessible for data copyright, data privacy and
competitive differentiation reasons, trained translation models are increasingly available on …

Save Cite Cited by 8 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] yancomm.net

[PDF][PDF] len or index or count, anything but v1”: Predicting variable names in decompilation output with transfer learning

KK Pal, AP Bajaj, P Banerjee, A Dutcher… - 2024 IEEE Symposium …, 2024 - yancomm.net

Binary reverse engineering is an arduous and tedious task performed by skilled and
expensive human analysts. Information about the source code is irrevocably lost in the …

Save Cite Cited by 9 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arizona.edu

EntityBERT: Entity-centric masking strategy for model pretraining for the clinical domain

C Lin, T Miller, D Dligach, S Bethard, G Savova - 2021 - repository.arizona.edu

Transformer-based neural language models have led to breakthroughs for a variety of
natural language processing (NLP) tasks. However, most models are pretrained on general …

Save Cite Cited by 43 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Teaching the pre-trained model to generate simple texts for text simplification

R Sun, W Xu, X Wan - arxiv preprint arxiv:2305.12463, 2023 - arxiv.org

Randomly masking text spans in ordinary texts in the pre-training stage hardly allows
models to acquire the ability to generate simple texts. It can hurt the performance of pre …

Save Cite Cited by 20 Related articles All 4 versions Free GPT-4 View as HTML

MoCA: Incorporating domain pretraining and cross attention for textbook question answering

F Xu, Q Lin, J Liu, L Zhang, T Zhao, Q Chai, Y Pan… - Pattern Recognition, 2023 - Elsevier

Abstract Textbook Question Answering (TQA) is a complex multimodal task to infer answers
given large context descriptions and abundant diagrams. Compared with Visual Question …

Save Cite Cited by 12 Related articles All 3 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Train no evil: Selective masking for task-guided pre-training

[HTML][HTML] Pre-trained models: Past, present and future

Data-driven building load prediction and large language models: Comprehensive overview

Not all tokens are what you need for pretraining

CLEVE: contrastive pre-training for event extraction

A novel neural network model fusion approach for improving medical named entity recognition in online health expert question-answering services

Continual knowledge distillation for neural machine translation

[PDF][PDF] len or index or count, anything but v1”: Predicting variable names in decompilation output with transfer learning

EntityBERT: Entity-centric masking strategy for model pretraining for the clinical domain

Teaching the pre-trained model to generate simple texts for text simplification

MoCA: Incorporating domain pretraining and cross attention for textbook question answering