Machine learning methods for small data challenges in molecular science
Small data are often used in scientific and engineering research due to the presence of
various constraints, such as time, cost, ethics, privacy, security, and technical limitations in …
various constraints, such as time, cost, ethics, privacy, security, and technical limitations in …
Interactive natural language processing
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …
Making sense of citizens' input through artificial intelligence: a review of methods for computational text analysis to support the evaluation of contributions in public …
Public sector institutions that consult citizens to inform decision-making face the challenge of
evaluating the contributions made by citizens. This evaluation has important democratic …
evaluating the contributions made by citizens. This evaluation has important democratic …
Large language models as annotators: Enhancing generalization of nlp models at minimal cost
State-of-the-art supervised NLP models achieve high accuracy but are also susceptible to
failures on inputs from low-data regimes, such as domains that are not represented in …
failures on inputs from low-data regimes, such as domains that are not represented in …
Active instruction tuning: Improving cross-task generalization by training on prompt sensitive tasks
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large
language models (LLMs) on a massive amount of diverse tasks with instructions. However …
language models (LLMs) on a massive amount of diverse tasks with instructions. However …
Which examples to annotate for in-context learning? towards effective and efficient selection
Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL). ICL is
efficient as it does not require any parameter updates to the trained LLM, but only few …
efficient as it does not require any parameter updates to the trained LLM, but only few …
Interactive multi-fidelity learning for cost-effective adaptation of language model with sparse human supervision
Large language models (LLMs) have demonstrated remarkable capabilities in various tasks.
However, their suitability for domain-specific tasks, is limited due to their immense scale at …
However, their suitability for domain-specific tasks, is limited due to their immense scale at …
Videocot: A video chain-of-thought dataset with active annotation tool
Y Wang, Y Zeng, J Zheng, X **ng, J Xu, X Xu - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with
less attention than videos, especially in sub-fields such as prompt engineering, video chain …
less attention than videos, especially in sub-fields such as prompt engineering, video chain …
Feedback efficient online fine-tuning of diffusion models
Diffusion models excel at modeling complex data distributions, including those of images,
proteins, and small molecules. However, in many cases, our goal is to model parts of the …
proteins, and small molecules. However, in many cases, our goal is to model parts of the …
On the limitations of simulating active learning
Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects
informative unlabeled data for human annotation, aiming to improve over random sampling …
informative unlabeled data for human annotation, aiming to improve over random sampling …