" Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
While communicating with a user, a task-oriented dialogue system has to track the user's
needs at each turn according to the conversation history. This process called dialogue state …
needs at each turn according to the conversation history. This process called dialogue state …
Evaluating token-level and passage-level dense retrieval models for math information retrieval
With the recent success of dense retrieval methods based on bi-encoders, studies have
applied this approach to various interesting downstream retrieval tasks with good efficiency …
applied this approach to various interesting downstream retrieval tasks with good efficiency …
One blade for one purpose: advancing math information retrieval using hybrid search
Neural retrievers have been shown to be effective for math-aware search. Their ability to
cope with math symbol mismatches, to represent highly contextualized semantics, and to …
cope with math symbol mismatches, to represent highly contextualized semantics, and to …
[HTML][HTML] On the instability of further pre-training: Does a single sentence matter to BERT?
We observe a remarkable instability in BERT-like models: minimal changes in the internal
representations of BERT, as induced by one-step further pre-training with even a single …
representations of BERT, as induced by one-step further pre-training with even a single …
Cross-lingual distillation for domain knowledge transfer with sentence transformers
Abstract Recent advancements in Natural Language Processing (NLP) have substantially
enhanced language understanding. However, non-English languages, especially in …
enhanced language understanding. However, non-English languages, especially in …
Data augmentation based on large language models for radiological report classification
Abstract The International Classification of Diseases (ICD) is fundamental in the field of
healthcare as it provides a standardized framework for the classification and coding of …
healthcare as it provides a standardized framework for the classification and coding of …
Masked Modeling Duo: Towards a Universal Audio Pre-Training Framework
Self-supervised learning (SSL) using masked prediction has made great strides in general-
purpose audio representation. This study proposes Masked Modeling Duo (M2D), an …
purpose audio representation. This study proposes Masked Modeling Duo (M2D), an …
[HTML][HTML] From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain
In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-
trained Large Language Models (LLMs) within the biomedical domain, a field that poses …
trained Large Language Models (LLMs) within the biomedical domain, a field that poses …
IndoGovBERT: A Domain-Specific Language Model for Processing Indonesian Government SDG Documents
Achieving the Sustainable Development Goals (SDGs) requires collaboration among
various stakeholders, particularly governments and non-state actors (NSAs). This …
various stakeholders, particularly governments and non-state actors (NSAs). This …
Bag of Lies: Robustness in Continuous Pre-training BERT
This study aims to acquire more insights into the continuous pre-training phase of BERT
regarding entity knowledge, using the COVID-19 pandemic as a case study. Since the …
regarding entity knowledge, using the COVID-19 pandemic as a case study. Since the …