Unlike “Likely”,“Unlike” is Unlikely: BPE-based Segmentation hurts Morphological Derivations in LLMs
Abstract Large Language Models (LLMs) rely on subword vocabularies to process and
generate text. However, because subwords are marked as initial-or intra-word, we find that …
generate text. However, because subwords are marked as initial-or intra-word, we find that …
Towards the Machine Translation of Scientific Neologisms
Scientific research continually discovers and invents new concepts, which are then referred
to by new terms, neologisms, or neonyms in this context. As the vast majority of publications …
to by new terms, neologisms, or neonyms in this context. As the vast majority of publications …
Can Large Language Models Code Like a Linguist?: A Case Study in Low Resource Sound Law Induction
Historical linguists have long written a kind of incompletely formalized''program''that converts
reconstructed words in an ancestor language into words in one of its attested descendants …
reconstructed words in an ancestor language into words in one of its attested descendants …
Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Historical linguists have long written" programs" that convert reconstructed words in an
ancestor language into their attested descendants via ordered string rewrite functions …
ancestor language into their attested descendants via ordered string rewrite functions …
Robust Privacy Amidst Innovation with Large Language Models Through a Critical Assessment of the Risks
This study examines integrating EHRs and NLP with large language models (LLMs) to
improve healthcare data management and patient care. It focuses on using advanced …
improve healthcare data management and patient care. It focuses on using advanced …