CaLMQA: Exploring culturally specific long-form question answering across 23 languages

S Arora, M Karpinska, HT Chen, I Bhattacharjee… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are used for long-form question answering (LFQA), which
requires them to generate paragraph-length answers to complex questions. While LFQA has …

Slovak dataset for multilingual question answering

D Hládek, J Staš, J Juhár, T Koctúr - IEEE Access, 2023 - ieeexplore.ieee.org
SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It
consists of more than 91k factual questions and answers from various fields. Each question …

Poquad-the polish question answering dataset-description and analysis

R Tuora, A Zwierzchowska… - Proceedings of the 12th …, 2023 - dl.acm.org
This paper showcases PoQuAD—a SQuAD-like contribution to building Question Answering
tools for Polish. It largely follows the usual Machine Reading Comprehension format, but a …

BenCzechMark: A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism

M Fajcik, M Docekal, J Dolezal, K Ondrej… - arxiv preprint arxiv …, 2024 - arxiv.org
We present BenCzechMark (BCM), the first comprehensive Czech language benchmark
designed for large language models, offering diverse tasks, multiple task formats, and …

Towards a polish question answering dataset (poquad)

R Tuora, N Zawadzka-Paluektau, C Klamra… - … Conference on Asian …, 2022 - Springer
This paper presents the efforts towards creating PoQuAD, a dataset for training automatic
question answering models in Polish. It justifies why having native data is vital for training …

A Dataset and Strong Baselines for Classification of Czech News Texts

H Kydlíček, J Libovický - International Conference on Text, Speech, and …, 2023 - Springer
Pre-trained models for Czech Natural Language Processing are often evaluated on purely
linguistic tasks (POS tagging, parsing, NER) and relatively simple classification tasks such …

Employing sentence context in Czech answer selection

M Medveď, R Sabol, A Horák - … 2020, Brno, Czech Republic, September 8 …, 2020 - Springer
Question answering (QA) of non-mainstream languages requires specific adaptations of the
current methods tested primarily with very large English resources. In this paper, we present …

[PDF][PDF] Pretraining and Evaluation of Czech ALBERT Language Model

P Zelina - Bachelor thesis, Masaryk University, Fakulty of …, 2020 - is.muni.cz
This thesis explores a new language model called ALBERT, released by Google Research
in 2019. The ALBERT architecture has been very successful in English NLP tasks, and this …

[PDF][PDF] Czech Question Answer Selection using Recurrent Neural Networks

R Sabol - 2020 - is.muni.cz
This thesis describes an answer selection module optimized for the Czech language. The
module is being developed for open-domain question answering system Automatic …

[PDF][PDF] Comparing RNN and Transformer Context Representations in the Czech Answer Selection Task

RS Marek Medved, A Horák - 2022 - scitepress.org
Open domain question answering now inevitably builds upon advanced neural models
processing large unstructured textual sources serving as a kind of underlying knowledge …