BenCzechMark: A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism
We present BenCzechMark (BCM), the first comprehensive Czech language benchmark
designed for large language models, offering diverse tasks, multiple task formats, and …
designed for large language models, offering diverse tasks, multiple task formats, and …