Folio: Natural language reasoning with first-order logic S Han, H Schoelkopf, Y Zhao, Z Qi, M Riddell, L Benson, L Sun, E Zubova, ... EMNLP 2024, 2022 | 158* | 2022 |
Revisiting the gold standard: Grounding summarization evaluation with robust human evaluation Y Liu, AR Fabbri, P Liu, Y Zhao, L Nan, R Han, S Han, S Joty, CS Wu, ... ACL 2023, 2023 | 106 | 2023 |
Medagents: Large language models as collaborators for zero-shot medical reasoning X Tang, A Zou, Z Zhang, Z Li, Y Zhao, X Zhang, A Cohan, M Gerstein ACL 2024 Findings, 2023 | 104 | 2023 |
MultiHiertt: Numerical reasoning over multi hierarchical tabular and textual data Y Zhao, Y Li, C Li, R Zhang ACL 2022, 2022 | 92 | 2022 |
Investigating data contamination in modern benchmarks for large language models C Deng, Y Zhao, X Tang, M Gerstein, A Cohan NAACL 2024, 2024 | 87* | 2024 |
Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies L Nan, Y Zhao, W Zou, N Ri, J Tae, E Zhang, A Cohan, D Radev EMNLP 2023 Findings, 2023 | 77* | 2023 |
Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios Y Zhao, H Zhang, S Si, L Nan, X Tang, A Cohan EMNLP 2023 Industry Track, 2023 | 43* | 2023 |
Benchmarking generation and evaluation capabilities of large language models for instruction controllable summarization Y Liu, AR Fabbri, J Chen, Y Zhao, S Han, S Joty, P Liu, D Radev, CS Wu, ... NAACL 2024 Findings, 2024 | 39 | 2024 |
Apparel-invariant feature learning for person re-identification Z Yu, Y Zhao, B Hong, Z Jin, J Huang, D Cai, XS Hua IEEE Transactions on Multimedia 24, 4482-4492, 2021 | 39 | 2021 |
Prioritizing safeguarding over autonomy: Risks of llm agents for science X Tang, Q Jin, K Zhu, T Yuan, Y Zhang, W Zhou, M Qu, Y Zhao, J Tang, ... arXiv preprint arXiv:2402.04247, 2024 | 32 | 2024 |
Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data? X Tang, Y Zong, J Phang, Y Zhao, W Zhou, A Cohan, M Gerstein NAACL 2024, 2024 | 30* | 2024 |
ReasTAP: Injecting table reasoning skills during pre-training via synthetic reasoning examples Y Zhao, L Nan, Z Qi, R Zhang, D Radev EMNLP 2022, 2022 | 29 | 2022 |
DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data Y Zhao, Y Long, H Liu, L Nan, L Chen, R Kamoi, Y Liu, X Tang, R Zhang, ... ACL 2024, 2024 | 26* | 2024 |
L2ceval: Evaluating language-to-code generation capabilities of large language models A Ni, P Yin, Y Zhao, M Riddell, T Feng, R Shen, S Yin, Y Liu, S Yavuz, ... TACL 2024, 2024 | 23 | 2024 |
RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations Y Zhao, C Zhao, L Nan, Z Qi, W Zhang, X Tang, B Mi, D Radev ACL 2023, 2023 | 23 | 2023 |
QTSumm: Query-Focused Summarization over Tabular Data Y Zhao, Z Qi, L Nan, B Mi, Y Liu, W Zou, S Han, R Chen, X Tang, Y Xu, ... EMNLP 2023, 2023 | 19* | 2023 |
Finmath: Injecting a tree-structured solver for question answering over financial reports C Li, W Ye, Y Zhao LREC 2022, 2022 | 19 | 2022 |
LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control Y Zhao, Z Qi, L Nan, LJY Flores, D Radev EACL 2023, 2023 | 18 | 2023 |
Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation Y Liu, AR Fabbri, Y Zhao, P Liu, S Joty, CS Wu, C Xiong, D Radev EMNLP 2023, 2023 | 16 | 2023 |
Open-finllms: Open multimodal large language models for financial applications Q Xie, D Li, M Xiao, Z Jiang, R Xiang, X Zhang, Z Chen, Y He, W Han, ... arXiv preprint arXiv:2408.11878, 2024 | 15 | 2024 |