- Academic Search

Judgment sieve: Reducing uncertainty in group judgments through interventions targeting ambiguity...

TS Kuo, AL Halfaker, Z Cheng, J Kim, MH Wu… - Proceedings of the CHI …, 2024 - dl.acm.org

AI tools are increasingly deployed in community contexts. However, datasets used to
evaluate AI are typically created by developers and annotators outside a given community …

Save Cite Cited by 16 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

" Yeah, this graph doesn't show that": Analysis of Online Engagement with Misleading Data Visualizations

M Lisnic, A Lex, M Kogan - Proceedings of the CHI Conference on …, 2024 - dl.acm.org

Attempting to make sense of a phenomenon or crisis, social media users often share data
visualizations and interpretations that can be erroneous or misleading. Prior work has …

Save Cite Cited by 5 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization

Y Liu, S Feng, X Han, V Balachandran, CY Park… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we take a first step towards designing summarization systems that are faithful to
the author's opinions and perspectives. Focusing on a case study of preserving political …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

DISCERN: Designing Decision Support Interfaces to Investigate the Complexities of Workplace Social Decision-Making With Line Managers

P Khadpe, L Le, K Nowak, ST Iqbal, J Suh - Proceedings of the CHI …, 2024 - dl.acm.org

Line managers form the first level of management in organizations, and must make complex
decisions, while maintaining relationships with those impacted by their decisions. Amidst …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

WeAudit: Scaffolding User Auditors and AI Practitioners in Auditing Generative AI

WH Deng, C Wang, HZ Han, JI Hong, K Holstein… - arxiv preprint arxiv …, 2025 - arxiv.org

There has been growing interest from both practitioners and researchers in engaging end
users in AI auditing, to draw upon users' unique knowledge and lived experiences …

Save Cite Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

A Framework for Evaluating LLMs Under Task Indeterminacy

L Guerdan, H Wallach, S Barocas… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language model (LLM) evaluations often assume there is a single correct response--a
gold label--for each item in the evaluation corpus. However, some tasks can be ambiguous …

[Free GPT-4]

[PDF] arxiv.org

PolicyCraft: Supporting Collaborative and Participatory Policy Design through Case-Grounded Deliberation

TS Kuo, QZ Chen, AX Zhang, J Hsieh, H Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org

Community and organizational policies are typically designed in a top-down, centralized
fashion, with limited input from impacted stakeholders. This can result in policies that are …

[Free GPT-4]

[PDF] arxiv.org

Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game

P Samadarshi, M Mustafa, A Kulkarni… - arxiv preprint arxiv …, 2024 - arxiv.org

The New York Times Connections game has emerged as a popular and challenging pursuit
for word puzzle enthusiasts. We collect 200 Connections games to evaluate the …

Save Cite Cited by 3 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Paper Copilot: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

J Yang - arxiv preprint arxiv:2502.00874, 2025 - arxiv.org

The rapid growth of submissions to top-tier Artificial Intelligence (AI) and Machine Learning
(ML) conferences has prompted many venues to transition from closed to open review …

[Free GPT-4]

[PDF] aclanthology.org

Automating Annotation Guideline Improvements using LLMs: A Case Study

A Bibal, N Gerlek, G Muric, E Boschee… - … of Context and …, 2025 - aclanthology.org

Annotating texts can be a tedious task, especially when texts are noisy. At the root of the
issue, guidelines are not always optimized enough to be able to perform the required …

Create alert

Cite

Advanced search

Saved to My library

Judgment sieve: Reducing uncertainty in group judgments through interventions targeting ambiguity...

Wikibench: Community-driven data curation for ai evaluation on wikipedia

" Yeah, this graph doesn't show that": Analysis of Online Engagement with Misleading Data Visualizations

What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization

DISCERN: Designing Decision Support Interfaces to Investigate the Complexities of Workplace Social Decision-Making With Line Managers

WeAudit: Scaffolding User Auditors and AI Practitioners in Auditing Generative AI

A Framework for Evaluating LLMs Under Task Indeterminacy

PolicyCraft: Supporting Collaborative and Participatory Policy Design through Case-Grounded Deliberation

Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game

Paper Copilot: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

Automating Annotation Guideline Improvements using LLMs: A Case Study