Interactive and visual prompt engineering for ad-hoc task adaptation with large language models
State-of-the-art neural language models can now be used to solve ad-hoc language tasks
through zero-shot prompting without the need for supervised training. This approach has …
through zero-shot prompting without the need for supervised training. This approach has …
Visual comparison of language model adaptation
Neural language models are widely used; however, their model parameters often need to be
adapted to the specific domains and tasks of an application, which is time-and resource …
adapted to the specific domains and tasks of an application, which is time-and resource …
Knowledgevis: Interpreting language models by comparing fill-in-the-blank prompts
Recent growth in the popularity of large language models has led to their increased usage
for summarizing, predicting, and generating text, making it vital to help researchers and …
for summarizing, predicting, and generating text, making it vital to help researchers and …
Mediators: Conversational agents explaining nlp model behavior
The human-centric explainable artificial intelligence (HCXAI) community has raised the
need for framing the explanation process as a conversation between human and machine …
need for framing the explanation process as a conversation between human and machine …
Llm comparator: Visual analytics for side-by-side evaluation of large language models
Automatic side-by-side evaluation has emerged as a promising approach to evaluating the
quality of responses from large language models (LLMs). However, analyzing the results …
quality of responses from large language models (LLMs). However, analyzing the results …
XAINES: Explaining AI with narratives
Artificial Intelligence (AI) systems are increasingly pervasive: Internet of Things, in-car
intelligent devices, robots, and virtual assistants, and their large-scale adoption makes it …
intelligent devices, robots, and virtual assistants, and their large-scale adoption makes it …
LLM Comparator: Interactive Analysis of Side-by-Side Evaluation of Large Language Models
Evaluating large language models (LLMs) presents unique challenges. While automatic
side-by-side evaluation, also known as LLM-as-a-judge, has become a promising solution …
side-by-side evaluation, also known as LLM-as-a-judge, has become a promising solution …
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
Human evaluation is increasingly critical for assessing large language models, capturing
linguistic nuances, and reflecting user preferences more accurately than traditional …
linguistic nuances, and reflecting user preferences more accurately than traditional …
Interactive prompt debugging with sequence salience
We present Sequence Salience, a visual tool for interactive prompt debugging with input
salience methods. Sequence Salience builds on widely used salience methods for text …
salience methods. Sequence Salience builds on widely used salience methods for text …
Visual Analytics for Generative Transformer Models
While transformer-based models have achieved state-of-the-art results in a variety of
classification and generation tasks, their black-box nature makes them challenging for …
classification and generation tasks, their black-box nature makes them challenging for …