AIware 2025
Wed 19 - Thu 20 November 2025
co-located with ASE 2025

Large Language Models (LLMs) excel in tasks like natural language understanding and text generation. Prompt engineering plays a critical role in leveraging LLM effectively. However, LLM’s black-box nature hinders its interpretability and effective prompt engineering. A wide range of model explanation approaches have been developed for deep learning models (e.g., feature attribution-based and attention-based techniques). However, these local explanations are designed for single-output tasks like classification and regression, and cannot be directly applied to LLMs, which generate sequences of tokens. Recent efforts in LLM explanation focus on natural language explanations, but they are prone to hallucinations and inaccuracies. To address this, we introduce PromptExp, a framework for multi-granularity prompt explanations by aggregating tokenlevel insights. PromptExp introduces two token-level explanation approaches: (1) an aggregation-based approach combining local explanation techniques (e.g., Integrated Gradient), and (2) a perturbation-based approach with novel techniques to evaluate token masking impact. PromptExp supports both white-box and black-box explanations and extends explanations to higher granularity levels (e.g., sentences and components), enabling flexible analysis. We evaluate PromptExp in case studies such as sentiment analysis, showing the perturbation-based approach performs best using semantic similarity to assess perturbation impact. Furthermore, we conducted a user study in our industrial partner’s company to confirm PromptExp’s accuracy and practical value, and demonstrate its potential to enhance LLM interpretability.

Thu 20 Nov

Displayed time zone: Seoul change

16:00 - 16:50
Evaluation Frameworks, and Quantitative Assessment of LLMs (Part 2)Main Track / Benchmark & Dataset Track at Grand Hall 1
Chair(s): Zhou Yang University of Alberta, Alberta Machine Intelligence Institute
16:00
8m
Talk
PromptExp: Multi-granularity Prompt Explanation of Large Language Models
Main Track
Ximing Dong Centre for Software Excellence at Huawei Canada, Shaowei Wang University of Manitoba, Dayi Lin Centre for Software Excellence, Huawei Canada, Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada, Ahmed E. Hassan Queen’s University
16:08
8m
Talk
Beyond Code Explanations: A Ray of Hope for Cross-Language Vulnerability Repair
Main Track
Kevin Lira North Carolina State University, Baldoino Fonseca Universidade Federal de Alagoas, Wesley K.G. Assunção North Carolina State University, Davy Baía Federal University of Alagoas, Márcio Ribeiro Federal University of Alagoas, Brazil
Pre-print
16:16
8m
Talk
Secure Code Generation at Scale with Reflexion
Benchmark & Dataset Track
Arup Datta University of North Texas, Ahmed Aljohani University of North Texas, Hyunsook Do University of North Texas
Pre-print
16:24
5m
Talk
A Tool for Benchmarking Large Language Models' Robustness in Assessing the Realism of Driving Scenarios
Benchmark & Dataset Track
Jiahui Wu Simula Research Laboratory and University of Oslo, Chengjie Lu Simula Research Laboratory and University of Oslo, Aitor Arrieta Mondragon University, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University
Pre-print
16:29
21m
Live Q&A
Joint Q&A and Discussion #LLMAssessment
Main Track