AIware 2025 - Keynotes

About
Program

AIware 2025 Keynote Speakers

John Lam

Microsoft

Spec Kit in Practice: Executable Specs, On-Demand Checklists, and a Polya Loop

Spec Kit is an open-source toolkit for Spec-Driven Development that treats the spec as the executable center of gravity—driving planning, tasking, and implementation across today’s AI dev tools. I’ll demo the Specify CLI to go from intent to working code, show how plan-and-task breakdowns keep outcomes on-spec, and discuss what reliably works (and where it doesn’t) when shipping real features with AI in the loop. We’ll also touch on current agent integrations (e.g., GitHub Copilot, Cursor, Claude Code, Gemini CLI) and the concrete developer workflows teams are adopting. GitHub I’ll preview two near-term enhancements we’re building to tighten reliability. First, checklists as first-class constraints: today they live inside spec templates; we’re lifting them out so teams can invoke reusable acceptance criteria on demand—gating generation, execution, and sign-off across runs. Second, a Polya-style loop—understand → plan → execute → review—operationalized end-to-end so humans and agents share one problem-solving rhythm. Together these changes aim to reduce rework, make “what good looks like” explicit and verifiable, and keep fast paths fast without sacrificing correctness.

Bio: John Lam works on programming tools and platforms at Microsoft in Redmond, WA. His work spans dynamic languages on statically typed runtimes (IronRuby, IronPython, the DLR), multi-language Windows developer platforms built with COM, interactive Python tooling including the Jupyter experience in VS Code, and integrating Python into Excel as a peer to the Excel formula language. He is the creator of Spec Kit, an open-source project advancing Spec-Driven Development from programmer intent to shipped code.

Yiling Lou

UIUC

Automatically Maintaining Agent Systems: How Far Are We?

Large Language Model (LLM)-based agent systems are emerging as a new software paradigm and have been widely adopted across diverse domains. As agent systems are inevitably prone to bugs and continually evolve to meet changing external requirements, maintaining them is therefore critical yet requires substantial effort. In this talk, we will present our recent work on maintaining and optimizing agent systems. We will discuss common quality issues that arise during agent maintenance and explore how existing software maintenance techniques perform when applied to agent systems. We will further address the cost challenges of multi-agent systems and introduce our recent budget-aware optimization technique for multi-agent systems.

Bio: Dr. Yiling Lou is an incoming Assistant Professor in the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. Her research focuses on Software Engineering and its synergy with Artificial Intelligence. Her work has won multiple ACM SIGSOFT and IEEE TCSE Distinguished Paper Awards. She has severed as Program Co-chairs for LLM4Code (2024/2025/2026) and AIware (2025).

Chao Peng

Bytedance

Teaching LLMs to Debug: Toward Reasoning- and Tool-Aware Coding Agents

Large language models have shown remarkable ability in code generation and comprehension, yet they still struggle to reason systematically or operate effectively within real development environments. In this talk, I will share our recent efforts to move beyond static prompting toward training LLMs that can think, act, and verify like developers. We begin by exploring how complex software issues can be decomposed into structured subtasks—planning, searching, and verifying—to teach models to perform hierarchical reasoning and adaptive decision-making. Building on this foundation, we extend the learning process to environments where models interact with external tools such as test runners, linters, and code search systems. Through this integration, models not only generate code but also validate and refine their own outputs through iterative feedback. Together, these developments point toward a new generation of agentic LLMs—models that bridge reasoning and execution, capable of autonomously debugging, verifying, and evolving software.

Bio: Chao is a Principal Research Scientist at ByteDance (字节跳动). At ByteDance, he leads the Software Engineering Lab, where they conduct research on AI agents for software engineering. His research interest lies in the area of software testing, program repair and optimisations, and the synergy with machine learning and compiler techniques. He is also responsible for academic development and university collaboration.

Baptiste Rozière

Mistral AI

Code assistants: from code completion to coding agents.

Large Language Models (LLMs) applications rapidly evolved from code completion tools (e.g., GitHub Copilot, 2021) to autonomous code agents capable of complex reasoning and workflow execution. In this talk, we examine the technical advancements and remaining challenges in using LLMs for code generation, drawing on recent work from Mistral’s code generation team. We’ll discuss key improvements in model architectures, training strategies, and evaluation frameworks, as well as open questions in the field.

Bio: Baptiste leads the code generation team at Mistral AI. Previously, he was a research scientist at Meta AI in Paris working on Code Generation. He contributed to Llama and led Code Llama. During his PhD at Meta AI and Université Paris Dauphine, Baptiste conducted research on unsupervised translation of programming languages and model pre-training for code.

Dickson Tsai

Anthropic

Claude Code: From Single Agent in Terminal to Multi-Agent Systems

Claude Code, Anthropic's flagship terminal-based coding agent, started as an interactive tool for writing software. But as we iterated on it, we discovered something more fundamental: an extensible infrastructure for building multi-agent applications with ease. Teams at Anthropic are already leveraging Claude Code to build code review systems, issue triaging pipelines, and more -- with greater intellectual depth than ever before. The key to building effective multi-agent systems is understanding the agent loop underpinning both the terminal UI and the SDK. We'll build intuition for this foundational concept, then explore Claude Code's various primitives that extend the loop: custom agents for delegation, skills for providing knowledge for Claude to look up, and hooks for injecting logic at decision points. We'll examine how these primitives enable the real systems mentioned above, and how Claude Code's evolution into the Claude Agent SDK enables attendees to build their own multi-agent development workflows.

Bio: Dickson Tsai is a product engineer at Anthropic, where he works on Claude Code, a terminal-native agentic AI coding assistant. He pioneered "hooks," a feature that allows developers to extend Claude Code's agent loop with custom scripts to provide automated feedback, manage permissions dynamically, and more. His other contributions include the SlashCommand tool for Claude to invoke custom commands, Opus Plan Mode to balance complex task planning and costs, and Claude 4 model integration. Previously, Dickson worked as a pre-training data research engineer at Anthropic and as a growth engineer on Google Search. He has presented on AI coding agents at various Fortune 500 companies.

Don Syme

Github Next

Ambient, Safe Agentic Automation on the GitHub Platform

Software collaboration platforms like GitHub provide perfect opportunities for agentic automation. At GitHub Next we have developed GitHub Agentic Workflows, a framework for exploring practical, safe semi-automated software automation for repo-centric behaviours, focusing on "Continuous AI" scenarios that parallel CI/CD DevOps - Continuous Documentation, Test Improvement, Perf Improvement, Accessibility Improvement and much more. By safely bringing agentic behaviours directly into the GitHub information space on GitHub Actions, we allow developers and enterprise agentic architects exceptional creativity in exploring how automated coding agents can boost team productivity aligned with team goals, opening the doors to a new era of semi-automated software engineering.

Bio: Don Syme is a Principal Researcher at GitHub Next specialising in AI-driven collaborative programming systems. He is a co-creator of GitHub Agentic Workflows and Copilot Workspace and the designer of the F# language. In 2015, he was honored with a Silver Medal from the Royal Academy of Engineering.

Taylor Mullen

Google

Gemini CLI: The Terminal Renaissance

The command line is being reborn, transformed by the intersection of AI and developer workflows. In this talk, we'll walk through how Gemini CLI is reshaping developer productivity—from rapid prototyping and code generation to debugging, deployment, and automation. We'll explore how Gemini CLI wrote most of its own code and shifted from a single-command assistant to an extensible platform that elevates terminal-based development. We'll also cover the philosophy behind CLI-first AI tooling and the importance of UX in building the next generation of AI-native products.

Bio: Taylor Mullen is the creator of Gemini CLI and an engineer at Google, building tools for future AI developers. Before Google, he was one of the founders of the modern GitHub Copilot IDE platforms, where he drove strategic vision and served as the tech lead for GitHub Copilot in Visual Studio. He brings a deep background in developer tooling, OSS, and all things generative AI. A self-proclaimed serial side-projectist, Taylor is passionate about using AI to fundamentally improve the developer experience.

Jack Johns

British Telecom

Human To Hero: How Gen AI Can Be The Alfred That Transforms Anyone Into Batman

“A hero can be anyone.” Bruce Wayne, The Dark Knight Rises (2012) Generative AI has transformed the way we develop systems at BT. Every developer has access to cutting edge coding assistants, the Batarang to their Batman if you will. But with the recent advancements in agentic frameworks making a utility belt of tools even easier to acquire, where should we next focus our attention? In this talk, I will take you through BT's journey with generative AI, starting with coding assistants, and finishing with what we believe to be the next step in the gen AI journey. I will introduce you to Alfred, the intelligence agent-turned butler ready to attend to your every need, whether it be drawing your architectural diagrams, or analysing them for quality and guardrail conformance. The journey will explore what is currently possible at the bleeding edge of AI supporting industrial software engineering, and we will use the opportunity to identify the next generation research challenges that once solved will allow everyone to be super heroes.

Bio: Jack Johns is a research scientist from BT's Software and AI Lab, investigating the future of software engineering and how the business should evolve in the next 5-10 years to stay ahead of the curve. His work concerns various generative AI research and deployment initiatives that are associated with the software development lifecycle, with a particular interest in architectural design and knowledge management. In 2019, he was awarded with the Christopher Mills Award by the ITP.

Jie Zhang

King's College London

AIware: Beyond Correctness

Large language models (LLMs) are rapidly transforming the practice of software engineering. Code correctness has been the primary focus for evaluating these models, but real-world software development requires much more. In this talk, I will discuss how we could move beyond correctness to address broader dimensions of LLM-generated code. I will use my recent work on efficiency, fairness, and diversity as examples, and discuss the challenges and opportunities that lie ahead in advancing LLMs for code.

Bio: Dr. Jie M. Zhang is a lecturer of computer science at King’s College London. Her main research interests are the trustworthiness of software engineering, AI, and LLMs. She is a steering committee member of conferences IEEE ICST and ACM AIware. Over the last three years, she has been invited to give over 40 talks at conferences, universities, and IT companies. In recognition of her influence, she was named one of the Top 15 Global Chinese Female Young Scholars in Interdisciplinary AI (2023). Her research has won the FSE 2025 distinguished paper award, the 2024 and 2022 Transactions on Software Engineering Best Paper award, and the ICLR 2022 spotlight paper award. She is also the winner of 2025 ACM Sigsoft Early Research Award, one of the most prestigious honours for early-career researchers in the software engineering community.

Lingming Zhang

UIUC

Demystifying LLM-based Software Engineering Agents

In recent years, Large Language Models (LLMs) have shown impressive performance across a wide range of downstream applications, including software engineering. In this talk, we will discuss the history and recent trends of software engineering agents and present our work in this promising direction, covering both agent scaffold design (e.g., Agentless) and software-centric LLM post-training (e.g., SWE-RL and Code World Model).

Bio: Lingming Zhang is an associate professor at University of Illinois Urbana-Champaign. His research lies at the intersection of Software Engineering and Machine Learning. His group has pioneered a series of work on LLM-based software testing, analysis, repair, and synthesis (such as TitanFuzz, KNighter, AlphaRepair, and Agentless), and also released multiple open code LLMs (including the recent SWE-RL, PurpCode, and Code World Model), with millions of downloads worldwide. Many of their techniques for training, improving, and applying code LLMs or agents have been widely adopted by leading AI companies, including Meta, Google, OpenAI, and DeepSeek.

Pick Thongtanunam

The University of Melbourne

Unlocking the Potential of LLMs with Care: Can They Really Understand Code Changes?

Large language models (LLMs) have shown remarkable ability in generating texts across various domains, including code. However, there is a growing concern about hallucinations and whether LLMs can truly understand tasks and internal representations of code. In this talk, I will review LLMs’ abilities in code-related tasks through our lens of transferable knowledge. In particular, I will argue that although these models can perform well in benchmarks that involve restricted contexts and tasks, they in fact lack fundamental understanding of software beyond the contexts they have seen. I will also present our empirical findings and measurement that highlight the shortcomings of LLMs in tasks where context and prior knowledge are essential, along with our research agenda and direction to unlock their full potential in developer-facing tasks.

Bio: Dr. Patanamon Thongtanunam (or Pick) is a Senior Lecturer (equivalent to Associate Professor) at the School of Computing and Information Systems, the University of Melbourne. Her research interests include empirical software engineering, data mining, and data-driven techniques to support software engineering tasks. Her research work and endeavour has received numerous prestigious awards including an Australian Research Councile (ARC) Discovery Early Career Research Award (2021 - 2024), Japan Society for the Promotion of Science Research Fellowship (2016 - 2018), ACM SIGSOFT Distinguished Paper award, IEEE Computer Society TCSE Distinguished Paper Award, as well as distinguished reviewer awards.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

You're viewing the program in a time zone which is different from your device's time zone change time zone

Wed 19 Nov
Displayed time zone: Seoul change

09:00 - 18:00	Quiet RoomASE Catering at Ida 1

09:00 9h Other		Quiet Room ASE Catering

09:00 - 18:00	Prayer RoomASE Catering at Ida 2

09:00 9h Other		Prayer Room ASE Catering

09:00 - 09:20	MIP Award 2ASE MIP Award at Vista Chair(s): Lars Grunske Humboldt-Universität zu Berlin

09:00 20m Talk		Automated Test Input Generation for Android: Are We There Yet? ASE MIP Award Shauvik Roy Choudhary Uber Technologies, Inc, Alessandra Gorla IMDEA Software Institute, Alessandro Orso University of Georgia, USA DOI

10:30 - 11:00	Coffee BreakASE Catering at Vista/Grand Hall Foyer

10:30 30m Coffee break		Break ASE Catering

11:00 - 12:30	AIware Keynotes Session 1Keynotes at Grand Hall 1 Chair(s): Jie M. Zhang King's College London

11:00 20m Keynote		Claude Code: From Single Agent in Terminal to Multi-Agent Systems Keynotes K: Dickson Tsai Anthropic
11:20 20m Keynote		Code assistants: from code completion to coding agents Keynotes K: Baptiste Rozière Mistral AI Pre-print
11:40 20m Keynote		Demystifying LLM-based Software Engineering Agents Keynotes K: Lingming Zhang University of Illinois at Urbana-Champaign Pre-print
12:00 30m Panel		Joint Q&A and Discussion Keynotes

12:30 - 14:00	LunchASE Catering at Vista/Grand Hall Foyer

12:30 90m Lunch		Lunch ASE Catering

14:00 - 15:30	AIware Keynotes Session 2Keynotes at Grand Hall 1 Chair(s): Yiling Lou University of Illinois at Urbana-Champaign

14:00 20m Keynote		Gemini CLI: The Terminal Renaissance Keynotes K: Taylor Mullen Google Pre-print
14:20 20m Keynote		Ambient, Safe Agentic Automation on the GitHub Platform Keynotes K: Don Syme Microsoft Pre-print
14:40 20m Keynote		Human To Hero: How Gen AI Can Be The Alfred That Transforms Anyone Into Batman Keynotes K: Jack Johns BT Group PLC
15:00 30m Panel		Joint Q&A and Discussion Keynotes

15:30 - 16:00	Coffee BreakASE Catering at Vista/Grand Hall Foyer

15:30 30m Coffee break		Break ASE Catering

Thu 20 Nov
Displayed time zone: Seoul change

08:30 - 10:00	AIware Keynotes Session 3Keynotes at Grand Hall 1 Chair(s): Gustavo Oliva Centre for Software Excellence, Huawei Canada

08:30 20m Keynote		Spec Kit in Practice: Executable Specs, On‑Demand Checklists, and a Polya Loop Keynotes K: John Lam Microsoft Pre-print
08:50 20m Keynote		Automatically Maintaining Agent Systems: How Far Are We? Keynotes K: Yiling Lou University of Illinois at Urbana-Champaign
09:10 20m Keynote		Teaching LLMs to Debug: Toward Reasoning- and Tool-Aware Coding Agents Keynotes K: Chao Peng ByteDance
09:30 30m Panel		Joint Q&A and Discussion Keynotes

10:00 - 10:30	Coffee BreakASE Catering at Vista/Grand Hall Foyer

10:00 30m Coffee break		Break ASE Catering

12:30 - 14:00	LunchASE Catering at Vista/Grand Hall Foyer

12:30 90m Lunch		Lunch ASE Catering

14:00 - 15:00	AIware Keynotes Session 4Keynotes at Grand Hall 1 Chair(s): Ahmed E. Hassan Queen’s University

14:00 20m Keynote		Unlocking the Potential of LLMs with Care: Can They Really Understand Code Changes? Keynotes K: Patanamon Thongtanunam The University of Melbourne
14:20 20m Keynote		AIware: Beyond Correctness Keynotes K: Jie M. Zhang King's College London
14:40 20m Panel		Joint Q&A and Discussion Keynotes

15:30 - 16:00	Coffee BreakASE Catering at Vista/Grand Hall Foyer

15:30 30m Coffee break		Break ASE Catering

KeynotesAIware 2025