Wed 19 NovDisplayed time zone: Seoul change
09:00 - 09:20 | OpeningMain Track at Grand Hall 1 Chair(s): Yiling Lou University of Illinois at Urbana-Champaign, Qinghua Lu Data61, CSIRO, Jie M. Zhang King's College London | ||
09:00 - 18:00 | |||
09:00 9hOther | Quiet Room ASE Catering | ||
09:00 - 18:00 | |||
09:00 9hOther | Prayer Room ASE Catering | ||
09:00 - 09:20 | |||
09:00 20mTalk | Automated Test Input Generation for Android: Are We There Yet? ASE MIP Award Shauvik Roy Choudhary Uber Technologies, Inc, Alessandra Gorla IMDEA Software Institute, Alessandro Orso University of Georgia, USA DOI | ||
10:30 - 11:00 | |||
10:30 30mCoffee break | Break ASE Catering | ||
12:30 - 14:00 | |||
12:30 90mLunch | Lunch ASE Catering | ||
15:30 - 16:00 | |||
15:30 30mCoffee break | Break ASE Catering | ||
17:35 - 18:20 | Brainstorming Panel 1: Future of SE agents and Workflow Main Track at Grand Hall 1 Chair(s): Ahmed E. Hassan Queen’s University | ||
Thu 20 NovDisplayed time zone: Seoul change
10:00 - 10:30 | |||
10:00 30mCoffee break | Break ASE Catering | ||
10:30 - 11:50 | LLM-Based Software Testing and Quality AssuranceMain Track / Benchmark & Dataset Track at Grand Hall 1 Chair(s): Xiaoning Du Monash University | ||
10:30 8mTalk | Understanding the Characteristics of LLM-Generated Property-Based Tests in Exploring Edge Cases Main Track Hidetake Tanaka Nara Institute of Science and Technology, Haruto Tanaka Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology Pre-print | ||
10:38 8mTalk | Understanding LLM-Driven Test Oracle Generation Main Track Adam Bodicoat University of Auckland, Gunel Jahangirova King's College London, Valerio Terragni University of Auckland | ||
10:46 8mTalk | Turning Manual Tasks into Actions: Assessing the Effectiveness of Gemini-generated Selenium Tests Main Track Myron David Peixoto Federal University of Alagoas, Baldoino Fonseca Universidade Federal de Alagoas, Davy Baía Federal University of Alagoas, Kevin Lira North Carolina State University, Márcio Ribeiro Federal University of Alagoas, Brazil, Wesley K.G. Assunção North Carolina State University, Nathalia Nascimento Pennsylvania State University, Paulo Alencar University of Waterloo File Attached | ||
10:54 8mTalk | Software Testing with Large Language Models: An Interview Study with Practitioners Main Track Maria Deolinda Cesar school, Cleyton Magalhaes Universidade Federal Rural de Pernambuco, Ronnie de Souza Santos University of Calgary | ||
11:02 8mTalk | HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation Main Track Rabimba Karanjai University of Houston, Lei Xu Kent State University, Weidong Shi University of Houston | ||
11:10 8mTalk | Assertion-Aware Test Code Summarization with Large Language Models Benchmark & Dataset Track Anamul Haque Mollah University of North Texas, Ahmed Aljohani University of North Texas, Hyunsook Do University of North Texas DOI Pre-print | ||
11:20 30mLive Q&A | Joint Q&A and Discussion #LLMforTesting Main Track | ||
11:50 - 12:30 | Future of AIwareBenchmark & Dataset Track / Main Track / ArXiv Track at Grand Hall 1 Chair(s): Haoye Tian Aalto University | ||
11:50 8mTalk | The Future of Generative AI in Software Engineering: A Vision from Industry and Academia in the European GENIUS Project Main Track Robin Gröpler ifak - Institute for Automation and Communication, Magdeburg, Steffen Klepke Siemens AG, Jack Johns BT Group PLC, Andreas Dreschinski Akkodis, Klaus Schmid , Benedikt Dornauer University of Innsbruck; University of Cologne, Eray Tüzün Bilkent University, Joost Noppen , Mohammad Reza Mousavi King's College London, Yongjian Tang Siemens AG, Germany, Johannes Viehmann Fraunhofer FOKUS, Germany, Selin Şirin Aslangül , Beum Seuk Lee BT Group PLC, Adam Ziolkowski BT, Eric Zie Pre-print | ||
11:58 5mTalk | Where Do LLMs Still Struggle? An In-Depth Analysis of Code Generation Benchmarks Benchmark & Dataset Track | ||
12:03 5mTalk | Guidelines for Empirical Studies in Software Engineering involving Large Language Models ArXiv Track Sebastian Baltes Heidelberg University, Florian Angermeir fortiss GmbH, Chetan Arora Monash University, Marvin Muñoz Barón Technical University of Munich, Chunyang Chen TU Munich, Lukas Böhme Hasso Plattner Institute, University of Potsdam, Potsdam, Germany, Fabio Calefato University of Bari, Neil Ernst University of Victoria, Davide Falessi University of Rome Tor Vergata, Italy, Brian Fitzgerald Lero - The Irish Software Research Centre and University of Limerick, Davide Fucci Blekinge Institute of Technology, Marcos Kalinowski Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Stefano Lambiase Department of Computer Science, Aalborg University, Denmark, Daniel Russo Department of Computer Science, Aalborg University, Mircea Lungu IT University, Copenhagen, Lutz Prechelt Freie Universität Berlin, Paul Ralph Dalhousie University, Christoph Treude Singapore Management University, Stefan Wagner Technical University of Munich Pre-print | ||
12:10 20mLive Q&A | Joint Q&A and Discussion #FutureofAIware Main Track | ||
12:30 - 14:00 | |||
12:30 90mLunch | Lunch ASE Catering | ||
15:00 - 15:29 | Evaluation Frameworks, and Quantitative Assessment of LLMs (Part 1)Benchmark & Dataset Track / Main Track at Grand Hall 1 Chair(s): Zhou Yang University of Alberta, Alberta Machine Intelligence Institute | ||
15:00 8mTalk | Automated Extract Method Refactoring with Open-Source LLMs: A Comparative Study Main Track Sivajeet Chand Technical University of Munich, Melih Kilic Technical University of Munich, Roland Würsching Technical University of Munich, Sushant Kumar Pandey University of Groningen, The Netherlands, Alexander Pretschner TU Munich Pre-print | ||
15:08 8mTalk | Benchmarking Web API Integration Code Generation Benchmark & Dataset Track Daniel Maninger TU Darmstadt, Leon Chemnitz TU Darmstadt, Amir Molzam Sharifloo , Mira Mezini TU Darmstadt; hessian.AI; National Research Center for Applied Cybersecurity ATHENE Pre-print | ||
15:16 8mTalk | From Search to Reasoning: A Five-Level RAG Capability Framework for Enterprise Data Benchmark & Dataset Track Gurbinder Gill , Ritvik Gupta Carnegie Mellon University, USA, Denis Lusson , Anand Chandrashekar , Donald Nguyen Corvic AI Pre-print | ||
15:24 5mTalk | SWE-Sharp-Bench: A Reproducible Benchmark for C# Software Engineering Tasks Benchmark & Dataset Track Sanket Mhatre Microsoft, Yasharth Bajpai Microsoft, Sumit Gulwani Microsoft, Emerson Murphy-Hill Microsoft, Gustavo Soares Microsoft Pre-print | ||
15:30 - 16:00 | |||
15:30 30mCoffee break | Break ASE Catering | ||
16:00 - 16:50 | Evaluation Frameworks, and Quantitative Assessment of LLMs (Part 2)Main Track / Benchmark & Dataset Track at Grand Hall 1 Chair(s): Zhou Yang University of Alberta, Alberta Machine Intelligence Institute | ||
16:00 8mTalk | PromptExp: Multi-granularity Prompt Explanation of Large Language Models Main Track Ximing Dong Centre for Software Excellence at Huawei Canada, Shaowei Wang University of Manitoba, Dayi Lin Centre for Software Excellence, Huawei Canada, Gopi Krishnan Rajbahadur Centre for Software Excellence, Huawei, Canada, Ahmed E. Hassan Queen’s University | ||
16:08 8mTalk | Beyond Code Explanations: A Ray of Hope for Cross-Language Vulnerability Repair Main Track Kevin Lira North Carolina State University, Baldoino Fonseca Universidade Federal de Alagoas, Wesley K.G. Assunção North Carolina State University, Davy Baía Federal University of Alagoas, Márcio Ribeiro Federal University of Alagoas, Brazil Pre-print | ||
16:16 8mTalk | Secure Code Generation at Scale with Reflexion Benchmark & Dataset Track Arup Datta University of North Texas, Ahmed Aljohani University of North Texas, Hyunsook Do University of North Texas Pre-print | ||
16:24 5mTalk | A Tool for Benchmarking Large Language Models' Robustness in Assessing the Realism of Driving Scenarios Benchmark & Dataset Track Jiahui Wu Simula Research Laboratory and University of Oslo, Chengjie Lu Simula Research Laboratory and University of Oslo, Aitor Arrieta Mondragon University, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University Pre-print | ||
16:29 21mLive Q&A | Joint Q&A and Discussion #LLMAssessment Main Track | ||
16:50 - 17:35 | Responsible, Ethical, and Legal Dimensions of AIwareMain Track at Grand Hall 1 Chair(s): Jingzhi Gong | ||
16:50 8mTalk | Neuro-Symbolic Compliance: Integrating LLMs and SMT for Automated Financial Legal Analysis Main Track Yung Shen HSIA National Chengchi University, Fang Yu National Chengchi University, Jie-Hong Roland Jiang National Taiwan University File Attached | ||
16:58 8mTalk | Are We Aligned? A Preliminary Investigation of the Alignment of Responsible AI Values between LLMs and Human Judgment Main Track Asma Yamani King Fahd University of Petroleum and Minerals, Malak Baslyman King Fahd University of Petroleum & Minerals, Moataz Ahmed King Fahd University of Petroleum and Minerals File Attached | ||
17:06 5mTalk | A Vision for Value-Aligned AI-Driven Systems Main Track Humphrey Obie Monash University | ||
17:11 5mTalk | Generative AI and Empirical Software Engineering: A Paradigm Shift Main Track Pre-print | ||
17:16 19mLive Q&A | Joint Discussion #ResponsibleAI Main Track | ||
17:35 - 18:20 | Brainstorming Panel 2: Critical Challenges in AIware & Possible SolutionsMain Track at Grand Hall 1 Chair(s): Dayi Lin Centre for Software Excellence, Huawei Canada | ||
18:20 - 18:40 | |||
Unscheduled Events
| Not scheduled Talk | Where Do LLMs Still Struggle? An In-Depth Analysis of Code Generation Benchmarks (short paper-benchmark) Main Track | ||
| Not scheduled Research paper | Assertion-Aware Test Code Summarization with Large Language Models Main Track A: Anamul Haque Mollah University of North Texas, A: Ahmed Aljohani Rochester Institute of Technology, A: Hyunsook Do University of North Texas File Attached | ||
| Not scheduled Coffee break | Coffee break Main Track |
Accepted Papers
Call for Papers
“Software for all and by all” is the future of humanity. AIware, i.e., AI-powered software, has the potential to democratize software creation. We must reimagine software and software engineering (SE), enabling individuals of all backgrounds to participate in its creation with higher reliability and quality. Over the past decade, software has evolved from human-driven Codeware to the first generation of AIware, known as Neuralware, developed by AI experts. Foundation Models (FMs, including Large Language Models or LLMs), like GPT, ushered in software’s next generation, Promptware, led by domain and prompt experts. However, this merely scratches the surface of the future of software. We are already witnessing the emergence of the next generation of software, Agentware, in which humans and intelligent agents jointly lead the creation of software. With the advent of brain-like World Models and brain-computer interfaces, we anticipate the arrival of Mindware, representing another generation of software. Agentware and Mindware promise greater autonomy and widespread accessibility, with non-expert individuals, known as Software Makers, offering oversight to autonomous agents.
The software engineering community will need to develop fundamentally new approaches and evolve existing ones, so they are suitable for a world in which software creation is within the reach of Software Makers of all levels of SE expertise, as opposed to solely expert developers. We must recognize a shift in where expertise lies in software creation and start making the needed changes in the type of research that is being conducted, the ways that SE is being taught, and the support that is offered to software makers.
The 2nd ACM International Conference on AI-powered Software (AIware 2025, https://conf.researchr.org/home/aiware-2025) will be hosted on November 20th-21st, 2025, at Seoul, South Korea, co-located with ASE’25. AIware 2025 aims to bring different communities together in anticipation of the upcoming changes driven by FMs and look at them from the perspective of AI-powered software and their evolution. AIware 2025 promotes cross-disciplinary discussions, identifies emerging research challenges, and establishes a new research agenda for the community in the Foundation Model era.
Topics of interest
Topics of interest of AIware conference include but are not limited to the following:
- How would future software look like in the FM era?
- Agents & SE
- How to integrate legacy software in future AIware?
- Do existing programming models (e.g., object-oriented or functional programming) and SE practices (e.g., test-driven development and agile) remain suitable for developing and maintaining software in the FM era?
- What roles do autonomous agents play in the development and maintenance of software in the FM era?
- How will inner and open source collaboration evolve in the FM era?
- What kind of release engineering practices do we need for FM-powered software applications?
- Are LLMOps comprehensive enough to capture the release engineering needs of AIware in the FM era?
- How do we debug and monitor AIware in the FM era?
- How should we change SE curriculum, training and mentoring in the FM era?
- How to evolve FMs from the perspective of AIware and its makers in the FM era?
- How do human interactions and perceptions shape the development and implementation of AIware in the FM era?
- How do we measure and improve the trustworthiness of AIware in the FM era?
- What are the implications and effectiveness of foundation models in improving software engineering practices and outcomes?
- How does AIware impact developer productivity?
Types of submissions
AIware 2025 Main Track welcomes submissions from both academia and industry. At least one author from each accepted submission will be required to attend the conference and present the paper. Submissions can include but are not limited to: case studies, vision papers, literature reviews and surveys, position papers, theoretical, and applied research papers.
Page limits:
- Full-length papers (i.e., case studies, theoretical, applied research papers): 6 - 8 pages;
- Short papers (i.e., vision papers, new idea papers, and position papers): 2 - 4 pages;
- Literature reviews and surveys: 14 - 20 pages.
With an additional 1-2 pages of reference. The page limit is strict.
Awards
The best full-length papers accepted in the main track of AIware will be recognized with an ACM SIGSOFT Distinguished Paper Awards.
Selected AIware papers will be invited to be revised and extended for consideration in a special issue of the Empirical Software Engineering journal by Springer.
New this year
-
OpenReview. For AIware 2025, we are excited to adopt OpenReview as our platform for paper submission and reviewing. This marks the first time a software engineering conference has embraced OpenReview, aligning AIware with the open science practices already well-established in leading AI conferences such as NeurIPS, ICLR, ICML, and ACL. Our goal is to foster greater transparency, richer dialogue, and a more inclusive review process by leveraging the features of OpenReview, including open engagement, author responses, and optional public visibility. We believe this shift will promote a more transparent, interactive, and community-driven review process, ultimately advancing the quality and impact of research shared at AIware.
-
Author Response. For AIware 2025, we incorporate an author response period, during which we expect multi-round interactive communication between authors and reviewers.
Submission guidelines
All authors should use the official “ACM Primary Article Template” as can be obtained from the ACM Proceedings Template page. LaTeX users should use the following latex code at the start of the LaTeX document where the review option produces line numbers for easy reference by the reviewers and the anonymous optician omits author names:
\documentclass[sigconf,review,anonymous]{acmart}
Papers must be submitted electronically in OpenReview platform through the following submission site: AIware OpenReview Submission Site
Authors are required to sign up active OpenReview accounts for submission. (Institutional email is recommended for registration otherwise it might take a couple of days for OpenReview to manually activate the account.) Instructions for account signing-up. Please visit http://openreview.net/profile after logging in, complete your profile and import your publications. Publications can be automatically imported from DBLP. Instructions for importing publications.
All submissions must be in PDF. All papers must be written in English.
All submissions are subject to ACM policies including ACM Publications Policies, ACM’s new Publications Policy on Research Involving Human Participants and Subjects, ACM Policy and Procedures on Plagiarism, ACM Policy on Prior Publication and Simultaneous Submissions, and the ACM Policy on Authorship and its accompanying FAQ released April 20, 2023. In particular, authors should pay attention to the following points:
- Generative AI tools and technologies, such as ChatGPT, may not be listed as authors of an ACM published Work. The use of generative AI tools and technologies to create content is permitted but must be fully disclosed in the Work. For example, the authors could include the following statement in the Acknowledgements section of the Work: ChatGPT was used to generate sections of this Work, including text, tables, graphs, code, data, citations, etc. If you are uncertain about the need to disclose the use of a particular tool, err on the side of caution, and include a disclosure in the acknowledgements section of the Work.
- If you are using generative AI software tools to edit and improve the quality of your existing text in much the same way you would use a typing assistant like Grammarly to improve spelling, grammar, punctuation, clarity, engagement or to use a basic word processing system to correct spelling or grammar, it is not necessary to disclose such usage of these tools in your Work.
Review and evaluation process
A double-anonymous review process will be employed for submissions to the main track. The submission must not reveal the identity of the authors in any way. Papers that violate the double-anonymous requirement will be desk-rejected. For more details on the double-anonymous process, please refer to ASE’s double-anonymous review process.
All submissions will be desk-checked to make sure that they are within the scope of the conference and have satisfied the submission requirements (e.g., page limits and anonymity).
Three members of the Program Committee will then be assigned for each submission for the review process. The Program Committee members can bid on submissions to review. The Program Committee will discuss the review results virtually and decide on the accepted submissions. The accepted submissions will be published in the ACM digital library.
AUTHORS TAKE NOTE: The official publication date is the date the proceedings are made available in the ACM Digital Library. This date may be up to two weeks prior to the first day of the conference. The official publication date affects the deadline for any patent filings related to published work.