AIware 2025
Wed 19 - Thu 20 November 2025
co-located with ASE 2025

This program is tentative and subject to change.

Large Language Models (LLMs) have introduced innovative avenues for automating software testing using prompts. Despite numerous studies on software testing automation, there remains limited understanding on the effectiveness of LLM-generated Selenium tests. In this paper, we investigate the effectiveness of Gemini to produce Selenium tests from manual task specifications and HyperText Markup Language (HTML) code snippets. By effectiveness, we mean if the generated Selenium tests are executable and functionally accurate (meeting the intended behavior specified in a manual task). To do that, we specify eight manual tasks (involving tasks related to search, filter, navigation, and form submissions) and define 25 actions for each task, using HTML code extracted from 200 web pages. These tasks require the interaction of diverse User Interface (UI) components, such as search boxes and checkboxes. The results indicate that 87.5% of the generated Selenium tests are executable and 51.5% of them meet the intended behavior. Manual tasks involving interaction with modals presented the greatest challenges for test generation. While carousels and buttons achieved relatively high success rates, they still accounted for many of the post-correction fixes. These components—often dynamic or context-dependent—were among those where most errors occurred during test generation.

Turning Manual Tasks into Actions: Assessing the Effectiveness of Gemini-generated Selenium Tests (aiware_myron_final.pdf)568KiB

This program is tentative and subject to change.

Thu 20 Nov

Displayed time zone: Seoul change

10:30 - 11:50
LLM-Based Software Testing and Quality AssuranceMain Track / Benchmark & Dataset Track at Grand Hall 1
10:30
8m
Talk
Understanding the Characteristics of LLM-Generated Property-Based Tests in Exploring Edge Cases
Main Track
Hidetake Tanaka Nara Institute of Science and Technology, Haruto Tanaka Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print
10:38
8m
Talk
Understanding LLM-Driven Test Oracle Generation
Main Track
Adam Bodicoat University of Auckland, Gunel Jahangirova King's College London, Valerio Terragni University of Auckland
10:46
8m
Talk
Turning Manual Tasks into Actions: Assessing the Effectiveness of Gemini-generated Selenium Tests
Main Track
Myron David Peixoto Federal University of Alagoas, Baldoino Fonseca Universidade Federal de Alagoas, Davy Baía Federal University of Alagoas, Kevin Lira North Carolina State University, Márcio Ribeiro Federal University of Alagoas, Brazil, Wesley K.G. Assunção North Carolina State University, Nathalia Nascimento Pennsylvania State University, Paulo Alencar University of Waterloo
File Attached
10:54
8m
Talk
Software Testing with Large Language Models: An Interview Study with Practitioners
Main Track
Maria Deolinda Cesar school, Cleyton Magalhaes Universidade Federal Rural de Pernambuco, Ronnie de Souza Santos University of Calgary
11:02
8m
Talk
HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation
Main Track
Rabimba Karanjai University of Houston, Lei Xu Kent State University, Weidong Shi University of Houston
11:10
8m
Talk
Assertion-Aware Test Code Summarization with Large Language Models
Benchmark & Dataset Track
Anamul Haque Mollah University of North Texas, Ahmed Aljohani University of North Texas, Hyunsook Do University of North Texas
DOI Pre-print
11:20
30m
Live Q&A
Joint Q&A and Discussion #LLMforTesting
Main Track