Understanding the Characteristics of LLM-Generated Property-Based Tests in Exploring Edge Cases (AIware 2025 - Main Track)

Who

Hidetake Tanaka, Haruto Tanaka, Kazumasa Shimari, Kenichi Matsumoto

Track

AIware 2025 Main Track

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 20 Nov 2025 10:30 - 10:38 at Grand Hall 1 - LLM-Based Software Testing and Quality Assurance Chair(s): Xiaoning Du

Abstract

As Large Language Models (LLMs) increasingly generate code in software development, ensuring the quality of LLM-generated code has become important. Traditional testing approaches using Example-based Testing (EBT) often miss edge cases – defects that occur at boundary values, special input patterns, or extreme conditions. This research investigates the characteristics of LLM-generated Property-based Testing (PBT) compared to EBT for exploring edge cases. We analyze 16 HumanEval problems where standard solutions failed on extended test cases, generating both PBT and EBT test codes using Claude-4-sonnet. Our experimental results reveal that while each method individually achieved a 68.75% bug detection rate, combining both approaches improved detection to 81.25%. The analysis demonstrates complementary characteristics: PBT effectively detects performance issues and edge cases through extensive input space exploration, while EBT effectively detects specific boundary conditions and special patterns. These findings suggest that a hybrid approach leveraging both testing methods can improve the reliability of LLM-generated code, providing guidance for test generation strategies in LLM-based code generation.

Link to Preprint

http://arxiv.org/abs/2510.25297

Hidetake Tanaka

Nara Institute of Science and Technology

Japan

Haruto Tanaka

Nara Institute of Science and Technology

Kazumasa Shimari

Nara Institute of Science and Technology

Japan

Kenichi Matsumoto

Nara Institute of Science and Technology

Japan

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 20 Nov
Displayed time zone: Seoul change

10:30 - 11:50	LLM-Based Software Testing and Quality AssuranceMain Track / Benchmark & Dataset Track at Grand Hall 1 Chair(s): Xiaoning Du Monash University

10:30 8m Talk		Understanding the Characteristics of LLM-Generated Property-Based Tests in Exploring Edge Cases Main Track Hidetake Tanaka Nara Institute of Science and Technology, Haruto Tanaka Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology Pre-print
10:38 8m Talk		Understanding LLM-Driven Test Oracle Generation Main Track Adam Bodicoat University of Auckland, Gunel Jahangirova King's College London, Valerio Terragni University of Auckland
10:46 8m Talk		Turning Manual Tasks into Actions: Assessing the Effectiveness of Gemini-generated Selenium Tests Main Track Myron David Peixoto Federal University of Alagoas, Baldoino Fonseca Universidade Federal de Alagoas, Davy Baía Federal University of Alagoas, Kevin Lira North Carolina State University, Márcio Ribeiro Federal University of Alagoas, Brazil, Wesley K.G. Assunção North Carolina State University, Nathalia Nascimento Pennsylvania State University, Paulo Alencar University of Waterloo File Attached
10:54 8m Talk		Software Testing with Large Language Models: An Interview Study with Practitioners Main Track Maria Deolinda Cesar school, Cleyton Magalhaes Universidade Federal Rural de Pernambuco, Ronnie de Souza Santos University of Calgary
11:02 8m Talk		HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation Main Track Rabimba Karanjai University of Houston, Lei Xu Kent State University, Weidong Shi University of Houston
11:10 8m Talk		Assertion-Aware Test Code Summarization with Large Language Models Benchmark & Dataset Track Anamul Haque Mollah University of North Texas, Ahmed Aljohani University of North Texas, Hyunsook Do University of North Texas DOI Pre-print
11:20 30m Live Q&A		Joint Q&A and Discussion #LLMforTesting Main Track