Beyond Code Explanations: A Ray of Hope for Cross-Language Vulnerability Repair (AIware 2025 - Main Track)

Wed 19 - Thu 20 November 2025

co-located with ASE 2025

Who

Kevin Lira, Baldoino Fonseca, Wesley K. G. Assunção, Davy Baía, Márcio Ribeiro

Track

AIware 2025 Main Track

Abstract

Software vulnerabilities pose a significant security concern, given the widespread reliance on software systems. In response, recent research has turned to Large Language Models (LLMs) as a means to automate vulnerability repair. However, most existing studies focus on specific backend programming languages, such as C/C++, Java, or Python, which limits our understanding of how LLMs perform across front-end programming languages, such as JavaScript, TypeScript, and PHP. This study investigates the effectiveness of GPT-4.1, Claude Opus 4, and Gemini 2.5 Pro, three state-of-the-art language models, in repairing vulnerabilities across these front-end programming languages, which are widely used in web development and frequently targeted in real-world exploits. To do that, we curated a dataset comprising 4,900 CVEs and 5,005 associated commits from 2,432 open-source projects spanning JavaScript, TypeScript, and PHP. The results indicate that GPT-4.1 is the most consistently effective model, while Claude Opus 4 often produces the most human-like patches. Our analysis highlights the strengths and limitations of each model, indicating that while LLMs hold promise for automated vulnerability repair, their effectiveness remains uneven across multiple front-end languages.

Kevin Lira

North Carolina State University

Beyond Code Explanations: A Ray of Hope for Cross-Language Vulnerability Repair

Kevin Lira

North Carolina State University

United States

Baldoino Fonseca

Universidade Federal de Alagoas

Wesley K. G. Assunção

Johannes Kepler University Linz

Davy Baía

Federal University of Alagoas

Brazil

Márcio Ribeiro

Federal University of Alagoas, Brazil

Brazil

Tracks