Researchers Accelerate Fault Localization in Software Development

Modern software applications typically involve numerous files and millions of lines of code. Due to this vast volume, identifying and fixing faults, a process known as debugging, can be challenging.

In many software companies, developers continue to search for faults manually, consuming a significant portion of their working hours. Research shows that this can account for between 30% and 90% of the total development time.

Birgit Hofer and Thomas Hirsch from the Institute of Software Technology at Graz University of Technology (TU Graz) have created a solution using existing natural language processing techniques and metrics to significantly accelerate the process of identifying faulty code and, consequently, debugging.

 Fault Localization Consumes the Most Time

“We began by surveying developers to identify the biggest time drains during debugging. It became clear that fixing bugs is not the primary issue; rather, programmers are mostly hindered by locating faults—narrowing down the search to the relevant section of the code,” explains Birgit Hofer.

With this insight, the researchers aimed to develop a solution that is scalable to applications with extensive codebases.

While model-based approaches, which convert programs into logical representations (models), are efficient for small programs, they become impractical for larger ones due to exponentially increasing computational demands.

In contrast, the method developed by Birgit Hofer and Thomas Hirsch quantifies certain software properties, such as code readability or complexity, allowing it to handle large codebases efficiently. The computational effort increases only linearly with the size of the code.

Comparing Bug Descriptions and Code

The process of fault localization begins with the bug report, where testers or users provide details about the observed issue, including software version, operating system, steps taken before the failure, and other relevant information.

Using this bug report, a combination of natural language processing and metrics analyzes the entire codebase, examining elements such as classes, variable names, files, methods, functions, and their calls.

The system identifies code sections that most closely match the bug report, resulting in a list of five to ten files ranked by their likelihood of containing the issue. Developers also receive information on the most probable type of fault involved, which helps them locate and fix the bug more efficiently.

“Software developers’ time is valuable, yet they often spend a significant portion of it locating and fixing bugs instead of developing new features,” says Birgit Hofer.

“While there are existing methods to address this problem, we have explored how to combine and enhance them to create a foundation for commercial use. The system is functional, but to integrate it into a company, it would need to be tailored to specific needs.”

Related posts

Geoinformatics and the Future of Smart Marine Asset Management: Opportunities for Africa’s Oil & Gas Sector

New open-source software for quantum cryptography surpasses the sum of its individual components.

Innovative Affordable Technology for Preventing Drone Collisions