Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 17 Issue 2, 2026.
Abstract: This study introduces a structured framework for evaluating the security of Java applications generated by large language models (LLMs) and presents the results from its implementation across three models: DeepSeek, GPT-4, and Llama 4. The framework integrates Open Web Application Security Project (OWASP)-supported tools, such as SpotBugs with FindSecBugs, OWASP Dependency Check, and OWASP Zed Attack Proxy (ZAP), alongside the NIST Risk Management Framework. These tools and standards were selected for being publicly available, allowing this process to be replicated and extended without proprietary licensing, and for their alignment with widely adopted industry benchmarks. The testing methodology for generated Java applications includes static code analysis, third-party dependency checking, and dynamic attack simulation. Each of the specified tools for this study corresponds to identifying a specific category of critical vulnerabilities. Identified vulnerabilities are then evaluated against NIST risk analysis standards to characterize their threat sources, likelihoods, and impacts, as well as their implications for the overall security risk profile of each application. The effect of prompt design is also explored by comparing a neutral prompt against a security-emphasized prompt incorporating OWASP best practices. Results varied considerably across models: GPT-4 showed noticeable improvements across critical and high-severity vulnerabilities, with 33.3% and 53.8% reductions, respectively. However, Llama 4 and DeepSeek saw an increase in vulnerabilities from the neutral to the secure prompt. Llama 4 had a general increase of 10- 15% across critical, high, and medium-severity vulnerabilities, while DeepSeek saw no change in high-severity vulnerabilities and a 40% increase in low-severity vulnerabilities. The framework presented provides a structured process for evaluating LLM-generated code against established software development and security standards, while identifying present limitations and possible directions for future work.
Dominic Niceforo and Haydar Cukurtepe. “Evaluating the Efficiency of LLM-Generated Software in Resisting Malicious Attacks”. International Journal of Advanced Computer Science and Applications (IJACSA) 17.2 (2026). http://dx.doi.org/10.14569/IJACSA.2026.0170204
@article{Niceforo2026,
title = {Evaluating the Efficiency of LLM-Generated Software in Resisting Malicious Attacks},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2026.0170204},
url = {http://dx.doi.org/10.14569/IJACSA.2026.0170204},
year = {2026},
publisher = {The Science and Information Organization},
volume = {17},
number = {2},
author = {Dominic Niceforo and Haydar Cukurtepe}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.