Exploring the Utilization of Program Semantics in Extreme Code Summarization: An Experimental Study Based on Acceptability Evaluation

Jiuli Li; Yan Liu

doi:10.14569/IJACSA.2023.0141077

DOI: 10.14569/IJACSA.2023.0141077

PDF

Exploring the Utilization of Program Semantics in Extreme Code Summarization: An Experimental Study Based on Acceptability Evaluation

Author 1: Jiuli Li

Author 2: Yan Liu

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 10, 2023.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: With the rise of deep learning methods, neural network architecture adopted from neural machine translation has been widely studied in code summarization by learning the sequential content of code. Given the inherent nature of programming languages, learning the representation of source code from the parsed structural information is also a typical way for constructing code summarization models. Recent studies show that the overall performance of the neural models for code summarization can be improved by utilizing sequential and structural information in a hybrid manner. However, both of these two kinds of information fed to the neural models for code summarization fail to embrace the semantics of source code snippets in an explicit way. Is it really a good way to just leave the semantics as hidden things in the source code and have the neural models capture whatever they can get? To observe the utilization of program semantics in automatic code summarization, we conducted an experimental study by analyzing the acceptability of the extreme code summaries generated from neural models. To make the models aligned in the same context for this experimental study and to focus on the observation of the semantics, we re-implement the neural models from three selected studies as extreme code summarization solutions. After an intuitive observation and exploration of the generated summaries with the models trained from a Java dataset, we identify five acceptability aspects: (1) function name format; (2) function naming style; (3) semantic level similarity; (4) the differences in hitting rate of representative words; and (5) the correlation between extreme code summaries with function body. Based on the false negative and false positive phenomena in the results, ablation experiments have shown that the use of program semantics has a positive effect on generating high-quality abstracts in neural models. Our work proves the potential of utilizing the program semantics explicitly in code summarization, and the possible directions are also indicated.

Keywords: Extreme code summarization; program semantics utilization; acceptability analysis of code summary

Jiuli Li and Yan Liu, “Exploring the Utilization of Program Semantics in Extreme Code Summarization: An Experimental Study Based on Acceptability Evaluation” International Journal of Advanced Computer Science and Applications(IJACSA), 14(10), 2023. http://dx.doi.org/10.14569/IJACSA.2023.0141077

@article{Li2023,
title = {Exploring the Utilization of Program Semantics in Extreme Code Summarization: An Experimental Study Based on Acceptability Evaluation},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.0141077},
url = {http://dx.doi.org/10.14569/IJACSA.2023.0141077},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {10},
author = {Jiuli Li and Yan Liu}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Exploring the Utilization of Program Semantics in Extreme Code Summarization: An Experimental Study Based on Acceptability Evaluation

Upcoming Conferences