Program Comprehension

IDEAL: An Open-Source Identifier Name Appraisal Tool

Developers must comprehend the code they will maintain, meaning that the code must be legible and reasonably self-descriptive. Unfortunately, there is still a lack of research and tooling that supports developers in understanding their naming …

An Ensemble Approach for Annotating Source Code Identifiers with Part-of-speech Tags

This paper presents an ensemble part-of-speech tagging approach for source code identifiers. Ensemble tagging is a technique that uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text at a higher …

Using Grammar Patterns to Interpret Test Method Name Evolution

It is good practice to name test methods such that they are comprehensible to developers; they must be written in such a way that their purpose and functionality are clear to those who will maintain them. Unfortunately, there is little automated …

On the generation, structure, and semantics of grammar patterns in source code identifiers

Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier …

Contextualizing rename decisions using refactorings, commit messages, and data types

Identifier names are the atoms of program comprehension. Weak identifier names decrease developer productivity and degrade the performance of automated approaches that leverage identifier names in source code analysis; threatening many of the …

An Empirical Study of Abbreviations and Expansions in Software Artifacts

Expanding abbreviations is an important text normalization technique used for the purpose of either increasing developer comprehension or supporting the application of natural-language-based tools for source code identifiers. This paper closely …

An Open Dataset of Abbreviations and Expansions

We present a data set of abbreviations and expansions, derived from a set of five open source systems, for use by the research and development communities.

Contextualizing Rename Decisions using Refactorings and Commit Messages

Identifier names are the atoms of comprehension; weak identifier names decrease productivity by increasing the chance that developers make mistakes and increasing the time taken to understand chunks of code. Therefore, it is vital to support …

Modeling the Relationship Between Identifier Name and Behavior

This paper presents the features of a model that relates the natural language found in identifiers with program semantics. The model takes advantage of part of speech information and static-analysis-based program models to understand how different …

Towards a Model to Appraise and Suggest Identifier Names

Unknowingly, identifiers in the source code of a software system play a vital role in determining the quality of the system. Ambiguous and confusing identifier names lead developers to not only misunderstand the behavior of the code but also …