Impact of stack overflow code snippets on software cohesion: a preliminary study
Developers frequently copy code snippets from publicly-available resources such as Stack Overflow (SO). While this may lead to a ‘quick fix’ for a development problem, little is known about how these copied code snippets affect the code quality of the recipient application, or how the quality of the recipient classes subsequently evolves over the time of the project. This has an impact on whether such code copying should be encouraged, and how classes that receive such code snippets should be monitored during evolution. To investigate this issue, we used instances from the SOTorrent database where Java snippets had been copied from Stack Overflow into GitHub projects. In each case, we measured the quality of the recipient class just prior to the addition of the snippet, immediately after the addition of the snippet, and at a later stage in the project. Our goal was to determine if the addition of the snippet caused quality to improve or deteriorate, and what the long-term implications were for the quality of the recipient class. Code quality was measured using the cohesion metrics Low-level Similarity-based Class Cohesion (LSCC) and Class Cohesion (CC). Over a random sample of 378 classes that received code snippets copied from Stack Overflow to GitHub, we found that in almost 70% of the cases where the copied snippet affected cohesion, the effect was to reduce the cohesion of the recipient class. Furthermore, this deterioration in cohesion tends to persist in the subsequent evolution of the recipient class. In over 70% of cases the recipient class never fully regained the cohesion it lost in receiving the snippet. These results suggest that when copying code snippets from external repositories, more attention should be paid to integrating the code with the recipient class.