Software Engineering for Smart Data Analytics & Smart Data Analytics for Software Engineering
Two of the meanings of the word “cultivation” that are rather unrelated show a strong dependency, when applied to the domain of code quality:
The existing code in an evolving software system could be seen as the soil in which new code and new functionality is growing. While working this “soil” developers benefit from unobtrusively presented automatic feedback about the quality of their code. There are tools that verify the correct usage of good code structures (“design pattern”) and other tools that highlight improvement opportunities (“bad smells”).
As design patterns and bad smells are usually presented and discussed separately it has not been observed, that they partially contradict with each other. We will show that even well chosen design patterns can lead to bad smells. Thus, design quality is relative, which doesn't mean that it is arbitrary. The knowledge about design quality has to be rendered more precisely. We suggest to co-evolve the codified quality knowledge together with the code in a process of cultivation. Bad smell definitions can then easily be extended by taking existing design patterns into account.
When the design knowledge is cultivated together with the code, specific
knowledge like typical method names can be incorporated.
A case study explored unjustified “intensive coupling”-smells in ArgoUML:
While a previously suggested generic structural criterion identified 13%
unjustified warnings, taking the specific names into account, identified 90%.
2018-06-16: The last line of the abstract is based on an embarrassing error in a spreadsheet formula. After correcting this error and taking the naming conventions into account that were mentioned in the paper we get the following values:
Based on CINT, CDISP | Restricted based on MAXNESTING | Restricted based on names | Restriced based on both | |
---|---|---|---|---|
Intensive coupling | 447 | 238 (-47%) | 309 (-31%) | 205 (-54%) |
Dispersed coupling | 1152 | 654 (-43%) | 812 (-30%) | 568 (-51%) |
This means, restricting the smell candidates just based on names reduces the numbers moderately less than restricting them based on the metric MAXNESTING. Still, none of the restricting criteria subsumes the other so that combining them increases the reduction. The naming criterion is motivated by a reason (“decoupled objects need to be brought together in configuration and test methods”) while the nesting criterion relies on not further detailed expert experience. It might be worth to revisit the code in detail to check whether all naming conventions where captured.