Software Engineering for Smart Data Analytics & Smart Data Analytics for Software Engineering
Start by creating a project for your analyses:
File → New… → Project
General –> Project
Next
Finish
Create a Prolog file in the new project:
File → New… → File
Finish
Don't get scared if you do not know the Prolog programming language. Everything you need to know will be explained in this tutorial. (Later, if you want to learn more, you can continue here.)
For now, all you see is an empty Prolog editor. Here we are going to create our analysis. The goal of the analysis in this tutorial is to find all System.out.println
calls in the sourcecode. Enter the following code in the editor:
:- module(sysout_analysis, [ sysout_call/6 ]). sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) :-
In this code we define a Prolog module 1) with the name sysout_analysis
and a predicate with the name sysout_call
and six arguments. To see how a System.out.println
call is represented in JTransformer please have a look at the JT Tutorial Project. In the package logging
there is the class SomeClass
with a few sysout-calls. If you open a sysout-call in the Factbase inspector you will find something like this:
Before you continue you should have a look at the PEF-Documentation to understand what the different parts of the PEFs mean (callT, fieldAccessT, staticTypeRefT, literalT).
You can try to write the analysis yourself or just have a look at the following code example 2). You can also create a Source Template with the Copy to Clipboard Feature.
sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) :- % match the structure callT(CallId, CallParent, Enclosing, FieldAccess, [Argument], Println, [], _), fieldAccessT(FieldAccess, CallId, Enclosing, StaticTypeRef, Out, _), staticTypeRefT(StaticTypeRef, FieldAccess, Enclosing, System), % match the references to class 'System', field 'out' and method 'println' classT(System, _, 'System', _, _), fieldT(Out, System, PrintStream, 'out', _), methodT(Println, PrintStream, 'println', _, _, _, _, _).
In line 3 the callT fact is matched. We are looking for calls with exactly one Argument (5th argument) and no type arguments (7th argument). The other arguments are either unbound variables or underscores.
In line 4 the field access to the field “out
” of the System
class is matched. You can see that the Id of the fieldAccessT fact is referenced by the callT (since the field access is the reciever of the call).
In line 5 the static type reference to the class System is matched. This is the last PEF we need for matching the structure of a System.out.println call.
For now, we matched the structure of a call with one argument, whose reciever is a field access that is done on a static type reference. But since that's only the structure, it doesn't mean that it's always a System.out.println. We have to determine how the referenced elements should look like. This is done in line 7 to 9. We say that a class with the name “System
” should be referenced, and a field with the name “out
”, whose parent is the System
class. The method should have the name “println
” and the parent has to be the type of the out
field 3).
Put this code to your Prolog Module, save it and call sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument)
on the console. Now you should see some results.
It's not very practical to always check the results on the console. It is possible to add your own analysis to the Control Center. This is explained on the “Adding your analysis to the Control Center” page.
The example so far is enough to quickly find all sysout calls. But there are a few things which can be improved.
In line 7 of our example we are finding the reference to the class “System
” by calling classT(System, _, 'System', _, _)
. This is not wrong, but it might lead to wrong results since it's just looking for a class with the name System
. This doesn't mean that it definitely is the class we want (which is java.lang.System
). One way to make the analysis more robust would be to check the name of the package of the class (which means checking the parent compilation unit of the class and the parent package of the compliation unit). An easiery way is, to use the predefined predicate fully_qualified_name/2
. Here you can just enter the fully qualified name and get the id of this specific class. So it's not possible to get the wrong result, even if there are other classes with the name System
in your code.
Access via name (can lead to wrong results) | Access via fqn (no wrong results since fqn is unique) | |
---|---|---|
classT(System, _, 'System', _, _) | –> | fully_qualified_name(System, 'java.lang.System') |
If you are looking for System.out.println calls you normally do this because you want to replace them by calling some logging method. If that's the case, the logging method should be allowed to call System.out.println. In the JT Tutorial example project this means, that you don't want to have the warning marker in the class “MyLogger”. To do this, you have to limit the scope of the analysis, so that it doesn't find the approved calls.
sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) :- ... % limit the scope methodT(Enclosing, ParentClass, _, _, _, _, _, _), not(fully_qualified_name(ParentClass, 'logging.MyLogger')).
Enclosing is the enclosing method of the call. We are making sure, that this method is NOT part of the class MyLogger
. Of course there are various ways of limiting the scope. You could also have an analysis where you only want to find results in a specific class or in a specific package and so on.
In the “Creating Analyses (As) and Transformations (Ts)” section it was already explained how to use the completion in the console. If you try this with your analysis you will realize that it doesn't recognize the parameter names (and just replaces them with A, B, …). To enable the code completion with the correct names for your predicate you have to add pldoc comments to your code. The easiest way is to just copy the predicate head to the comment:
%% sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) :-
Now, if you use the completion, it will add the correct parameter names.
This is the complete example code with all the improved features described on this page.
:- module(sysout_analysis, [ sysout_call/6 ]). %% sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) sysout_call(CallId, CallParent, Enclosing, FieldAccess, StaticTypeRef, Argument) :- % match the structure callT(CallId, CallParent, Enclosing, FieldAccess, [Argument], Println, [], _), fieldAccessT(FieldAccess, CallId, Enclosing, StaticTypeRef, Out, _), staticTypeRefT(StaticTypeRef, FieldAccess, Enclosing, System), % match the references to class 'System', field 'out' and method 'println' fully_qualified_name(System, 'java.lang.System'), fieldT(Out, System, PrintStream, 'out', _), methodT(Println, PrintStream, 'println', _, _, _, _, _), % limit the scope methodT(Enclosing, ParentClass, _, _, _, _, _, _), not(fully_qualified_name(ParentClass, 'logging.MyLogger')).
You can download my_analysis.zip and import it to eclipse. This file contains the System.out.println
-detector shown above plus the code that implements its connection to JTransformer, the related transformation and its declaration as quickfix.