SDA SE Wiki

Software Engineering for Smart Data Analytics & Smart Data Analytics for Software Engineering

User Tools

Site Tools


Lab: "Semantic Data Web Technologies"

Dr. Günter Kniesel, Lars Reimann, Firas Kassawat


General Information

This lab is part of the Intelligent Systems track of the M.Sc. curriculum.

Semantic Data Web Technologies

The lab will consist of two tracks, each with topics for groups of 2 to 4 students.

Track 1 will be dedicated to the use of semantic data web technologies for representing semantic information about software systems. In particular, we will extract information from the code and the documentation of a widely-used machine learning library, scikit-learn, and represent this information as a knowledge graph (KG), formed by RDF triples RDF triples. The KG will later be used for a variety of tasks that could be topics for master theses.

The work in this track will encompass creation of a KG that describes the scikit-learn library:

  • Development of the ontology that specifies the semantic structure of the KG (1 person) → The topic requires much conceptual work and much communication with all other lab members but very little coding (validation, consistency checking)
  • Extraction of range boundary constraints from Python documentation using NLP techniques (2 persons team) → NLP, coding, testing, evaluation
  • Extraction of dependency constraints from Python documentation using NLP techniques (2 persons team) → NLP, coding, testing, evaluation
  • Extraction of side-effect descriptions (change of state) by analysis of Python library code (4 persons team) → static analysis, coding, testing, evaluation

Track 2 will be dedicated to “Explainability”. A machine learning algorithm is said to be explainable, if the reasons why its results make sense can be understood by humans. Explainablility is an active area of research because non-explainability of results is one of the main factors hampering wider-spread adoption of ML, especially for critical tasks. Some explainability approaches already exist, but only for numerical data. In the lab we will try to generalize some of them to the domain of the semantic web, that is, RDF knowledge graphs. In particular, possible groups could work on

  • Benchmarking of available algorithms
  • Implementing of an algorithm on a relational or image dataset
  • implementing of an algorithm on a semantic dataset

Interested in further details? Then see the registration details for the info meeting below.

Prerequisites

Lab participants must be familiar with

  • the concepts of the semantic web (RDF, OWL)
  • version management (Git)
  • unit testing (e.g. pytest or JUnit)
  • object-oriented programming (you will need good programming skills)

If the number of interested students exceeds the number of available seats, priority will be given to those who have additional knowledge in at least one of the following:

  • Python programming
  • Natural language processing, especially spaCy
  • Static analysis, especially astroid
  • Parsing techniques (regular expressions, context free grammars, recursive descent parsers)

Format

  • Weekly meetings with supervisor (30 minutes per group of two students)
  • Biweekly public status presentations (5 min per person)
  • Final presentation on last day of the course (20 min).
  • A detailed user manual and code documentation must be delivered by Monday, March 13, 2022 (instead of a final project report).

The final deliverables (code, documentation, manual) will be peer-reviewed.

Registration and Info Meeting

Interested students are required to send an e-mail to all organizers mentioned at the bottom of the page. In this mail you should state

  • your curriculum,
  • matriculation number,
  • semester of study and
  • your background in the areas listed above as prerequisites. Be honest, we will quickly find out if you tried to cheat (see below).

If we think that you have sufficient background we will send you an invitation to the info meeting. The info meeting will introduce the above mentioned topics in more detail, and explain the requirements for the lab, the infrastructure, schedule, etc.. You will then be able to choose a topic.

We will form groups for each topic based on our assessment of your skills. If in doubt, we will give you a small task that requires the respective skills and assess how well you can solve it.

The info meeting will take place on Friday, 15.10.2021, 9:00 (st) in room 1.047. Depending on the development of the COVID-19 pandemic, the place could change. In the worst case, the info meeting will be online.

Place and Time

The following times are fixed:

  • Plenary Meetings: each Friday at 9:00 (st!), starting 15.10.2021 → always in room 1.047
  • Talk with supervisor: each Friday, time slots assigned per group, starting 22.10.2021 → always in room 1.067
  • Group work: Room 1.037 is reserved each Friday for your use, starting 15.10.2021.

Depending on the development of the COVID-19 pandemic, the places could change. In the worst case, the lab will be partly or fully online.

Mailing List (will be set up after info meeting)

  • mdse course lists iai uni bonn defill spaces with- @ . . - .”)

Organizers

Who E-mail Tel Office
Dr. Günter Kniesel gk cs uni bonn de (0228) 73-4511 1.066
Lars Reimann reimann cs uni bonn de (0228) 73-4… 1.065
Firas Kassawat kassawat cs uni bonn de (0228) 73-4… 1.065

fill spaces with@ . . - .”)

teaching/labs/sdwt/2021/start.txt · Last modified: 2021/08/14 10:36 by Günter Kniesel

SEWiki, © 2021