Publication Overview

The orginal papers can be found here:

Publication Specific Documentation

For recently published papers, we provide here a fast entry point to the provided modules. All of those modules are as well accessible over the standard documentation

SimE4KG: Explainable Distributed multi-modal Semantic Similarity Estimation for Knowledge Graphs

This framework includes all of the most recent developments for the SimE4KG framework. SimE4KG is the Explainable Distributed In-Memory multi-modal Semantic Similarity Estimation for Knowledge Graphs.

Overview

In this release we introduce multiple changes to the Sansa Stack to offer the SimE4KG functionalities The content is structured as follows:

  • Release
  • Databricks Notebooks
  • ReadMe of novel Modules
  • Novel Classes
  • Unit Tests
  • Data Sets
  • Further Reading

Release

The changes are made available within this release here

SimE4KG Databricks Notebook

To showcase in a hands on session the usage of SimE4KG modules, we introduce multiple Databricks Notebooks. Those show the Full pipeline but also dedicated parts like the SmartFeature Extractor. Within the notebooks you can see the mixture of Explanations, Sample code and the output of the code snippets. With the Notebooks you can reproduce the functionality within you browser without a need to install the Framework locally. The Notebooks can be found here:

ReadMe

The novel modules of SimE4KG are documented within the SANSA ML ReadMe. For quick links especially to the high level SimE4KG Transformer and the SmartFeatureExtractor, you can use these two links:

Novel Classes

Novel Classes developed within this release are especially the Dasim Transformer and the SmartFeature extractor but also the corresponding unit test as well as the Evaluation scripts to test module performance:

  • DasimTransformer Class Unit Test
  • Smart Feature Extractor Class Unit Test
  • Evaluation Classes like data size scalability, feature availability evaluation, Smartfeature extractor evaluation and many more …

Datasets

As starting point to play around with the developments of this framework, we recommend the Linked Movie Data Base RDF Knowledge Graph. This KG represents in millions of triples data about movies and consists of multi modal features like lists of URIs as the lists of actors, numeric features like the runtime but also timestamp data like the release date. For purposes of Unit test, we propose also an extract of this data which follow the same schema.

Further Reading

If you are interested into further reading and background information of other related modules we recommend the following papers:

Other

  • In addition, we provide the full jar of this version below

DistRDF2ML

Release

The changes are made available within this release here

Docs

The documentation with sample code snippets are available within the SANSA ML Readme which include:

Code to Modules:

This release majorly provides the modules:

DistSim ICSC Paper Documentation

the documentation in docs are available here the respective similarity estimation models are in this github directory and further needed utils are here

Code to Modules:

DistAD ICSC Paper Documentation

The documentation in docs are available here. The modules are in this github directory.