A Model Context Protocol (MCP) server to access data about training materials.
Project description
Project 18: Mining the potential of knowledge graphs for metadata on training
Abstract
Knowledge graphs (KGs) can greatly increase the potential of data by revealing hidden relationships and turning it into useful information. A KG is a graph-based representation of data that stores relations between subjects, predicates and objects in triplestores. These entities are typically described in pre-defined ontologies, which increase interoperability and connect data that would otherwise remain isolated in siloed databases. This structured data representation can greatly facilitate complex querying and applications to deep learning approaches like generative AI.
ELIXIR and its Nodes are making a major effort to make the wealth of open training materials on the computational life sciences reusable, amongst others by guidelines and support for annotating training materials with standardized metadata. One major step in standardizing metadata is the use of the Bioschemas training profile, which became a standard for representing training metadata. Despite being standardized and interoperable, there is still a lot of potential to turn these resources into valuable information, linking training data across various databases.
In this project, we aim to create queryable KGs derived from training metadata in the Bioschemas format available from platforms like TeSS and glittr.org. In a subsequent step, we will investigate the potential of such KGs for several use cases, including construction of custom learning paths, creation of detailed trainer profiles, and connection of training metadata to other databases. These use-cases will also shed light on the limits on the currently available metadata, and will help to make future choices on richer metadata and standards.
Leads
Geert van Geest, Harshita Gupta, Vincent Emonet
💬 MCP server
A Model Context Protocol (MCP) server to access and search through the training materials of multiple Elixir repositories, such as TeSS and Glittr.
⚡️ Usage
[!IMPORTANT]
Requirement:
uv, to easily handle python scripts and virtual environments
Use with STDIO transport:
uv run elixir-training-mcp
Use with Deploy as Streamable HTTP server:
uv run elixir-training-mcp --http
🧰 Available MCP tools
Once the server is running you can call the following tools from your MCP-compatible client:
| Tool | Description |
|---|---|
search_training_materials |
Proxies the live TeSS API and returns raw JSON results. |
keyword_search |
Searches the harvested TTL datasets (TeSS + GTN) by free-text keyword and returns enriched metadata. |
provider_search |
Filters harvested resources by provider name (case-insensitive). |
location_search |
Returns TeSS course instances in a given country (optionally city). |
date_search |
Finds TeSS course instances starting within a provided ISO date range. |
topic_search |
Matches harvested resources by EDAM identifier or topic label. |
dataset_stats |
Summarises dataset diagnostics (resource counts, type distribution, access modes). |
[!NOTE] The local tools read from
data/tess_harvest.ttlanddata/gtn_harvest.ttl. Regenerate these files with the harvest scripts if you need fresher data.
🔌 Connect client to MCP server
Follow the instructions of your favorite chat client.
To add a new MCP server to VSCode GitHub Copilot:
- Install the
GitHub.copilotextension - Open the Command Palette (
ctrl+shift+porcmd+shift+p) - Search for
MCP: Add Server...- Choose
STDIO, and provide the command:uvx elixir-training-mcp - Or choose
HTTP, and provide the MCP server URL, e.g. http://localhost:8000/mcp
- Choose
To use it with STDIO transport, your VSCode mcp.json should look like:
{
"servers": {
"elixir-training-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["elixir-training-mcp"]
}
}
}
[!TIP]
You can use a local folder for development:
{ "servers": { "elixir-training-mcp": { "type": "stdio", "cwd": "~/dev/ELIXIR-TrP-KG-training-metadata", "command": "uv", "args": ["run", "elixir-training-mcp"] } } }
You can also connect to a running server using Streamable HTTP:
{
"servers": {
"elixir-training-mcp-http": {
"url": "http://localhost:8000/mcp",
"type": "http"
}
}
}
Harvesting
The data/tess_harvest.ttl files is included in the repository (7MB), you can run the script to harvest JSON-LD data and build this ttl file, but it takes ~30min due to parsing JSON-LD being expensive:
uv run src/elixir_training_mcp/harvest/harvest_tess.py
Harvest GTN:
uv run src/elixir_training_mcp/harvest/harvest_gtn.py
[!NOTE]
TeSS contains training materials and courses from various providers, such as GTN (Galaxy Training Network) training materials. Metadata about material in TeSS and GTN can be matched on
schema:url
Deploy a SPARQL endpoint on http://localhost:8000:
uv run rdflib-endpoint serve src/elixir_training_mcp/data/*_harvest.ttl
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file elixir_training_mcp-0.0.3.tar.gz.
File metadata
- Download URL: elixir_training_mcp-0.0.3.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
731d98a5c63eeda2559c1d65751facfc5aa723babe0ba6dd8acf41505c3269ee
|
|
| MD5 |
5b254bd1b36297eb52c1e8fcd53bb4ba
|
|
| BLAKE2b-256 |
9b9f4e20f82a0f06fc5880c585f678eea053d56eb1c699f1b26ec95bd651e500
|
File details
Details for the file elixir_training_mcp-0.0.3-py3-none-any.whl.
File metadata
- Download URL: elixir_training_mcp-0.0.3-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7246785a38a86e024f8f95444b6d5eaaf51bf85740e5e1a66011962795fee318
|
|
| MD5 |
082049fc2ac96f7e366c6eea1043bbc0
|
|
| BLAKE2b-256 |
f93620b9ca3e503a37e910b912994b90f876c6a098976fec17ee99967f5fc3ba
|