An MCP server that exposes the corpus of Atatürk's speeches, statements, telegrams and proclamations (1906-1938) for researchers worldwide. Created by bugraayan.com.
Project description
Atatürk MCP
Created by Buğra Ayan — bugraayan.com
Tüm dünyadan Atatürk üzerine araştırma yapanlar için açık kaynak bir köprü.
A Model Context Protocol server that exposes the complete corpus of Mustafa Kemal Atatürk's speeches, statements, telegrams and proclamations (1906-1938) to any MCP-aware AI client (Claude Desktop, Cursor, Cline, Windsurf, …).
Built for researchers, historians, journalists and students anywhere in the world who want to ask LLMs questions like:
- "What did Atatürk say about women's rights in 1923?"
- "Find quotes about education and modernisation."
- "Show me his opening address to the Grand National Assembly on 1 March 1922."
- "Compare the 1927 Nutuk's treatment of the War of Independence with his 1933 10th Year Speech."
Corpus
| Source | Coverage | Format | Speeches |
|---|---|---|---|
| ATAM — Atatürk Araştırma Merkezi "Söylev ve Demeçleri" Cilt I-III (2024 edition) | 1906-1938, the definitive corpus | 366 | |
| Vikikaynak | Individual speeches, telegrams, all TBMM opening addresses (1920-1938) | HTML | 45 |
| Internet Archive — Nutuk (English) | The 1927 Nutuk in English | PDF / DjVu | 20 chapters |
| Total | 1906-1938 | SQLite + FTS5 | 411 speeches + Nutuk |
Atatürk passed away in 1938; his words are in the public domain. Editorial
annotations from the ATAM edition are not redistributed — only the speech bodies
themselves, with a source_ref pointing back to the corresponding ATAM volume and
page number for academic citation.
Quick start
1. Install
From PyPI (recommended for end users):
pip install ataturk-mcp # runtime only
pip install "ataturk-mcp[etl]" # also includes the ETL scripts
From source (for hacking on the ETL):
git clone https://github.com/bugraayan/ataturk-mcp.git
cd ataturk-mcp
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[etl]"
2. Build the database (one-off, ~1 minute)
python scripts/fetch_atam.py # ~10 MB of PDFs from atam.gov.tr
python scripts/fetch_wikisource.py # MediaWiki API, ~45 pages
python scripts/fetch_nutuk_en.py # ~2 MB DjVu text from Internet Archive
python scripts/build_db.py # produces data/speeches.db (~11 MB)
Or, if you only want the core corpus:
python scripts/fetch_atam.py
python scripts/build_db.py --skip-wikisource --skip-nutuk
A pre-built speeches.db is also published on the GitHub Releases page so end
users do not need to run the ETL themselves.
3. Run the MCP server
ataturk-mcp # stdio transport (default, used by all MCP clients)
# or
fastmcp run src/ataturk_mcp/server.py:mcp
# or
python -m ataturk_mcp.server
Connecting to MCP clients
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the
equivalent on your platform and add:
{
"mcpServers": {
"ataturk": {
"command": "/absolute/path/to/.venv/bin/ataturk-mcp"
}
}
}
If your database lives somewhere other than the repo, set the path explicitly:
{
"mcpServers": {
"ataturk": {
"command": "/absolute/path/to/.venv/bin/ataturk-mcp",
"env": {
"ATATURK_MCP_DB": "/path/to/speeches.db"
}
}
}
}
Restart Claude Desktop; the hammer icon now exposes the Atatürk tools.
Cursor
Edit ~/.cursor/mcp.json (or the project-level .cursor/mcp.json):
{
"mcpServers": {
"ataturk": {
"command": "/absolute/path/to/.venv/bin/ataturk-mcp"
}
}
}
Cline / Continue / any MCP host
Use the same command line with whichever JSON the host expects. The server speaks
stdio by default and follows the MCP 2024-11-05 spec.
Tools exposed
| Tool | Purpose |
|---|---|
search_speeches(query, lang, year_from, year_to, kind, limit) |
Full-text BM25 search over the entire corpus, with snippets. Diacritic-insensitive; supports FTS5 operators (AND, OR, NEAR, "phrases", prefix*). |
get_speech(speech_id, lang) |
Return one speech in full, in Turkish or English. |
list_speeches(year, year_from, year_to, kind, source, limit, offset) |
Browse the corpus chronologically. |
random_speech(lang, kind) |
Pick a random speech (useful for daily-quote agents). |
list_topics() / speeches_by_topic(topic_id, limit) |
Topical browsing (when ATAM Konular İndeksi is loaded). |
nutuk_search(query, lang, limit) |
Search within Nutuk (1927). |
nutuk_chapter(chapter, lang) |
Return one Nutuk chapter (1-20). |
cite(speech_id) |
Generate APA / MLA / Chicago citations. |
corpus_stats() |
Summary statistics about the corpus. |
Resources
| URI | Description |
|---|---|
ataturk://speech/{speech_id} |
Plain-text rendering of a single speech with header. |
ataturk://nutuk/{chapter}/{lang} |
One Nutuk chapter. |
ataturk://corpus/stats |
Statistics as JSON. |
Prompts
| Prompt | Use |
|---|---|
analyze_speech(speech_id) |
Scholarly analysis template (context, rhetoric, themes, citation). |
find_quote(theme, n_quotes) |
Theme-based quote hunter across the corpus. |
Development
pip install -e ".[etl,dev]"
pytest -q
ruff check .
The test suite uses an in-memory seeded SQLite DB and FastMCP's in-process client, so it runs in under a second and does not require the production DB to be built.
Architecture
ATAM PDFs ─┐
Vikikaynak ─┼─► ETL scripts ─► SQLite (FTS5) ─► FastMCP stdio server ─► AI clients
Nutuk EN ──┘ speeches.db
- ETL (
scripts/) is fully decoupled from the server (src/ataturk_mcp/). - The server opens the DB read-only and is safe to run in parallel from multiple clients.
- Turkish search quality: FTS5 with
unicode61 remove_diacritics 2plus application-level I/İ normalisation indb.normalise_query.
Publishing to PyPI
The project is wired to publish from GitHub Actions (see
.github/workflows/release.yml) when you push a
tag of the form vX.Y.Z. For manual publishing:
pip install build twine
python -m build # produces dist/*.whl and dist/*.tar.gz
twine check dist/*
twine upload dist/* # uploads to https://pypi.org/project/ataturk-mcp/
# or for a dry run on TestPyPI:
twine upload --repository testpypi dist/*
Credits & author
- Author: Buğra Ayan — bugraayan.com
- Email:
hello@bugraayan.com - Atatürk Araştırma Merkezi (ATAM) — for digitising and editing the Söylev ve Demeçleri corpus, the gold-standard source for this project.
- Vikikaynak / Türkçe Wikisource contributors — for transcribing individual addresses and the TBMM opening speeches.
- Internet Archive — for hosting the public-domain English Nutuk.
If you build research or journalism on top of this server, please cite both
Atatürk's words (via the in-tool cite command) and the original sources
(ATAM volume + page, or the Wikisource URL).
License
MIT for the code (© Buğra Ayan / bugraayan.com). The speech texts themselves are in the public domain (Atatürk died in 1938). ATAM, TBMM and Wikisource are credited as the digital sources used to build the corpus; please respect each source's terms when redistributing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ataturk_mcp-0.1.1.tar.gz.
File metadata
- Download URL: ataturk_mcp-0.1.1.tar.gz
- Upload date:
- Size: 4.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1e3d0d8ab012e8e84d5a50d2dd4aee2b6a692fe215a67798b8e622fa59d254a
|
|
| MD5 |
3f482b94d045bd63c1ed8af4a8d64f5b
|
|
| BLAKE2b-256 |
17769b139d9236695f7666a891bbb1f9e532870122e925e0275b507aa081e87e
|
File details
Details for the file ataturk_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ataturk_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 4.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1be9c8f862027d06af47d673b84b677862de0ace6ca743e1272013650f4bc163
|
|
| MD5 |
610852c44fad5699b09e641a8c7a9a77
|
|
| BLAKE2b-256 |
688010427db3af0829c91ed82e0f1c3cd77bc8fa5ef3e151cf643a097115585c
|