MCP server for ESRU-EMOVI 2023 social mobility survey
Project description
emovi-mcp
MCP server for the ESRU-EMOVI 2023 social mobility survey (Mexico).
Servidor MCP para la encuesta ESRU-EMOVI 2023 de movilidad social en México.
What is this? / ¿Qué es esto?
emovi-mcp lets AI assistants (Claude, ChatGPT, etc.) query Mexico's most comprehensive social mobility survey through natural language. It exposes weighted statistical computations, intergenerational transition matrices, and variable exploration as MCP tools.
emovi-mcp permite que asistentes de IA (Claude, ChatGPT, etc.) consulten la encuesta de movilidad social más completa de México mediante lenguaje natural. Expone cómputos estadísticos ponderados, matrices de transición intergeneracional y exploración de variables como herramientas MCP.
About ESRU-EMOVI 2023
The ESRU-EMOVI 2023 survey, conducted by the Centro de Estudios Espinosa Yglesias (CEEY), is a nationally representative survey of social mobility in Mexico. It covers 17,843 respondents aged 25-64, with expansion factors representing ~60 million people.
Datasets included:
| Dataset | Description | Rows | Variables |
|---|---|---|---|
entrevistado |
Main respondent data | 17,843 | ~296 |
hogar |
Household roster | 55,477 | ~56 |
ingreso_2017 |
Imputed 2017 income (temporal comparison) | 17,665 | ~2 |
inclusion_financiera |
Financial inclusion module | 5,976 | ~109 |
Installation / Instalación
# Clone the repository
git clone https://github.com/Lalitronico/emovi-mcp.git
cd emovi-mcp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/Mac
.venv\Scripts\activate # Windows
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Optional: visualization support
pip install -e ".[dev,viz]"
Or install directly from PyPI:
pip install emovi-mcp
# With visualization support
pip install emovi-mcp[viz]
Prerequisites / Prerrequisitos
- Python >= 3.10
- ESRU-EMOVI 2023 .dta files (obtain from CEEY)
Configuration / Configuración
Set the EMOVI_DATA_DIR environment variable to point to the directory containing the .dta files:
# Linux/Mac
export EMOVI_DATA_DIR="/path/to/Esru Emovi 2023/_extracted/3 BASES DE DATOS/Data"
# Windows (PowerShell)
$env:EMOVI_DATA_DIR = "C:\path\to\Esru Emovi 2023\_extracted\3 BASES DE DATOS\Data"
# Windows (cmd)
set EMOVI_DATA_DIR=C:\path\to\Esru Emovi 2023\_extracted\3 BASES DE DATOS\Data
Usage with Claude Desktop / Uso con Claude Desktop
Add the following to your claude_desktop_config.json:
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"emovi-mcp": {
"command": "C:/path/to/emovi-mcp/.venv/Scripts/python.exe",
"args": ["-m", "emovi_mcp"],
"env": {
"EMOVI_DATA_DIR": "C:/path/to/3 BASES DE DATOS/Data"
}
}
}
}
On macOS/Linux, replace Scripts/python.exe with bin/python.
Tools / Herramientas
The server exposes 11 MCP tools:
| Tool | Description |
|---|---|
describe_survey |
Survey overview: datasets, sample size, design, dimensions |
list_variables |
Browse variables by dataset, section, or search keyword |
variable_detail |
Full info for a variable: label, values, section, dataset |
tabulate |
Weighted crosstabulation (row x col with expansion factors) |
transition_matrix |
Intergenerational mobility matrix with formal indices (Shorrocks, Prais, odds ratios) and optional standard errors via Taylor linearization |
weighted_stats |
Descriptive statistics: mean, median, std, quantiles (weighted) |
compare_groups |
Compare a variable across groups (mean, median, or distribution) |
filter_data |
Extract raw data rows with optional filters (max 100 rows) |
financial_inclusion_summary |
Financial inclusion analysis: savings, credit, banking, literacy, discrimination |
income_comparison |
Temporal income comparison between 2017 and 2023 with poverty line classification |
visualize_mobility |
Generate heatmaps, Sankey diagrams, or bar charts for mobility matrices (requires [viz]) |
Example queries / Ejemplos de consultas
Once connected, you can ask the AI assistant questions like:
-
"¿Cuál es la distribución educativa por sexo?" → Uses
tabulate(row_var="educ", col_var="sexo") -
"Muéstrame la matriz de movilidad educativa intergeneracional" → Uses
transition_matrix(dimension="education") -
"¿Cuál es el ingreso promedio por región?" → Uses
weighted_stats(variable="ingc_pc", by="region_14") -
"Compara la movilidad educativa entre hombres y mujeres" → Uses
transition_matrix(dimension="education", by="sexo") -
"¿Qué variables hay sobre educación?" → Uses
list_variables(search="educ")
Variable Dictionary / Diccionario de variables
The project includes a pre-built dictionary with 792 variables extracted from the official CEEY documentation and .dta metadata. The dictionary supports searching by name, description, dataset, and section.
To rebuild the dictionary from source data:
python scripts/build_dictionary.py
This requires the Diccionario ESRU EMOVI 2023.xlsx file in the data directory.
Running Tests / Ejecutar pruebas
pytest
All 96 tests cover weighted statistics, transition matrices, mobility indices, Taylor-linearized standard errors, financial inclusion, temporal income comparison, visualization, and variable dictionary functionality using synthetic data (no real microdata needed for tests).
Technical Notes / Notas técnicas
- All statistics are weighted using the
factorexpansion variable (orfac_incfor the financial inclusion module) - pyreadstat loads .dta files with
apply_value_formats=Falseto avoid crashes from duplicate municipality labels padres_eduis constructed asmax(educp, educm)following the CEEY .do file methodology- Wealth index uses PCA on binary asset indicators (Filmer & Pritchett, 2001), as an alternative to CEEY's MCA approach
- Standard errors use Taylor linearization for ratio estimators under stratified cluster sampling (PSU/strata)
- Mobility indices: Shorrocks M, Prais escape probability, intergenerational Pearson r, corner odds ratios
- STDIO transport: The server communicates via standard input/output, compatible with Claude Desktop and other MCP clients
Project Structure / Estructura del proyecto
emovi-mcp/
├── pyproject.toml
├── README.md
├── LICENSE
├── CONTRIBUTING.md
├── .github/workflows/
│ ├── ci.yml # CI: pytest on Python 3.10/3.11/3.12
│ └── publish.yml # Publish to PyPI on release
├── scripts/
│ ├── build_dictionary.py # One-time dictionary builder
│ └── validate_ceey.py # Validate against CEEY 2025 reference
├── validation/
│ └── ceey_reference_values.json # CEEY reference matrices
├── src/emovi_mcp/
│ ├── __init__.py
│ ├── __main__.py # python -m emovi_mcp
│ ├── main.py # FastMCP server entry point
│ ├── config.py # Environment, mappings, constants
│ ├── data_loader.py # Lazy .dta loader with cache
│ ├── dictionary.py # Variable dictionary (JSON-based)
│ ├── stats_engine.py # Transition matrices, descriptives
│ ├── data/
│ │ └── dictionary.json # 792 variables
│ ├── helpers/
│ │ ├── formatting.py # Markdown formatters for LLM output
│ │ ├── labels.py # Value label resolution
│ │ ├── mobility_indices.py # Shorrocks, Prais, odds ratios
│ │ ├── survey_variance.py # Taylor linearization for SE/CI
│ │ ├── validation.py # Column + filter validation
│ │ ├── visualization.py # Heatmaps, Sankey, bar charts
│ │ └── weights.py # Weighted mean, median, quantile, freq
│ └── tools/
│ ├── __init__.py # Tool registration (11 tools)
│ ├── compare.py # compare_groups
│ ├── describe.py # describe_survey
│ ├── financial.py # financial_inclusion_summary
│ ├── mobility.py # transition_matrix
│ ├── stats.py # weighted_stats
│ ├── subset.py # filter_data
│ ├── tabulate.py # tabulate
│ ├── temporal.py # income_comparison
│ ├── variables.py # list_variables, variable_detail
│ └── visualize.py # visualize_mobility
└── tests/
├── conftest.py # Shared fixtures (synthetic data)
├── test_dictionary.py
├── test_financial.py
├── test_mobility.py
├── test_mobility_indices.py
├── test_stats_engine.py
├── test_survey_variance.py
├── test_temporal.py
├── test_visualization.py
└── test_wealth_index.py
License / Licencia
MIT
Acknowledgments / Agradecimientos
Survey data: Centro de Estudios Espinosa Yglesias (CEEY) — ESRU-EMOVI 2023.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file emovi_mcp-0.2.0.tar.gz.
File metadata
- Download URL: emovi_mcp-0.2.0.tar.gz
- Upload date:
- Size: 55.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b93fb05655139730acd385f34cf102e6e5b18d03615f8b5d2c2d2b5f3227457
|
|
| MD5 |
c5fe58ed008bc0957e1b400f62c7c8a1
|
|
| BLAKE2b-256 |
c7e5333824ff5a9fb7579fdfda1432b5737a16d55a50cbf272a729a4231c6569
|
Provenance
The following attestation bundles were made for emovi_mcp-0.2.0.tar.gz:
Publisher:
publish.yml on Lalitronico/emovi-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
emovi_mcp-0.2.0.tar.gz -
Subject digest:
0b93fb05655139730acd385f34cf102e6e5b18d03615f8b5d2c2d2b5f3227457 - Sigstore transparency entry: 1006719229
- Sigstore integration time:
-
Permalink:
Lalitronico/emovi-mcp@d0f5ae8fbb8f2fe8677b22db091f1594bca06a68 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Lalitronico
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d0f5ae8fbb8f2fe8677b22db091f1594bca06a68 -
Trigger Event:
release
-
Statement type:
File details
Details for the file emovi_mcp-0.2.0-py3-none-any.whl.
File metadata
- Download URL: emovi_mcp-0.2.0-py3-none-any.whl
- Upload date:
- Size: 42.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a49b0d57cb920035ba7f2a08818ee03b9be840d47ec318eb0e0d85d01580b529
|
|
| MD5 |
ee5c1928b69e6a72c8c28b00ea6f9a26
|
|
| BLAKE2b-256 |
a38aa5290cdfded5eeac57b26d1531e1e236ce1ab937ac142adfa0e310a3a40b
|
Provenance
The following attestation bundles were made for emovi_mcp-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on Lalitronico/emovi-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
emovi_mcp-0.2.0-py3-none-any.whl -
Subject digest:
a49b0d57cb920035ba7f2a08818ee03b9be840d47ec318eb0e0d85d01580b529 - Sigstore transparency entry: 1006719230
- Sigstore integration time:
-
Permalink:
Lalitronico/emovi-mcp@d0f5ae8fbb8f2fe8677b22db091f1594bca06a68 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Lalitronico
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d0f5ae8fbb8f2fe8677b22db091f1594bca06a68 -
Trigger Event:
release
-
Statement type: