Verificação privacy-preserving contra base Receita Federal. Match, don't reveal.
Project description
brasil-mcp-match
Match, don't reveal. Privacy-preserving CNPJ verification against the Brazilian Receita Federal base.
brasil-mcp-match is a Model Context Protocol server (and matching REST API) that lets AI agents and KYC pipelines verify what they already know about a Brazilian company — without ever returning the underlying personal/business data.
Instead of "give me the razão social for CNPJ X" (which is the model every existing CNPJ API uses), the question becomes "does CNPJ X have a razão social that matches the string Y?". The answer is a boolean + a confidence hint. The base data never crosses the wire.
That single inversion has big downstream consequences:
- LGPD posture. Operators don't accumulate copies of the RF base; controllers don't redistribute PII they had no basis to share.
- Match accuracy >> exact lookup. Fuzzy matching is built in for razão social (accent-insensitive, token-set, weighted phonetic). The caller doesn't need to know exactly how the RF spells things.
- Auditability. Every call returns a
query_idand is logged with a hashed API key, hashed input, and response summary only — never the raw RF payload.
Install
pip install brasil-mcp-match
# or
uv add brasil-mcp-match
License note. This package is AGPL-3.0-or-later. If you're calling our hosted API, you're fine. If you want to self-host commercially without releasing your derived source, contact us for a commercial license.
Quick start
1. Start Postgres + run the ingest
git clone https://github.com/brasil-mcp/match.git
cd match
docker compose up -d postgres
uv run brasil-mcp-match-ingest --release 2026-04
The first ingest takes ~30 minutes (downloads ~5 GB of RF dumps, parses, COPYs into Postgres). Subsequent monthly refreshes are incremental.
2. Start the API
uv run brasil-mcp-match serve 8000
3. Make a call
curl -X POST http://localhost:8000/v1/match/razao-social \
-H "X-Brasil-MCP-Key: brasilmcp_yourkeyhere" \
-H "Content-Type: application/json" \
-d '{"cnpj": "33000167000101", "nome": "Petrobras"}'
{
"query_id": "06d9ef6e-3759-43a4-864f-786e1ad59a6d",
"base_updated_at": "2026-04-01",
"match": true,
"confidence": 1.0,
"hint": "fuzzy_prefix"
}
The response never contains the actual razão social registered at the Receita Federal — just confirmation that what you supplied matches.
v0.1.0 tools
| Tool | Description | Output |
|---|---|---|
match_razao_social |
Verify if a name matches the RF-registered razão social. Fuzzy: accent-insensitive, token-set, phonetic. | { match, confidence, hint } |
check_situacao_cadastral |
Return the cadastral status (ativa/suspensa/inapta/baixada/nula). | { situacao, since } |
check_porte_empresa |
Return company size enum (MEI/ME/EPP/DEMAIS) and Simples Nacional flag. | { porte, is_simples_nacional, is_mei } |
match_uf |
Verify the company's registered UF matches what you supplied. | { match } |
Full input/output schemas + cURL examples: docs/tools.md.
Base coverage — what's in (and what's out)
The Match base intentionally excludes two categories of CNPJs from the ingested Postgres database:
-
MEI (Microempreendedor Individual). A MEI is a sole-proprietorship where the legal entity is bound to a single individual's CPF. Treating MEIs the same as regular companies would expose individual-person PII through every match operation. They're excluded by default.
-
Companies with non-active cadastral status. Suspended, unfit, inactive, and dissolved companies aren't relevant for KYC, onboarding, or any forward-looking B2B operation. They're noise. Excluded by default.
Consequence: CNPJ_NOT_FOUND now means one of:
- The CNPJ doesn't exist (typo, hallucination, invalid number).
- The CNPJ exists but is a MEI.
- The CNPJ exists but isn't currently active.
If you need to distinguish these cases, use brasil-mcp-essentials.validate_cnpj first (offline, free) to confirm the CNPJ is structurally valid before calling Match.
Override (for special use cases)
The exclusion is enforced at ingestion via SQL filters in the loader. Override via environment variables before running brasil-mcp-match-ingest:
# Include MEI in the base
export BRASIL_MCP_MATCH_INCLUDE_MEI=1
# Include non-active CNPJs
export BRASIL_MCP_MATCH_INCLUDE_INATIVAS=1
uv run brasil-mcp-match-ingest --release 2026-04
The future brasil-mcp-compliance (Phase 5) will likely use both overrides to maintain a full historical trail. The default Match deployment exports neither.
MCP client setup (Claude Desktop)
{
"mcpServers": {
"brasil-mcp-match": {
"url": "https://match.brasil-mcp.com/sse",
"transport": "sse",
"headers": {
"X-Brasil-MCP-Key": "brasilmcp_yourkeyhere"
}
}
}
}
Different MCP clients negotiate auth differently. For stdio (single-user dev), use the REST API with the same key. For hosted SSE, the header above is sufficient. See the MCP spec for transport details.
Privacy & LGPD
This is a server that processes data classified by Brazilian law as personal (when a CNPJ is associated with an MEI or natural-person sócio). We take three explicit positions:
- Operational minimization. Every output is a boolean, enum, or short string. The RF payload (razão social, capital social, addresses, sócio names, CPF fragments) never leaves the server. We've structurally enforced this in the codebase + asserted it across a security test suite (323 tests).
- Opt-out (Art. 18 LGPD). Titulares can request removal via
POST /v1/opt-out/{cnpj}with proof. After 15 business days the CNPJ is blocked from all match/check tools. Seedocs/lgpd/for our LIA template and DPA template. - Auditability without spying. Every call yields a
query_id. Callers can retrieve the audit entry for their own calls — never for someone else's (RBAC enforced + tested).
If you're an attorney evaluating us for a deployment, start with the LIA template (docs/lgpd/LIA.md). It maps our processing to a legitimate-interest legal basis with explicit safeguards.
Architecture
src/brasil_mcp_match/
core/
ingestion/ # downloader, parser, loader, refresh job, manifest
matching/ # razao_social, situacao, porte, localizacao
repository/ # CnpjRepo protocol + PostgresCnpjRepo
auth/ # api_key + quota
audit/ # append-only log
lgpd/ # opt-out (Art. 18)
errors.py # ErrorCode + ErrorObj
adapters/
mcp/ # FastMCP server + 4 tools
rest/ # FastAPI app + dependencies + routes_match + routes_lgpd
Same pattern as brasil-mcp-essentials: pure-Python core, thin adapters. Postgres + GIN tri-gram indexes for fuzzy. FastAPI + FastMCP SSE.
Roadmap
- v0.1.0 (now) — 4 match/check tools, REST + MCP SSE adapters, API keys, audit, opt-out, rate limit. 100% test coverage (line + branch).
- v0.2.0 — More tools:
match_cnae,check_idade,match_socio_cpf(compares CPF prefix only),match_municipio,match_cep. - v0.3.0 — OAuth 2.0 (vs API keys), tenant-scoped quotas, OpenTelemetry tracing.
- v0.4.0 — Tri-gram fuzzy as a built-in match strategy (vs Python-side rapidfuzz). Async refresh job.
Family
- Fase 1 —
brasil-mcp-essentials. 14 offline utilities (validators, boletos, PIX QR, calendário). MIT. - Fase 2 — this repo. Verification against the RF base. AGPLv3.
- Fase 3 —
brasil-mcp-compliance(future). Due diligence + KYC, commercial.
License
AGPL-3.0-or-later. For commercial self-host without source release obligations, contact us.
Contributing
Issues + PRs welcome. Before opening a PR, run:
uv run ruff check && uv run ruff format --check && uv run pyright src
uv run pytest --cov-fail-under=100 -q
CI runs the same gates across Python 3.11, 3.12, and 3.13.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brasil_mcp_match-0.1.1.tar.gz.
File metadata
- Download URL: brasil_mcp_match-0.1.1.tar.gz
- Upload date:
- Size: 193.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ed90ba7f5f5b79e8f2ad50dfb549e5b1d25ebe159ab1f84279c977ccde97a05
|
|
| MD5 |
8b6fb2f8464661e0bc9ac6ad02598598
|
|
| BLAKE2b-256 |
6dc9e0ece2d867b172436f95e52ec8965b6a5fc877ee56bd100d55f6393f64bc
|
Provenance
The following attestation bundles were made for brasil_mcp_match-0.1.1.tar.gz:
Publisher:
release.yml on brasil-mcp/match
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
brasil_mcp_match-0.1.1.tar.gz -
Subject digest:
1ed90ba7f5f5b79e8f2ad50dfb549e5b1d25ebe159ab1f84279c977ccde97a05 - Sigstore transparency entry: 1602644299
- Sigstore integration time:
-
Permalink:
brasil-mcp/match@4f28cae1275f05441dd3754e10da949105979e4a -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/brasil-mcp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4f28cae1275f05441dd3754e10da949105979e4a -
Trigger Event:
push
-
Statement type:
File details
Details for the file brasil_mcp_match-0.1.1-py3-none-any.whl.
File metadata
- Download URL: brasil_mcp_match-0.1.1-py3-none-any.whl
- Upload date:
- Size: 46.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
514bd37993bb77a23c5e7a10a74412b633e9582115959a9f99b4e806831e86c4
|
|
| MD5 |
a8e113180cd1fa712bb3aee5c2521d5f
|
|
| BLAKE2b-256 |
2bd2008e25014b0c6d30c0240cfcc58b95b1205cd0e4b34dde27e9d46237463d
|
Provenance
The following attestation bundles were made for brasil_mcp_match-0.1.1-py3-none-any.whl:
Publisher:
release.yml on brasil-mcp/match
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
brasil_mcp_match-0.1.1-py3-none-any.whl -
Subject digest:
514bd37993bb77a23c5e7a10a74412b633e9582115959a9f99b4e806831e86c4 - Sigstore transparency entry: 1602644387
- Sigstore integration time:
-
Permalink:
brasil-mcp/match@4f28cae1275f05441dd3754e10da949105979e4a -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/brasil-mcp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4f28cae1275f05441dd3754e10da949105979e4a -
Trigger Event:
push
-
Statement type: