Name Analysis & Prediction Engine
Project description
🌍 Ethnidata: Ethical & Demographic Intelligence
Ethnidata is a specialized library for ethical demographic analysis, name-based ethnic classification, and socioeconomic profiling. It is designed to help researchers and developers understand global diversity while maintaining strict ethical standards and explainability.
🌟 Vision
To provide a transparent and robust framework for demographic intelligence, enabling unbiased analysis and inclusive product development through Explainable AI (XAI).
🚀 Key Features
- 🧬 Advanced Classification: High-accuracy ethnic and regional classification based on global naming patterns.
- 🔍 Explainable AI (XAI): Integral
Explainerclass that breaks down WHY a classification was made, citing linguistic markers. - 📊 Demographic Synthesis: Generate privacy-safe synthetic demographic profiles for testing and simulation.
- 📉 Bias Detection: Tools to identify and mitigate representation bias in your datasets.
- 🌍 Global Coverage: Support for over 150 ethnic groups and regional clusters.
📦 Installation
pip install ethnidata
🛠️ Premium Usage
1. Unified Facade Access
The EthniData facade provides a streamlined interface for classification and explainability.
from ethnidata import EthniData
# Initialize the intelligence engine
ed = EthniData()
# 1. Classify a name with explainability
result = ed.classify("Kazuo Ishiguro", explain=True)
print(f"Name: {result.name}")
print(f"Primary Ethnicity: {result.ethnicity}")
print(f"Confidence: {result.confidence:.2f}")
# 2. Access XAI Insights
explanation = result.explanation
print("\n--- XAI Breakdown ---")
for marker in explanation.linguistic_markers:
print(f"- Marker: {marker.token} | Strength: {marker.weight:.2f} | Origin: {marker.region}")
✅ Verified Output
Name: Kazuo Ishiguro
Primary Ethnicity: Japanese
Confidence: 0.98
--- XAI Breakdown ---
- Marker: Kazuo | Strength: 0.85 | Origin: East Asia (Japan)
- Marker: Ishiguro | Strength: 0.92 | Origin: East Asia (Japan)
2. Synthetic Profile Generation
Create high-fidelity, privacy-safe demographic data for system testing.
from ethnidata import ProfileGenerator, Region
generator = ProfileGenerator()
# Generate a batch of synthetic profiles for the Mediterranean region
profiles = generator.generate_batch(region=Region.MEDITERRANEAN, count=5)
for profile in profiles:
print(f"Profile: {profile.name} | Age: {profile.age} | Occupation: {profile.estimated_occupation}")
✅ Verified Output
Profile: Marco Rossi | Age: 34 | Occupation: Software Engineer
Profile: Elena Papadopoulos | Age: 28 | Occupation: Architect
...
📊 API Reference
EthniData (Facade)
classify(name: str, explain: bool = False) -> ClassificationResult: The primary entry point for classification.batch_classify(names: list, ...) -> List[ClassificationResult]: Process large datasets efficiently.get_explainer() -> ExplainabilityEngine: Access the raw XAI engine.
Modules
ExplainabilityEngine: Linguistic marker analysis and evidence weighing.ProfileGenerator: Synthetic data engine with region-specific constraints.BiasAnalyzer: Statistical tools for measuring group representation.
🛡️ Ethics & Privacy
Ethnidata is built with a Privacy-First approach. It does not store personal data and focuses on aggregate-level linguistic patterns. We strongly recommend using this library only for research and inclusive design purposes.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ethnidata-4.3.1.tar.gz.
File metadata
- Download URL: ethnidata-4.3.1.tar.gz
- Upload date:
- Size: 35.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa2698a74d362f0b8843ea05cd3099eee7b35588ab6fe2bec6b9e7e7fbcf834a
|
|
| MD5 |
3621138b6047d6cd20139847877a0aa2
|
|
| BLAKE2b-256 |
fe60ecb1bcbd2503329a9e6d744e36ff26e29f79cc568a407711f7cae31b41f9
|
File details
Details for the file ethnidata-4.3.1-py3-none-any.whl.
File metadata
- Download URL: ethnidata-4.3.1-py3-none-any.whl
- Upload date:
- Size: 57.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bee8c282e92b530c428268ea64e091dcc66098a393fcf969f5bd505ccad8e435
|
|
| MD5 |
a3ba8c5f9d3b9aa71ba0d56d165f8dfa
|
|
| BLAKE2b-256 |
41c135310ab47b369639e100c752b7ee7f8e2377e2bb68c3814f7ad524772e3b
|