Generate realistic fake Persian/Farsi names for testing - تولید اسمهای فارسی فیک با 10,000+ اسم اصیل
Project description
Farsi Faker | فارسی فیکر
Generate realistic fake Persian/Farsi names for testing and development
تولید اسمهای فارسی فیک واقعگرایانه برای تست و توسعه
✨ Features
- 🎯 10,000+ Authentic Names - Real Persian names from Iranian datasets
- 👥 Gender-Specific - Separate male and female name generation
- ⚡ High Performance - Optimized pickle-based data storage
- 🔄 Reproducible - Seed support for consistent results
- 🚀 Zero Dependencies - No external packages required for production
- 🔒 Thread-Safe - Safe for concurrent use
- 📝 Fully Typed - Complete type hints for better IDE support
- ✅ Well Tested - Comprehensive test coverage
- 🌍 Unicode Support - Full Persian/Farsi character support
- 🐌 pandas Integration - Optional DataFrame output for data science workflows
📦 Installation
From PyPI (Recommended)
pip install farsi-faker
With pandas support (for DataFrame output)
pip install farsi-faker pandas
From Source
git clone https://github.com/alisadeghiaghili/farsi-faker.git
cd farsi-faker
pip install -e .
Requirements
- Python 3.7+
- No external dependencies for production use
- Optional:
pandasfor DataFrame output (as_dataframe=True)
🚀 Quick Start
Basic Usage
from farsi_faker import FarsiFaker
faker = FarsiFaker()
# Generate a random person
person = faker.full_name()
print(person)
# {'name': 'علی صادقی عقیلی', 'first_name': 'علی', 'last_name': 'صادقی عقیلی', 'gender': 'male'}
# Generate male name
male = faker.full_name('male')
print(male['name']) # علی صادقی عقیلی
# Generate female name
female = faker.full_name('female')
print(female['name']) # سپیده جلیلی
Generate Multiple Names
# 10 random names as a list (default)
people = faker.generate_names(10)
# 50 male names as a list
men = faker.generate_names(50, 'male')
# 30 female names as a pandas DataFrame
women_df = faker.generate_names(30, 'female', as_dataframe=True)
print(women_df.shape) # (30, 4)
print(list(women_df.columns)) # ['name', 'first_name', 'last_name', 'gender']
print(women_df.head(2))
# name first_name last_name gender
# 0 فاطمه احمدی فاطمه احمدی female
# 1 زینب رضایی زینب رضایی female
Generate Balanced Dataset
# 100 people with 60% male ratio — as a list
dataset = faker.generate_dataset(100, male_ratio=0.6)
print(len(dataset)) # 100
# Same, but as a pandas DataFrame
df = faker.generate_dataset(500, male_ratio=0.5, as_dataframe=True)
print(df.shape) # (500, 4)
print(df['gender'].value_counts())
# male 250
# female 250
# Name: gender, dtype: int64
Reproducible Results
faker1 = FarsiFaker(seed=42)
faker2 = FarsiFaker(seed=42)
assert faker1.full_name() == faker2.full_name() # True
Quick One-Off Generation
from farsi_faker import generate_fake_name
person = generate_fake_name('male')
print(person['name']) # علی صادقی عقیلی
📖 Documentation
Class: FarsiFaker
Main class for generating Persian names.
Constructor
FarsiFaker(seed: Optional[int] = None)
Parameters:
seed(int, optional): Random seed for reproducible results
Example:
faker = FarsiFaker() # random
faker = FarsiFaker(seed=42) # reproducible
male_first_name() -> str
Return a random male first name.
faker.male_first_name() # 'محمد'
female_first_name() -> str
Return a random female first name.
faker.female_first_name() # 'فاطمه'
first_name(gender=None) -> Tuple[str, str]
Return a first name with its normalised gender.
Parameters:
gender(str, optional): Any supported gender token (see Gender Input Options)
Returns: (name, gender) — gender is always 'male' or 'female'
name, g = faker.first_name('male')
# ('علی', 'male')
name, g = faker.first_name() # random gender
# ('مریم', 'female')
last_name() -> str
Return a random Persian family name.
faker.last_name() # 'احمدی'
full_name(gender=None) -> Dict[str, str]
Return a complete person record.
Returns: dict with keys name, first_name, last_name, gender
person = faker.full_name('female')
# {
# 'name': 'سپیده جلیلی',
# 'first_name': 'سپیده',
# 'last_name': 'جلیلی',
# 'gender': 'female'
# }
assert person['name'] == person['first_name'] + ' ' + person['last_name']
generate_names(count=10, gender=None, as_dataframe=False)
Generate multiple full-name records.
Parameters:
count(int, default10): Number of records to generategender(str, optional): Gender applied to all records; random mix whenNoneas_dataframe(bool, defaultFalse): Return apandas.DataFrameinstead of a list
Returns: List[Dict] or pandas.DataFrame with columns ['name', 'first_name', 'last_name', 'gender']
Raises: ValueError if count ≤ 0; ImportError if as_dataframe=True and pandas is not installed
# List (default)
people = faker.generate_names(5, 'male')
assert len(people) == 5
assert all(p['gender'] == 'male' for p in people)
# DataFrame
df = faker.generate_names(100, as_dataframe=True)
assert df.shape == (100, 4)
assert list(df.columns) == ['name', 'first_name', 'last_name', 'gender']
assert not df.isnull().any().any()
assert (df['name'] == df['first_name'] + ' ' + df['last_name']).all()
generate_dataset(count=100, male_ratio=0.5, as_dataframe=False)
Generate a balanced dataset with a configurable gender ratio.
Parameters:
count(int, default100): Total number of recordsmale_ratio(float, default0.5): Fraction of male records in[0.0, 1.0]as_dataframe(bool, defaultFalse): Return apandas.DataFrameinstead of a list
Returns: Shuffled List[Dict] or pandas.DataFrame
Raises: ValueError if count ≤ 0 or male_ratio outside [0.0, 1.0]; ImportError if pandas missing and as_dataframe=True
# List (default)
dataset = faker.generate_dataset(10, male_ratio=0.6)
assert len(dataset) == 10
assert sum(1 for p in dataset if p['gender'] == 'male') == 6
# DataFrame
df = faker.generate_dataset(100, male_ratio=0.5, as_dataframe=True)
assert df.shape == (100, 4)
assert df['gender'].value_counts().to_dict() == {'male': 50, 'female': 50}
# Edge cases
assert all(p['gender'] == 'female' for p in faker.generate_dataset(5, male_ratio=0.0))
assert all(p['gender'] == 'male' for p in faker.generate_dataset(5, male_ratio=1.0))
get_stats() -> Dict[str, int]
Return statistics about the embedded names database.
Returns: dict with keys male_names_count, female_names_count, last_names_count, total_names, possible_combinations
stats = faker.get_stats()
assert stats['possible_combinations'] == \
(stats['male_names_count'] + stats['female_names_count']) * stats['last_names_count']
print(f"Possible combinations: {stats['possible_combinations']:,}")
# Possible combinations: 21,000,000
Function: generate_fake_name(gender=None, seed=None) -> Dict[str, str]
Convenience wrapper for one-off generation. For bulk generation prefer a FarsiFaker instance directly.
from farsi_faker import generate_fake_name
p1 = generate_fake_name('female', seed=99)
p2 = generate_fake_name('female', seed=99)
assert p1 == p2 # reproducible
🎨 Examples
Example 1: Django test fixtures
from farsi_faker import FarsiFaker
from myapp.models import User
faker = FarsiFaker(seed=42)
for person in faker.generate_dataset(100, male_ratio=0.5):
User.objects.create(**person)
Example 2: Export to CSV
import csv
from farsi_faker import FarsiFaker
faker = FarsiFaker()
with open('people.csv', 'w', encoding='utf-8', newline='') as f:
writer = csv.DictWriter(f, fieldnames=['name', 'first_name', 'last_name', 'gender'])
writer.writeheader()
writer.writerows(faker.generate_dataset(1000, male_ratio=0.6))
Example 3: pandas DataFrame for data science
from farsi_faker import FarsiFaker
faker = FarsiFaker(seed=123)
df = faker.generate_dataset(500, male_ratio=0.55, as_dataframe=True)
print(df.shape) # (500, 4)
print(df['gender'].value_counts()) # male 275 / female 225
print(df.groupby('gender')['last_name'].nunique())
Example 4: pytest fixture
import pytest
from farsi_faker import FarsiFaker
@pytest.fixture
def fake_users():
return FarsiFaker(seed=42).generate_dataset(10, male_ratio=0.5)
def test_user_creation(fake_users):
assert len(fake_users) == 10
assert all('name' in u for u in fake_users)
Example 5: Flask mock API
from flask import Flask, jsonify
from farsi_faker import FarsiFaker
app = Flask(__name__)
faker = FarsiFaker()
@app.route('/api/users/random')
def random_user():
return jsonify(faker.full_name())
@app.route('/api/users/<int:count>')
def multiple_users(count):
return jsonify(faker.generate_names(min(count, 100)))
🎯 Gender Input Options
| Input | Resolves to |
|---|---|
'male', 'm' |
'male' |
'مرد', 'پسر', 'مذکر' |
'male' |
'female', 'f' |
'female' |
'زن', 'دختر', 'مونث' |
'female' |
None |
random |
📊 Database Statistics
from farsi_faker import FarsiFaker
stats = FarsiFaker().get_stats()
print(f"Male names: {stats['male_names_count']:,}")
print(f"Female names: {stats['female_names_count']:,}")
print(f"Last names: {stats['last_names_count']:,}")
print(f"Total names: {stats['total_names']:,}")
print(f"Possible combinations: {stats['possible_combinations']:,}")
🧪 Testing
pip install -e ".[dev]"
pytest tests/ -v
pytest tests/ --cov=farsi_faker --cov-report=html
🛠️ Development
git clone https://github.com/alisadeghiaghili/farsi-faker.git
cd farsi-faker
python -m venv venv && source venv/bin/activate
pip install -e ".[all]"
# quality checks
black farsi_faker/ && isort farsi_faker/ && mypy farsi_faker/
pytest tests/ -v
📁 Project Structure
farsi-faker/
├── farsi_faker/
│ ├── __init__.py
│ ├── faker.py ← core class
│ ├── _version.py
│ └── data/names.pkl
├── tests/test_faker.py
├── scripts/create_pickle.py
├── setup.py
├── pyproject.toml
├── CHANGELOG.md
└── README.md
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Add tests for new functionality
- Run tests (
pytest tests/) - Commit (
git commit -m 'Add amazing feature') - Push and open a Pull Request
Code style: Black + isort. Type hints required. Docstrings required.
📄 License
MIT — see LICENSE.
📞 Contact
- Author: Ali Sadeghi Aghili
- GitHub: alisadeghiaghili/farsi-faker
- PyPI: pypi.org/project/farsi-faker
- Issues: github.com/alisadeghiaghili/farsi-faker/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file farsi_faker-1.1.0.tar.gz.
File metadata
- Download URL: farsi_faker-1.1.0.tar.gz
- Upload date:
- Size: 130.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ded0100cf377078ea87d72225bad319ed8566518ecee12b347ad862fc9a247ba
|
|
| MD5 |
5c32c30569d1dad370899f26a010dc81
|
|
| BLAKE2b-256 |
620ca1854430b0f989509df608d1a611758134f1993997d58c9e7b2f11a35fc0
|
File details
Details for the file farsi_faker-1.1.0-py3-none-any.whl.
File metadata
- Download URL: farsi_faker-1.1.0-py3-none-any.whl
- Upload date:
- Size: 91.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a45a4d4564f7412bfd0adc409c39df99ae3d263c3fdbc2c027d0e717bff1b29
|
|
| MD5 |
b158a876c0c2e046a5abd1fed0bae9e4
|
|
| BLAKE2b-256 |
34fe5c0cb0a0888e36401dd8205e9e452a3a7f0b3b27734fb8c2f0ba201ea4dc
|