AI-powered resume parser with parallel processing capabilities
Project description
ResumeParser Pro 🚀
Production-ready AI-powered resume parser with parallel processing capabilities. Extract structured data from resumes in PDF, DOCX, and TXT formats using state-of-the-art language models.
🌟 Features
- 🤖 AI-Powered: Uses advanced language models (GPT, Gemini, Claude, etc.)
- ⚡ Parallel Processing: Process multiple resumes simultaneously
- 📊 Structured Output: Returns clean, validated JSON data
- 🎯 High Accuracy: Extracts 20+ fields with intelligent categorization
- 📈 Production Ready: Robust error handling and logging
- 🔌 Easy Integration: Simple API with just 3 lines of code
🚀 Quick Start
Installation
pip install ai-resume-parser
For full functionality (recommended)
pip install ai-resume-parser[full]
Basic Usage
from resumeparser_pro import ResumeParserPro
#Initialize parser
parser = ResumeParserPro(
provider="google_genai",
model_name="gemini-2.0-flash",
api_key="your-api-key"
)
#Parse single resume
result = parser.parse_resume("resume.pdf")
print(f"Name: {result.resume_data.contact_info.full_name}")
print(f"Experience: {result.resume_data.total_experience_months} months")
Flexible approach (recommended)
if result.success:
name = result.resume_data.contact_info.full_name
experience = result.resume_data.total_experience_months
# Quick overview (convenience method)
print(result.get_summary())
# Full data export
resume_dict = result.model_dump()
Batch Processing
#Process multiple resumes in parallel
file_paths = ["resume1.pdf", "resume2.docx", "resume3.pdf"]
results = parser.parse_batch(file_paths)
#returns list of parsed resumes
#Get successful results
successful_resumes = parser.get_successful_resumes(results)
print(f"Parsed {len(successful_resumes)} resumes successfully")
📊 Extracted Data
ResumeParser Pro extracts 20+ structured fields:
Contact Information
- Full name, email, phone number
- Location, LinkedIn, GitHub, portfolio
- Other social profiles
Professional Data
- Work experience with integer month durations
- Education with GPA standardization
- Skills categorized by type
- Projects with technologies and outcomes
- Certifications with dates and organizations
Metadata
- Total experience in months
- Industry classification
- Seniority level assessment
🎯 Supported AI Providers
Since ai-resume-parser uses LangChain's init_chat_model, it supports all LangChain-compatible providers:
Major Providers:
| Provider | Example Models | Setup |
|---|---|---|
| Gemini 2.0 Flash, Gemini Pro, Gemini 1.5 | provider="google_genai" |
|
| OpenAI | GPT-4o, GPT-4o-mini, GPT-4 Turbo | provider="openai" |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | provider="anthropic" |
| Azure OpenAI | GPT-4, GPT-3.5-turbo | provider="azure_openai" |
| AWS Bedrock | Claude, Llama, Titan | provider="bedrock" |
| Cohere | Command, Command-R | provider="cohere" |
| Mistral | Mistral Large, Mixtral | provider="mistral" |
| Ollama | Local models (Llama, CodeLlama) | provider="ollama" |
| Together | Various open-source models | provider="together" |
Usage Examples:
#Google Gemini
!pip install langchain-google-genai
parser = ResumeParserPro(
provider="google_genai",
model_name="gemini-2.0-flash",
api_key="your-google-api-key"
)
#Azure OpenAI
parser = ResumeParserPro(
provider="azure_openai",
model_name="gpt-4",
api_key="your-azure-key"
)
#Local Ollama
parser = ResumeParserPro(
provider="ollama",
model_name="llama2:7b",
api_key="" # No API key needed for local
)
#AWS Bedrock
parser = ResumeParserPro(
provider="bedrock",
model_name="anthropic.claude-3-sonnet-20240229-v1:0",
api_key="your-aws-credentials"
)
Full list: See LangChain Model Providers for complete provider support.
📈 Performance
- Speed: ~5-10 seconds per resume (based on th llm used)
- Parallel Processing: 5-10x faster for batch operations
- Accuracy: 95%+ field extraction accuracy
- File Support: PDF, DOCX, TXT formats
🛠️ Advanced Features
Custom Configuration
parser = ResumeParserPro(
provider="openai",
model_name="gpt-4o-mini",
api_key="your-api-key",
max_workers=10, # Parallel processing workers
temperature=0.1 # Model consistency
)
Error Handling
results = parser.parse_batch(file_paths, include_failed=True)
Get processing summary
summary = parser.get_summary(results)
print(f"Success rate: {summary['success_rate']:.1f}%")
print(f"Failed files: {len(summary['failed_files'])}")
📋 Requirements
- Python 3.8+
- API key from supported provider
- Optional: PyMuPDF, python-docx for enhanced file support
🤝 Contributing
Contributions welcome! Please read our contributing guidelines.
📄 License
MIT License - see LICENSE file for details.
🆘 Support
Built with ❤️ for the recruitment and HR community
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_resume_parser-1.0.2.tar.gz.
File metadata
- Download URL: ai_resume_parser-1.0.2.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b71c69ee314de263f55264da28399444caba57f8576bde1c6b64d2c1bdcd8c6
|
|
| MD5 |
d3b6f121f8260d4f3a70541254b8f159
|
|
| BLAKE2b-256 |
c9c6cb8bf19bf2ef2832c1c6e9cfaaf102d83ade40ab065019e4a481bc583b68
|
File details
Details for the file ai_resume_parser-1.0.2-py3-none-any.whl.
File metadata
- Download URL: ai_resume_parser-1.0.2-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1007d89dcfbd66d2783b6759526d9cf2337f01cf939c544967cab11d51b46b53
|
|
| MD5 |
8841d42b8de5003327eb3938acf00a21
|
|
| BLAKE2b-256 |
eece340adfc91a46137eeb1937cbe6a2b459c250eaa6cfdb27fdf31c8042369d
|