Smart Recruitment Utility Library — job matching, resume parsing, validation & response formatting
Project description
☁️ CloudHire AI Utils
Smart Recruitment Utility Library for the CloudHire Platform
A Python library providing intelligent recruitment-related functionalities such as job matching (weighted, TF-IDF, similarity scoring), resume skill extraction, skill normalisation, resume-to-job matching, validation, and response formatting to enhance the efficiency and scalability of the CloudHire platform.
🚀 Why This Library is Important
♻️ Reusability
This library is completely independent of the main CloudHire application. It can be installed and used in any Python project that deals with recruitment, job portals, HR systems, or career platforms. Write once, use everywhere.
📈 Scalability
Each module follows the Single Responsibility Principle — it does one thing well. As the platform grows, you can extend individual modules (e.g., add new scoring algorithms to the recommender) without touching unrelated code.
🏗️ Clean Architecture
The library demonstrates separation of concerns — business logic (matching, parsing, validation) lives in the library, while the web application layer (routes, controllers) stays in the main app. This makes the codebase easier to test, debug, and maintain.
🧠 Intelligence
Rather than simple CRUD operations, this library adds AI-inspired logic such as TF-IDF scoring, Jaccard similarity, and weighted recommendations — demonstrating advanced thinking beyond basic development.
📦 Standards-Compliant
Published as a proper pip-installable package following Python packaging standards (setup.py, pyproject.toml, semantic versioning). Production-ready with built-in logging across all modules.
📦 Installation
From PyPI (after publishing)
pip install cloudhire-ai-utils
From source (local development)
git clone https://github.com/venkatsai/cloudhire-ai-utils.git
cd cloudhire-ai-utils
pip install -e .
🏗️ Library Structure
cloudhire-ai-utils/
├── cloudhire_ai_utils/
│ ├── __init__.py # Package entry point (exports all classes)
│ ├── recommender.py # 🔥 Job Recommendation Engine (weighted + TF-IDF + similarity)
│ ├── parser.py # 📝 Resume Skill Extractor
│ ├── normalizer.py # 🔄 Skill Normalizer (synonym mapping)
│ ├── matcher.py # 🎯 Resume-to-Job Matcher (high-level pipeline)
│ ├── validator.py # ✅ Job Validation Module
│ └── formatter.py # 📦 API Response Formatter
├── setup.py
├── pyproject.toml
├── LICENSE
├── test_library.py
└── README.md
⚙️ Modules & Usage
1️⃣ Job Recommendation Engine (recommender.py) 🔥
Matches candidate skills against job listings using 3 scoring strategies:
- Weighted scoring — custom skill weights for prioritisation
- TF-IDF scoring — term frequency-inverse document frequency
- Jaccard similarity — set-based similarity coefficient
from cloudhire_ai_utils import JobRecommender
recommender = JobRecommender(skill_weights={"python": 3, "aws": 2})
jobs = [
{"id": 1, "title": "Backend Developer", "skills": ["python", "sql", "aws"]},
{"id": 2, "title": "Frontend Developer", "skills": ["react", "css", "javascript"]},
{"id": 3, "title": "Data Scientist", "skills": ["python", "machine learning", "pandas"]},
]
# Weighted scoring (default)
results = recommender.recommend_jobs(["python", "aws"], jobs)
for job, score in results:
print(f" {job['title']} — Score: {score}")
# Jaccard Similarity scoring
sim_results = recommender.recommend_by_similarity(["python", "aws"], jobs)
for job, sim in sim_results:
print(f" {job['title']} — Similarity: {sim}")
# TF-IDF scoring
tfidf_results = recommender.recommend_by_tfidf(["python", "aws"], jobs)
for job, score in tfidf_results:
print(f" {job['title']} — TF-IDF: {score}")
# Detailed recommendation (includes all scores)
detailed = recommender.recommend_jobs_detailed(["python", "aws"], jobs)
for r in detailed:
print(f" {r['job']['title']}")
print(f" Match: {r['match_pct']}% | Similarity: {r['similarity']} | TF-IDF: {r['tfidf_score']}")
print(f" Matched: {r['matched_skills']}")
print(f" Missing: {r['missing_skills']}")
2️⃣ Skill Normalizer (normalizer.py) 🔄
Converts skill synonyms and abbreviations to canonical forms. Ensures "js" and "javascript" are treated as the same skill.
from cloudhire_ai_utils import SkillNormalizer
normalizer = SkillNormalizer()
# Single skill normalisation
print(normalizer.normalize("js")) # → "javascript"
print(normalizer.normalize("py")) # → "python"
print(normalizer.normalize("k8s")) # → "kubernetes"
print(normalizer.normalize("React.js")) # → "react"
# Normalise a list
skills = normalizer.normalize_list(["js", "py", "k8s", "node", "ML"])
print(skills) # ['javascript', 'kubernetes', 'machine learning', 'nodejs', 'python']
# Add custom synonyms at runtime
normalizer.add_synonym("RoR", "ruby on rails")
# Get all aliases for a canonical name
print(normalizer.get_synonyms_for("javascript"))
# ['js', 'javascript']
60+ built-in synonym mappings covering programming languages, frameworks, cloud platforms, databases, DevOps tools, and more.
3️⃣ Resume Skill Extractor (parser.py) 📝
Extracts technical and soft skills from raw resume text.
from cloudhire_ai_utils import ResumeParser
parser = ResumeParser()
resume_text = """
John Doe — Software Engineer
Experienced in Python, AWS, and React development.
Strong background in SQL databases and Docker containerisation.
Excellent leadership and communication skills.
"""
# Extract skills
skills = parser.extract_skills(resume_text)
print("Skills found:", skills)
# Get full summary
summary = parser.extract_summary(resume_text)
print(f"Total skills: {summary['skill_count']}")
print(f"Word count: {summary['word_count']}")
# Skills with mention count
counts = parser.extract_skills_with_count(resume_text)
print(counts) # {'aws': 1, 'python': 1, 'react': 1, ...}
# Custom skills database
custom_parser = ResumeParser(custom_skills=["flutter", "dart"])
4️⃣ Resume-to-Job Matcher (matcher.py) 🎯
High-level pipeline that combines Parser + Normalizer + Recommender in one call.
from cloudhire_ai_utils import ResumeJobMatcher
matcher = ResumeJobMatcher(skill_weights={"python": 3, "aws": 2})
resume = "Experienced in py, JS, AWS, and k8s development..."
jobs = [
{"id": 1, "title": "Backend Dev", "skills": ["python", "aws", "docker"]},
{"id": 2, "title": "Frontend Dev", "skills": ["react", "javascript"]},
]
# One-call matching (extracts → normalises → matches)
results = matcher.match(resume, jobs)
for r in results:
print(f" {r['job']['title']} — Score: {r['score']} | Match: {r['match_pct']}%")
# Analyse resume without matching
analysis = matcher.analyse_resume(resume)
print(f" Raw skills: {analysis['raw_skills']}")
print(f" Normalised: {analysis['normalised_skills']}")
5️⃣ Job Validation Module (validator.py) ✅
Validates job posting data before saving.
from cloudhire_ai_utils import JobValidator
from cloudhire_ai_utils.validator import ValidationError
validator = JobValidator()
# Valid job
validator.validate({
"title": "Python Developer",
"skills": ["python", "django"],
"salary_min": 40000,
"salary_max": 60000,
"status": "Active",
}) # Returns True
# Get errors without exception
errors = validator.get_errors({})
# ["'title' is required...", "'skills' is required..."]
# Add custom rule
validator.add_rule(
lambda j: len(j.get("skills", [])) <= 10,
"Maximum 10 skills allowed"
)
# Quick boolean check
print(validator.is_valid({"title": "Dev", "skills": ["python"]})) # True
6️⃣ API Response Formatter (formatter.py) 📦
Standardises all API responses across the platform.
from cloudhire_ai_utils import ResponseFormatter
# Success / Error
ResponseFormatter.success({"id": 1, "title": "Developer"})
ResponseFormatter.error("Job not found", code=404)
# Pagination
ResponseFormatter.paginated(data=[...], page=1, per_page=10, total=50)
# Shortcuts
ResponseFormatter.created({"id": 5})
ResponseFormatter.deleted(resource_id=5)
ResponseFormatter.not_found("Job")
ResponseFormatter.unauthorized()
ResponseFormatter.validation_error(["Title is required"])
📊 Logging (Production-Ready)
All modules include structured Python logging for production observability.
import logging
# Enable logging to see library activity
logging.basicConfig(level=logging.INFO)
# Now all cloudhire_ai_utils operations will log:
# INFO | cloudhire_ai_utils.recommender | Found 3 matching jobs (min_score=1)
# INFO | cloudhire_ai_utils.parser | Extracted 7 skills from text (255 chars)
# WARN | cloudhire_ai_utils.validator | Validation failed with 2 errors
🔗 Integration with CloudHire Backend
Flask Example
from flask import Flask, request, jsonify
from cloudhire_ai_utils import (
ResumeJobMatcher, JobValidator, ResponseFormatter, SkillNormalizer
)
app = Flask(__name__)
matcher = ResumeJobMatcher()
validator = JobValidator()
normalizer = SkillNormalizer()
@app.route("/api/recommend", methods=["POST"])
def recommend():
resume_text = request.json.get("resume", "")
jobs = get_jobs_from_db()
results = matcher.match(resume_text, jobs)
return jsonify(ResponseFormatter.success(results))
@app.route("/api/parse-resume", methods=["POST"])
def parse_resume():
text = request.json.get("text", "")
analysis = matcher.analyse_resume(text)
return jsonify(ResponseFormatter.success(analysis))
@app.route("/api/jobs", methods=["POST"])
def create_job():
job_data = request.json
# Normalise skills before saving
job_data["skills"] = normalizer.normalize_list(job_data.get("skills", []))
errors = validator.get_errors(job_data)
if errors:
return jsonify(ResponseFormatter.validation_error(errors)), 422
saved = save_to_db(job_data)
return jsonify(ResponseFormatter.created(saved)), 201
🧠 OOP Concepts Used
| Concept | Where Used |
|---|---|
| Encapsulation | Private methods (_normalise, _clean_text, _synonyms) |
| Abstraction | Simple public API hides complex logic |
| Inheritance | ValidationError extends Exception |
| Polymorphism | extract_skills() works on any string input |
| Composition | ResumeJobMatcher composes Parser + Normalizer + Recommender |
| Static Methods | ResponseFormatter uses @staticmethod |
| Facade Pattern | ResumeJobMatcher.match() — single entry point |
| Strategy Pattern | Multiple scoring algorithms in JobRecommender |
| Single Responsibility | Each module handles exactly one concern |
| Open/Closed | Custom rules via add_rule(), add_synonym() without modification |
📋 Requirements
- Python 3.8+
- No external dependencies (pure Python!)
🚀 Build & Publish to PyPI
# Install build tools
pip install build twine
# Build the package
python -m build
# Upload to PyPI
twine upload dist/*
# Or upload to TestPyPI first
twine upload --repository testpypi dist/*
After publishing:
pip install cloudhire-ai-utils
📄 License
MIT License — see LICENSE for details.
👤 Author
Venkatsai — CloudHire Platform
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cloudhire_ai_utils-1.1.0.tar.gz.
File metadata
- Download URL: cloudhire_ai_utils-1.1.0.tar.gz
- Upload date:
- Size: 21.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15bc9c87dc26d2ed28e4998c44e5debc2594ea0e5387be7af17e2505b341b876
|
|
| MD5 |
d9d957e4b9be73cca963980c6d684515
|
|
| BLAKE2b-256 |
2cb94cea7172f49471d6741ce9eaf12465030e1b2faa6211ee2ea4eea0584bcf
|
File details
Details for the file cloudhire_ai_utils-1.1.0-py3-none-any.whl.
File metadata
- Download URL: cloudhire_ai_utils-1.1.0-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7780939dd06b6b28f15b7030c0f1abf7602bdb97c9280014604f91e08b42958d
|
|
| MD5 |
162943dd55bd17195a2622a4b446a6b7
|
|
| BLAKE2b-256 |
6a9766195f332cb389f326323039b726ea9619275be9b714ef002f19c98a70e0
|