A command-line CV Analyzer tool that processes PDF resumes and performs analysis.

These details have not been verified by PyPI

Project links

repository

Project description

Writeup - Software Engineering Project - 22916 2025A

Command-line interface (CLI) tool designed to analyze CVs (curriculum vitae) in PDF format. It allows users, particularly HR professionals, to evaluate CVs against specific job descriptions. The tool provides a match score, highlights relevant skills, and generates a summary of the candidate's suitability for the role. The analysis can be performed for individual CVs or in batch mode for multiple CVs in a directory. The project leverages Gemini APIs for advanced text analysis and supports output in JSON or PDF format or displaying in terminal.

Authors

Contributor	GitHub Profile
@dolev146	GitHub
@shimshon21	GitHub
@yakov103	GitHub

Phase 1: Requirements Engineering

llm_link_to_chat_phase_1

Consult with an LLM to define ONE significant and interesting CV analysis feature.

prompt : "I am a software developer developing a terminal Python app that gets a CV (curriculum vitae) PDF file as input and a text prompt from the user, for example, "Does this CV match the iOS developer?" I want to consult with you about what could be a significant and interesting analysis feature I can implement in my software. Document the feature's requirements clearly. Include acceptance criteria. The users of our software will be Human resources trying to find the best job applicant by role, seniority and relevent skills."

Document the feature's requirements clearly.

Feature Overview

Feature Name

Role Matching & Competency Analysis

Description:

The Role Matching & Competency Analysis feature takes a PDF CV and a text prompt describing the job role (e.g., “iOS Developer” or specific requirement job post. It then analyzes the CV content for skills, experiences, and qualifications relevant to that role. The outcome are:

Numerical score between 0 - 10
Years of Experience
Summary
Relevent skills
Other skills
Pros
Cons

User Stories

As an HR professional, I want to input a CV and a prompt describing the job position so that I can receive a quick assessment of how well the CV matches the role. As an HR professional, I want to see a breakdown of relevant skills and experiences extracted from the CV so that I can understand the candidate's strengths and areas where they might fall short. As an HR professional, I want a confidence or match score that summarizes the overall fit so that I can compare multiple applicants easily.

Functional Requirements

File Parsing & Text Extraction Accept PDF input via command line, extract text reliably (up to ~10 pages).
Scoring Mechanism Generate a 0-10 score based on keyword frequency, critical experience, and years of experience.
Skills Generate Relevant skills which most likley will be required for the position. And other skill which might be less relvevant for the position.
Pros and Cons Generate all positive aspects and negative points as well.

Non-Functional Requirements

Performance

The system should provide results for a typical 2–5 page CV within a few seconds. Larger CVs (up to ~10 pages) should be processed within an acceptable time (under 30 seconds, for example).

Reliability

The system should reliably parse PDFs with standard text. If the PDF is scanned or image-based, the system may return an error or a warning unless Optical Character Recognition (OCR) is implemented.

Maintainability

The code should be modular, with separate functions for PDF parsing, text analysis, scoring, and summarization, allowing for easy updates or improvements to each component.

Scalability

The feature should be designed so that new skills or roles can be added easily to the matching algorithm (e.g., by adding or updating a skills dictionary or training an NLP model).

Include acceptance criteria

Acceptance Criteria

1. CV Input & Prompt Capture

Given a valid PDF file path and a relevant prompt, when the user runs the software in the terminal, then the software extracts text from the PDF and processes the prompt without error.**

2. Analysis & Scoring

Acceptence Criteria 1: Given a valid PDF that includes role (e.g:“iOS developer”) and seniority(e.g: Junior) , when the user runs the analysis, then the system outputs: A match score between 0 and 10, A Years of Experience, A short textual summary describing the candidate’s suitability, A list of detected relevant skills (e.g., “Swift,” “Objective-C,” “UIKit”), A list of other relevant skills (e.g., “Java,” “Kotlin,” “Android”), A list of Pros, A list of Cons.

Acceptence Criteria 2: Given a valid PDF that lacks any relevant skills for the prompt, when the user runs the analysis, then the system outputs a low match score (e.g., near 0) and indicates no relevant skills found. Invalid CV or Prompt

Acceptence Criteria 3(not implemented): Given an invalid or non-existent PDF file path, when the user runs the software, then the system provides an error message and does not crash. Given an empty or nonsensical prompt (e.g., “asdfghjk”), when the user runs the software, then the system warns the user that the role could not be identified and proceeds with minimal or no matching. Performance

Acceptence Criteria 4: Given a CV of ~5 pages, when the user runs the analysis, then the system should return results within a few seconds (e.g., under 20 seconds).

Document LLM interactions (link). chat

Phase 2: Architecture

Define Command-Line Interface Specification (Inline)

CLI Commands

`writeup evaluate`

Evaluates a single CV PDF against a specific position and seniority level, optionally overriding the default API key. Generates a structured report in either JSON or PDF format.

Usage

writeup evaluate [OPTIONS] FILE

Description

FILE (Positional Argument, Required)
Path to the CV PDF file that you want to evaluate.

Options

--position, -p (TEXT, Required)
The name/title of the position you are evaluating for (e.g., "Software Engineer").
--seniority, -s (TEXT, Required)
The seniority level of the position (e.g., "Mid-Level," "Senior," etc.).
--output, -o (TEXT)
The output file name for the generated report.
If omitted, a default name is used:
- evaluation_report.json if --format json
- evaluation_report.pdf if --format pdf
--format, -f (TEXT, default: json)
The report format. Choose json or pdf.
--api-key, -t (TEXT)
An optional override for the GEMINI_API_KEY environment variable.
Use this if you do not want to rely on the API key from .env.
--help
Displays usage information and exits.

Example

writeup evaluate resume.pdf -p "Data Scientist" -s "Senior" --format pdf

This command evaluates resume.pdf for a Senior Data Scientist role, then creates and saves a PDF report (by default, named evaluation_report.pdf).

Plan file system interactions, i.e., input/output (inline).

Inputs

The system accepts PDF files as input.
Input files can be specified using a file path or directory path.
When evaluating multiple files in a batch, the system will scan the specified directory for files matching the .pdf extension.
The input file(s) must be accessible and readable by the system.

Outputs:

Output files are generated in JSON or PDF format.
The default output file name follows this pattern:
- evaluation_report.json – for single-file JSON reports
- evaluation_report.pdf – for single-file PDF reports
- batch_evaluation_report.json – for batch JSON reports
- batch_evaluation_report.pdf – for batch PDF reports
If an output filename is specified using the --output option, the system will override the default name.

Your feature may use additional files for input and output.

A .env file or config file could store default API_KEY and other environment variables.

A logs directory could store logs if needed for debugging or usage reporting.

Identify relevant third-party libraries.

Typer

For easy and intuitive CLI creation.

Poetry

Dependency management and project packaging.

Curl

For requests

Google-Genai For Pdf analysis

Define team member responsibilities.

User	Responsibilitiy
@dolev146	Reasearch & Communication with Google api
@shimshon21	Export to pdf & Third libraries managment & Documentation
@yakov103	Infrastucture & Data flow

LLM Interactions: chat

Phase 3: Design

🧠 CRC: Classes, Responsibilities, and Collaborations

A structured overview of the main classes, their core responsibilities, and their collaborators in the system.

🧑‍💻 CLI Interface

🔧 Class	📌 Responsibilities	🤝 Collaborations
CLIHandler (cli.py)	- Parse command-line arguments (file path, position, seniority, output format). - Orchestrate the application's flow. - Handle user interaction and terminal output.	- Calls Analyzer - Uses JsonExporter and PDFExporter - Interacts with Feedback

🧰 Core Logic

🔧 Class	📌 Responsibilities	🤝 Collaborations
Analyzer (evaluator.py)	- Evaluate CVs using the Gemini API. - Generate structured feedback (score, skills, pros, cons). - Support batch evaluations.	- Uses Feedback - Interacts with TextPreprocessor - Uses Gemini API

📦 Data Models

🔧 Class	📌 Responsibilities	🤝 Collaborations
Feedback (models.py)	- Represent the structured output from the LLM (e.g., score, summary, skills). - Serve as a data model for reports.	- Used by Analyzer - Passed to JsonExporter and PDFExporter

📤 Reporting / Exporters

🔧 Class	📌 Responsibilities	🤝 Collaborations
JsonExporter (json_report.py)	- Generate and save evaluation results as JSON. - Support single and batch exports.	- Receives Feedback - Called by CLIHandler
PDFExporter (pdf_report.py)	- Generate a PDF report using LLM output.	- Called by CLIHandler - Receives Feedback

Phase 4: Coding & Testing

Files table:

Directory	File Name	Description
writeup	cli.py	Get user input and return output in Json/PDF/Console format
writeup -> core	evaluator.py	Fetch response from LLM by given prompt of user required experience seniority
	models.py	Store models fetched from the LLM
writeup -> reports	json_report.py	Export LLM response into Json file format
	pdf_report.py	Export LLM response into PDF file format
writeup -> utils	text_utils.py	UI text utils for drawing break lines
writeup -> tests	conftest.py	Tests configuration file
	test_batch_evaluate.py	Batch evaluate tests
	test_evaluate.py	Evaluate tests for multiple mock pdf files

Testing

We use pytest as our testing framework. Below are the details of the tests implemented:

Test Configuration (conftest.py)

gemini_api_key: A fixture that retrieves the GEMINI_API_KEY environment variable. If the key is not provided, it skips tests that require a real API call.
cv_dir: A fixture that provides the path to the cv directory, which is assumed to be located one level above the tests directory.

Batch Evaluate Tests (test_batch_evaluate.py)

These tests are designed to evaluate the functionality of batch processing multiple CV PDFs in a directory.

test_batch_evaluate_success: Tests the successful evaluation of multiple CV PDFs in a directory and checks if the batch report is generated correctly.
test_batch_evaluate_no_files: Tests the scenario where no PDF files are found in the specified directory and ensures the appropriate error message is displayed.
test_batch_evaluate_invalid_api_key: Tests the scenario where an invalid API key is provided and ensures the appropriate error message is displayed.

Evaluate Tests (`test_evaluate.py`)

These tests are designed to evaluate the functionality of processing a single CV PDF.

test_evaluate_success: Tests the successful evaluation of a single CV PDF and checks if the report is generated correctly.
test_evaluate_file_not_found: Tests the scenario where the specified CV PDF file is not found and ensures the appropriate error message is displayed.
test_evaluate_invalid_api_key: Tests the scenario where an invalid API key is provided and ensures the appropriate error message is displayed.
test_evaluate_invalid_format: Tests the scenario where an invalid report format is specified and ensures the appropriate error message is displayed.

Running Tests

To run the tests, use the following command:

poetry run pytest

This command will execute all the tests in the tests directory and provide a summary of the test results.

Phase 5: Documentation

Project Overview

The Writeup project is a CLI tool designed for HR professionals to analyze CVs in PDF format. It evaluates CVs against job descriptions, providing a match score, relevant skills, years of experience, and a summary of the candidate's suitability. The tool supports batch processing, JSON/PDF output, and leverages Gemini APIs for advanced text analysis. It is modular, scalable, and optimized for performance.

Project Structure

.
├── README.md
├── evaluation_report.json
├── poetry.lock
├── pyproject.toml
├── tests
│   ├── conftest.py
│   ├── test_batch_evaluate.py
│   └── test_evaluate.py
└── writeup
    ├── cli.py
    ├── core
    │   ├── evaluator.py
    │   └── models.py
    ├── reports
    │   ├── json_report.py
    │   └── pdf_report.py
    └── utils
        └── text_utils.py

6 directories, 13 files

Demo

Evaluate feature demo:

https://github.com/user-attachments/assets/1fef6e6f-c038-4f56-b554-e2a75cb2e037

Installation

  pipx install writeup-cv-cli

Usage

After installing, you can run the CLI command as follows:

writeup evaluate --pdf-path path/to/cv.pdf -t <gemini_token>

This command scans and analyzes the provided CV PDF file and outputs an analysis report.

Development

Run Tests:
```
poetry run pytest
```
LLM Interactions: Save all LLM chats in the chats/ directory.

Screenshots

PDF Example:

alt text

Analyzer Output:

Analyzer output

Analyzer JSON Output:

alt text

Project details

These details have not been verified by PyPI

Project links

repository

Release history Release notifications | RSS feed

This version

0.4.3

Mar 26, 2025

0.4.2

Mar 22, 2025

0.4.1

Mar 22, 2025

0.4.0

Mar 22, 2025

0.3.0

Mar 15, 2025

0.2.0

Mar 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

writeup_cv_cli-0.4.3.tar.gz (18.5 kB view details)

Uploaded Mar 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

writeup_cv_cli-0.4.3-py3-none-any.whl (15.1 kB view details)

Uploaded Mar 26, 2025 Python 3

File details

Details for the file writeup_cv_cli-0.4.3.tar.gz.

File metadata

Download URL: writeup_cv_cli-0.4.3.tar.gz
Upload date: Mar 26, 2025
Size: 18.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.9 Darwin/24.2.0

File hashes

Hashes for writeup_cv_cli-0.4.3.tar.gz
Algorithm	Hash digest
SHA256	`8e16a7c5b297764548da48aac4472472dbc69d71fbb83861e3682fc1aaa21a1a`
MD5	`5a8ba702b13dd7d80751a75720eba320`
BLAKE2b-256	`35516159236363c818a2804feb5287d07ced330d9a6e28c06440e06704225bc4`

See more details on using hashes here.

File details

Details for the file writeup_cv_cli-0.4.3-py3-none-any.whl.

File metadata

Download URL: writeup_cv_cli-0.4.3-py3-none-any.whl
Upload date: Mar 26, 2025
Size: 15.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.9 Darwin/24.2.0

File hashes

Hashes for writeup_cv_cli-0.4.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cef979b2cd09e9557742e180bf28eb6a3ff674ccb99fd803c96a8e1fc33061eb`
MD5	`53140e58a17fd388354cf0237c8f6355`
BLAKE2b-256	`b94f2521ecb7dd23de0fbb7d5d8a58488d1e43d892e4c02cd8cbd9af4569bb98`

See more details on using hashes here.

writeup-cv-cli 0.4.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Writeup - Software Engineering Project - 22916 2025A

Authors

Phase 1: Requirements Engineering

Document the feature's requirements clearly.

Feature Overview

Feature Name

Description:

User Stories

Functional Requirements

Non-Functional Requirements

Include acceptance criteria

Acceptance Criteria

1. CV Input & Prompt Capture

2. Analysis & Scoring

Phase 2: Architecture

Define Command-Line Interface Specification (Inline)

CLI Commands

writeup evaluate

Usage

Description

Options

Example

Plan file system interactions, i.e., input/output (inline).

Inputs

Outputs:

Your feature may use additional files for input and output.

Identify relevant third-party libraries.

Define team member responsibilities.

Phase 3: Design

🧠 CRC: Classes, Responsibilities, and Collaborations

🧑‍💻 CLI Interface

🧰 Core Logic

📦 Data Models

📤 Reporting / Exporters

Phase 4: Coding & Testing

Files table:

Testing

Test Configuration (conftest.py)

Batch Evaluate Tests (test_batch_evaluate.py)

Evaluate Tests (test_evaluate.py)

Running Tests

Phase 5: Documentation

Project Overview

Project Structure

Demo

Installation

Usage

Development

Screenshots

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`writeup evaluate`

Evaluate Tests (`test_evaluate.py`)