LinkedIn job scraper and CV analysis toolkit
Project description
CVAlchemix
Stop sending the same CV to every job. CVAlchemix scrapes LinkedIn postings and rewrites your CV with AI — tailored, compiled, and PDF-ready in seconds.
✨ Features
- Scrapes LinkedIn job title, company, location, and full job description with Playwright.
- Reuses a persistent browser profile so you can keep a logged-in LinkedIn session.
- Prompts for a Gemini API key and a plain-text base CV, then stores them locally for later runs.
- Includes a built-in
cvalchemix logincommand to open LinkedIn and persist an authenticated browser session. - Rewrites the CV into a structured
CVDataschema using Google Gemini. - Renders the final CV through a Jinja2 LaTeX template and compiles it to PDF with
tectonic. - Saves the intermediate
.texfile alongside the PDF so you can inspect the generated LaTeX. - Includes
show-configanddeletecommands for inspection and cleanup.
🎯 Use Cases
- A job seeker can paste a LinkedIn posting and generate a tailored CV PDF that mirrors the role's keywords and structure.
- A candidate applying to several roles can reuse the same base CV and produce a separate PDF for each company and posting.
- A developer can automate a job-application workflow by combining the scraper, Gemini rewrite step, and PDF renderer in one CLI.
- Someone keeping a persistent LinkedIn session can avoid repeated logins when scraping job pages over time.
- A career coach or reviewer can inspect the generated
.texand final PDF to understand how the CV was reshaped for a role.
⚠️ Current Limitations
These are known limitations in the current version. They will be addressed in future releases.
-
LinkedIn URL format: Only the following URL pattern is currently supported:
https://www.linkedin.com/jobs/collections/recommended/?currentJobId={currentJobId}Other LinkedIn URL shapes such as
/jobs/view/and search-result pages are not yet supported. -
AI Provider: Currently only Google Gemini is supported as the AI provider.
-
External renderer: PDF generation depends on a local
tectonicbinary being available onPATH. -
Scraping fragility: The scraper relies on LinkedIn CSS selectors, so LinkedIn DOM changes can break extraction.
-
Template scope: Only one bundled LaTeX template,
classic.tex.j2, is shipped right now. -
No built-in fit score: The repository contains structured analysis models, but there is no user-facing command that emits a standalone job-match score or analysis report yet.
🔩 Prerequisites
- Python >= 3.10
- A Google Gemini API key
tectonicinstalled and available on yourPATH
Playwright and its browsers are installed automatically as part of the Python dependencies — no manual step needed.
📦 Installation
Option 1 — pip
pip install cvalchemix
Option 2 — pipx (recommended for CLI tools)
pipx install cvalchemix
Option 3 — One-command install (macOS/Linux)
curl -fsSL https://raw.githubusercontent.com/kayesFerdous/CVAlchemix/main/install.sh | bash
If you don't have curl:
wget -qO- https://raw.githubusercontent.com/kayesFerdous/CVAlchemix/main/install.sh | bash
Option 4 — Windows (PowerShell)
irm https://raw.githubusercontent.com/kayesFerdous/CVAlchemix/main/install.ps1 | iex
The installer scripts prefer
pipxand fall back topip --userwhen needed. You can override the install source with theCVALCHEMIX_INSTALL_TARGETenvironment variable.
✅ Verify Installation
cvalchemix --help
🚀 Quick Start
1. Configure the app with your Gemini key and a plain-text base CV.
cvalchemix configure
2. Optional but recommended — open a persistent LinkedIn session once so the scraper can reuse it.
cvalchemix login
3. Generate a tailored CV from a LinkedIn job URL.
cvalchemix generate "https://www.linkedin.com/jobs/collections/recommended/?currentJobId=1234567890" -o ./output
The CLI writes the final PDF to ./output/cv/<company>_<timestamp>/cv.pdf and saves the intermediate LaTeX source as cv.tex in the same directory.
📖 Usage
| Command | What it does | Example |
|---|---|---|
cvalchemix configure |
Prompts for a Gemini API key and the path to your base CV text file, then saves them in the local config file. | cvalchemix configure |
cvalchemix login |
Opens LinkedIn in a persistent Playwright browser profile and stores login readiness for later generate runs. | cvalchemix login |
cvalchemix show-config |
Displays the saved configuration and masks the stored API key. | cvalchemix show-config |
cvalchemix generate <job-url> |
Scrapes a LinkedIn job post, rewrites your CV with Gemini, and renders a PDF. Use -o or --output to set the destination directory. |
cvalchemix generate "https://www.linkedin.com/jobs/collections/recommended/?currentJobId=1234567890" -o ./output |
cvalchemix delete |
Removes local CVAlchemix data and uninstalls the package by default. Use --data-only to keep the CLI installed, and -y to skip confirmation. |
cvalchemix delete -y |
⚙️ Configuration
CVAlchemix uses two layers of configuration:
- CLI config — run
cvalchemix configureto save yourgemini_api_keyandbase_cv_pathinto a local JSON config file under your platform user config directory. - LinkedIn session — run
cvalchemix loginonce to open LinkedIn and savelinkedin_login_configuredfor generate preflight checks. - A
.envfile in the working directory is also read if present.
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
GOOGLE_API_KEY |
empty | Gemini API key used by the settings layer. |
DEFAULT_MODEL |
gemini-2.5-flash-lite |
Gemini model name used by the LLM wrapper. |
PROFILE_DIR |
managed app profile directory | Playwright browser profile location for persistent LinkedIn sessions. |
OUTPUT_DIR |
./output |
Default output directory used by the settings layer. |
BROWSER_HEADLESS |
false |
Launch Playwright in headless mode when set to true. |
SCRAPE_TIMEOUT_MS |
30000 |
Timeout in ms while waiting for LinkedIn page content to load. |
MAX_RETRIES |
3 |
Number of retries defined by the configuration layer. |
LOG_LEVEL |
INFO |
Root logging level. |
A .env.example file with starter values for all of the above is included in the repository.
🗺️ Roadmap
- OpenAI / Anthropic provider support
- Additional CV templates
- Job-fit score command
- Broader LinkedIn URL support
🛠️ Local Development
git clone https://github.com/kayesFerdous/CVAlchemix.git
cd CVAlchemix
./install.sh # macOS/Linux
.\install.ps1 # Windows PowerShell
For a development install with editable mode:
pip install -e ".[dev]"
🤝 Contributing
- Fork the repository and create a focused branch for your change.
- Make the smallest practical change and keep the existing CLI behaviour intact unless the change explicitly requires otherwise.
- Verify the project still starts, configure the CLI, and run a sample
generateflow if your change touches the pipeline. - Open a pull request with a clear description of the change, the motivation, and any manual verification you performed.
📄 License
This project is licensed under the MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cvalchemix-0.1.2.tar.gz.
File metadata
- Download URL: cvalchemix-0.1.2.tar.gz
- Upload date:
- Size: 24.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c668b789cea319ccc836c20538fdadb50f3a976d63840b714e52e55fd614fde8
|
|
| MD5 |
78cf10f8e09a9a2e0c607dc87d695a6d
|
|
| BLAKE2b-256 |
86eea2295f10b6abc66cac6c54715b0a4ab3814e208eefc24bc2ab508b510eb6
|
File details
Details for the file cvalchemix-0.1.2-py3-none-any.whl.
File metadata
- Download URL: cvalchemix-0.1.2-py3-none-any.whl
- Upload date:
- Size: 26.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b080fa5dafe0604f4930c3d4acb73dff54408efe025904ff0a91680a4cbf4073
|
|
| MD5 |
d9af0a6ace43ffef79b5e6a058d7128d
|
|
| BLAKE2b-256 |
51e7963a0d180d3a83f14198999f5e7f39d17def3ace9f179ba623035d5c2dd0
|