AI-powered CLI for DNA sequence and genome analysis
Project description
Genomix CLI
AI-powered CLI for DNA sequence and genome analysis.
Genomix is an intelligent command-line tool that helps biologists, bioinformaticians, and researchers analyze genomic data through natural language. Ask questions about your VCF, FASTA, or FASTQ files — the AI reads them, queries real databases (NCBI, Ensembl, ClinVar), and explains results in accessible language.
Local-first. Runs with Ollama by default — your genomic data never leaves your machine.
What It Does
❯ Read raw_variants.vcf and give me a clinical summary
⚡ read_file(path='raw_variants.vcf')
↳ ##fileformat=VCFv4.2 ...
1. BRCA1 missense (chr17:43094464): Pathogenic — increased breast/ovarian cancer risk
2. CFTR deletion (chr7:117559593): Pathogenic — cystic fibrosis (homozygous)
3. HBB missense (chr11:5226773): Pathogenic — sickle cell trait (carrier)
4. APOE missense (chr19:44908822): Risk factor — Alzheimer's disease
...
❯ What does this reveal about the person's ancestry?
Based on the variant profile:
- HBB/rs334 (sickle cell trait): high frequency in African/Mediterranean populations
- CFTR deltaF508: most common in Northern European populations
- Combined profile suggests mixed European/African ancestry
Features
- Natural language interface — ask questions about your genomic data in plain English or French
- 18 MCP servers — 5 biotools (samtools, BWA, GATK, BLAST+, FastQC) + 13 databases (see below)
- 20 slash commands —
/qc,/align,/variant-call,/blast,/msa,/explain,/report,/structure, and more - 21 built-in skills — specialized AI instructions for sequencing, comparative genomics, clinical, oncology, pharmacogenomics, and more
- Protein structure analysis — AlphaFold predictions, AlphaMissense pathogenicity, PDB experimental structures
- Streaming responses — token-by-token display with thinking spinner
- Clinical HTML reports —
/reportgenerates styled variant reports - Smart analysis — reads raw VCFs (no annotations needed), identifies genes from coordinates, infers clinical significance
- Ancestry inference — population frequency analysis via gnomAD/1000 Genomes
- 3 AI providers — Ollama/local (default), Claude (Anthropic), OpenAI
- Privacy mode — automatically active with local models, raw sequences never sent to cloud
- MCP management —
/mcpto view, connect, and manage bioinformatics tool servers
Installation
# Install
pip install genomix-cli
# Check dependencies
genomix setup
# Initialize a project
cd my-analysis/
genomix init
Requirements
- Python 3.11+
- Ollama with a model (e.g.,
ollama pull qwen3-coder:30b) - Optional: samtools, BWA, GATK, BLAST+ for bioinformatics tools
Quick Start
# Start interactive mode
genomix
# Non-interactive usage
genomix ask "What is the BRCA1 gene?"
genomix ask "Read sample.vcf and summarize the variants"
genomix run /qc data/reads.fastq.gz
Interactive Session
██████╗ ███████╗███╗ ██╗ ██████╗ ███╗ ███╗██╗██╗ ██╗
...
v0.4.0 — AI-powered genome analysis
┌──────────────────────────────────────────────────────┐
│ Project BRCA Analysis - Cohort 2026 │
│ Organism Homo sapiens │
│ Reference GRCh38 │
│ Provider ollama (qwen3-coder:30b) │
│ Privacy 🔒 Local — data stays on this machine │
│ MCP 18 registered (4 connected, 14 missing) │
└──────────────────────────────────────────────────────┘
Connecting MCP servers...
Connecting to ClinVar... ✓ (3 tools)
Connecting to dbSNP... ✓ (3 tools)
Connecting to Ensembl... ✓ (5 tools)
Connecting to NCBI... ✓ (4 tools)
❯ _
Slash Commands
Full reference with examples: docs/commands.md
| Command | Description |
|---|---|
| Analysis | |
/qc |
Quality control (FastQC) |
/align |
Align reads to reference genome |
/variant-call |
Call variants (GATK/FreeBayes) |
/annotate |
Annotate variants (SnpEff/VEP) |
/pipeline |
Full pipeline: QC → align → call → annotate |
/report |
Generate styled HTML clinical report from VCF |
| Databases | |
/lookup |
Look up a gene or variant across databases |
/frequency |
Population allele frequencies (gnomAD) |
/disease |
Disease associations (OMIM) |
/cancer |
Somatic mutation context (COSMIC) |
/drug |
Pharmacogenomics annotations (PharmGKB) |
/literature |
Search biomedical literature (PubMed) |
| Structure | |
/structure |
Protein structure and AlphaFold predictions |
/domains |
Protein domain mapping (InterPro) |
| Comparative | |
/blast |
BLAST similarity search |
/msa |
Multiple sequence alignment |
/phylo |
Phylogenetic tree construction |
| Exploration | |
/summary |
Summarize a genomic file |
/search |
Query databases (NCBI, Ensembl...) |
/explain |
Explain a variant, gene, or region |
| Session | |
/mcp |
Manage MCP servers (connect, status) |
/swarm |
Show background analyses |
/provider |
Switch AI provider |
/model |
Switch model |
/help |
Show available commands |
Supported Databases
| Database | Description |
|---|---|
| NCBI | Gene, nucleotide, and protein search |
| Ensembl | Genome browser, gene annotations, variants |
| ClinVar | Clinical variant interpretations |
| dbSNP | SNP identifiers and allele frequencies |
| gnomAD | Population allele frequencies |
| OMIM | Mendelian disease catalog |
| PharmGKB | Pharmacogenomics annotations |
| COSMIC | Somatic mutations in cancer |
| InterPro | Protein domains and families |
| PubMed | Biomedical literature search |
| AlphaFold | Protein structure predictions |
| UniProt | Protein sequences and annotations |
| PDB | Experimental protein structures |
Protein Structure Analysis
Genomix integrates with Google DeepMind's AlphaFold for structural variant interpretation:
❯ /structure TP53
⚡ uniprot_gene_to_accession(gene_name='TP53')
⚡ alphafold_prediction(uniprot_id='P04637')
⚡ pdb_search_gene(gene_name='TP53')
TP53 (Cellular tumor antigen p53)
UniProt: P04637 | AlphaFold pLDDT: 75.06
PDB: 172 experimental structures
Domains: DNA-binding (IPR008923), Tetramerization (IPR003106)
Hotspot mutations: R175H, R248W, R273H (DNA-binding domain)
When analyzing missense variants, Genomix automatically checks:
- AlphaFold confidence at the variant position
- Protein domain context (via InterPro)
- AlphaMissense pathogenicity score
Architecture
┌─────────────────────────────────────────────┐
│ genomix-cli │
│ │
│ CLI/TUI ── Agent Loop ── Swarm Manager │
│ │ │
│ ┌────────────┼────────────┐ │
│ ▼ ▼ ▼ │
│ Tool Skills Project │
│ Registry System Manager │
│ │ │
│ ▼ │
│ MCP Servers │
│ ├── biotools: samtools, BWA, GATK, │
│ │ BLAST+, FastQC │
│ └── databases: NCBI, Ensembl, ClinVar, │
│ dbSNP, gnomAD, OMIM, PharmGKB, COSMIC, │
│ InterPro, PubMed, AlphaFold, UniProt, PDB │
│ │
│ AI Providers │
│ Ollama (local) │ Claude │ OpenAI │
└───────────────────────────────────────────────┘
AI Providers
Genomix supports 3 AI backends. Switch anytime with /provider in the chat.
Option 1: Ollama (local, default)
Everything stays on your machine. No API key needed. Best for sensitive/patient data.
# Install Ollama
brew install ollama
# Pull a model (pick one)
ollama pull qwen3-coder:30b # Best quality, needs 18GB RAM
ollama pull qwen3.5 # Faster, lighter, 128K context
ollama pull llama3.3:70b # Alternative, needs 40GB RAM
# Start Ollama (runs in background)
ollama serve
Config (~/.genomix/config.yaml):
provider:
default: ollama
model: qwen3-coder:30b
No secrets file needed. Privacy mode is automatic.
Option 2: Claude (Anthropic)
Best reasoning quality. Requires an API key from console.anthropic.com.
# 1. Get your API key at https://console.anthropic.com/settings/keys
# 2. Create config
cat > ~/.genomix/config.yaml << 'EOF'
provider:
default: claude
model: claude-sonnet-4-6
EOF
# 3. Store your API key (secure file, never committed to git)
cat > ~/.genomix/secrets.yaml << 'EOF'
anthropic_api_key: "sk-ant-your-key-here"
EOF
chmod 600 ~/.genomix/secrets.yaml
# 4. Launch genomix
genomix
Available Claude models:
| Model | Best for |
|---|---|
claude-sonnet-4-6 |
Fast, good quality (recommended) |
claude-opus-4-6 |
Best reasoning, slower |
claude-haiku-4-5-20251001 |
Fastest, cheapest |
Option 3: OpenAI
Requires an API key from platform.openai.com.
# 1. Get your API key at https://platform.openai.com/api-keys
# 2. Create config
cat > ~/.genomix/config.yaml << 'EOF'
provider:
default: openai
model: gpt-4o
EOF
# 3. Store your API key
cat > ~/.genomix/secrets.yaml << 'EOF'
openai_api_key: "sk-your-key-here"
EOF
chmod 600 ~/.genomix/secrets.yaml
# 4. Launch genomix
genomix
Available OpenAI models:
| Model | Best for |
|---|---|
gpt-4o |
Best overall (recommended) |
o3 |
Strongest reasoning |
gpt-4-turbo |
Fast, 128K context |
Switching providers on the fly
Inside a genomix session, switch without restarting:
❯ /provider claude
Switched to provider: claude
❯ /model claude-opus-4-6
Switched to model: claude-opus-4-6
❯ /provider ollama
Switched to provider: ollama
Privacy considerations
| Provider | Data location | Best for |
|---|---|---|
| Ollama | 100% local | Patient data, GDPR, confidential |
| Claude | Anthropic servers | Research, best analysis quality |
| OpenAI | OpenAI servers | Alternative cloud option |
With Ollama, raw sequences never leave your machine. With cloud providers, only tool result summaries are sent (not raw genomic data) when privacy mode is active.
Contributing
Contributions welcome! See CONTRIBUTING.md for development setup, project structure, and how to add new MCP servers, skills, or AI providers.
The easiest way to contribute is adding a new database MCP server — each one is a single self-contained file. See the architecture docs for an overview of the system.
License
Apache 2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file genomix_cli-0.4.1.tar.gz.
File metadata
- Download URL: genomix_cli-0.4.1.tar.gz
- Upload date:
- Size: 135.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdcbc69734acae074810da4de86e5011bd0b2ed5d9405747f886032eb626fdaa
|
|
| MD5 |
534d963f3f1788a8439762727120e7fc
|
|
| BLAKE2b-256 |
4abc9250940c8c6a3fbf1bab2049c928482f8048c6003fb7c1ee093689e848b0
|
File details
Details for the file genomix_cli-0.4.1-py3-none-any.whl.
File metadata
- Download URL: genomix_cli-0.4.1-py3-none-any.whl
- Upload date:
- Size: 101.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0dbb0fb1f889dffff75679c906b99264cb2bc157b080fb668c126ba6466c73a
|
|
| MD5 |
82cbe50bce148f018e8cf62613b0c7b5
|
|
| BLAKE2b-256 |
0d5a0ff1ad69eb5181410ce9c3773352234d96f4a0eb4f38bea5f79885922e31
|