Skip to main content

AI agent rules for Data Science, ML & AI Engineering — sync to Claude, Copilot, Codex, Gemini, Cursor, Windsurf

Project description

ds-agent-rules

CI License: MIT npm version PyPI version GitHub release GitHub stars

繁體中文版 README

A portable, composable rules system for AI coding agents — one source of truth for Data Science, Machine Learning, and AI Engineering projects.

Write rules once. Sync to Claude Code · GitHub Copilot · OpenAI Codex · Gemini Code · Cursor · Windsurf — all at once.


The Problem

Without explicit rules, AI agents silently introduce bad habits:

What goes wrong Impact
No random seeds Irreproducible experiments
Random train/test splits on time-series Data leakage
Skipped evaluation baselines Unverifiable model claims
Hardcoded hyperparameters Untrackable experiments

ds-agent-rules solves this with a layered, composable rule system that keeps every AI tool aligned.


How It Works

 ┌────────────────────┐
 │   base/core.md     │  ← always loaded
 │   base/ds-ml.md    │  ← project-type overlay
 │   snippets/rag.md  │  ← domain-specific rules
 │   team/*.md        │  ← team overrides (optional)
 └────────┬───────────┘
          │  sync.sh
          ▼
 ┌────────────────────────────────────┐
 │  CLAUDE.md                        │
 │  AGENTS.md                        │
 │  .github/copilot-instructions.md  │
 │  .gemini/styleguide.md            │
 │  .cursorrules                     │
 │  .windsurfrules                   │
 └────────────────────────────────────┘

Layer model: core (always) → overlay (project type) → snippets (domains) → team (overrides)


Quickstart

1. Install

Choose your preferred method:

# npm (zero-install via npx)
npx ds-agent-rules init

# pip
pip install ds-agent-rules
ds-agent-rules init

# git clone (full control)
git clone https://github.com/Edwarddev0723/ds-agent-rules ~/.ai-rules
cd ~/.ai-rules && chmod +x sync.sh new-project.sh

2. Pick your path

A) npx / pip — zero-clone workflow
cd /path/to/your/project
npx ds-agent-rules preset llm-project    # npm
ds-agent-rules preset llm-project        # pip

# or interactive
npx ds-agent-rules new-project
B) Interactive setup (git clone) — guided walkthrough
cd /path/to/your/project
~/.ai-rules/new-project.sh

Creates .ai-rules.yaml, syncs rules, and scaffolds directories.

B) One-liner with preset — fastest for common setups
cd /path/to/your/project
~/.ai-rules/sync.sh --preset llm-project
C) Config file — recommended for ongoing projects
cd /path/to/your/project
~/.ai-rules/sync.sh --init          # creates .ai-rules.yaml template
vim .ai-rules.yaml                   # edit to match your project
~/.ai-rules/sync.sh                  # sync (auto-reads config)

3. Useful flags

./sync.sh --list                     # show all overlays, snippets, presets
./sync.sh --dry-run ds-ml rag        # preview without writing files
./sync.sh --diff                     # show unified diff before applying changes
./sync.sh --validate                 # check project structure against rules
./sync.sh --output-dir /other/proj   # write to a different project
./sync.sh --team ./team-rules        # include team-specific rules

4. Make targets

make help                            # show all available targets
make lint                            # run ShellCheck on all scripts
make test                            # run bats test suite
make validate                        # validate current project
make ci                              # lint + test (same as CI)

Project Structure

ds-agent-rules/
├── base/                    # Project-type overlays
│   ├── core.md              # Universal rules (always included)
│   ├── ds-ml.md             # Data Science / ML
│   ├── llm-eng.md           # LLM / GenAI Engineering
│   ├── data-eng.md          # Data Engineering
│   ├── software-eng.md      # Traditional Software Engineering
│   └── research.md          # Research / Academic
│
├── snippets/                # Domain-specific rule modules (mix & match)
│   ├── agentic-ai.md        # AI Agents & tool use
│   ├── audio-speech.md      # ASR / TTS / Audio
│   ├── chinese-nlp.md       # Traditional Chinese NLP
│   ├── ctr-prediction.md    # CTR / Recommendation Systems
│   ├── cv.md                # Computer Vision
│   ├── data-labeling.md     # Annotation & Active Learning
│   ├── distributed-training.md  # Multi-GPU/Node (DeepSpeed, FSDP)
│   ├── edge-inference.md    # Mobile / Edge Deployment
│   ├── evaluation-framework.md  # Systematic Evaluation
│   ├── graph-ml.md          # Graph Neural Networks
│   ├── jax.md               # JAX / Flax
│   ├── llm-finetuning.md    # LLM Fine-Tuning (LoRA, RLHF)
│   ├── mlops.md             # MLOps & Deployment
│   ├── nlp-general.md       # General NLP
│   ├── prompt-engineering.md    # Prompt Design & Versioning
│   ├── pytorch.md           # PyTorch
│   ├── rag.md               # RAG Pipeline
│   ├── responsible-ai.md    # Responsible AI & Safety
│   ├── streaming-ml.md      # Online Learning & Streaming
│   ├── synthetic-data.md    # Synthetic Data & Privacy
│   ├── tabular-ml.md        # Tabular ML
│   ├── time-series.md       # Time Series Forecasting
│   └── vlm.md               # Vision-Language Models
│
├── presets/                  # Named combos for one-command setup (15 presets)
├── templates/                # Directory scaffolds per project type (5 templates)
├── tests/                    # bats test suite
│   └── sync.bats
├── .github/
│   ├── workflows/ci.yml      # CI (ShellCheck + bats on ubuntu & macos)
│   ├── PULL_REQUEST_TEMPLATE.md
│   └── ISSUE_TEMPLATE/       # Issue templates (new snippet, bug report)
├── sync.sh                   # Main sync script
├── new-project.sh            # Interactive project initializer
├── Makefile                  # make lint / test / validate / ci
├── CONTRIBUTING.md           # Contributor guide & snippet format spec
├── CHANGELOG.md              # Release history
└── README.md

Presets

Run ./sync.sh --list to see your local presets.

Preset Overlay Included Snippets
llm-project ds-ml llm-finetuning, rag, mlops, responsible-ai
agentic-ai llm-eng agentic-ai, prompt-engineering, rag, mlops, responsible-ai
distributed-llm ds-ml llm-finetuning, distributed-training, pytorch, mlops
cv-project ds-ml cv, mlops
recsys-project ds-ml ctr-prediction, tabular-ml, mlops
tabular-project ds-ml tabular-ml, mlops
ts-forecast ds-ml time-series, mlops
nlp-project ds-ml nlp-general, evaluation-framework, mlops
research-llm research llm-finetuning, rag, responsible-ai
full-stack-ai llm-eng llm-finetuning, rag, mlops, responsible-ai
data-platform data-eng streaming-ml, mlops
graph-ml-project ds-ml graph-ml, evaluation-framework, mlops
labeling-project ds-ml data-labeling, evaluation-framework, responsible-ai
edge-deploy ds-ml edge-inference, pytorch, mlops
vlm-project ds-ml vlm, cv, llm-finetuning, evaluation-framework

Configuration

.ai-rules.yaml (per-project)

Drop this in your project root. sync.sh auto-detects it.

profile: ds-ml
snippets:
  - llm-finetuning
  - rag
  - pytorch
  - mlops

# team_dir: ./team-rules     # optional: team-specific rules
# preset: llm-project        # optional: use a preset instead

Team Rules

Append company/team-specific .md rules after all snippets:

mkdir team-rules && vim team-rules/our-standards.md

# Via CLI
./sync.sh --team ./team-rules ds-ml rag

# Or in .ai-rules.yaml
# team_dir: ./team-rules

Extending

Action Command
New overlay cp base/ds-ml.md base/my-type.md → edit → ./sync.sh my-type
New snippet Create snippets/my-domain.md./sync.sh ds-ml my-domain
New preset echo "ds-ml my-domain mlops" > presets/my-preset.txt
Update a rule Edit snippet → ./sync.shgit commit

Installation & Git Strategy

# Option 1: npm (recommended for JS/TS developers)
npm install -g ds-agent-rules        # global install
npx ds-agent-rules sync ds-ml rag    # or run directly via npx

# Option 2: pip (recommended for Python developers)
pip install ds-agent-rules
ds-agent-rules sync ds-ml rag

# Option 3: Standalone (git clone)
git clone https://github.com/Edwarddev0723/ds-agent-rules ~/.ai-rules

# Option 4: Git submodule in dotfiles
cd ~/.dotfiles && git submodule add https://github.com/Edwarddev0723/ds-agent-rules

Committing generated files?

Scenario Recommendation
Solo / personal .gitignore them, regenerate with sync.sh
Team project Commit — consistent agent behavior across the team
Open source Commit — doubles as contributor onboarding context

Recommended Workflow

# 1. Start a new project
mkdir my-project && cd my-project && git init

# 2. Initialize (pick one)
~/.ai-rules/new-project.sh              # interactive
~/.ai-rules/sync.sh --preset llm-project # one-liner
~/.ai-rules/sync.sh --init              # config file

# 3. Work with your AI tools — they auto-read the generated files

# 4. Validate project structure
~/.ai-rules/sync.sh --validate

# 5. Evolve your rules
vim ~/.ai-rules/snippets/rag.md
~/.ai-rules/sync.sh
cd ~/.ai-rules && git add -A && git commit -m "rule: ..."

AI Tool → File Mapping

AI Tool Config File
Claude Code CLAUDE.md
GitHub Copilot .github/copilot-instructions.md
OpenAI Codex / ChatGPT AGENTS.md
Google Gemini Code .gemini/styleguide.md
Cursor .cursorrules
Windsurf .windsurfrules

Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Snippet format specification & quality criteria
  • Preset & overlay format
  • Commit conventions
  • PR checklist

Changelog

See CHANGELOG.md for release history.


Who Uses This

Using ds-agent-rules in your project or team? We'd love to hear about it! Open an issue or PR to add your name here.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ds_agent_rules-1.1.0.tar.gz (62.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ds_agent_rules-1.1.0-py3-none-any.whl (78.1 kB view details)

Uploaded Python 3

File details

Details for the file ds_agent_rules-1.1.0.tar.gz.

File metadata

  • Download URL: ds_agent_rules-1.1.0.tar.gz
  • Upload date:
  • Size: 62.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ds_agent_rules-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ba880056e6fd5ff30102532852ba773582acc4e8215f9a90fb6084370a7be1e4
MD5 334d32dcac20dbcca96fd80d0f69df43
BLAKE2b-256 eb3ae6f03c2267445c5f1196475136929f367f71486a86716ef31644bf4b9d8e

See more details on using hashes here.

File details

Details for the file ds_agent_rules-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: ds_agent_rules-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 78.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ds_agent_rules-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c652c32deabef86d21c0d88c5cd76103ead0d1f521eb33e45a8ecf9f8e8f8742
MD5 f734e5875a2b93136beac50c66123faf
BLAKE2b-256 10fdddad3a0579495c5aeddf810edb3f442a03e2330212a73c3ec9d45248d6a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page