Multi-language code analyzer with LSP and Neo4j integration
Project description
Source Atlas
Source Atlas is a powerful multi-language code analyzer that combines Tree-sitter parsing, Language Server Protocol (LSP) integration, and Neo4j graph database to create comprehensive code knowledge graphs.
โจ Features
- ๐ Multi-Language Support: Analyze Java, Python, Go, and TypeScript codebases
- ๐ Deep Code Analysis: Extract classes, methods, dependencies, and relationships
- ๐ง LSP Integration: Leverage Language Server Protocol for semantic analysis
- ๐ Knowledge Graph: Build rich code graphs in Neo4j for advanced querying
- ๐ฏ AST-Based: Uses Tree-sitter for accurate syntax parsing
- โก Incremental Analysis: Track code changes with AST hashing
- ๐ Relationship Tracking: Discover implements, extends, uses, and calls relationships
๐๏ธ Architecture
graph TB
A[Source Code] --> B[Tree-sitter Parser]
B --> C[AST Analysis]
C --> D[LSP Service]
D --> E[Code Analyzer]
E --> F[Code Chunks]
F --> G[Neo4j Knowledge Graph]
Components:
- Analyzers: Language-specific code analyzers (Java, Python, Go, TypeScript)
- Extractors: Extract specific code elements (classes, methods, endpoints)
- LSP Service: Integrates with language servers for semantic information
- Neo4j Service: Manages code graph database operations
- Models: Domain models for code chunks, methods, and relationships
๐ Prerequisites
- Python: 3.8 or higher
- Neo4j: 5.x running locally or remotely
- Language-specific tools (for the languages you want to analyze):
- Java: JDK 11+ (for LSP server)
- Python: Python 3.8+
- Go: Go 1.16+
- TypeScript: Node.js 14+
๐ Installation
1. Clone the repository
git clone https://github.com/quyen-ngv/source-atlas.git
cd source-atlas
2. Create virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate
3. Install dependencies
pip install -r requirements.txt
4. Set up Neo4j
Download and install Neo4j Desktop or use Docker:
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/your_password \
neo4j:5.14.0
5. Configure environment
Create a .env file from the template:
cp .env.example .env
Edit .env with your settings:
APP_NEO4J_URL=bolt://localhost:7687
APP_NEO4J_USER=neo4j
APP_NEO4J_PASSWORD=your_password
APP_NEO4J_DATABASE=neo4j
๐ป Quick Start
Basic Usage
python -m source_atlas analyze \
--project-path /path/to/your/project \
--language java \
--project-id my-project \
--output ./output
Using as a Library
from pathlib import Path
from analyzers.analyzer_factory import AnalyzerFactory
from neo4jdb.neo4j_service import Neo4jService
# Create analyzer
analyzer = AnalyzerFactory.create_analyzer(
language="java",
root_path="/path/to/project",
project_id="my-project",
branch="main"
)
# Analyze project
with analyzer:
chunks = analyzer.parse_project(Path("/path/to/project"))
# Import to Neo4j
neo4j_service = Neo4jService(
url="bolt://localhost:7687",
user="neo4j",
password="your_password"
)
neo4j_service.import_code_chunks(
chunks=chunks,
batch_size=500,
main_branch='main'
)
๐ง Configuration
Environment Variables
| Variable | Description | Default | Required |
|---|---|---|---|
APP_NEO4J_URL |
Neo4j connection URL | bolt://localhost:7687 |
Yes |
APP_NEO4J_USER |
Neo4j username | neo4j |
Yes |
APP_NEO4J_PASSWORD |
Neo4j password | - | Yes |
APP_NEO4J_DATABASE |
Neo4j database name | neo4j |
Yes |
NEO4J_MAX_CONNECTION_POOL_SIZE |
Max connection pool size | 50 |
No |
NEO4J_CONNECTION_TIMEOUT |
Connection timeout (seconds) | 30.0 |
No |
See docs/configuration.md for detailed configuration options.
๐ Documentation
- Architecture Overview - System design and components
- Configuration Guide - All configuration options
- Contributing Guidelines - How to contribute
- Security Policy - Security and vulnerability reporting
๐ฏ Examples
Analyze a Java Project
python -m source_atlas analyze \
--project-path ./examples/java_project \
--language java \
--project-id example-java \
--branch main
Query the Knowledge Graph
// Find all classes in a package
MATCH (c:Class {package: "com.example.service"})
RETURN c.className, c.filePath
// Find method call relationships
MATCH (m1:Method)-[:CALLS]->(m2:Method)
RETURN m1.name, m2.name
// Find implementation hierarchies
MATCH (c:Class)-[:IMPLEMENTS]->(i:Class)
RETURN c.fullClassName, i.fullClassName
๐๏ธ Project Structure
source_atlas/
โโโ analyzers/ # Language-specific code analyzers
โ โโโ base_analyzer.py
โ โโโ java_analyzer.py
โ โโโ analyzer_factory.py
โโโ extractors/ # Code element extractors
โ โโโ java/
โ โโโ python/
โ โโโ go/
โ โโโ typescript/
โโโ lsp/ # LSP service integration
โ โโโ lsp_service.py
โ โโโ implements/
โโโ models/ # Domain models
โ โโโ domain_models.py
โโโ neo4jdb/ # Neo4j integration
โ โโโ neo4j_service.py
โ โโโ neo4j_dto.py
โโโ utils/ # Utility functions
โโโ config/ # Configuration
๐ค Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Tree-sitter - Incremental parsing system
- Neo4j - Graph database platform
- LSP - Language Server Protocol
๐ง Contact
- Author: Nguyen Van Quyen
- Email: quyennv.4work@gmail.com
- GitHub: @quyen-ngv
๐ Issues & Support
If you encounter any issues or have questions:
- Check our documentation
- Search existing issues
- Create a new issue
Made with โค๏ธ by Nguyen Van Quyen
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file source_atlas-0.1.1.tar.gz.
File metadata
- Download URL: source_atlas-0.1.1.tar.gz
- Upload date:
- Size: 122.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
234717b118754e3154719b5befb39d7df0585a6a27fa4b06811e5c14b00415b1
|
|
| MD5 |
56db3d09c0304e640e7e8356dd8de42a
|
|
| BLAKE2b-256 |
d1cbe5be250fc9694fddac5bbfa9554506dce01ee466357982a2e01b5cebf81b
|
File details
Details for the file source_atlas-0.1.1-py3-none-any.whl.
File metadata
- Download URL: source_atlas-0.1.1-py3-none-any.whl
- Upload date:
- Size: 144.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43821b78e3354e6755fd5ed09f7903d4393d4d335959e72255f3a9d70298f1c0
|
|
| MD5 |
4f428389814265fa72371b8f46049992
|
|
| BLAKE2b-256 |
d2e4514d95ef47c089357ec7b522cf84b7b6a09b7251e6c1a9a571eb759dd611
|