Skip to main content

Multi-language code analyzer with LSP and Neo4j integration

Project description

Source Atlas

License: MIT Python 3.8+ Ask DeepWiki

Source Atlas is a powerful multi-language code analyzer that combines Tree-sitter parsing, Language Server Protocol (LSP) integration, and Neo4j graph database to create comprehensive code knowledge graphs.

โœจ Features

  • ๐ŸŒ Multi-Language Support: Analyze Java, Python, Go, and TypeScript codebases
  • ๐Ÿ” Deep Code Analysis: Extract classes, methods, dependencies, and relationships
  • ๐Ÿง  LSP Integration: Leverage Language Server Protocol for semantic analysis
  • ๐Ÿ“Š Knowledge Graph: Build rich code graphs in Neo4j for advanced querying
  • ๐ŸŽฏ AST-Based: Uses Tree-sitter for accurate syntax parsing
  • โšก Incremental Analysis: Track code changes with AST hashing
  • ๐Ÿ”— Relationship Tracking: Discover implements, extends, uses, and calls relationships

๐Ÿ—๏ธ Architecture

graph TB
    A[Source Code] --> B[Tree-sitter Parser]
    B --> C[AST Analysis]
    C --> D[LSP Service]
    D --> E[Code Analyzer]
    E --> F[Code Chunks]
    F --> G[Neo4j Knowledge Graph]

Components:

  • Analyzers: Language-specific code analyzers (Java, Python, Go, TypeScript)
  • Extractors: Extract specific code elements (classes, methods, endpoints)
  • LSP Service: Integrates with language servers for semantic information
  • Neo4j Service: Manages code graph database operations
  • Models: Domain models for code chunks, methods, and relationships

๐Ÿ“‹ Prerequisites

  • Python: 3.8 or higher
  • Neo4j: 5.x running locally or remotely
  • Language-specific tools (for the languages you want to analyze):
    • Java: JDK 11+ (for LSP server)
    • Python: Python 3.8+
    • Go: Go 1.16+
    • TypeScript: Node.js 14+

๐Ÿš€ Installation

1. Clone the repository

git clone https://github.com/quyen-ngv/source-atlas.git
cd source-atlas

2. Create virtual environment

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Set up Neo4j

Download and install Neo4j Desktop or use Docker:

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your_password \
  neo4j:5.14.0

5. Configure environment

Create a .env file from the template:

cp .env.example .env

Edit .env with your settings:

APP_NEO4J_URL=bolt://localhost:7687
APP_NEO4J_USER=neo4j
APP_NEO4J_PASSWORD=your_password
APP_NEO4J_DATABASE=neo4j

๐Ÿ’ป Quick Start

Basic Usage

python -m source_atlas analyze \
  --project-path /path/to/your/project \
  --language java \
  --project-id my-project \
  --output ./output

Using as a Library

from pathlib import Path
from analyzers.analyzer_factory import AnalyzerFactory
from neo4jdb.neo4j_service import Neo4jService

# Create analyzer
analyzer = AnalyzerFactory.create_analyzer(
    language="java",
    root_path="/path/to/project",
    project_id="my-project",
    branch="main"
)

# Analyze project
with analyzer:
    chunks = analyzer.parse_project(Path("/path/to/project"))

# Import to Neo4j
neo4j_service = Neo4jService(
    url="bolt://localhost:7687",
    user="neo4j",
    password="your_password"
)
neo4j_service.neo4j_service.import_code_chunks(
                chunks=chunks,
                batch_size=500,
                main_branch='main',
                base_branch='main',
                pull_request_id=None
            )

๐Ÿ”ง Configuration

Environment Variables

Variable Description Default Required
APP_NEO4J_URL Neo4j connection URL bolt://localhost:7687 Yes
APP_NEO4J_USER Neo4j username neo4j Yes
APP_NEO4J_PASSWORD Neo4j password - Yes
APP_NEO4J_DATABASE Neo4j database name neo4j Yes
NEO4J_MAX_CONNECTION_POOL_SIZE Max connection pool size 50 No
NEO4J_CONNECTION_TIMEOUT Connection timeout (seconds) 30.0 No

See docs/configuration.md for detailed configuration options.

๐Ÿ“š Documentation

๐ŸŽฏ Examples

Analyze a Java Project

python -m source_atlas analyze \
  --project-path ./examples/java_project \
  --language java \
  --project-id example-java \
  --branch main

Query the Knowledge Graph

// Find all classes in a package
MATCH (c:Class {package: "com.example.service"})
RETURN c.className, c.filePath

// Find method call relationships
MATCH (m1:Method)-[:CALLS]->(m2:Method)
RETURN m1.name, m2.name

// Find implementation hierarchies
MATCH (c:Class)-[:IMPLEMENTS]->(i:Class)
RETURN c.fullClassName, i.fullClassName

๐Ÿ—‚๏ธ Project Structure

source_atlas/
โ”œโ”€โ”€ analyzers/          # Language-specific code analyzers
โ”‚   โ”œโ”€โ”€ base_analyzer.py
โ”‚   โ”œโ”€โ”€ java_analyzer.py
โ”‚   โ””โ”€โ”€ analyzer_factory.py
โ”œโ”€โ”€ extractors/         # Code element extractors
โ”‚   โ”œโ”€โ”€ java/
โ”‚   โ”œโ”€โ”€ python/
โ”‚   โ”œโ”€โ”€ go/
โ”‚   โ””โ”€โ”€ typescript/
โ”œโ”€โ”€ lsp/               # LSP service integration
โ”‚   โ”œโ”€โ”€ lsp_service.py
โ”‚   โ””โ”€โ”€ implements/
โ”œโ”€โ”€ models/            # Domain models
โ”‚   โ””โ”€โ”€ domain_models.py
โ”œโ”€โ”€ neo4jdb/           # Neo4j integration
โ”‚   โ”œโ”€โ”€ neo4j_service.py
โ”‚   โ””โ”€โ”€ neo4j_dto.py
โ”œโ”€โ”€ utils/             # Utility functions
โ””โ”€โ”€ config/            # Configuration

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Tree-sitter - Incremental parsing system
  • Neo4j - Graph database platform
  • LSP - Language Server Protocol

๐Ÿ“ง Contact

๐Ÿ› Issues & Support

If you encounter any issues or have questions:


Made with โค๏ธ by Nguyen Van Quyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

source_atlas-0.1.3.tar.gz (322.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

source_atlas-0.1.3-py3-none-any.whl (452.3 kB view details)

Uploaded Python 3

File details

Details for the file source_atlas-0.1.3.tar.gz.

File metadata

  • Download URL: source_atlas-0.1.3.tar.gz
  • Upload date:
  • Size: 322.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ede2eec748cefd4ab7391a65aa635870621a42e3475b2cdd6e7a262182d815b7
MD5 7c52de164e7718586fb810957efebe53
BLAKE2b-256 b3bf1491769c907f3af936970d72dd45a24f235d50ee642ce863aaff4e00d68a

See more details on using hashes here.

File details

Details for the file source_atlas-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: source_atlas-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 452.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ca019bfe24ab5d5b1189ecb6c95a4aff18b95eca0a3850bd496f42bb0f9f8705
MD5 aaff9b8c97f7df4f972686def3db9852
BLAKE2b-256 623234ba83d619941ee48d33efdf3588626ca4f4a451f31411e16ae52c297a43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page