Skip to main content

Multi-language code analyzer with LSP and Neo4j integration

Project description

Source Atlas

License: MIT Python 3.8+ Ask DeepWiki

Source Atlas is a powerful multi-language code analyzer that combines Tree-sitter parsing, Language Server Protocol (LSP) integration, and Neo4j graph database to create comprehensive code knowledge graphs.

โœจ Features

  • ๐ŸŒ Multi-Language Support: Analyze Java, Python, Go, and TypeScript codebases
  • ๐Ÿ” Deep Code Analysis: Extract classes, methods, dependencies, and relationships
  • ๐Ÿง  LSP Integration: Leverage Language Server Protocol for semantic analysis
  • ๐Ÿ“Š Knowledge Graph: Build rich code graphs in Neo4j for advanced querying
  • ๐ŸŽฏ AST-Based: Uses Tree-sitter for accurate syntax parsing
  • โšก Incremental Analysis: Track code changes with AST hashing
  • ๐Ÿ”— Relationship Tracking: Discover implements, extends, uses, and calls relationships

๐Ÿ—๏ธ Architecture

graph TB
    A[Source Code] --> B[Tree-sitter Parser]
    B --> C[AST Analysis]
    C --> D[LSP Service]
    D --> E[Code Analyzer]
    E --> F[Code Chunks]
    F --> G[Neo4j Knowledge Graph]

Components:

  • Analyzers: Language-specific code analyzers (Java, Python, Go, TypeScript)
  • Extractors: Extract specific code elements (classes, methods, endpoints)
  • LSP Service: Integrates with language servers for semantic information
  • Neo4j Service: Manages code graph database operations
  • Models: Domain models for code chunks, methods, and relationships

๐Ÿ“‹ Prerequisites

  • Python: 3.8 or higher
  • Neo4j: 5.x running locally or remotely
  • Language-specific tools (for the languages you want to analyze):
    • Java: JDK 11+ (for LSP server)
    • Python: Python 3.8+
    • Go: Go 1.16+
    • TypeScript: Node.js 14+

๐Ÿš€ Installation

1. Clone the repository

git clone https://github.com/quyen-ngv/source-atlas.git
cd source-atlas

2. Create virtual environment

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Set up Neo4j

Download and install Neo4j Desktop or use Docker:

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your_password \
  neo4j:5.14.0

5. Configure environment

Create a .env file from the template:

cp .env.example .env

Edit .env with your settings:

APP_NEO4J_URL=bolt://localhost:7687
APP_NEO4J_USER=neo4j
APP_NEO4J_PASSWORD=your_password
APP_NEO4J_DATABASE=neo4j

๐Ÿ’ป Quick Start

Basic Usage

python -m source_atlas analyze \
  --project-path /path/to/your/project \
  --language java \
  --project-id my-project \
  --output ./output

Using as a Library

from pathlib import Path
from analyzers.analyzer_factory import AnalyzerFactory
from neo4jdb.neo4j_service import Neo4jService

# Create analyzer
analyzer = AnalyzerFactory.create_analyzer(
    language="java",
    root_path="/path/to/project",
    project_id="my-project",
    branch="main"
)

# Analyze project
with analyzer:
    chunks = analyzer.parse_project(Path("/path/to/project"))

# Import to Neo4j
neo4j_service = Neo4jService(
    url="bolt://localhost:7687",
    user="neo4j",
    password="your_password"
)
neo4j_service.neo4j_service.import_code_chunks(
                chunks=chunks,
                batch_size=500,
                main_branch='main',
                base_branch='main',
                pull_request_id=None
            )

๐Ÿ”ง Configuration

Environment Variables

Variable Description Default Required
APP_NEO4J_URL Neo4j connection URL bolt://localhost:7687 Yes
APP_NEO4J_USER Neo4j username neo4j Yes
APP_NEO4J_PASSWORD Neo4j password - Yes
APP_NEO4J_DATABASE Neo4j database name neo4j Yes
NEO4J_MAX_CONNECTION_POOL_SIZE Max connection pool size 50 No
NEO4J_CONNECTION_TIMEOUT Connection timeout (seconds) 30.0 No

See docs/configuration.md for detailed configuration options.

๐Ÿ“š Documentation

๐ŸŽฏ Examples

Analyze a Java Project

python -m source_atlas analyze \
  --project-path ./examples/java_project \
  --language java \
  --project-id example-java \
  --branch main

Query the Knowledge Graph

// Find all classes in a package
MATCH (c:Class {package: "com.example.service"})
RETURN c.className, c.filePath

// Find method call relationships
MATCH (m1:Method)-[:CALLS]->(m2:Method)
RETURN m1.name, m2.name

// Find implementation hierarchies
MATCH (c:Class)-[:IMPLEMENTS]->(i:Class)
RETURN c.fullClassName, i.fullClassName

๐Ÿ—‚๏ธ Project Structure

source_atlas/
โ”œโ”€โ”€ analyzers/          # Language-specific code analyzers
โ”‚   โ”œโ”€โ”€ base_analyzer.py
โ”‚   โ”œโ”€โ”€ java_analyzer.py
โ”‚   โ””โ”€โ”€ analyzer_factory.py
โ”œโ”€โ”€ extractors/         # Code element extractors
โ”‚   โ”œโ”€โ”€ java/
โ”‚   โ”œโ”€โ”€ python/
โ”‚   โ”œโ”€โ”€ go/
โ”‚   โ””โ”€โ”€ typescript/
โ”œโ”€โ”€ lsp/               # LSP service integration
โ”‚   โ”œโ”€โ”€ lsp_service.py
โ”‚   โ””โ”€โ”€ implements/
โ”œโ”€โ”€ models/            # Domain models
โ”‚   โ””โ”€โ”€ domain_models.py
โ”œโ”€โ”€ neo4jdb/           # Neo4j integration
โ”‚   โ”œโ”€โ”€ neo4j_service.py
โ”‚   โ””โ”€โ”€ neo4j_dto.py
โ”œโ”€โ”€ utils/             # Utility functions
โ””โ”€โ”€ config/            # Configuration

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Tree-sitter - Incremental parsing system
  • Neo4j - Graph database platform
  • LSP - Language Server Protocol

๐Ÿ“ง Contact

๐Ÿ› Issues & Support

If you encounter any issues or have questions:


Made with โค๏ธ by Nguyen Van Quyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

source_atlas-0.1.6.tar.gz (338.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

source_atlas-0.1.6-py3-none-any.whl (463.9 kB view details)

Uploaded Python 3

File details

Details for the file source_atlas-0.1.6.tar.gz.

File metadata

  • Download URL: source_atlas-0.1.6.tar.gz
  • Upload date:
  • Size: 338.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.6.tar.gz
Algorithm Hash digest
SHA256 6779737138f16fcf2d20c7938f00e8bb081e61af94dcd572168100384858705b
MD5 fa796149c8cf7cc17ad773274007ecba
BLAKE2b-256 28e65c86e909caa4f85af6c5958932cf3f3b0e634c4b6e5e52ac4801dd42048a

See more details on using hashes here.

File details

Details for the file source_atlas-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: source_atlas-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 463.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 9d72fb6c4607f7ae576249e1a1748d62a717471ce748a92339e1a1b128ed0f1d
MD5 e56320eb110f992506478414cae5aae3
BLAKE2b-256 9ea409458571850392e54f539c230643d1fe83a558aecba7198358a4c4d1ed1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page