Skip to main content

Multi-language code analyzer with LSP and Neo4j integration

Project description

Source Atlas

License: MIT Python 3.8+ Ask DeepWiki

Source Atlas is a powerful multi-language code analyzer that combines Tree-sitter parsing, Language Server Protocol (LSP) integration, and Neo4j graph database to create comprehensive code knowledge graphs.

โœจ Features

  • ๐ŸŒ Multi-Language Support: Analyze Java, Python, Go, and TypeScript codebases
  • ๐Ÿ” Deep Code Analysis: Extract classes, methods, dependencies, and relationships
  • ๐Ÿง  LSP Integration: Leverage Language Server Protocol for semantic analysis
  • ๐Ÿ“Š Knowledge Graph: Build rich code graphs in Neo4j for advanced querying
  • ๐ŸŽฏ AST-Based: Uses Tree-sitter for accurate syntax parsing
  • โšก Incremental Analysis: Track code changes with AST hashing
  • ๐Ÿ”— Relationship Tracking: Discover implements, extends, uses, and calls relationships

๐Ÿ—๏ธ Architecture

graph TB
    A[Source Code] --> B[Tree-sitter Parser]
    B --> C[AST Analysis]
    C --> D[LSP Service]
    D --> E[Code Analyzer]
    E --> F[Code Chunks]
    F --> G[Neo4j Knowledge Graph]

Components:

  • Analyzers: Language-specific code analyzers (Java, Python, Go, TypeScript)
  • Extractors: Extract specific code elements (classes, methods, endpoints)
  • LSP Service: Integrates with language servers for semantic information
  • Neo4j Service: Manages code graph database operations
  • Models: Domain models for code chunks, methods, and relationships

๐Ÿ“‹ Prerequisites

  • Python: 3.8 or higher
  • Neo4j: 5.x running locally or remotely
  • Language-specific tools (for the languages you want to analyze):
    • Java: JDK 11+ (for LSP server)
    • Python: Python 3.8+
    • Go: Go 1.16+
    • TypeScript: Node.js 14+

๐Ÿš€ Installation

1. Clone the repository

git clone https://github.com/quyen-ngv/source-atlas.git
cd source-atlas

2. Create virtual environment

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Set up Neo4j

Download and install Neo4j Desktop or use Docker:

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your_password \
  neo4j:5.14.0

5. Configure environment

Create a .env file from the template:

cp .env.example .env

Edit .env with your settings:

APP_NEO4J_URL=bolt://localhost:7687
APP_NEO4J_USER=neo4j
APP_NEO4J_PASSWORD=your_password
APP_NEO4J_DATABASE=neo4j

๐Ÿ’ป Quick Start

Basic Usage

python -m source_atlas analyze \
  --project-path /path/to/your/project \
  --language java \
  --project-id my-project \
  --output ./output

Using as a Library

from pathlib import Path
from analyzers.analyzer_factory import AnalyzerFactory
from neo4jdb.neo4j_service import Neo4jService

# Create analyzer
analyzer = AnalyzerFactory.create_analyzer(
    language="java",
    root_path="/path/to/project",
    project_id="my-project",
    branch="main"
)

# Analyze project
with analyzer:
    chunks = analyzer.parse_project(Path("/path/to/project"))

# Import to Neo4j
neo4j_service = Neo4jService(
    url="bolt://localhost:7687",
    user="neo4j",
    password="your_password"
)
neo4j_service.neo4j_service.import_code_chunks(
                chunks=chunks,
                batch_size=500,
                main_branch='main',
                base_branch='main',
                pull_request_id=None
            )

๐Ÿ”ง Configuration

Environment Variables

Variable Description Default Required
APP_NEO4J_URL Neo4j connection URL bolt://localhost:7687 Yes
APP_NEO4J_USER Neo4j username neo4j Yes
APP_NEO4J_PASSWORD Neo4j password - Yes
APP_NEO4J_DATABASE Neo4j database name neo4j Yes
NEO4J_MAX_CONNECTION_POOL_SIZE Max connection pool size 50 No
NEO4J_CONNECTION_TIMEOUT Connection timeout (seconds) 30.0 No

See docs/configuration.md for detailed configuration options.

๐Ÿ“š Documentation

๐ŸŽฏ Examples

Analyze a Java Project

python -m source_atlas analyze \
  --project-path ./examples/java_project \
  --language java \
  --project-id example-java \
  --branch main

Query the Knowledge Graph

// Find all classes in a package
MATCH (c:Class {package: "com.example.service"})
RETURN c.className, c.filePath

// Find method call relationships
MATCH (m1:Method)-[:CALLS]->(m2:Method)
RETURN m1.name, m2.name

// Find implementation hierarchies
MATCH (c:Class)-[:IMPLEMENTS]->(i:Class)
RETURN c.fullClassName, i.fullClassName

๐Ÿ—‚๏ธ Project Structure

source_atlas/
โ”œโ”€โ”€ analyzers/          # Language-specific code analyzers
โ”‚   โ”œโ”€โ”€ base_analyzer.py
โ”‚   โ”œโ”€โ”€ java_analyzer.py
โ”‚   โ””โ”€โ”€ analyzer_factory.py
โ”œโ”€โ”€ extractors/         # Code element extractors
โ”‚   โ”œโ”€โ”€ java/
โ”‚   โ”œโ”€โ”€ python/
โ”‚   โ”œโ”€โ”€ go/
โ”‚   โ””โ”€โ”€ typescript/
โ”œโ”€โ”€ lsp/               # LSP service integration
โ”‚   โ”œโ”€โ”€ lsp_service.py
โ”‚   โ””โ”€โ”€ implements/
โ”œโ”€โ”€ models/            # Domain models
โ”‚   โ””โ”€โ”€ domain_models.py
โ”œโ”€โ”€ neo4jdb/           # Neo4j integration
โ”‚   โ”œโ”€โ”€ neo4j_service.py
โ”‚   โ””โ”€โ”€ neo4j_dto.py
โ”œโ”€โ”€ utils/             # Utility functions
โ””โ”€โ”€ config/            # Configuration

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Tree-sitter - Incremental parsing system
  • Neo4j - Graph database platform
  • LSP - Language Server Protocol

๐Ÿ“ง Contact

๐Ÿ› Issues & Support

If you encounter any issues or have questions:


Made with โค๏ธ by Nguyen Van Quyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

source_atlas-0.1.4.tar.gz (322.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

source_atlas-0.1.4-py3-none-any.whl (452.4 kB view details)

Uploaded Python 3

File details

Details for the file source_atlas-0.1.4.tar.gz.

File metadata

  • Download URL: source_atlas-0.1.4.tar.gz
  • Upload date:
  • Size: 322.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.4.tar.gz
Algorithm Hash digest
SHA256 0882a6fa2ebbfef4dace256da6b89a2876f8f573f25d1a92d6f285277e88668d
MD5 e5376e9d9ac4657980b0e0a0567bb54e
BLAKE2b-256 a2c5c5cab973ae023bbdb8bab55038ff81f18d03f8d6ede64fac4b94ed535b4a

See more details on using hashes here.

File details

Details for the file source_atlas-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: source_atlas-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 452.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for source_atlas-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 220515a9b94ea9abb4bb6f6762226d12ae29a39e65b3cdbdeab4b5488384d757
MD5 354bb1dcaed47a1b6fd3ee79cc16f536
BLAKE2b-256 8f45208e5534d28f4936a6cce4840caf3bdaf73014fbdfb08e1eaa822b0e525e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page