Skip to main content

No project description provided

Project description

Gossiphs = Gossip Graphs

[!TIP] We provide an easy-to-use Python SDK and support for MCP (Model Context Protocol), allowing you to seamlessly integrate it with your AI.

See Gossiphs MCP Server

Crates.io Version RealWorld Test

"Zero setup" & "Blazingly fast" general code file relationship analysis. With Python & Rust. Based on tree-sitter and git analysis. Support MCP and ready for AI🤖

What's it

Gossiphs can analyze the history of commits and the relationships between variable declarations and references in your codebase to obtain a relationship diagram of the code files.

It also allows developers to query the content declared in each file, thereby enabling free search for its references throughout the entire codebase to achieve more complex analysis.

graph TD
    A[main.py] --- S1[func_main] --- B[module_a.py]
    A --- S2[Handler] --- C[module_b.py]
    B --- S3[func_util] --- D[utils.py]
    C --- S3[func_util] --- D
    A --- S4[func_init] --- E[module_c.py]
    E --- S5[process] --- F[module_d.py]
    E --- S6[Processor] --- H[module_e.py]
    H --- S7[transform] --- I[module_f.py]
    I --- S3[func_util] --- D

Supported Languages

We are expanding language support based on Tree-Sitter Query, which isn't too costly. If you're interested, you can check out the contribution section.

Language Status
Rust
Python
TypeScript
JavaScript
Golang
Java
Kotlin
Swift

You can see the rule files here.

Usage

Python

pip install gossiphs

Analyze your codebase with networkx within 30 lines:

import networkx as nx
from gossiphs import GraphConfig, create_graph, Graph

config = GraphConfig()
config.project_path = "../.."
graph: Graph = create_graph(config)

nx_graph = nx.DiGraph()

for each_file in graph.files():
    nx_graph.add_node(each_file, metadata=graph.file_metadata(each_file))

    related_files = graph.related_files(each_file)
    for each_related_file in related_files:
        related_symbols = set(each_symbol.symbol.name for each_symbol in each_related_file.related_symbols)

        nx_graph.add_edge(
            each_file,
            each_related_file.name,
            related_symbols=list(related_symbols)
        )

print(f"NetworkX graph created with {nx_graph.number_of_nodes()} nodes and {nx_graph.number_of_edges()} edges.")

for src, dest, data in nx_graph.edges(data=True):
    print(f"{src} -> {dest}, related symbols: {data['related_symbols']}")

Output:

NetworkX graph created with 13 nodes and 27 edges.
src/server.rs -> src/main.rs, related symbols: ['server_main']
src/main.rs -> src/graph.rs, related symbols: ['default']
src/main.rs -> examples/mini.rs, related symbols: ['default']
src/main.rs -> src/server.rs, related symbols: ['main']
src/symbol.rs -> src/graph.rs, related symbols: ['link_file_to_symbol', 'list_references', 'list_references_by_definition', 'id', 'enhance_symbol_to_symbol', 'add_file', 'add_symbol', 'list_definitions', 'list_symbols', 'new', 'link_symbol_to_symbol', 'get_symbol']
...

More examples can be found here.

Others

We also provide a CLI and additional usage options, making it easy to directly export CSV files or start an HTTP service.

See usage page.

Goal & Motivation

[!TIP] Create a file relationship index with:

  • low cost
  • acceptable accuracy
  • high versatility for nearly any code repository

Code navigation is a fascinating subject that plays a pivotal role in various domains, such as:

  • Guiding the context during the development process within an IDE.
  • Facilitating more convenient code browsing on websites.
  • Analyzing the impact of code changes in Continuous Integration (CI) systems.
  • ...

In the past, I endeavored to apply LSP/LSIF technologies and techniques like Github's Stack-Graphs to impact analysis, encountering different challenges along the way. For our needs, a method akin to Stack-Graphs aligns most closely with our expectations. However, the challenges are evident: it requires crafting highly language-specific rules, which is a considerable investment for us, given that we do not require such high precision data.

We attempt to make some trade-offs on the challenges currently faced by stack-graphs to achieve our expected goals to a certain extent:

  • Zero repo-specific configuration: It can be applied to most languages and repositories without additional configuration.
  • Low extension cost: adding rules for languages is not high.
  • Acceptable precision: We have sacrificed a certain level of precision, but we also hope that it remains at an acceptable level.

How it works

Gossiphs constructs a graph that interconnects symbols of definitions and references.

  1. Extract imports and exports: Identify the imports and exports of each file.
  2. Connect nodes: Establish connections between potential definition and reference nodes.
  3. Refine edges with commit histories: Utilize commit histories to refine the relationships between nodes.

Unlike stack-graphs, we have omitted the highly complex scope analysis and instead opted to refine our edges using commit histories. This approach significantly reduces the complexity of rule writing, as the rules only need to specify which types of symbols should be exported or imported for each file.

While there is undoubtedly a trade-off in precision, the benefits are clear:

  1. Minimal impact on accuracy: In practical scenarios, the loss of precision is not as significant as one might expect.
  2. Commit history relevance: The use of commit history to reflect the influence between code segments aligns well with our objectives.
  3. Language support: We can easily support the vast majority of programming languages, meeting the analysis needs of various types of repositories.

Precision

Static analysis has its limits, such as dynamic binding. Therefore, it is unlikely to achieve the level of accuracy provided by LSP, but it can offer sufficient accuracy in the areas where it is primarily used.

The method we use to demonstrate accuracy is to compare the results with those of LSP/LSIF. It must be admitted that static inference is almost impossible to obtain all reference relationships like LSP.

You can further combine your own needs and use other methods such as tfidf to process the results to meet more complex requirements.

Repo Coverage of LSP Edges by Gossiphs
https://github.com/go-gorm/gorm 442/499 = 88.5 %
https://github.com/gin-gonic/gin 238/252 = 94.4%

Contribution

The project is still in a very early and experimental stage. If you are interested, please leave your thoughts through an issue. In the short term, we hope to build better support for more languages.

You just need to:

  1. Edit rules in src/rule.rs
  2. Test it in src/extractor.rs
  3. Try it with your repo in src/graph.rs

Tree-sitter Playground is a good helper.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gossiphs-0.11.6.tar.gz (71.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gossiphs-0.11.6-cp38-abi3-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8+Windows x86-64

gossiphs-0.11.6-cp38-abi3-manylinux_2_38_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.38+ x86-64

gossiphs-0.11.6-cp38-abi3-manylinux_2_38_i686.whl (5.3 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.38+ i686

gossiphs-0.11.6-cp38-abi3-macosx_11_0_arm64.whl (4.7 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

gossiphs-0.11.6-cp38-abi3-macosx_10_12_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file gossiphs-0.11.6.tar.gz.

File metadata

  • Download URL: gossiphs-0.11.6.tar.gz
  • Upload date:
  • Size: 71.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for gossiphs-0.11.6.tar.gz
Algorithm Hash digest
SHA256 96ee8c7fc836a4e833a4f89fcececb0f83ec8fce02bd334018271164cfed53bd
MD5 37c94c86268f7550474aba35f86634ce
BLAKE2b-256 9f0544d9d74ea4fece442ba5082f41b27dcf77a290e014fd41a08410b7dbf106

See more details on using hashes here.

File details

Details for the file gossiphs-0.11.6-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: gossiphs-0.11.6-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for gossiphs-0.11.6-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 596ef660e6af83cd51d2e758de22fa6c4e5b04cafdd4f13a27059c03c0c6927a
MD5 b8c002358eb28ff20a7a04f5c9254b78
BLAKE2b-256 21b0d57cc719bc73228417a32c4604562a698f47e39aa40b4969237704a51610

See more details on using hashes here.

File details

Details for the file gossiphs-0.11.6-cp38-abi3-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for gossiphs-0.11.6-cp38-abi3-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 6a8fdc96b75266e34ee7cb07b56e845c1d8604a373d035e22e73169d5b3b076e
MD5 8bad67c00836235437b26aba45ea0628
BLAKE2b-256 1611e39ff8e9890d96344fba917fb4e3bcf9b886f98b8c6c3632365fa0d75f5b

See more details on using hashes here.

File details

Details for the file gossiphs-0.11.6-cp38-abi3-manylinux_2_38_i686.whl.

File metadata

File hashes

Hashes for gossiphs-0.11.6-cp38-abi3-manylinux_2_38_i686.whl
Algorithm Hash digest
SHA256 3b4751cff6eb37588946878ca5ec0034eb997609b752841d39c14021ac15f556
MD5 5603bb7b385be1869556cad77975c34b
BLAKE2b-256 0cce4f789ab23d6239ce383de243dfb61f98bb09fd7b60ea4fbed1b757414eff

See more details on using hashes here.

File details

Details for the file gossiphs-0.11.6-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gossiphs-0.11.6-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0d89a4e663a4a1abb7f219eb5ca417d0932fe5bd4f325fca9feb0eb2c8bf7a04
MD5 d355a55851196d270ef5c6ba2f27bbbb
BLAKE2b-256 b1119a45c02353e577d0abe7c50820f36c1362b7dd02ff53663b5bb80ca76967

See more details on using hashes here.

File details

Details for the file gossiphs-0.11.6-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gossiphs-0.11.6-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9faaa26d2f2b57987f99d5c81c8e7d2241f1d0ad084cc163f9e27fb18ce3c4d9
MD5 fda56aa35144a7498c4081622e52c7d0
BLAKE2b-256 73769928decb03caf3e8caef4183bfaa0d9b30a680f6b083b176e03188f02d14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page