A graph analytics engine built directly on Apache Arrow
Project description
Lynxes
A Fast, Zero-Copy Graph Analytics Engine Built Natively on Apache Arrow.
Why Lynxes | Quickstart | API Overview | Architecture
Lynxes is a blazingly fast, lazy-evaluated graph analytics engine. Unlike traditional Python libraries that wrap generic structures, Lynxes builds a graph-native engine directly over Arrow, completely bypassing the overhead of NetworkX or igraph.
Why Lynxes
- Zero-Copy Arrow Backing —
NodeFrameandEdgeFramedirectly own Apache ArrowRecordBatch. No intermediate copies, no Pandas/Polars dependency. - Graph Structure as a First-Class Citizen —
EdgeFramealways maintains a Compressed Sparse Row (CSR) index. Neighbor lookups are O(degree) from day one — no full table scans. - Lazy by Default — No computation happens until you call
.collect(). The built-in optimizer runs Predicate Pushdown, Projection Pushdown, Traversal Pruning, and Subgraph Caching before execution. - Language-Agnostic Core — The query engine, storage engine, and graph algorithms are written entirely in Rust. Python is a thin zero-overhead PyO3 wrapper.
Quickstart
Install
pip install lynxes
# or
uv add lynxes
Build from source
git clone https://github.com/your-org/lynxes
cd lynxes/py-lynxes
uv run maturin develop --release
Python API
import lynxes as lx
# Load from .gf text, .gfb binary, or Parquet
g = lx.read_gf("graph.gf")
# g = lx.read_parquet_graph("nodes.parquet", "edges.parquet")
# g = lx.read_gfb("graph.gfb")
# Build a lazy plan — nothing executes yet
result = (
g.lazy()
.filter_nodes(lx.col("age") > 25)
.expand("KNOWS", hops=2, direction="out")
.aggregate_neighbors("KNOWS", lx.count().alias("friend_count"))
.sort("friend_count", descending=True)
.limit(10)
.collect()
)
print(result)
Pattern Matching
Cypher-like pattern matching over the lazy execution engine:
result = (
g.lazy()
.match_pattern(
[
lx.node("person", "Person"),
lx.edge("WORKS_AT"),
lx.node("company", "Company"),
],
where_=lx.col("person.age") > 25,
)
.collect()
)
Graph Algorithms
# PageRank
pr = g.pagerank() # → NodeFrame with 'pagerank' column
# Shortest path
path = g.shortest_path("alice", "charlie") # → ["alice", "bob", "charlie"]
# Connected components
cc = g.connected_components() # → NodeFrame with 'component_id' column
# Betweenness centrality
bc = g.betweenness_centrality()
# Community detection (Louvain / Label Propagation)
cm = g.community_detection()
Remote Connectors
# Neo4j (Cypher)
g = lx.read_neo4j("bolt://localhost:7687", "neo4j", "password")
# ArangoDB (AQL)
g = lx.read_arangodb(
endpoint="http://localhost:8529",
database="mydb",
graph="social",
vertex_collection="persons",
edge_collection="knows",
)
# SPARQL endpoint
g = lx.read_sparql(
endpoint="https://dbpedia.org/sparql",
node_template="SELECT ?id WHERE { ?id a <Thing> }",
edge_template="SELECT ?s ?o WHERE { ?s ?p ?o }",
)
Distributed Graph Partitioning
# Partition a large graph across N shards
pg = g.partition(4, strategy="hash") # or "range" / "label"
print(pg.n_shards) # 4
print(pg.stats()) # imbalance ratio, boundary edges, …
# BFS across shard boundaries
nodes, edges = pg.distributed_expand(["alice"], hops=2, direction="out")
# Merge shards back into one GraphFrame
merged = pg.merge()
CLI
# Inspect a .gfb file
lynxes inspect graph.gfb
# Convert formats
lynxes convert graph.gf graph.gfb
# Run a filter query
lynxes query graph.gfb --filter "age > 25" --limit 10
API Overview
Top-level functions
| Function | Description |
|---|---|
lx.read_gf(path) |
Load a .gf text graph |
lx.read_gfb(path) |
Load a .gfb binary graph |
lx.read_parquet_graph(nodes, edges) |
Load from Parquet files |
lx.read_neo4j(uri, user, password) |
Connect to Neo4j |
lx.read_arangodb(...) |
Connect to ArangoDB |
lx.read_sparql(endpoint, ...) |
Connect to SPARQL endpoint |
lx.col(name) |
Create a column expression |
lx.count() / lx.sum(e) / lx.mean(e) |
Aggregation expressions |
lx.node(alias, label?) |
Pattern node descriptor |
lx.edge(type?) |
Pattern edge descriptor |
lx.partition_graph(g, n) |
Partition a GraphFrame |
GraphFrame methods
| Method | Returns |
|---|---|
.lazy() |
LazyGraphFrame |
.nodes() / .edges() |
NodeFrame / EdgeFrame |
.node_count() / .edge_count() |
int |
.subgraph(ids) / .subgraph_by_label(l) |
GraphFrame |
.pagerank(...) |
NodeFrame |
.shortest_path(src, dst) |
list[str] |
.connected_components() |
NodeFrame |
.betweenness_centrality() |
NodeFrame |
.community_detection() |
NodeFrame |
.partition(n, strategy) |
PartitionedGraph |
.write_gf(path) / .write_gfb(path) |
— |
.write_parquet_graph(nodes, edges) |
— |
LazyGraphFrame methods
| Method | Description |
|---|---|
.filter_nodes(expr) |
Keep nodes matching expression |
.filter_edges(expr) |
Keep edges matching expression |
.select_nodes(cols) / .select_edges(cols) |
Project columns |
.expand(type?, hops, direction) |
BFS graph traversal |
.aggregate_neighbors(type, agg) |
Aggregate over neighbor edges |
.match_pattern(steps, where_?) |
Cypher-like pattern matching |
.sort(by, descending) |
Sort result |
.limit(n) |
Cap result size |
.explain() |
Print logical plan |
.collect() |
Execute → GraphFrame |
.collect_nodes() |
Execute → NodeFrame |
.collect_edges() |
Execute → EdgeFrame |
Architecture
Lynxes is organized as a multi-crate Rust workspace with a thin Python layer on top:
py-lynxes/ ← Python package (maturin / PyO3)
src/lynxes/ ← lynxes Python namespace
tests/unit/ ← pytest integration tests
tests/benchmark/ ← NetworkX / igraph comparisons
crates/
lynxes/ ← Umbrella re-export crate
lynxes-core/ ← Arrow frames, CSR index, algorithms,
│ expression types, logical plan, optimizer
lynxes-plan/ ← Logical plan re-exports (thin)
lynxes-io/ ← File I/O (.gf parser, .gfb binary, Parquet)
lynxes-connect/ ← Remote connectors (Neo4j, ArangoDB,
│ SPARQL, Arrow Flight, GFConnector)
lynxes-lazy/ ← LazyGraphFrame + query executor
lynxes-python/ ← PyO3 binding crate (_lynxes.so)
lynxes-cli/ ← `lynxes` command-line tool
Execution Pipeline
Python call
│
▼
LazyGraphFrame (plan tree)
│
▼
Optimizer ──── PredicatePushdown
── ProjectionPushdown
── TraversalPruning
── SubgraphCaching
── EarlyTermination
│
▼
Executor ─────────────────────────────────────┐
│ │
▼ ▼
NodeFrame / EdgeFrame CSR Index (O(degree))
(Arrow RecordBatch) BFS / Traversal / Algorithms
Crate Dependency Graph
lynxes-python ──┐
lynxes-cli ──┤
├──► lynxes-lazy ──► lynxes-connect ──┐
│ ├──► lynxes-io ──┐
│ └──► lynxes-plan ─┤
│ ├──► lynxes-core
└───────────────────────────────────────────────────────►┘
Documentation Map
DESIGN.md— In-depth architectural design and engine principlesdocs/spec/— Feature and restructure specificationspy-lynxes/tests/benchmark/— Performance benchmarks vs NetworkX / igraph
Contributing
Please read DESIGN.md first. Core principles that are non-negotiable:
- Never wrap Polars —
NodeFrame/EdgeFrameown ArrowRecordBatchdirectly - CSR is mandatory —
EdgeFramealways holds a CSR index; no linear scan fallbacks - Lazy by default — All operations build a
LogicalPlan; execution only on.collect() - No optimization without measurement — Run
cargo benchbefore claiming speedups
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lynxes-1.3.6.tar.gz.
File metadata
- Download URL: lynxes-1.3.6.tar.gz
- Upload date:
- Size: 241.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e5912adb20f08bd8f22f8cead67c1f53256d1e75819995ca943ba00748b8338
|
|
| MD5 |
9ac4d6485347bb7b49b8774085f6d930
|
|
| BLAKE2b-256 |
8d58eff8facc430232817e3d8d8986d086a049d646ef7e21b2222914e23b12e2
|
Provenance
The following attestation bundles were made for lynxes-1.3.6.tar.gz:
Publisher:
release.yml on eastlighting1/Lynxes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lynxes-1.3.6.tar.gz -
Subject digest:
4e5912adb20f08bd8f22f8cead67c1f53256d1e75819995ca943ba00748b8338 - Sigstore transparency entry: 1362753220
- Sigstore integration time:
-
Permalink:
eastlighting1/Lynxes@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Branch / Tag:
refs/tags/v1.3.6 - Owner: https://github.com/eastlighting1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file lynxes-1.3.6-cp310-abi3-win_amd64.whl.
File metadata
- Download URL: lynxes-1.3.6-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 5.1 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45d817da879a4dde084bbfbbb841587304b23e219cd9d28423c7c57f0ffe47dd
|
|
| MD5 |
db25d92ae84061592d4eee589030e840
|
|
| BLAKE2b-256 |
daf88a95f52f8ccc8dd1eef5dad576da94cd3f1007c78f78350ea832893160c6
|
Provenance
The following attestation bundles were made for lynxes-1.3.6-cp310-abi3-win_amd64.whl:
Publisher:
release.yml on eastlighting1/Lynxes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lynxes-1.3.6-cp310-abi3-win_amd64.whl -
Subject digest:
45d817da879a4dde084bbfbbb841587304b23e219cd9d28423c7c57f0ffe47dd - Sigstore transparency entry: 1362753429
- Sigstore integration time:
-
Permalink:
eastlighting1/Lynxes@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Branch / Tag:
refs/tags/v1.3.6 - Owner: https://github.com/eastlighting1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file lynxes-1.3.6-cp310-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: lynxes-1.3.6-cp310-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.8 MB
- Tags: CPython 3.10+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee8c0c6f2f9e6e2fcbb7bc2f54d75befda0f795e073d5fc8bf78a4d5ac45b714
|
|
| MD5 |
2b67fef604b89596f57571334c5a28f3
|
|
| BLAKE2b-256 |
d0be9a216f17b4f67c9bf110c9ed60c4c6c3a9eaedb805361e2206b183a2ae2e
|
Provenance
The following attestation bundles were made for lynxes-1.3.6-cp310-abi3-manylinux_2_28_x86_64.whl:
Publisher:
release.yml on eastlighting1/Lynxes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lynxes-1.3.6-cp310-abi3-manylinux_2_28_x86_64.whl -
Subject digest:
ee8c0c6f2f9e6e2fcbb7bc2f54d75befda0f795e073d5fc8bf78a4d5ac45b714 - Sigstore transparency entry: 1362753378
- Sigstore integration time:
-
Permalink:
eastlighting1/Lynxes@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Branch / Tag:
refs/tags/v1.3.6 - Owner: https://github.com/eastlighting1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file lynxes-1.3.6-cp310-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: lynxes-1.3.6-cp310-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.10+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81c0ec5a784e8ac8b7608288ec6ae648ac6251e81e171a49a1cccf77a9b9f5fd
|
|
| MD5 |
df7e7564955fa826dc1a540bfb904d03
|
|
| BLAKE2b-256 |
0738edece9a9671c4e05a050606dde977e0ee0c0b6be8921bcc64c952fb9518e
|
Provenance
The following attestation bundles were made for lynxes-1.3.6-cp310-abi3-manylinux_2_28_aarch64.whl:
Publisher:
release.yml on eastlighting1/Lynxes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lynxes-1.3.6-cp310-abi3-manylinux_2_28_aarch64.whl -
Subject digest:
81c0ec5a784e8ac8b7608288ec6ae648ac6251e81e171a49a1cccf77a9b9f5fd - Sigstore transparency entry: 1362753280
- Sigstore integration time:
-
Permalink:
eastlighting1/Lynxes@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Branch / Tag:
refs/tags/v1.3.6 - Owner: https://github.com/eastlighting1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file lynxes-1.3.6-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: lynxes-1.3.6-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 5.0 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fc0503c2f108d167d7b09d1b6365d90f24f1242842aefed1f232bd6fa019fe8
|
|
| MD5 |
ac9fbee808f10b83b9543a19af62704f
|
|
| BLAKE2b-256 |
7b3d0f72eb8864ab01a8e8db93f90fc14fd31c25bfb30010b483263ee8dce91b
|
Provenance
The following attestation bundles were made for lynxes-1.3.6-cp310-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on eastlighting1/Lynxes
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lynxes-1.3.6-cp310-abi3-macosx_11_0_arm64.whl -
Subject digest:
8fc0503c2f108d167d7b09d1b6365d90f24f1242842aefed1f232bd6fa019fe8 - Sigstore transparency entry: 1362753321
- Sigstore integration time:
-
Permalink:
eastlighting1/Lynxes@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Branch / Tag:
refs/tags/v1.3.6 - Owner: https://github.com/eastlighting1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@37ac0bafaeca53d4914fe2c6ec1854b78dbffb1d -
Trigger Event:
push
-
Statement type: