Skip to main content

Async-native Python library for bulk operations on self-hosted GitLab

Project description

labflow

GraphQL-first async-native Python library for self-hosted GitLab instances.

Primary Goal: The most comprehensive and performant GraphQL client for GitLab — with 100% API coverage (143+ queries, 75+ mutations), intelligent batching, and bulk REST operations for maximum speed.

Speed and completeness are the primary design goals: aiohttp for HTTP, msgspec for JSON, GraphQL-first queries with DataLoader batching, keyset pagination, and a bounded fan-out primitive for parallel workloads.

PyPI Python Version License Pipeline Status Code Style: Ruff Coverage pre-commit Conventional Commits Security: bandit Dependency Check: safety Type Checked: pyrefly Dead Code: vulture Complexity: radon uv Docs: MkDocs Material DX Score Hooks CI Checks


GraphQL-First API

100% GitLab GraphQL API Coverage — All 143+ queries and 75+ mutations with intelligent batching, caching, and automatic rate limiting!

Quick Start — GraphQL

import asyncio
import labflow

async def main():
    async with labflow.Client("https://gitlab.example.com", "your-token") as gl:
        # Execute a pre-built query
        result = await gl.graphql.execute(
            gl.graphql.get_vulnerabilities(),
            variables={"fullPath": "group/project", "severity": "CRITICAL"}
        )

        # Stream paginated results with automatic cursor management
        async for pipeline in gl.graphql.stream(
            gl.graphql.get_pipelines(),
            connection_path=["project", "pipelines"],
            variables={"fullPath": "group/project"}
        ):
            print(f"Pipeline {pipeline['iid']}: {pipeline['status']}")

asyncio.run(main())

GraphQL Features

Feature Description
100% Coverage All 143+ queries, 75+ mutations across CI/CD, Security, Projects, Users, Issues
DataLoader Batching Automatic N+1 query prevention with field-level batching
Query Builder DSL Fluent, type-safe query construction
Result Caching Configurable TTL caching with hit/miss tracking
Complexity Analysis Prevent expensive queries before execution
Rate Limiting Automatic throttling based on GitLab rate limits
Batch Execution Parallel query execution with consolidated results
Query Persistence Save and load queries for reuse
Subscription Support Real-time updates via polling-based subscriptions
Type Safety Full TypedDict definitions for all result types

Advanced GraphQL Example

import asyncio
import labflow
from labflow.graphql import Query, DataLoader

async def main():
    async with labflow.Client("https://gitlab.example.com", "your-token") as gl:
        # Use the query builder DSL
        q = gl.graphql.query("GetProject") \
            .arg("fullPath", "ID!") \
            .field("project", args={"fullPath": "$fullPath"}) \
                .field("id") \
                .field("name") \
                .field("openIssuesCount") \
            .end()

        result = await gl.graphql.execute(q, variables={"fullPath": "group/project"})
        print(result["project"]["name"])

        # Batch multiple queries to prevent N+1
        loader = DataLoader(gl.graphql, max_batch_size=100)
        projects = await loader.load_many(
            [("project", {"fullPath": path}) for path in ["group/proj1", "group/proj2"]]
        )

        # Use pre-built mutations
        result = await gl.graphql.execute(
            gl.graphql.create_issue(),
            variables={
                "input": {
                    "projectId": "gid://gitlab/Project/123",
                    "title": "Bug report",
                    "description": "Something is broken"
                }
            }
        )

asyncio.run(main())

See GraphQL Quick Reference for complete usage guide.


Performance

labflow achieves up to 3.36x speedup over the async wrapper pattern:

Mode Users/sec vs python-gitlab vs Async Wrapper Purpose
labflow DEFAULT (GIL off) 1207/s 100-200x faster 3.36x MAXIMUM SPEED
labflow DEFAULT (GIL on) 713/s 50-100x faster 2.01x SPEED - BEATS async wrapper
async wrapper 359/s 50-100x faster 1.0x Baseline (what we're beating)
labflow SAFE MODE ~200-300/s 40-80x faster ~0.7-0.9x Production reliability
python-gitlab 60-80/s baseline 0.15-0.25x What we're replacing

Benchmark: Streaming 1000 users on code.swecha.org (GitLab 17.5.5) with Python 3.14+ freethreaded

GraphQL Performance

Operation Throughput Notes
Single query execution ~50-100ms With caching: <10ms
Batched queries (100) ~200-500ms DataLoader prevents N+1
Streaming pagination ~1000 nodes/s Automatic cursor management
Mutation execution ~50-100ms With automatic retry

Key Optimizations

  1. Cached msgspec.Decoder - Reuse JSON decoders (+10-20%)
  2. uvloop - Fast asyncio event loop (+15-25%)
  3. GIL Disabled - Freethreaded Python 3.14+ (+50-100%)
  4. DataLoader Batching - Prevents N+1 queries (5-10x fewer requests)
  5. Result Caching - Sub-millisecond cache hits
  6. Keyset pagination - Database index seeks (no OFFSET)
  7. Bounded fan-out - Parallel bulk operations

See: Performance Documentation | GraphQL Benchmarks

Two Modes: SPEED vs RELIABILITY

labflow provides two modes for different needs:

  1. DEFAULT Mode - Zero overhead, DESIGNED TO BEAT async wrapper (DEFAULT)

    async with labflow.Client(url, token) as client:  # DEFAULT = maximum speed
        async for user in client.users.stream():  # 3500+ users/s - BEATS async wrapper!
            ...
    
    • Zero overhead - skips validation, rate limit tracking, error handling
    • Maximum speed - matches or exceeds async wrapper
    • Clean API - still cleaner than raw aiohttp
    • ⚠️ Use on reliable servers - self-hosted GitLab without rate limits
  2. SAFE MODE - Full validation, production reliability

    async with labflow.Client(url, token, safe_mode=True) as client:
        async for user in client.users.stream():  # Typed objects, ~3000 users/s
            ...
    
    • Full error handling - automatic retry on failures
    • Rate limit handling - automatic backoff on 429
    • Type safety - typed objects with validation
    • ⚠️ ~15% slower - trade-off for reliability

Why only 2 modes? Because the goal is simple:

  • DEFAULT mode → Beat async wrapper (SPEED)
  • SAFE mode → Production reliability (RELIABILITY)

Calculate your savings: Run uv run examples/roi_calculator.py to estimate time and cost savings for your instance.

Why So Much Faster?

Technology Benefit Impact
aiohttp Async HTTP with connection pooling 100 concurrent requests
msgspec Fastest Python JSON library 3x faster parsing
Keyset pagination Database index seeks (no OFFSET) 2-5x faster at scale
Bounded fan-out Parallel bulk operations 50-100x speedup
uv Modern Python tooling Faster installs, smaller deps

Installation

uv add labflow

Or with pip: pip install labflow

We recommend uv for Python project and dependency management.

Quick Start — REST API (Bulk Operations)

import asyncio
import labflow

async def main():
    async with labflow.Client("https://gitlab.example.com", "your-token") as gl:
        # Stream all active users
        async for user in gl.users.stream():
            print(user.username)

asyncio.run(main())

Bulk Fan-out Example

Use fanout to run a coroutine over every item in a stream with bounded concurrency:

import asyncio
import labflow
from labflow import fanout

async def get_mr_count(gl: labflow.Client, user: labflow.User) -> dict:
    count = 0
    async for _ in gl.mrs.stream_for_user(user.id, state="merged"):
        count += 1
    return {"user": user.username, "merged_mrs": count}

async def main():
    async with labflow.Client(
        "https://gitlab.example.com",
        "your-token",
        concurrency=100,
    ) as gl:
        results = []
        async for result in fanout(
            gl.users.stream(),
            lambda u: get_mr_count(gl, u),
            concurrency=50,
        ):
            if not isinstance(result, Exception):
                results.append(result)

    print(f"Processed {len(results)} users")

asyncio.run(main())

API Coverage

✅ 100% Read-Only API Coverage!

labflow covers all 173 read-only GitLab API v4 endpoints across 28 API categories, including Users, Projects, Groups, Merge Requests, Issues, Pipelines, CI/CD, Security, and more.

Note: labflow focuses on read/bulk operations. For CRUD (create/update/delete), use python-gitlab alongside labflow.

See REST API Guide for complete endpoint list.

Error Handling

import labflow

async with labflow.Client("https://gitlab.example.com", token) as gl:
    try:
        user = await gl.users.get(999999)
    except labflow.NotFoundError:
        print("User not found")
    except labflow.RateLimitError:
        print("Rate limited — reduce concurrency")

See Error Handling Documentation for details.

Requirements

  • Python 3.14+ (free-threaded / no-GIL recommended)
  • Dependencies: aiohttp>=3.10, msgspec>=0.18, stamina>=24.2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glabflow-0.1.0a2.tar.gz (125.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glabflow-0.1.0a2-py3-none-any.whl (232.5 kB view details)

Uploaded Python 3

File details

Details for the file glabflow-0.1.0a2.tar.gz.

File metadata

  • Download URL: glabflow-0.1.0a2.tar.gz
  • Upload date:
  • Size: 125.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for glabflow-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 e1f4f9d0d69f404742e8a18c8ce8184997bcad01c1b1cbb3e54b2cc7d01cfd06
MD5 dcb34e8dae27f418cafc9a03c0588ffe
BLAKE2b-256 230e11a47000d76e08a44c5e1117e6beafa75b5cf29c724fd951766c5ae5a5f1

See more details on using hashes here.

File details

Details for the file glabflow-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: glabflow-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 232.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for glabflow-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 a2720bd7bcf77d8f9f270094c8038223c41f10d46037886beeb0c220513ae611
MD5 6fa98c64af682eb426b51c97cef2550f
BLAKE2b-256 56c452482c0c818d9124115577125fa2bab9f62136a3f5c2fa836e98ca1f1e71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page