Community Version of the B2B Antigravity PySpark Framework. Essential utilities for AWS FinOps and Cloud cost optimization.

These details have not been verified by PyPI

Project description

🚀 Antigravity Lite (FinOps & AWS Glue Tools)

AWS Financial Auditor and Smart S3 Manager for PySpark Ecosystems

🛑 The Silent AWS Glue Killer: Spark's Catalyst Optimizer

Have you ever wondered why your massive PySpark cluster just hangs for hours, consuming 100% CPU without writing a single byte of data when processing a Wide Dataframe?

Many Data Engineers blame data skew or bad partitioning, panicking and upscaling AWS Glue Worker instances to expensive G.4X or G.8X tiers. But throwing money at RAM is not the solution. The architectural solution is not buying more RAM; it's isolating the math.

📦 Installation

pip install antigravity-lite

🛠 Included Open-Source Tools

1. Smart S3 Renamer (`S3Finalizer` - Universal API)

Tired of PySpark polluting your Datalake with part-00000... strings and empty _SUCCESS files? S3Finalizer is a native Boto3 utility that scans raw outputs and renames them sequentially and cleanly without breaking cluster concurrency. It works with Apache Spark, AWS Glue DynamicFrames, and standard S3 files seamlessly.

from antigravity_lite.io.s3_finalizer import S3Finalizer

finalizer = S3Finalizer(bucket_name="my-corporate-datalake")

# Automatically re-sequence and format any outputs natively
finalizer.sequence_files(
    s3_prefix="raw_zone/sales/",
    pattern="ENTERPRISE_REPORT_{seq:04d}.parquet",
    starts_with="",      # Optional: Target specific outputs (e.g. "0000_part")
    ends_with=".parquet",# Optional: Ignore non-parquet files
    contains="part"      # Optional: Filter
)
# Magic Output: ENTERPRISE_REPORT_0001.parquet

2. AWS Glue FinOps Auditor (`AgAuditor`)

Inject this standalone tool to scan your AWS CloudWatch telemetry and compute exactly how many thousands of dollars you are wasting each month on inflated AWS Worker instances just to keep Spark's Catalyst Optimizer from crashing.

from antigravity_lite.auditor.finops import AgAuditor

# Scan the cluster and generate a high-fidelity FinOps report
AgAuditor.run_aws_audit(
    region="us-east-1", 
    dias_analisis=7, 
    anonimize=True  # Optional: Masks sensitive Job names for safe sharing
)

Premium TUI Output: Generates professional ASCII tables with ANSI colors and Unicode borders (LinkedIn-ready screenshots).
Resource Anonymization: Deterministic hashing to mask internal AWS naming conventions and project IDs.
OOM Risk Detection: Automatically flags Jobs where JVM Heap usage exceeds 85%, indicating severe architectural instability.

3. S3 Directory Explorer (`AgS3DirectoryLister`)

Tired of discovering that AWS S3 is a flat namespace and doesn't have real "folders"? Listing hierarchies in Boto3 using the CommonPrefixes property is frustrating. AgS3DirectoryLister abstracts all the pain of native pagination and returns a clean logical "folder" tree.

from antigravity_lite.io import AgS3DirectoryLister

explorer = AgS3DirectoryLister()
child_folders = explorer.list_folders("s3://your-bucket/datalake/bronze/")

# Imprime un cómodo árbol en tu terminal emulando un `ls`
explorer.print_tree("s3://your-bucket/datalake/bronze/")

# Cuenta archivos exactos en toda la jerarquía
total_parquet = explorer.count_files("s3://your-bucket/datalake/bronze/", suffix=".parquet")
print(f"Total archivos Parquet: {total_parquet}")

4. Multithreaded S3 Smart Copier (`AgS3SmartCopier`)

Cloning or merging massive Datalakes in S3 using traditional iterative scripts chokes your network and takes all afternoon. Additionally, spinning up a Spark cluster just to "copy data" is a gross waste of AWS billing. AgS3SmartCopier spins up an asynchronous swarm in pure Python ThreadPoolExecutor to transfer thousands of files applying mathematical filters, at a fraction of the time of a conventional Boto3 script.

from antigravity_lite.io import AgS3SmartCopier

copier = AgS3SmartCopier()

# Ultra-fast massive copy without spinning up Spark
copier.copy_path(
    origin_path="s3://data-lake/raw/",
    dest_path="s3://data-lake/historical/",
    starts_with="SALES_2026",
    ends_with=".parquet",    # Filters to ignore hidden trash files
    max_workers=10           # CPU threads fired simultaneously
)

5. Memory Optimizing Chunker (`DataFrameChunkerLite`)

Does your Spark cluster throw Java Heap Space / OutOfMemoryError when saving Wide DataFrames with dozens of columns? DataFrameChunkerLite intercepts Spark's execution graph, truncating the mathematical lineage using Logarithmic Tree-Reduction methodologies so your cluster survives without scaling your AWS infrastructure.

Important Note: This is the Community Edition and is strictly limited to a maximum of 550 columns. If you run this on a wider dataframe, it will safely reject execution.

from antigravity_lite.core import DataFrameChunkerLite

# 1. Provide the wide dataframe and the primary key
chunker = DataFrameChunkerLite(df_crashing, id_cols=["client_id"], chunk_size=20)

def business_logic(chunk_df, index, multiplier):
    # This logic now runs isolated and safe from Catalyst OOM
    # using personal parameters passed via the Chunker
    for c in chunk_df.columns:
        if c != "client_id":
            chunk_df = chunk_df.withColumn(c, chunk_df[c] * multiplier)
    return chunk_df

# 🚀 OPTION A: High-Performance Zero-Shuffle (Recommended)
# This avoids expensive Joins, running in seconds instead of minutes.
df_final = chunker.process_inplace(business_logic, multiplier=1.5)

# 🧩 OPTION B: Standard Split-Join Architecture
# Useful for complex logic that requires processing isolation.
results = chunker.process_chunks(business_logic, multiplier=1.5)
df_final = chunker.join_chunks(results)

6. Massive Binary Stream Concatenator (`AgStreamConcatenator`)

Are you trying to concatenate 100 GBs of CSV fragments into a single file but Python crashes with MemoryError using pandas or open().read()? This utility implements high-performance constant memory O(1) piping. It dynamically transfers chunks of bytes directly to the disk (or S3) keeping a hard limit on RAM usage, while intelligently evading redundant CSV headers.

import glob
from antigravity_lite.io import AgStreamConcatenator, AgS3DirectoryLister

joiner = AgStreamConcatenator()
explorer = AgS3DirectoryLister()

# --- LOCAL TO LOCAL ---
# Pass 500 files dynamically by exploring the local disk
local_files = glob.glob("/Users/data/chunks/*.csv")
joiner.concat_local(
    input_paths=local_files, 
    output_path="massive_combined.csv",
    has_header=True
)

# --- S3 T0 S3 ---
# Gather 1,000 files dynamically traversing S3 without bringing them to local disk
cloud_files = explorer.list_files("s3://lake/raw/sales/", suffix=".csv")
joiner.concat_s3(
    input_s3_uris=cloud_files,
    output_s3_uri="s3://lake/gold/massive_sales.csv",
    has_header=True
)

💎 Commercial Licensing (Antigravity PRO)

The Lite version can tell you you're burning thousands of dollars... Purchasing the Antigravity PRO Enterprise License actually fixes it.

If your AgAuditor report flags an "⚠️ AST/OOM RISK" or your Heap spikes past 85%, you need the DataFrameChunker mathematical engine (Exclusive to the Pro B2B Edition). The enterprise version intercepts Spark's low-level planner and vertically slices the execution plan using Logarithmic Binary Trees (Tree Reduce) to forcibly truncate the AST Lineage. This drops your memory footprint so drastically that you can process half-a-billion operations on tiny G.1X clusters at zero OutOfMemory risk.

💻 Request a Proof-of-Concept or Live Architecture Demo for B2B deployment by connecting via LinkedIn.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.40

Apr 27, 2026

0.1.39

Apr 27, 2026

0.1.38

Apr 27, 2026

0.1.37

Apr 27, 2026

0.1.36

Apr 27, 2026

0.1.35

Apr 27, 2026

0.1.34

Apr 27, 2026

0.1.33

Apr 17, 2026

0.1.32

Apr 17, 2026

0.1.31

Apr 17, 2026

0.1.30

Apr 17, 2026

This version

0.1.28

Apr 17, 2026

0.1.27

Apr 17, 2026

0.1.24

Apr 17, 2026

0.1.23

Apr 17, 2026

0.1.22

Apr 17, 2026

0.1.21

Apr 17, 2026

0.1.20

Apr 15, 2026

0.1.19

Apr 10, 2026

0.1.18

Apr 7, 2026

0.1.16

Apr 7, 2026

0.1.15

Apr 7, 2026

0.1.12

Apr 7, 2026

0.1.10

Apr 7, 2026

0.1.9

Apr 7, 2026

0.1.8

Apr 7, 2026

0.1.7

Apr 7, 2026

0.1.6

Apr 6, 2026

0.1.4

Apr 6, 2026

0.1.3

Apr 1, 2026

0.1.2

Apr 1, 2026

0.1.1

Mar 31, 2026

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antigravity_lite-0.1.28.tar.gz (23.2 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

antigravity_lite-0.1.28-py3-none-any.whl (21.0 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file antigravity_lite-0.1.28.tar.gz.

File metadata

Download URL: antigravity_lite-0.1.28.tar.gz
Upload date: Apr 17, 2026
Size: 23.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for antigravity_lite-0.1.28.tar.gz
Algorithm	Hash digest
SHA256	`a77d15e179342353c98c83627dc760603037b39181696e069657c09364301580`
MD5	`f53cf0f5162a1c447b364b818e7eeed4`
BLAKE2b-256	`a96c500dccc1c910c49e97033abf950e8b0c6b38cf2a2d43e1b5de233900e17e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for antigravity_lite-0.1.28.tar.gz:

Publisher: publish-lite.yml on andresvega925/AntigravityFW

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: antigravity_lite-0.1.28.tar.gz
- Subject digest: a77d15e179342353c98c83627dc760603037b39181696e069657c09364301580
- Sigstore transparency entry: 1329139598
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: andresvega925/AntigravityFW@2ca2431e6c262921948bd89a0dd628c9979db1ee
- Branch / Tag: refs/heads/main
- Owner: https://github.com/andresvega925
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-lite.yml@2ca2431e6c262921948bd89a0dd628c9979db1ee
- Trigger Event: push

File details

Details for the file antigravity_lite-0.1.28-py3-none-any.whl.

File metadata

Download URL: antigravity_lite-0.1.28-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 21.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for antigravity_lite-0.1.28-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c37e0a36326604f0492751158475061971431fed7973b8c3e68db6b65648130`
MD5	`4a9f5cc55917052a988664fd84f32b19`
BLAKE2b-256	`d42f0b2515b3bf0ec1ba448c6f958c8b8d401ed6e7f23ee5656afc8113f88170`

See more details on using hashes here.

Provenance

The following attestation bundles were made for antigravity_lite-0.1.28-py3-none-any.whl:

Publisher: publish-lite.yml on andresvega925/AntigravityFW

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: antigravity_lite-0.1.28-py3-none-any.whl
- Subject digest: 5c37e0a36326604f0492751158475061971431fed7973b8c3e68db6b65648130
- Sigstore transparency entry: 1329139617
- Sigstore integration time: Apr 17, 2026
Source repository:
- Permalink: andresvega925/AntigravityFW@2ca2431e6c262921948bd89a0dd628c9979db1ee
- Branch / Tag: refs/heads/main
- Owner: https://github.com/andresvega925
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-lite.yml@2ca2431e6c262921948bd89a0dd628c9979db1ee
- Trigger Event: push

antigravity-lite 0.1.28

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

🚀 Antigravity Lite (FinOps & AWS Glue Tools)

🛑 The Silent AWS Glue Killer: Spark's Catalyst Optimizer

📦 Installation

🛠 Included Open-Source Tools

1. Smart S3 Renamer (`S3Finalizer` - Universal API)

2. AWS Glue FinOps Auditor (`AgAuditor`)

3. S3 Directory Explorer (`AgS3DirectoryLister`)

4. Multithreaded S3 Smart Copier (`AgS3SmartCopier`)

5. Memory Optimizing Chunker (`DataFrameChunkerLite`)

6. Massive Binary Stream Concatenator (`AgStreamConcatenator`)

💎 Commercial Licensing (Antigravity PRO)

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

antigravity-lite 0.1.28

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

🚀 Antigravity Lite (FinOps & AWS Glue Tools)

🛑 The Silent AWS Glue Killer: Spark's Catalyst Optimizer

📦 Installation

🛠 Included Open-Source Tools

1. Smart S3 Renamer (S3Finalizer - Universal API)

2. AWS Glue FinOps Auditor (AgAuditor)

3. S3 Directory Explorer (AgS3DirectoryLister)

4. Multithreaded S3 Smart Copier (AgS3SmartCopier)

5. Memory Optimizing Chunker (DataFrameChunkerLite)

6. Massive Binary Stream Concatenator (AgStreamConcatenator)

💎 Commercial Licensing (Antigravity PRO)

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

1. Smart S3 Renamer (`S3Finalizer` - Universal API)

2. AWS Glue FinOps Auditor (`AgAuditor`)

3. S3 Directory Explorer (`AgS3DirectoryLister`)

4. Multithreaded S3 Smart Copier (`AgS3SmartCopier`)

5. Memory Optimizing Chunker (`DataFrameChunkerLite`)

6. Massive Binary Stream Concatenator (`AgStreamConcatenator`)