A High-Performance, Production-Ready Python Implementation of C# LINQ with Deferred Execution.
Project description
Python PyLINQ (linqex)
A High-Performance, Production-Ready Python Implementation of C# LINQ with Deferred Execution.
📋 About the Project
🚀 Why PyLINQ?
Data manipulation in Python often leads to highly nested comprehensions, unreadable functional chains (map, filter, reduce), or unnecessary memory overhead when processing large data streams.
linqex brings the elegance and power of C# LINQ (Language Integrated Query) directly into the Python ecosystem. It allows you to query, transform, and manipulate iterable sequences using a fluent, declarative syntax while maintaining absolute type safety and phenomenal execution speeds.
🚀 The Power of Deferred Execution (Lazy Evaluation)
Standard Python list comprehensions compute the entire result set in memory at once. If you only need the first 3 matching elements from a 10 GB log file, loading it all into memory is disastrous.
linqex is built on a pure lazy-evaluation architecture using native Python yield generators and the C-based itertools library. The data pipeline you define (e.g., .where().select().order_by()) is never executed until a terminal operation like .to_list(), .first(), or .count() is invoked. This results in an $O(1)$ memory footprint, unlocking the ability to process massive datasets seamlessly.
✨ Key Features
- 100% C# LINQ Parity: Supports almost all LINQ operators from .NET 8, including modern additions like
.chunk(),.max_by(), and.distinct_by(). - Deferred Execution: Chain as many operations as you want. The engine only computes exactly what it needs, exactly when it needs it.
- Pythonic Fast-Paths: If you pass an in-memory sequence (like a
listortuple), methods like.count(),.element_at(), and.reverse()bypass O(N) iterations and execute instantly in O(1) constant time leveraging Python's__len__and__getitem__. - Zero Overhead Memory: Utilizes strict
__slots__across all classes, eliminating dynamic dictionary allocations and keeping memory usage razor-thin even when spawning millions of groups or ordered states. - Strict Exception Parity: Replicates C#'s robust exception behavior. Operations like
.single()throw exceptions on duplicates, and.to_dict()fiercely guards against silent key overwrites, ensuring data integrity. - Absolute Type Safety: Meticulously annotated with Python
typinggenerics (Generic[T],TypeVar). It provides flawless IDE autocomplete (VS Code, PyCharm) and fully supports static analyzers likemypy. - Stable Multi-Level Sorting: Offers
.order_by().then_by_descending()chaining without re-evaluating the source, natively leveraging Python's lightning-fast Timsort algorithm.
⚙️ Architectural Notes
Engineering facts developers need to know when using this library:
- The Generator Exhaustion Reality:
Python generators can only be traversed once. If you pass a generator expression
(x for x in ...)intoEnumerableand execute a terminal operation like.count(), the generator is consumed. A subsequent.to_list()will return an empty array. To perform multiple terminal operations, ensure you pass an in-memory collection (like alist) to the engine or explicitly call.to_list()first. - Terminal vs. Intermediate Operations:
Methods like
where,select, andskipare Intermediate (they return a new Enumerable and do no work). Methods liketo_list,count,sum, andfirstare Terminal (they force the evaluation of the pipeline). - Lookup vs. Dictionary:
In LINQ, a
Dictionarymaps one key to one value, while aLookupmaps one key to a collection of values.linqexstrictly follows this. Furthermore, requesting a non-existent key from a.to_lookup()result returns an emptyEnumerableinstead of throwing aKeyError, making grouped data access incredibly safe.
🚀 Getting Started
🛠️ Dependencies
- No external dependencies.
- Only Python Standard Library (
itertools,collections,functools,typing). - Fully compatible with Python 3.9+.
📦 Installation
The library has zero external dependencies and works natively with Python's core toolkit.
-
Clone the repository
git clone https://github.com/TahsinCr/python-linqex.git
-
Install via PIP
pip install linqex
💻 Usage Examples
1. Standard Data Transformation & Filtering
Cleanly filter, sort, and project data without nested comprehensions.
from linqex import Enumerable
data = [
{"name": "Alice", "age": 28, "role": "Dev"},
{"name": "Bob", "age": 35, "role": "HR"},
{"name": "Charlie", "age": 42, "role": "Dev"},
{"name": "Dave", "age": 22, "role": "Dev"}
]
# Pipeline is lazy. No iteration happens yet.
devs = (Enumerable(data)
.where(lambda x: x["role"] == "Dev")
.where(lambda x: x["age"] > 25)
.order_by_descending(lambda x: x["age"])
.select(lambda x: x["name"]))
# Terminal operation executes the pipeline
print(devs.to_list())
# Output: ['Charlie', 'Alice']
2. Aggregations and Fast-Paths
Finding the maximum element based on a specific property, similar to .MaxBy() in C#.
from linqex import Enumerable
inventory = [
{"id": 1, "product": "Laptop", "price": 1200},
{"id": 2, "product": "Mouse", "price": 45},
{"id": 3, "product": "Monitor", "price": 300}
]
stream = Enumerable(inventory)
# Finds the actual dictionary object of the most expensive item
most_expensive = stream.max_by(lambda x: x["price"])
print(most_expensive["product"]) # Output: Laptop
# O(1) Fast-path count execution since the source is a List
total_items = stream.count()
3. Massive Data Chunking (Memory Safe)
Process millions of records in chunks for database batch inserts without blowing up the RAM.
from linqex import Enumerable
def massive_database_stream():
for i in range(1, 1000000):
yield {"id": i, "status": "pending"}
stream = Enumerable(massive_database_stream())
# Groups data into lists of 500 items lazily
batches = stream.chunk(500)
for batch in batches.take(3): # Only process the first 3 batches
print(f"Executing SQL bulk insert for {len(batch)} items...")
4. Grouping & Analytics (group_by)
Easily group data by a specific key and perform aggregate calculations on the sub-groups.
from linqex import Enumerable
orders = [
{"customer": "C1", "amount": 100},
{"customer": "C2", "amount": 50},
{"customer": "C1", "amount": 200},
{"customer": "C3", "amount": 300}
]
report = (Enumerable(orders)
.group_by(lambda o: o["customer"])
.select(lambda group: {
"customer": group.key,
"total_spent": group.sum(lambda x: x["amount"]),
"order_count": group.count()
})
.to_list())
# [{'customer': 'C1', 'total_spent': 300, 'order_count': 2}, ...]
5. Relational Inner Joins in Memory
Merge two disparate data sources safely and efficiently.
from linqex import Enumerable
employees = [{"id": 1, "name": "Alice", "dept_id": 10}, {"id": 2, "name": "Bob", "dept_id": 20}]
departments = [{"id": 10, "name": "Engineering"}, {"id": 20, "name": "Sales"}]
joined_data = Enumerable(employees).join(
inner=departments,
outer_key=lambda e: e["dept_id"],
inner_key=lambda d: d["id"],
selector=lambda e, d: f"{e['name']} works in {d['name']}"
).to_list()
# ['Alice works in Engineering', 'Bob works in Sales']
🙏 Acknowledgments and License
This project is fully open-source under the MIT License (License).
- PyPI: linqex on PyPI
- Source Code: Tahsincr/python-linqex
If you find any bugs or want to make an architectural contribution, feel free to open an Issue or submit a Pull Request on GitHub!
📫 Contact
X: @TahsinCrs
Linkedin: @TahsinCr
Email: TahsinCrs@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file linqex-2.0.tar.gz.
File metadata
- Download URL: linqex-2.0.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c38c103461bfe1f74e3dff4204c9e0c63b1a0de0c6b3eb217c35f6ece09bfccd
|
|
| MD5 |
8c61450b09012258cbf626536d99b937
|
|
| BLAKE2b-256 |
2fafb89fc86c1fbb7aeb0b75fe1ceda17296e362e8e136d277f0e9422c490f68
|
File details
Details for the file linqex-2.0-py3-none-any.whl.
File metadata
- Download URL: linqex-2.0-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
653d8effd913ea76170880688b43e3f1e0d81fece838f452f3a269b4da440ebf
|
|
| MD5 |
b14ead50aeffa269abacc26f97433daf
|
|
| BLAKE2b-256 |
90be21f16edd3b7027fc13ff1b6438898f9aa4881b7234b19a2c054393e49a00
|