Skip to main content

A Python library for various big data structures including graphs, trees, and more.

Project description

BigDataLib

BigDataLib is a Python library that provides direct implementations of various big data structures, allowing you to work with these structures efficiently using simple functions. This library includes support for graphs, trees, and more.

Features

  • Adjacency Matrix: Efficient representation and manipulation of graph edges.
  • Adjacency List: Lightweight representation of graph edges with fast access.
  • Trie (Prefix Tree): Fast prefix-based search and insertion for strings.
  • Segment Tree: Efficient range queries and updates.
  • Fenwick Tree (Binary Indexed Tree): Fast updates and prefix sum queries.
  • Red-Black Tree: Self-balancing binary search tree with efficient insertions and deletions.
  • AVL Tree: Self-balancing binary search tree maintaining height balance.
  • Skip List: Probabilistic data structure for fast search, insertion, and deletion.
  • B-Tree: Balanced tree used for efficient data retrieval in databases.
  • B+ Tree: Similar to B-Tree but with all values stored in leaf nodes.

Installation

You can install the library using pip:

pip install bigdata_lib

Usage

1.Adjacency Matrix

The adjacency matrix is a 2D array used to represent graph edges. It is memory-intensive but straightforward.

from bigdata_lib.adjacency_matrix import create_adjacency_matrix, add_edge_matrix, display_matrix

matrix = create_adjacency_matrix(5)
add_edge_matrix(matrix, 1, 2)
display_matrix(matrix)

2.Adjacency List

The adjacency list is a dictionary where keys are nodes and values are lists of adjacent nodes.

from bigdata_lib.adjacency_list import add_edge_list, display_list

graph = {}
add_edge_list(graph, 1, 2)
display_list(graph)

3.Trie (Prefix Tree)

A trie is a tree used for storing strings in a way that allows for fast retrieval of prefixes.

from bigdata_lib.trie_structure import Trie, TrieNode

trie = Trie()
trie.insert("example")
print(trie.search("example"))  # Output: True
print(trie.search("test"))     # Output: False

4.Segment Tree

A segment tree is used for storing intervals or segments, allowing efficient querying and updating.

from bigdata_lib.segment_tree import SegmentTree

arr = [1, 3, 5, 7, 9, 11]
segment_tree = SegmentTree(arr)
print(segment_tree.query(1, 3))  # Output: 15
segment_tree.update(1, 10)
print(segment_tree.query(1, 3))  # Output: 22

5.Fenwick Tree (Binary Indexed Tree)

A Fenwick Tree is used for efficient range queries and updates.

from bigdata_lib.fenwick_tree import FenwickTree

arr = [1, 3, 5, 7, 9, 11]
fenwick_tree = FenwickTree(arr)
print(fenwick_tree.query(3))  # Output: 16
fenwick_tree.update(1, 10)
print(fenwick_tree.query(3))  # Output: 26

6.Red-Black Tree

A Red-Black Tree is a self-balancing binary search tree with specific rules to ensure balanced height.

from bigdata_lib.red_black_tree import RedBlackTree

rb_tree = RedBlackTree()
rb_tree.insert(10)
rb_tree.insert(20)
rb_tree.insert(15)
# Implement traversal or other operations as needed

7.AVL Tree

An AVL Tree is a self-balancing binary search tree where the height difference between left and right subtrees is maintained.

from bigdata_lib.avl_tree import AVLTree

avl_tree = AVLTree()
root = None
root = avl_tree.insert(root, 10)
root = avl_tree.insert(root, 20)
root = avl_tree.insert(root, 15)
# Implement traversal or other operations as needed

8.Skip List

A Skip List is a probabilistic data structure that allows fast search, insertion, and deletion operations.

from bigdata_lib.skip_list import SkipList

skip_list = SkipList(max_level=3, p=0.5)
skip_list.insert(3)
skip_list.insert(6)
skip_list.insert(7)
print(skip_list.search(6))  # Output: True
print(skip_list.search(4))  # Output: False

9.B-Tree

A B-Tree is a balanced tree used for efficient data retrieval and insertion, often used in database systems.

from bigdata_lib.b_tree import BTree

b_tree = BTree(t=2)
b_tree.insert(10)
b_tree.insert(20)
b_tree.insert(5)
# Implement traversal or other operations as needed

10.B+ Tree

A B+ Tree is similar to a B-Tree but with all values stored in leaf nodes, allowing efficient range queries.

from bigdata_lib.bplus_tree import BPlusTree

bplus_tree = BPlusTree(t=2)
bplus_tree.insert(10)
bplus_tree.insert(20)
bplus_tree.insert(5)
# Implement traversal or other operations as needed

Contributing

Contributions are welcome! Please submit a pull request or open an issue to suggest improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigdata_lib-0.1.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

bigdata_lib-0.1-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file bigdata_lib-0.1.tar.gz.

File metadata

  • Download URL: bigdata_lib-0.1.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for bigdata_lib-0.1.tar.gz
Algorithm Hash digest
SHA256 b6b4cf0a50bff1d986e7ecddc832ded0f5fbb2f2aa42156cf00f111a4159a665
MD5 bfa068a12e8262cd4b46d01d0150810d
BLAKE2b-256 a4a4fca5446b136ed075968940b081b1f3e357a7eae1473a2b38bb0350e2a013

See more details on using hashes here.

File details

Details for the file bigdata_lib-0.1-py3-none-any.whl.

File metadata

  • Download URL: bigdata_lib-0.1-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for bigdata_lib-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1b45566c982668381166c63f01f672079a0770041e5bfa5ddd4bcd7370a0a30f
MD5 6b7aeb8fd67c1c8f89f60f98c6c202f7
BLAKE2b-256 703cf66c306bede9fceae15c875a35a7863727c65895a7556ef8a0d29a77ada3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page