Skip to main content

A versatile Python package for data extraction from JSON-like structures with user-defined format keys, enhanced with synonym retrieval capabilities.

Project description

Complex Parser

Complex Parser is a powerful Python package designed to streamline the process of data extraction from JSON-like structures while also enriching the extracted data with synonym retrieval capabilities. Whether you're working with complex nested JSON data or simple dictionaries, this package provides a flexible and intuitive solution for extracting specific data elements based on user-defined format keys, all while expanding the semantic richness of your data through synonym retrieval.

Features

Data Extraction

  • Structured Data Extraction: Extract specific data elements from nested JSON-like structures based on user-specified format keys.
  • Customizable Format Keys: Define format keys to precisely target the data elements you need, making it adaptable to a wide range of data structures.
  • Efficient Data Parsing: Utilizes efficient algorithms to parse through the data and extract relevant information with minimal computational overhead.
  • Thread Based chuncking: Utilises threads to quickly sort through larger data sets.

Synonym Retrieval

  • Semantic Enrichment: Enhance the semantic richness of your data by retrieving synonyms for key terms using both WordNet and custom synonym lists.
  • Flexible Synonym Loading: Load additional synonyms from custom lists to expand the synonym pool for specific terms, allowing for fine-tuned control over synonym retrieval.

Ease of Use

  • Simple Integration: Integrate seamlessly into your Python projects with an intuitive interface and straightforward usage.
  • Comprehensive Documentation: Detailed documentation and examples provided for easy reference and quick integration into your projects.

Installation

You can install Complex Parser via pip:

pip install complex-parser

Usage:

Here's a simple example demonstrating how to use the package

from complex_parser import extract_data

# Example data
data = {
    "people":[
        {
            "name": "John",
            "age": 30,
            "address": {
                "road": "123 Main St",
                "city": "Anytown"
            }
        }, 
        {
            "name": "Joshua",
            "age": 3100,
            "address": {
                "road": "657 Loud St",
                "city": "Basictown",
                "landmark": "Town Square"
            }
        }, 
        {
            "name": "John",
            "age": 30,
            "location": {
                "road": "8474 Main St",
                "city": "None"
            }
        }, 
        {
            "fullname": "Job",
            "age": 27,
            "destination": {
                "road": "8474 John's St",
                "city": "London"
            }
        }, 
        {
            "unknown": "Job",
            "age": 27,
            "destination": {
                "road": "8474 John's St",
                "city": "London"
            }
        }
    ]
}
format_keys = ["name", "address"]
load_lists= {
    "address":[
        "location"
    ], 
    "name": [
        "fullname"
    ]
}
# Extract data with specified format keys
extracted_data = extract_data(data=data, format_keys=format_keys,load_lists=load_lists)
print(extracted_data)

results:

[{'name': 'John', 'age': 30, 'address': {'road': '123 Main St', 'city': 'Anytown'}}, {'name': 'Joshua', 'age': 3100, 'address': {'road': '657 Loud St', 'city': 'Basictown', 'landmark': 'Town Square'}}, {'name': 'John', 'age': 30, 'location': {'road': '8474 Main St', 'city': 'None'}}, {'fullname': 'Job', 'age': 27, 'destination': {'road': "8474 John's St", 'city': 'London'}}]

License:

This project is licensed under the Mozilla Public License Version 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit bug reports, feature requests, or pull requests on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

complex_parser-0.0.2.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

complex_parser-0.0.2-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file complex_parser-0.0.2.tar.gz.

File metadata

  • Download URL: complex_parser-0.0.2.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for complex_parser-0.0.2.tar.gz
Algorithm Hash digest
SHA256 10b4e8b485d09575864e795ddff157ec32999d6ac8de83550728d4f968c0f9cc
MD5 3b55209dcd571c7d75e697ede5878a89
BLAKE2b-256 7aec8ae1818bad930516b5e0ea85e692483131994006e396e5089fadabfc8ba8

See more details on using hashes here.

File details

Details for the file complex_parser-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: complex_parser-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for complex_parser-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0099cf21db3f920093dede21b04d0ddb318a86d3e54ffa4ed4424cb9499bd17e
MD5 950a144939e2009ff8e3100f141d89b9
BLAKE2b-256 25d4aabadd38b4987116e5beeb5d183bfbc6264f9db0f5d74f03060f42e79c6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page