Skip to main content

easily convert json like strings to dictionaries

Project description

String2Dict

string2dict is a Python library designed to transform complex strings into Python dictionaries. It is particularly useful when working with text-based outputs from language models (LLMs) that need to be parsed into valid JSON objects or Python dictionaries. This class provides functionality for cleaning, sanitizing, and parsing such text data efficiently.

Since LLMs often return data with extra characters, formatting inconsistencies, or embedded code markers, it can be challenging to directly parse the output into JSON or dictionaries. String2Dict aims to simplify this process by handling common formatting issues and providing a robust parsing mechanism.

Key Features

  • Strips Whitespace: Removes unnecessary leading and trailing whitespace from strings.
  • Removes Embedded Markers: Cleans code markers like json or ``` to ensure the string is ready for parsing.
  • Ensures Valid JSON Braces: Adjusts strings to ensure they start and end with curly braces ({}).
  • Supports JSON and Python Parsing: Tries to parse strings using json.loads first, and falls back to ast.literal_eval if needed.
  • Handles Multiple Dictionaries: Extracts and parses multiple dictionary-like strings from a single input.

Installation & Usage

To use String2Dict:

pip install string2dict 
from string2dict import String2Dict
s2d=String2Dict()
llm_output = "```json\n{\"name\": \"ChatGPT\", \"version\": \"4.0\"}\n```"
parsed_dict= s2d.run(llm_output)
print(parsed_dict)

Output:

{'name': 'ChatGPT', 'version': '4.0'}

Example 2: Parsing Multiple Dictionaries from a String

# Input string containing multiple dictionaries
llm_output = """
```json
{"name": "ChatGPT", "version": "4.0"}
{"name": "GPT-3", "version": "3.0"}
```"""

# Extract and convert each dictionary into a list of dictionaries
parsed_dicts = s2d.string_to_dict_list(llm_output)
print(parsed_dicts)

Output:

[
    {'name': 'ChatGPT', 'version': '4.0'},
    {'name': 'GPT-3', 'version': '3.0'}
]

Methods

1. strip_surrounding_whitespace(string: str) -> str

  • Strips leading and trailing whitespace from the input string.
  • Args: string (str) - The input string.
  • Returns: Stripped string.

2. remove_embedded_markers(string: str) -> str

  • Removes embedded markers like json and other code block markers.
  • Args: string (str) - The input string.
  • Returns: Cleaned string.

3. ensure_string_starts_and_ends_with_braces(string: str) -> str

  • Ensures the string starts and ends with curly braces ({}).
  • Args: string (str) - The input string.
  • Returns: Adjusted string.

4. parse_as_json(string: str) -> dict

  • Attempts to parse the string as JSON using json.loads.
  • Args: string (str) - The input JSON string.
  • Returns: Parsed dictionary.

5. parse_with_literal_eval(string: str) -> dict

  • Attempts to parse the string using Python's ast.literal_eval.
  • Args: string (str) - The input string.
  • Returns: Parsed dictionary.

6. run(string: str) -> dict

  • Processes a string through all cleaning and parsing steps, returning a parsed dictionary.
  • Args: string (str) - The input string.
  • Returns: Parsed dictionary or None if parsing fails.

7. string_to_dict_list(string: str) -> list

  • Extracts multiple dictionaries from a string and converts each to a Python dictionary.
  • Args: string (str) - The input string containing one or more dictionaries.
  • Returns: A list of parsed dictionaries, or None if parsing fails.

Logging

The String2Dict class supports logging for easier debugging. Set the debug parameter to True when initializing the class to enable detailed logging.

Error Handling

The class handles parsing errors gracefully:

  • If json.loads fails, it attempts to use ast.literal_eval.
  • If both methods fail, it logs an error and returns None.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

string2dict-0.0.8.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

string2dict-0.0.8-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file string2dict-0.0.8.tar.gz.

File metadata

  • Download URL: string2dict-0.0.8.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for string2dict-0.0.8.tar.gz
Algorithm Hash digest
SHA256 d6641dfc0222bdbf90620a1ff634688e094adcf49e4e30284434102c2b2cd635
MD5 ca15aacdc08cd28493767193269d0aae
BLAKE2b-256 2a0de117100b756c79d43c36ae25e16482ef9c8855083a0ded09ba75ee8a0670

See more details on using hashes here.

File details

Details for the file string2dict-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: string2dict-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for string2dict-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 18ecd44b3e0ec5fccd6b65273d08000c42f59f0a54f88e9128a1d33d65f888a8
MD5 eb33ca3637f3f6fa88f74306c01c7f2f
BLAKE2b-256 fd5779a5967dbcae9939a24032fd4487b0727b09dc9d2ee601a60eaa7e95e7a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page