Skip to main content

Modules to convert different types of files using AI based validations and conversions.

Project description

datascav-switch

Python LangChain OpenAI License: MIT

datascav-switch is a Python package for intelligent document format conversion, leveraging generative AI (OpenAI) and a scalable architecture. This project is part of a suite of tools for automation, data extraction, and transformation.


Main Features

  • PDF to Markdown conversion with layout preservation
  • Support for multiple input formats (file, URL, base64, bytes)
  • Parallel processing and dynamic logging
  • Detailed token tracking
  • Native integration with LangChain and tracing via LangSmith

Installation

pip install datascav-switch

Requirements:

  • Python 3.10+
  • OpenAI API key (OPENAI_API_KEY)

Quick Start

from scav_switch.converters.pdf import ScavToMarkdown
scav = ScavToMarkdown(model='gpt-4.1', verbose=True)
markdown = scav.dig('/path/to/file.pdf')
print(markdown)

For complete examples and detailed documentation, see the docs/ folder and the notebooks for each module.


Documentation


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datascav_switch-1.0.3.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datascav_switch-1.0.3-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file datascav_switch-1.0.3.tar.gz.

File metadata

  • Download URL: datascav_switch-1.0.3.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for datascav_switch-1.0.3.tar.gz
Algorithm Hash digest
SHA256 3d23429f1dc43f43b0682ee8f4ee1b29c64b7a0bdd2d02868fe2ab9f9b3613f1
MD5 7120b46d2b583bc60f781c951c215bdc
BLAKE2b-256 eb622beb00f60480368a62dad34d301146f67a4eef73f58fe2a9445ca5e4f8ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for datascav_switch-1.0.3.tar.gz:

Publisher: python-publish.yml on datascav/datascav-switch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datascav_switch-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for datascav_switch-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 437e7332b74e69083d1574babb320aeb2be98eb34dd7b312ccc6dba7f4839a27
MD5 cf9c4aaa8b467ce5b49ccfafe8f86f39
BLAKE2b-256 44736155a8261974c3b7054824f0abf13dec5ae15f7abc5412fcbdbf92b83743

See more details on using hashes here.

Provenance

The following attestation bundles were made for datascav_switch-1.0.3-py3-none-any.whl:

Publisher: python-publish.yml on datascav/datascav-switch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page