Skip to main content

Modules to convert different types of files using AI based validations and conversions.

Project description

datascav-switch

Python LangChain OpenAI License: MIT

datascav-switch is a Python package for intelligent document format conversion, leveraging generative AI (OpenAI) and a scalable architecture. This project is part of a suite of tools for automation, data extraction, and transformation.


Main Features

  • PDF to Markdown conversion with layout preservation
  • Support for multiple input formats (file, URL, base64, bytes)
  • Parallel processing and dynamic logging
  • Detailed token tracking
  • Native integration with LangChain and tracing via LangSmith

Installation

pip install datascav-switch

Requirements:

  • Python 3.12+
  • OpenAI API key (OPENAI_API_KEY)

Quick Start

from scav_switch.converters.pdf import ScavToMarkdown
scav = ScavToMarkdown(model='gpt-4.1', verbose=True)
markdown = scav.dig('/path/to/file.pdf')
print(markdown)

For complete examples and detailed documentation, see the docs/ folder and the notebooks for each module.


Documentation


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datascav_switch-1.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datascav_switch-1.1.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file datascav_switch-1.1.0.tar.gz.

File metadata

  • Download URL: datascav_switch-1.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datascav_switch-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a02ecf2a7dd9f9cdd58e9c356cf65a47fb070dac69cf3263092c9fc2f5ed454d
MD5 c6fee5f992d1a5e8ec32cec84184343f
BLAKE2b-256 7e4fc5defa67f678cc81e41e8476cbae688f47b2c65de69d85f44c4d78bf8076

See more details on using hashes here.

Provenance

The following attestation bundles were made for datascav_switch-1.1.0.tar.gz:

Publisher: python-publish.yml on datascav/datascav-switch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datascav_switch-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for datascav_switch-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74da41610be456ccc3895a1e3089b2a6911ebef7082ac83f98d9ede9d6314f1a
MD5 7ea19cb7720aa890642983a5a5729cac
BLAKE2b-256 5f75ae5cc33ba339ac71c39bf88d143a0958e093de3467ecfa5bee0e73f846cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for datascav_switch-1.1.0-py3-none-any.whl:

Publisher: python-publish.yml on datascav/datascav-switch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page