Skip to main content

Parser for all.

Project description

🌊 AnyParser

pypi_status Commit activity Slack

AnyParser provides an API to accurately extract unstructured data (e.g., PDFs, images, charts) into a structured format.

:seedling: Set up your AnyParser API key

To get started, generate your API key from the Sandbox Account Page. Each account comes with 100 free pages.

⚠️ Note: The free API is limited to 10 pages/call.

For more information or to inquire about larger usage plans, feel free to contact us at info@cambioml.com.

To set up your API key (CAMBIO_API_KEY), follow these steps:

  1. Create a .env file in the root directory of your project.
  2. Add the following line to the .env file:
CAMBIO_API_KEY=0cam************************

:computer: Installation

1. Set Up a New Conda Environment and Install AnyParser

First, create and activate a new Conda environment, then install AnyParser:

conda create -n any-parse python=3.10 -y
conda activate any-parse
pip3 install any-parser

2. Create an AnyParser Instance Using Your API Key

Use your API key to create an instance of AnyParser. Make sure you’ve set up your .env file to store your API key securely:

import os
from dotenv import load_dotenv
from any_parser import AnyParser

# Load environment variables
load_dotenv(override=True)

# Get the API key from the environment
example_apikey = os.getenv("CAMBIO_API_KEY")

# Create an AnyParser instance
ap = AnyParser(api_key=example_apikey)

3. Run Synchronous Extraction

To extract data synchronously and receive immediate results:

# Extract content from the file and get the markdown output along with processing time
markdown, total_time = ap.extract(file_path="./data/test.pdf")

4. Run Asynchronous Extraction

For asynchronous extraction, send the file for processing and fetch results later:

# Send the file to begin asynchronous extraction
file_id = ap.async_extract(file_path="./data/test.pdf")

# Fetch the extracted content using the file ID
markdown = ap.async_fetch(file_id=file_id)

:scroll: Examples

Check out these examples to see how you can utilize AnyParser to extract text, numbers, and symbols in fewer than 10 lines of code!

Extract all text and layout from PDF into Markdown Format

Are you an AI engineer looking to accurately extract both the text and layout (e.g., table of contents or Markdown headers hierarchy) from a PDF? Check out this 3-minute notebook demo.

Extract a Table from an Image into Markdown Format

Are you a financial analyst needing to accurately extract numbers from a table within an image? Explore this 3-minute notebook example.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

any_parser-0.0.18.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

any_parser-0.0.18-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file any_parser-0.0.18.tar.gz.

File metadata

  • Download URL: any_parser-0.0.18.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.14 Darwin/24.1.0

File hashes

Hashes for any_parser-0.0.18.tar.gz
Algorithm Hash digest
SHA256 bcd8834fa5f4ea3a1416b93591ed02c046d9b7a9989a86d0e1dfd8596ac71c5d
MD5 8bc2281d5cf00d62886b3962b9601ef8
BLAKE2b-256 a3f5e8b541a02468b87133221c3f4c6fc769d7bf4b33b862aa6b87b630527435

See more details on using hashes here.

File details

Details for the file any_parser-0.0.18-py3-none-any.whl.

File metadata

  • Download URL: any_parser-0.0.18-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.14 Darwin/24.1.0

File hashes

Hashes for any_parser-0.0.18-py3-none-any.whl
Algorithm Hash digest
SHA256 cd2df73bebd1ef956a2b7d4d65595c1f4c66920928ffac69fa8c042dade73b82
MD5 cd28497faa1c9919dac4a561ecaacac3
BLAKE2b-256 1b9da40bd4f197f2668975d7478904cacf28ba6ae501d06edb81d5f6291beb8b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page