Parser for all.
Project description
🌊 AnyParser
AnyParser provides an API to accurately extract your unstructured data (e.g. PDF, images, charts) into structured format.
:seedling: Set up your AnyParser API key
AnyParser is still in private beta. If you are interested in testing our document models, please reach out at info@cambioml.com for a FREE API key.
To set up your API key CAMBIO_API_KEY
, you will need to :
- create a
.env
file in your root folder; - add the following one line to your `.env file:
CAMBIO_API_KEY=17b************************
:computer: Installation
conda create -n any-parse python=3.10 -y
conda activate any-parse
pip3 install any-parser
bashfile usage
To use AnyParser via curl
requests, you can run the following bash command from the root folder of this repository:
bash parse.sh <your apiKey> <file path> <prompt for parse (optional, default="")>
For example, to extract a table from a PDF file, you can run the following command:
bash parse.sh gl************************************** /path/to/your/file.pdf "Return the table in a JSON format with each box's key and value."
:scroll: Examples
AnyParser can extract text, numbers and symbols from PDF, images, etc. Check out each notebook below to run AnyParser within 10 lines of code!
Extract a Table from PDF into Excel
Do you want to extract a complicated table from a financial report (PDF) into Excel spread sheet? Check out this notebook (3-min read)!
Extract a Table from an Image into Markdown Format
Are you a financial analyst who need to extract ACCURATE number from a table in an image or a PDF. Check out this notebook (3-min read)!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for any_parser-0.0.12-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8c93c6e455d25d6d1f24b902b8a2588a20d1da8a76d002af63930ad9178060d |
|
MD5 | acce29994f5a38fd7c583847c9ead8c1 |
|
BLAKE2b-256 | 8635d4d79a2b771e82075387ab024a2771eca60903f69060ab14711fbf6b0864 |