SDK to interact with the NuMind models API.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

NuMind

These details have not been verified by PyPI

Project description

NuMind SDK

Python SDK to interact with NuMind's models API: NuExtract and NuMarkdown.

Installation

pip install numind

Usage and code examples

Create a client

You must first get an API key on the NuExtract platform.

import os

from numind import NuMind

# Create a client object to interact with the API
# Providing the `api_key` is not required if the `NUMIND_API_KEY` environment variable
# is already set.
client = NuMind(api_key=os.environ["NUMIND_API_KEY"])

Create an async client

You can create an async client by using the NuMindAsync class:

import asyncio
from numind import NuMindAsync

client = NuMindAsync(api_key="API_KEY")
requests = [{}]

async def main():
    return [
        await client.extract_structured_data(project_id, **request_kwargs)
        for request_kwargs in requests
    ]


responses = asyncio.run(main())

The methods and their usages are the same as for the sync NuMind client except that API methods are coroutines that must be awaited.

NuExtract: Extract structured information "on the fly"

If you want to extract structured information from data without projects but just by providing the input template, you can use the extract method which provides a more user-friendly way to interact with the API:

template = {
    "destination": {
        "name": "verbatim-string",
        "zip_code": "string",
        "country": "string"
    },
    "accommodation": "verbatim-string",
    "activities": ["verbatim-string"],
    "duration": {
        "time_unit": ["day", "week", "month", "year"],
        "time_quantity": "integer"
    }
}
input_text = """My dream vacation would be a month-long escape to the stunning islands of Tahiti.
I’d stay in an overwater bungalow in Bora Bora, waking up to crystal-clear turquoise waters and breathtaking sunrises.
Days would be spent snorkeling with vibrant marine life, paddleboarding over coral gardens, and basking on pristine white-sand beaches.
I’d explore lush rainforests, hidden waterfalls, and the rich Polynesian culture through traditional dance, music, and cuisine.
Evenings would be filled with romantic beachside dinners under the stars, with the soothing sound of waves as the perfect backdrop."""

output = client.extract_structured_data(template=template, input_text=input_text)
print(output)

# Can also work with files, replace the path with your own
# from pathlib import Path
# output = client.extract(template=template, input_file="file.ppt")

{
    "destination": {
        "name": "Tahiti",
        "zip_code": "98730",
        "country": "France"
    },
    "accommodation": "overwater bungalow in Bora Bora",
    "activities": [
        "snorkeling",
        "paddleboarding",
        "basking",
        "explore lush rainforests, hidden waterfalls, and the rich Polynesian culture"
    ],
    "duration": {
        "time_unit": null,
        "time_quantity": null
    }
}

Create a good template

NuExtract uses JSON schemas as extraction templates which specify the information to retrieve and their types, which are:

string: a text, whose value can be abstract, i.e. totally free and can be deduced from calculations, reasoning, external knowledge;
verbatim-string: a purely extractive text whose value must be present in the document. Some flexibility might be allowed on the formatting, e.g. new lines and escaped characters (e.g. \n) in a documents might be represented with a space;
integer: an integer number;
number: any number, that may be a floating point number or an integer;
boolean: a boolean whose value should be either true or false;
date-time: a date or time whose value should follow the ISO 8601 standard (YYYY-MM-DDThh:mm:ss). It may feature "reduced" accuracy, i.e. omitting certain date or time components not useful in specific cases. For examples, if the extracted value is a date, YYYY-MM-DD is a valid value format. The same applies to times with the hh:mm:ss format (without omitting the leading T symbol). Additionally, the "least significant" component might be omitted if it is not required or specified. For example, a specific month and year can be specified as YYYY-MM while omitting the day component DD. A specific hour can be specified as hh while omitting the minutes and seconds components. When combining dates and time, only the least significant time components can be omitted, e.g. YYYY-MM-DDThh:mm which is omitting the seconds.

Additionally, the value of a field can be:

a nested dictionary, i.e. another branch, describing elements associated to their parent node (key);
an array of items of the form ["type"], whose values are elements of a given "type", which can also be a dictionary of unspecified depth;
an enum, i.e. a list of elements to choose from of the form ["choice1", "choice2", ...]. For values of this type, just set the value of the item to choose, e.g. "choice1", and do not set the value as an array containing the item such as ["choice1"];
a multi-enum, i.e. a list from which multiple elements can be picked, of the form [["choice1", "choice2", ...]] (double square brackets).

Inferring a template

The "infer_template" method allows to quickly create a template that you can start to work with from a text description.

from numind.openapi_client import TemplateRequest
from pydantic import StrictStr

description = "Create a template that extracts key information from an order confirmation email. The template should be able to pull details like the order ID, customer ID, date and time of the order, status, total amount, currency, item details (product ID, quantity, and unit price), shipping address, any customer requests or delivery preferences, and the estimated delivery date."
input_schema = client.post_api_infer_template(
    template_request=TemplateRequest(description=StrictStr(description))
)

Create a project

A project allows to define an information extraction task from a template and examples.

from numind.openapi_client import CreateProjectRequest

project_id = client.post_api_structured_extraction(
    CreateProjectRequest(
        name="vacation",
        description="Extraction of locations and activities",
        template=template,
    )
)

The project_id can also be found in the "API" tab of a project on the NuExtract website.

Add examples to a project to teach NuExtract via ICL (In-Context Learning)

from pathlib import Path

# Prepare examples, here a text and a file
example_1_input = "This is a text example"
example_1_expected_output = {
    "destination": {"name": None, "zip_code": None, "country": None}
}
with Path("example_2.odt").open("rb") as file:  # read bytes
    example_2_input = file.read()
example_2_expected_output = {
    "destination": {"name": None, "zip_code": None, "country": None}
}
examples = [
    (example_1_input, example_1_expected_output),
    (example_2_input, example_2_expected_output),
]

# Add the examples to the project
client.add_examples_to_structured_extraction_project(project_id, examples)

Extract structured information from text

output_schema = client.extract_structured_data(project_id, input_text=input_text)

Extract structured information from a file

from pathlib import Path

file_path = Path("document.odt")
with file_path.open("rb") as file:
    input_file = file.read()
output_schema = client.extract(project_id, input_file=input_file)

NuMarkdown: Convert a document to a RAG-ready Markdown

from pathlib import Path

file_path = Path("document.pdf")
with file_path.open("rb") as file:
    input_file = file.read()
markdown = client.extract_content(input_file)

Documentation

Extracting Information from Documents

Once your project is ready, you can use it to extract information from documents in real time via this RESTful API.

Each project has its own extraction endpoint:

https://nuextract.ai/api/projects/{projectId}/extract

You provide it a document and it returns the extracted information according to the task defined in the project. To use it, you need:

To create an API key in the Account section
To replace {projectId} by the project ID found in the API tab of the project

You can test your extraction endpoint in your terminal using this command-line example with curl (make sure that you replace values of PROJECT_ID and NUEXTRACT_API_KEY):

NUEXTRACT_API_KEY=\"_your_api_key_here_\"; \\
PROJECT_ID=\"a24fd84a-44ab-4fd4-95a9-bebd46e4768b\"; \\
curl \"https://nuextract.ai/api/projects/${PROJECT_ID}/extract\" \\
  -X POST \\
  -H \"Authorization: Bearer ${NUEXTRACT_API_KEY}\" \\
  -H \"Content-Type: application/octet-stream\" \\
  --data-binary @\"${FILE_NAME}\"

You can also use the Python SDK, by replacing the project_id, api_key and file_path variables in the following code:

from numind import NuMind
from pathlib import Path

client = NuMind(api_key=api_key)
file_path = Path(\"path\", \"to\", \"document.odt\")
with file_path.open(\"rb\") as file:
    input_file = file.read()
output_schema = client.post_api_projects_projectid_extract(project_id, input_file)

Using the Platform via API

Everything you can do on the web platform can be done via API - check the user guide to learn about how the platform works. This can be useful to create projects automatically, or to make your production more robust for example.

Main resources

Project - user project, identified by projectId
File - uploaded file, identified by fileId, stored up to two weeks if not tied to an Example
Document - internal representation of a document, identified by documentId, created from a File or a text, stored up to two weeks if not tied to an Example
Example - document-extraction pair given to teach NuExtract, identified by exampleId, created from a Document

Most common API operations

Creating a Project via POST /api/projects
Changing the template of a Project via PATCH /api/projects/{projectId}
Uploading a file to a File via POST /api/files (up to 2 weeks storage)
Creating a Document via POST /api/documents/text and POST /api/files/{fileID}/convert-to-document from a text or a File
Adding an Example to a Project via POST /api/projects/{projectId}/examples
Changing Project settings via POST /api/projects/{projectId}/settings
Locking a Project via POST /api/projects/{projectId}/lock

This Python package is automatically generated by the OpenAPI Generator project:

API version:
Package version: 1.0.0
Generator version: 7.22.0
Build package: org.openapitools.codegen.languages.PythonClientCodegen

Documentation for API Endpoints

All URIs are relative to https://nuextract.ai

Class	Method	HTTP request
ContentExtractionApi	get_api_content_extraction_jobs_contentextractionjobid	GET /api/content-extraction/jobs/{contentExtractionJobId}
ContentExtractionApi	post_api_content_extraction_jobs	POST /api/content-extraction/jobs
ContentExtractionProjectManagementApi	delete_api_content_extraction_contentprojectid	DELETE /api/content-extraction/{contentProjectId}
ContentExtractionProjectManagementApi	get_api_content_extraction	GET /api/content-extraction
ContentExtractionProjectManagementApi	patch_api_content_extraction_contentprojectid	PATCH /api/content-extraction/{contentProjectId}
ContentExtractionProjectManagementApi	patch_api_content_extraction_contentprojectid_settings	PATCH /api/content-extraction/{contentProjectId}/settings
ContentExtractionProjectManagementApi	post_api_content_extraction	POST /api/content-extraction
ContentExtractionProjectManagementApi	post_api_content_extraction_contentprojectid_reset_settings	POST /api/content-extraction/{contentProjectId}/reset-settings
DefaultApi	get_api_debug_status_code	GET /api/debug/status/{code}
DefaultApi	get_api_health	GET /api/health
DefaultApi	get_api_inference_status	GET /api/inference-status
DefaultApi	get_api_ping	GET /api/ping
DefaultApi	get_api_version	GET /api/version
DocumentsApi	get_api_documents_documentid	GET /api/documents/{documentId}
DocumentsApi	get_api_documents_documentid_content	GET /api/documents/{documentId}/content
DocumentsApi	post_api_documents_documentid_new_owner	POST /api/documents/{documentId}/new-owner
DocumentsApi	post_api_documents_text	POST /api/documents/text
FilesApi	get_api_files_fileid	GET /api/files/{fileId}
FilesApi	get_api_files_fileid_content	GET /api/files/{fileId}/content
FilesApi	post_api_files	POST /api/files
FilesApi	post_api_files_fileid_convert_to_document	POST /api/files/{fileId}/convert-to-document
InferenceApi	post_api_content_extraction_contentprojectid_jobs_document_documentid	POST /api/content-extraction/{contentProjectId}/jobs/document/{documentId}
InferenceApi	post_api_structured_extraction_structuredprojectid_jobs_document_documentid	POST /api/structured-extraction/{structuredProjectId}/jobs/document/{documentId}
InferenceApi	post_api_structured_extraction_structuredprojectid_jobs_text	POST /api/structured-extraction/{structuredProjectId}/jobs/text
InferenceApi	post_api_template_generation_jobs_document_documentid	POST /api/template-generation/jobs/document/{documentId}
InferenceApi	post_api_template_generation_jobs_text	POST /api/template-generation/jobs/text
JobsApi	get_api_jobs	GET /api/jobs
JobsApi	get_api_jobs_jobid_status	GET /api/jobs/{jobId}/status
JobsApi	get_api_jobs_jobid_stream	GET /api/jobs/{jobId}/stream
StructuredDataExtractionApi	get_api_structured_extraction_jobs_structuredextractionjobid	GET /api/structured-extraction/jobs/{structuredExtractionJobId}
StructuredDataExtractionApi	post_api_structured_extraction_structuredprojectid_jobs	POST /api/structured-extraction/{structuredProjectId}/jobs
StructuredExtractionExamplesApi	delete_api_structured_extraction_structuredprojectid_examples_structuredexampleid	DELETE /api/structured-extraction/{structuredProjectId}/examples/{structuredExampleId}
StructuredExtractionExamplesApi	get_api_structured_extraction_structuredprojectid_examples	GET /api/structured-extraction/{structuredProjectId}/examples
StructuredExtractionExamplesApi	get_api_structured_extraction_structuredprojectid_examples_structuredexampleid	GET /api/structured-extraction/{structuredProjectId}/examples/{structuredExampleId}
StructuredExtractionExamplesApi	post_api_structured_extraction_structuredprojectid_examples	POST /api/structured-extraction/{structuredProjectId}/examples
StructuredExtractionExamplesApi	put_api_structured_extraction_structuredprojectid_examples_structuredexampleid	PUT /api/structured-extraction/{structuredProjectId}/examples/{structuredExampleId}
StructuredExtractionProjectManagementApi	delete_api_structured_extraction_structuredprojectid	DELETE /api/structured-extraction/{structuredProjectId}
StructuredExtractionProjectManagementApi	get_api_structured_extraction	GET /api/structured-extraction
StructuredExtractionProjectManagementApi	get_api_structured_extraction_structuredprojectid	GET /api/structured-extraction/{structuredProjectId}
StructuredExtractionProjectManagementApi	get_api_structured_extraction_structuredprojectid_thumbnail	GET /api/structured-extraction/{structuredProjectId}/thumbnail
StructuredExtractionProjectManagementApi	patch_api_structured_extraction_structuredprojectid	PATCH /api/structured-extraction/{structuredProjectId}
StructuredExtractionProjectManagementApi	patch_api_structured_extraction_structuredprojectid_settings	PATCH /api/structured-extraction/{structuredProjectId}/settings
StructuredExtractionProjectManagementApi	post_api_structured_extraction	POST /api/structured-extraction
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_duplicate	POST /api/structured-extraction/{structuredProjectId}/duplicate
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_lock	POST /api/structured-extraction/{structuredProjectId}/lock
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_reset_settings	POST /api/structured-extraction/{structuredProjectId}/reset-settings
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_share	POST /api/structured-extraction/{structuredProjectId}/share
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_unlock	POST /api/structured-extraction/{structuredProjectId}/unlock
StructuredExtractionProjectManagementApi	post_api_structured_extraction_structuredprojectid_unshare	POST /api/structured-extraction/{structuredProjectId}/unshare
TemplateGenerationApi	get_api_template_generation_jobs_templatejobid	GET /api/template-generation/jobs/{templateJobId}
TemplateGenerationApi	post_api_template_generation_jobs	POST /api/template-generation/jobs

Documentation For Models

Documentation For Authorization

Authentication schemes defined for the API:

oauth2Auth

Type: OAuth
Flow: accessCode
Authorization URL: https://users.numind.ai/realms/extract-platform/protocol/openid-connect/auth
Scopes:
openid: OpenID connect
profile: view profile
email: view email

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

NuMind

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

May 11, 2026

0.2.2

Mar 25, 2026

0.2.1

Feb 13, 2026

0.2.0

Dec 19, 2025

0.1.3

Oct 24, 2025

0.1.2

Oct 13, 2025

0.1.1

Oct 8, 2025

0.1.0

Sep 24, 2025

0.0.3

Sep 22, 2025

0.0.2

Aug 29, 2025

0.0.1.post2

Jul 28, 2025

0.0.1.post1

Jun 23, 2025

0.0.1

Jun 18, 2025

0.0.0.dev0 pre-release

Feb 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numind-0.3.0.tar.gz (482.9 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

numind-0.3.0-py3-none-any.whl (302.5 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file numind-0.3.0.tar.gz.

File metadata

Download URL: numind-0.3.0.tar.gz
Upload date: May 11, 2026
Size: 482.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for numind-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`6a701a10bc1fb2c73ecc0ef7f5d3965a2a484ee19c72af9e7862a5d25f451d65`
MD5	`937a2821154f8be35203e453171de14c`
BLAKE2b-256	`77254efb3c97a608123be71d2e5ccca4d3eae065a1271679bfa28bdd3f2b2db0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for numind-0.3.0.tar.gz:

Publisher: publish-pypi.yml on numindai/nuextract-platform-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: numind-0.3.0.tar.gz
- Subject digest: 6a701a10bc1fb2c73ecc0ef7f5d3965a2a484ee19c72af9e7862a5d25f451d65
- Sigstore transparency entry: 1507750620
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: numindai/nuextract-platform-sdk@8ec06d397a50d0987d61476624da4a6d926dddeb
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/numindai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@8ec06d397a50d0987d61476624da4a6d926dddeb
- Trigger Event: release

File details

Details for the file numind-0.3.0-py3-none-any.whl.

File metadata

Download URL: numind-0.3.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 302.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for numind-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a32a1547e192e4acc2938fb01bb88a2fe1e81e5992286fb001177f6836b8b7f`
MD5	`bff28337ac5a4cab8455de002e11d894`
BLAKE2b-256	`a3460b2d4e1d87bbb50f556c1d69bd27ca283092aab0e319d0553178161481b0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for numind-0.3.0-py3-none-any.whl:

Publisher: publish-pypi.yml on numindai/nuextract-platform-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: numind-0.3.0-py3-none-any.whl
- Subject digest: 4a32a1547e192e4acc2938fb01bb88a2fe1e81e5992286fb001177f6836b8b7f
- Sigstore transparency entry: 1507750710
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: numindai/nuextract-platform-sdk@8ec06d397a50d0987d61476624da4a6d926dddeb
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/numindai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@8ec06d397a50d0987d61476624da4a6d926dddeb
- Trigger Event: release

numind 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

NuMind SDK

Installation

Usage and code examples

Create a client

Create an async client

NuExtract: Extract structured information "on the fly"

Create a good template

Inferring a template

Create a project

Add examples to a project to teach NuExtract via ICL (In-Context Learning)

Extract structured information from text

Extract structured information from a file

NuMarkdown: Convert a document to a RAG-ready Markdown

Documentation

Extracting Information from Documents

Using the Platform via API

Main resources

Most common API operations

Documentation for API Endpoints

Documentation For Models

Documentation For Authorization

oauth2Auth

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance