Skip to main content

Open-source tool for accurate & fast scientific literature data extraction with LLM and human-in-the-loop.

Project description

Extralit
Extralit

Extract structured data from scientific literature with human validation

CI Codecov Downloads

Extralit is an open-source platform that transforms how researchers extract structured data from scientific literature. Want to get started? Check out our documentation.

Why use Extralit?

Accelerate Scientific Data Collection

Manual data extraction from research papers is slow and error-prone, often taking 6-12 months for systematic reviews. Extralit combines AI-powered extraction with human validation to reduce this to weeks while maintaining research-grade accuracy.

Take Control of Your Research Data

Most scientific data extraction tools are inflexible black boxes. Extralit is different - it's open source and puts you in control. Define custom extraction schemas, validate results, and integrate with your existing research workflows.

Scale Your Literature Reviews

Whether you're conducting a systematic review, meta-analysis, or building a scientific knowledge base, Extralit helps you efficiently process hundreds of papers. Our platform handles complex tables, figures, and relationships while preserving scientific rigor.

🏘️ Community

We're an open-source project built for researchers, by researchers. Here's how to get involved:

  • Slack Community: Connect with other researchers and developers
  • Documentation: Learn how to use and contribute to Extralit
  • Roadmap: See what we're building and share your ideas

Real-World Impact

Extralit is already accelerating research at leading institutions:

  • Gates Foundation: Reduced systematic review time for malaria intervention studies from 6 months to 6 weeks
  • Life Science Research: Streamlined extraction of clinical trial endpoints, genetic markers, and intervention protocols
  • Meta-Analysis: Enabled rapid synthesis of evidence across hundreds of papers while maintaining rigorous validation

👨‍💻 Getting Started

Installation

Install Extralit using pip:

pip install extralit

Initialize the client:

import extralit as ex

client = ex.Extralit(
    api_url="https://your-deployment-url", 
    api_key="your-api-key"
)

Create an extraction schema

Define what data you want to extract:

schema = ex.Schema(
    name="clinical_trial",
    fields=[
        ex.TextField(name="intervention", required=True),
        ex.NumericField(name="sample_size", required=True),
        ex.TextField(name="outcome_measure"),
        ex.TableField(name="results_table")
    ]
)

project = client.create_project(
    name="trial_extraction",
    schema=schema
)

Add documents and start extraction

# Add PDFs to extract from
project.add_documents("path/to/papers/*.pdf")

# Start extraction
extractions = project.extract()

# Review and validate results
validated_data = project.validate(extractions)

Need more help? Check out our detailed tutorials.

🥇 Contributors

Want to contribute? Great! Check out our contribution guide or join our Slack community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extralit-0.4.1.tar.gz (235.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extralit-0.4.1-py3-none-any.whl (312.1 kB view details)

Uploaded Python 3

File details

Details for the file extralit-0.4.1.tar.gz.

File metadata

  • Download URL: extralit-0.4.1.tar.gz
  • Upload date:
  • Size: 235.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.24.1 CPython/3.13.3 Linux/6.11.0-1014-azure

File hashes

Hashes for extralit-0.4.1.tar.gz
Algorithm Hash digest
SHA256 36312a760072ba6e37143701edc9de65f632ad7c27019496afa38d60c7645043
MD5 62ac57d9aa7500b5f17fd636454dca0d
BLAKE2b-256 f8f39c910a283d24d46cff38558827c24546466eeefc9fe695778b551fca45b6

See more details on using hashes here.

File details

Details for the file extralit-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: extralit-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 312.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.24.1 CPython/3.13.3 Linux/6.11.0-1014-azure

File hashes

Hashes for extralit-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1142c46e4b931472a37d47eea99229d90ddb2f77d58b489e0a5b6b3d1570bc2b
MD5 f3100fd353d1fb1b18b6c14b2c8d6ea2
BLAKE2b-256 4aa8790dada816ed8c71515b9b0db60f7328413527ca148a919d3de90df776a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page