Skip to main content

Python client library for Diffbot APIs

Project description

Diffbot Python Library

Python client library for Diffbot APIs.

Installation

python3 -m pip install diffbot-python

Or, for local development:

pip install -e ".[dev]"

Usage

Authentication

The CLI and the library can share a single credential. The token always has to be passed to the client explicitly, but resolve_token() gives you the same lookup the CLI uses, in this order:

  1. An explicit token passed to resolve_token(token).
  2. The DIFFBOT_API_TOKEN environment variable.
  3. A DIFFBOT_API_TOKEN=... line in ~/.diffbot/credentials.

Set it once and it works for both the CLI and your scripts. Either export it:

export DIFFBOT_API_TOKEN=<TOKEN>

…or write it to the shared credentials file (handy for keeping it out of your shell environment):

mkdir -p ~/.diffbot
printf 'DIFFBOT_API_TOKEN=%s\n' '<TOKEN>' > ~/.diffbot/credentials
chmod 600 ~/.diffbot/credentials

With either in place, resolve the token and pass it to the client:

from diffbot import Diffbot, resolve_token

db = Diffbot(token=resolve_token())  # from env var or ~/.diffbot/credentials
data = db.extract("https://www.example.com")

Extract structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
data = db.extract("https://www.example.com")

Ask Diffbot LLM

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
    print(chunk, end="")

Crawl a site for structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for event in db.crawl("https://www.example.com", hops=1):
    print(event)

Query the Knowledge Graph

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.dql('type:Organization name:"Diffbot"')

Web Search

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.web_search("diffbot knowledge graph")
for r in results["search_results"]:
    print(r["score"], r["title"], r["pageUrl"])
    print(r["content"])

Entities (NLP)

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
result = db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
for entity in result["entities"]:
    print(entity["name"], entity.get("type"), entity.get("id"))
print("sentiment:", result.get("sentiment"))

Async Usage

Extract structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        data = await db.extract("https://www.example.com")
        print(data)

asyncio.run(main())

Ask Diffbot LLM

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
            print(chunk, end="")

asyncio.run(main())

Crawl a site for structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for event in db.crawl("https://www.example.com", hops=1):
            print(event)

asyncio.run(main())

Query the Knowledge Graph

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.dql('type:Organization name:"Diffbot"')
        print(results)

asyncio.run(main())

Web Search

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.web_search("diffbot knowledge graph")
        for r in results["search_results"]:
            print(r["score"], r["title"], r["pageUrl"])
            print(r["content"])

asyncio.run(main())

Entities (NLP)

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        result = await db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
        for entity in result["entities"]:
            print(entity["name"], entity.get("type"), entity.get("id"))
        print("sentiment:", result.get("sentiment"))

asyncio.run(main())

CLI

This library also includes a CLI exposed as the db command.

To make db available from anywhere, install it as an isolated tool with uv:

uv tool install .

This drops a db executable into ~/.local/bin (ensure it is on your PATH). Use --force to reinstall or upgrade after changes, or --editable to have source edits take effect immediately. Alternatively, a plain pip install . (or pip install -e .) also installs the db entry point into the active environment.

export DIFFBOT_API_TOKEN=your-token-here

db extract https://www.example.com
db ask "What's the capital of France?"
db crawl https://www.example.com --hops 1
db crawl-list-jobs
db crawl-delete-job crawl-1234567890
db web-search "diffbot knowledge graph"
db web-search "diffbot knowledge graph" -n 5 -f json
db entities "Apple CEO Tim Cook announced record quarterly earnings."
db entities "Apple CEO Tim Cook announced record quarterly earnings." -f dql

Tests

Run the mock test suite:

python -m pytest

Run live integration tests against the real API (requires a valid token). The token is resolved the same way as everywhere else — the DIFFBOT_API_TOKEN environment variable or ~/.diffbot/credentials:

DIFFBOT_API_TOKEN=your_token python -m pytest -m live

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffbot_python-0.2.0.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffbot_python-0.2.0-py3-none-any.whl (28.9 kB view details)

Uploaded Python 3

File details

Details for the file diffbot_python-0.2.0.tar.gz.

File metadata

  • Download URL: diffbot_python-0.2.0.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for diffbot_python-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2beb10d3890b044a2b2d90384c5b60fade84b1f695f7de296319170cee0ba316
MD5 9aacd372038c606989c3b3b79ea92981
BLAKE2b-256 06524b899005e54f488f3b193988cea024a5db17aa9ba62e75fef6ab666c49b1

See more details on using hashes here.

File details

Details for the file diffbot_python-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: diffbot_python-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 28.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for diffbot_python-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a9456651e95c1868811c3b90aa739153bcc3dfabaf7c8de9f794912596f28a3
MD5 209fca45eeb06287d17ecbd1628f4e26
BLAKE2b-256 7f9c505d40ca1af973602d69f94b1a6bccc3496ea070694be730802aa21d83a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page