Skip to main content

Python client library for Diffbot APIs

Project description

Diffbot Python Library

Python client library for Diffbot APIs.

Installation

python3 -m pip install diffbot-python

Or, for local development:

pip install -e ".[dev]"

Usage

Authentication

The CLI and the library can share a single credential. The token always has to be passed to the client explicitly, but resolve_token() gives you the same lookup the CLI uses, in this order:

  1. An explicit token passed to resolve_token(token).
  2. The DIFFBOT_API_TOKEN environment variable.
  3. A DIFFBOT_API_TOKEN=... line in ~/.diffbot/credentials.

Set it once and it works for both the CLI and your scripts. Either export it:

export DIFFBOT_API_TOKEN=<TOKEN>

…or write it to the shared credentials file (handy for keeping it out of your shell environment):

mkdir -p ~/.diffbot
printf 'DIFFBOT_API_TOKEN=%s\n' '<TOKEN>' > ~/.diffbot/credentials
chmod 600 ~/.diffbot/credentials

With either in place, resolve the token and pass it to the client:

from diffbot import Diffbot, resolve_token

db = Diffbot(token=resolve_token())  # from env var or ~/.diffbot/credentials
data = db.extract("https://www.example.com")

Extract structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
data = db.extract("https://www.example.com")

Ask Diffbot LLM

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
    print(chunk, end="")

Crawl a site for structured content

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
for event in db.crawl("https://www.example.com", hops=1):
    print(event)

Query the Knowledge Graph

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.dql('type:Organization name:"Diffbot"')

Web Search

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
results = db.web_search("diffbot knowledge graph")
for r in results["search_results"]:
    print(r["score"], r["title"], r["pageUrl"])
    print(r["content"])

Entities (NLP)

from diffbot import Diffbot

db = Diffbot(token="YOUR_TOKEN")
result = db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
for entity in result["entities"]:
    print(entity["name"], entity.get("type"), entity.get("id"))
print("sentiment:", result.get("sentiment"))

Async Usage

Extract structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        data = await db.extract("https://www.example.com")
        print(data)

asyncio.run(main())

Ask Diffbot LLM

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for chunk in db.ask([{"role": "user", "content": "What's the capital of France?"}]):
            print(chunk, end="")

asyncio.run(main())

Crawl a site for structured content

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        async for event in db.crawl("https://www.example.com", hops=1):
            print(event)

asyncio.run(main())

Query the Knowledge Graph

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.dql('type:Organization name:"Diffbot"')
        print(results)

asyncio.run(main())

Web Search

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        results = await db.web_search("diffbot knowledge graph")
        for r in results["search_results"]:
            print(r["score"], r["title"], r["pageUrl"])
            print(r["content"])

asyncio.run(main())

Entities (NLP)

import asyncio
from diffbot import DiffbotAsync

async def main():
    async with DiffbotAsync(token="YOUR_TOKEN") as db:
        result = await db.entities("Apple CEO Tim Cook announced record quarterly earnings.")
        for entity in result["entities"]:
            print(entity["name"], entity.get("type"), entity.get("id"))
        print("sentiment:", result.get("sentiment"))

asyncio.run(main())

CLI

This library also includes a CLI exposed as the db command.

To make db available from anywhere, install it as an isolated tool with uv:

uv tool install .

This drops a db executable into ~/.local/bin (ensure it is on your PATH). Use --force to reinstall or upgrade after changes, or --editable to have source edits take effect immediately. Alternatively, a plain pip install . (or pip install -e .) also installs the db entry point into the active environment.

export DIFFBOT_API_TOKEN=your-token-here

db extract https://www.example.com
db ask "What's the capital of France?"
db crawl https://www.example.com --hops 1
db crawl-list-jobs
db crawl-delete-job crawl-1234567890
db web-search "diffbot knowledge graph"
db web-search "diffbot knowledge graph" -n 5 -f json
db entities "Apple CEO Tim Cook announced record quarterly earnings."
db entities "Apple CEO Tim Cook announced record quarterly earnings." -f dql

Tests

Run the mock test suite:

python -m pytest

Run live integration tests against the real API (requires a valid token). The token is resolved the same way as everywhere else — the DIFFBOT_API_TOKEN environment variable or ~/.diffbot/credentials:

DIFFBOT_API_TOKEN=your_token python -m pytest -m live

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffbot_python-0.2.1.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffbot_python-0.2.1-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file diffbot_python-0.2.1.tar.gz.

File metadata

  • Download URL: diffbot_python-0.2.1.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for diffbot_python-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8b93fa8f0f22679c6224c329a3a552e36f2b199b2fe1213531b62002829de566
MD5 684cd5bbb1e6770951fbe8f9be150de8
BLAKE2b-256 600fdbf3cf6a4a9a442abb1f3a7734e1a16e2800702685d213139604caf5d336

See more details on using hashes here.

File details

Details for the file diffbot_python-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: diffbot_python-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for diffbot_python-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 082238fc30b9423bb0040b895938823c0727d2bf36b648bcc967837ab243c434
MD5 ca08f25328c0836335d2d4bcd0299400
BLAKE2b-256 6b96ff74c8d76befa76ae1f383d6b94656cd910bf9d6f9dd242eb4776402baf6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page