AI-powered CLI for WordLift knowledge graph and SEO workflows.

Project description

worai

Command-line toolkit for WordLift operations and SEO checks. Pronunciation: "waw-RYE"

Docs: https://docs.wordlift.io/worai/

Install

pipx install worai
pip install worai

Full docs: https://docs.wordlift.io/worai/

Runtime dependency note:

wordlift-sdk>=5.0.0,<6.0.0 (installed automatically by pip)
copier (required by worai graph sync create, installed automatically by pip)

If you plan to run seocheck, install Playwright browsers:

playwright install chromium

Quick Start

worai --help
worai seocheck https://example.com/sitemap.xml
worai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.json
worai <command> --help

Configuration

Config file (TOML) discovery order:

--config
WORAI_CONFIG
./worai.toml
~/.config/worai/config.toml
~/.worai.toml

Profiles:

[profile.<name>] with --profile or WORAI_PROFILE

Common keys:

wordlift.api_key
gsc.id
gsc.client_secrets
ga.id
ga.client_secrets
oauth.token (shared token for GSC + GA)
postprocessor_runtime (graph sync runtime: subprocess or persistent; profile override supported)
ingest.source (auto|urls|sitemap|sheets|local)
ingest.loader (auto|simple|proxy|playwright|premium_scraper|web_scrape_api|passthrough)
ingest.passthrough_when_html (default: true)

Supported environment variables:

WORAI_CONFIG — path to a config TOML file (overrides discovery order).
WORAI_PROFILE — profile name under [profile.<name>].
WORAI_LOG_LEVEL — default log level (debug|info|warning|error).
WORAI_LOG_FORMAT — default log format (text|json).
WORDLIFT_KEY — WordLift API key for entity operations.
WORDLIFT_API_KEY — alternate WordLift API key name (also accepted by some commands).
GSC_CLIENT_SECRETS — path to OAuth client secrets JSON for GSC.
GSC_ID — GSC property URL.
OAUTH_TOKEN — path to store the shared OAuth token (GSC + GA).
GSC_OUTPUT — default output CSV path for GSC export.
GA_ID — GA4 property ID for Analytics sections.
GA_CLIENT_SECRETS — path to OAuth client secrets JSON for GA4.
GSC_TOKEN / GA_TOKEN — legacy aliases for OAUTH_TOKEN (must point to the same file if used).
WORAI_DISABLE_UPDATE_CHECK — set to 1|true|yes|on to disable startup update checks.

.env support:

worai loads .env from the current working directory (and parent lookup) at startup.
values from .env are treated as environment variables.
existing environment variables take precedence over .env values.

Example environment setup:

export WORDLIFT_KEY="wl_..."
export WORAI_CONFIG="~/worai.toml"
export WORAI_PROFILE="dev"
export GSC_CLIENT_SECRETS="~/client_secrets.json"
export OAUTH_TOKEN="~/oauth_token.json"

Example worai.toml:

[defaults]
log_level = "info"

[wordlift]
api_key = "wl_..."

[gsc]
id = "sc-domain:example.com"
client_secrets = "/path/to/client_secrets.json"

[ga]
id = "123456789"
client_secrets = "/path/to/client_secrets.json"

[oauth]
token = "/path/to/oauth_token.json"

[ingest]
source = "auto"
loader = "web_scrape_api"
passthrough_when_html = true

Ingestion profile examples:

[profile.inventory_local]
ingest.source = "local"
ingest.loader = "passthrough"
ingest.passthrough_when_html = true

[profile.inventory_remote]
ingest.source = "sitemap"
ingest.loader = "web_scrape_api"

[profile.graph_sync_proxy]
urls = ["https://example.com/a", "https://example.com/b"]
ingest.source = "urls"
ingest.loader = "proxy"
web_page_import_timeout = "60s"

Commands

Full docs: https://docs.wordlift.io/worai/

seocheck — run SEO checks for sitemap URLs and URL lists.
google-search-console — export GSC page metrics as CSV.
dedupe — deduplicate WordLift entities by schema:url.
canonicalize-duplicate-pages — select canonical URLs using GSC KPIs.
delete-entities-from-csv — delete entities listed in a CSV.
find-faq-page-wrong-type — find and patch FAQPage typing issues.
find-missing-names — find entities missing schema:name/headline.
find-url-by-type — list schema:url values by type from RDF.
graph — run graph-specific workflows.
link-groups — build or apply LinkGroup data from CSV.
patch — patch entities from RDF.
structured-data — generate JSON-LD/YARRRML mappings or materialize RDF from YARRRML.
validate — validate JSON-LD with SHACL shapes (use structured-data validate page for webpage URLs).
self update — check for new worai versions and optionally run the upgrade command.
upload-entities-from-turtle — upload .ttl files with resume.
dil-import - upload DILs from a CSV file.

Command help:

worai <command> --help

Autocompletion:

worai --install-completion
worai --show-completion

Updates:

worai checks for new versions periodically and prints a non-blocking notice when an update is available.
run worai self update to check manually and see/apply the suggested upgrade command.

Examples

seocheck

worai seocheck https://example.com/sitemap.xml
worai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --save-html
worai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --no-open-report
worai seocheck https://example.com/sitemap.xml --user-agent "Mozilla/5.0 ..."
worai seocheck https://example.com/sitemap.xml --sitemap-fetch-mode browser
worai seocheck https://example.com/sitemap.xml --no-report-ui
worai seocheck https://example.com/sitemap.xml --recheck-failed --recheck-from ./seocheck-report

google-search-console

worai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.json
- Uses OAuth redirect port 8080 by default.

seoreport (with Analytics)

worai seoreport --site sc-domain:example.com --ga-id 123456789 --format html

canonicalize-duplicate-pages

worai canonicalize-duplicate-pages --input gsc_pages.csv --output canonical_targets.csv --kpi-window 28d --kpi-metric clicks
worai canonicalize-duplicate-pages --input gsc_pages.csv --entity-type Product

dedupe

worai dedupe --dry-run

find-faq-page-wrong-type

worai find-faq-page-wrong-type ./data.ttl --dry-run --replace-type
worai find-faq-page-wrong-type ./data.ttl --patch --replace-type

find-missing-names

worai find-missing-names ./data.ttl

find-url-by-type

worai find-url-by-type ./data.ttl schema:Service schema:Product

link-groups

worai link-groups ./links.csv --format turtle
worai link-groups ./links.csv --apply --dry-run --concurrency 4

graph

worai --config ./worai.toml graph sync run --profile acme
worai graph sync run --profile acme --debug
worai graph sync create ./acme-graph
worai graph sync create ./acme-graph --template ./graph-sync-template --defaults
worai graph sync create ./acme-graph --data-file ./answers.yml --non-interactive
worai graph sync create ./acme-graph --vcs-ref v1.2.3
worai graph property delete seovoc:html --dry-run
worai graph property delete https://w3id.org/seovoc/html --yes --workers 4
- graph property delete sends X-include-Private: true by default for both GraphQL match discovery and entity PATCH requests.
- graph sync create runs Copier in trusted mode by default so template _tasks execute.
- Mapping docs (for [profile.<name>]): docs/graph-sync-mappings-reference.md, docs/graph-sync-mappings-guide.md, docs/graph-sync-mappings-examples.md
- web_page_import_timeout is configured in seconds in worai.toml (60 -> 60000 ms in SDK).
- postprocessor_runtime = "persistent" in worai.toml sets SDK env POSTPROCESSOR_RUNTIME=persistent for graph sync run (profile value overrides global).
- SDK wordlift-sdk 5.1.1+ postprocessor context migration:
  - context.settings -> context.profile (for example context.profile["settings"]["api_url"])
  - context.account.key -> context.account_key
  - context.account remains the clean /me account object
- SDK 5 ingestion defaults to INGEST_LOADER=web_scrape_api; legacy web_page_import_mode=default maps to web_scrape_api.
- WEB_PAGE_IMPORT_MODE is emitted as an SDK-valid fetch mode:
  - ingest.loader=proxy -> WEB_PAGE_IMPORT_MODE=proxy
  - ingest.loader=premium_scraper -> WEB_PAGE_IMPORT_MODE=premium_scraper
  - ingest.loader=web_scrape_api (and other loaders) -> WEB_PAGE_IMPORT_MODE=default

patch

worai patch ./data.ttl --dry-run --add-types

structured-data

worai structured-data create https://example.com/article Review --output-dir ./structured-data
worai structured-data create https://example.com/article --type Review --output-dir ./structured-data
worai structured-data create https://example.com/article --type Review --debug
worai structured-data create https://example.com/article --type Review --max-xhtml-chars 40000 --max-nesting-depth 2
worai structured-data generate https://example.com/sitemap.xml --yarrrml ./mapping.yarrrml --output-dir ./out
worai structured-data generate https://example.com/page --yarrrml ./mapping.yarrrml --format jsonld
worai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csv
worai structured-data inventory ./urls.txt --output ./structured-data-inventory.csv
worai structured-data inventory https://docs.google.com/spreadsheets/d/<id>/edit --sheet-name URLs_US --output ./structured-data-inventory.csv
worai structured-data inventory https://example.com/sitemap.xml --destination-sheet-id <spreadsheet_id> --destination-sheet-name Inventory
worai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csv --concurrency auto
worai structured-data inventory /path/to/debug_cloud/us --source-type debug-cloud --output ./structured-data-inventory.csv
worai structured-data inventory /path/to/debug_cloud/us --ingest-source local --ingest-loader passthrough --output ./structured-data-inventory.csv
worai structured-data inventory https://example.com/sitemap.xml --ingest-loader web_scrape_api --output ./structured-data-inventory.csv

validate

worai validate jsonld --shape review-snippet --shape schema-review ./data.jsonld
worai validate jsonld --format raw https://api.wordlift.io/data/example.jsonld
worai structured-data validate page https://example.com/article --shape review-snippet

self update

worai self update --check-only
worai self update --yes

upload-entities-from-turtle

worai upload-entities-from-turtle ./entities --recursive --limit 50

dil-import

worai dil-import <wordlift_key> <path_to_csv_file>

Troubleshooting

Playwright missing browsers:
- playwright install chromium
YARRRML conversion:
- npm install -g @rmlio/yarrrml-parser
RML execution:
- morph-kgc is included in project dependencies
Dependency notes:
- Common runtime libs (e.g., requests, rdflib, tqdm, advertools, Google auth helpers) are provided transitively by wordlift-sdk.
OAuth token issues:
- Remove the token file and re-run worai google-search-console.
- If you are prompted to re-auth every run, delete the token file to force a new consent flow that includes a refresh token.

Project details

Release history Release notifications | RSS feed

6.17.6

Mar 26, 2026

6.17.5

Mar 26, 2026

6.17.4

Mar 25, 2026

6.17.3

Mar 25, 2026

6.17.2

Mar 25, 2026

6.17.1

Mar 25, 2026

6.17.0

Mar 23, 2026

6.16.5

Mar 13, 2026

6.16.4

Mar 13, 2026

6.16.3

Mar 13, 2026

6.16.2

Mar 12, 2026

6.16.1

Mar 12, 2026

6.16.0

Mar 12, 2026

6.15.1

Mar 12, 2026

6.14.2

Mar 12, 2026

6.14.1

Mar 12, 2026

6.14.0

Mar 11, 2026

6.13.1

Mar 11, 2026

6.13.0

Mar 11, 2026

6.12.12

Mar 10, 2026

6.12.11

Mar 10, 2026

6.12.10

Mar 10, 2026

6.12.9

Mar 10, 2026

6.12.8

Mar 10, 2026

6.12.7

Mar 10, 2026

6.12.6

Mar 10, 2026

6.12.5

Mar 10, 2026

6.12.4

Mar 10, 2026

6.12.3

Mar 10, 2026

6.12.2

Mar 10, 2026

6.12.1

Mar 9, 2026

6.11.2

Mar 9, 2026

6.11.1

Mar 9, 2026

6.11.0

Mar 6, 2026

6.10.0

Mar 5, 2026

6.9.7

Mar 4, 2026

6.9.6

Mar 4, 2026

6.9.5

Mar 4, 2026

6.9.4

Mar 1, 2026

6.9.3

Mar 1, 2026

6.8.0

Feb 27, 2026

6.7.3

Feb 27, 2026

6.7.2

Feb 27, 2026

6.7.0

Feb 27, 2026

6.6.4

Feb 26, 2026

6.6.3

Feb 26, 2026

6.6.2

Feb 26, 2026

6.6.1

Feb 26, 2026

6.6.0

Feb 26, 2026

6.5.3

Feb 26, 2026

6.5.2

Feb 26, 2026

6.5.1

Feb 25, 2026

6.5.0

Feb 25, 2026

6.2.9

Feb 25, 2026

6.2.8

Feb 25, 2026

6.2.6

Feb 25, 2026

6.2.5

Feb 25, 2026

6.2.4

Feb 25, 2026

6.2.3

Feb 24, 2026

6.2.2

Feb 24, 2026

6.2.1

Feb 24, 2026

6.2.0

Feb 24, 2026

6.1.0

Feb 24, 2026

6.0.0

Feb 24, 2026

4.3.0

Feb 23, 2026

4.2.1

Feb 22, 2026

This version

4.2.0

Feb 22, 2026

4.1.2

Feb 22, 2026

4.1.1

Feb 20, 2026

4.1.0

Feb 20, 2026

4.0.1

Feb 20, 2026

4.0.0

Feb 20, 2026

3.4.0

Feb 19, 2026

3.3.0

Feb 19, 2026

3.2.0

Feb 19, 2026

3.1.0

Feb 19, 2026

3.0.1

Feb 19, 2026

3.0.0

Feb 19, 2026

2.9.0

Feb 18, 2026

2.8.0

Feb 18, 2026

2.7.0

Feb 18, 2026

2.6.0

Feb 18, 2026

2.5.0

Feb 18, 2026

2.4.0

Feb 18, 2026

2.3.0

Feb 18, 2026

2.2.0

Feb 17, 2026

2.1.0

Feb 17, 2026

2.0.0

Feb 16, 2026

1.17.0

Feb 14, 2026

1.16.2

Feb 13, 2026

1.14.0

Feb 5, 2026

1.13.1

Feb 5, 2026

1.13.0

Feb 5, 2026

1.12.0

Feb 5, 2026

1.11.0

Feb 4, 2026

1.9.0

Feb 3, 2026

1.8.0

Feb 2, 2026

1.7.0

Feb 1, 2026

1.6.0

Jan 30, 2026

1.5.0

Jan 15, 2026

1.4.0

Jan 14, 2026

1.3.1

Jan 13, 2026

1.3.0

Jan 12, 2026

1.2.0

Jan 12, 2026

1.1.0

Jan 12, 2026

1.0.4

Jan 12, 2026

1.0.3

Jan 12, 2026

1.0.2

Jan 12, 2026

1.0.1

Jan 12, 2026

1.0.0

Jan 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

worai-4.2.0.tar.gz (107.4 kB view details)

Uploaded Feb 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

worai-4.2.0-py3-none-any.whl (119.4 kB view details)

Uploaded Feb 22, 2026 Python 3

File details

Details for the file worai-4.2.0.tar.gz.

File metadata

Download URL: worai-4.2.0.tar.gz
Upload date: Feb 22, 2026
Size: 107.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for worai-4.2.0.tar.gz
Algorithm	Hash digest
SHA256	`31ff9b72fa93403643b9df45ffca2f6960cf16bf60eaac3894b6f3e2ef9883f7`
MD5	`396bb2bbafe8d02a2a4c69a9c84511c8`
BLAKE2b-256	`3ae9319bf7ed7fc9f965a1500ec7e3cec5a20ae19e9dfcf91bf704889fbca687`

See more details on using hashes here.

Provenance

The following attestation bundles were made for worai-4.2.0.tar.gz:

Publisher: publish.yml on wordlift/worai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: worai-4.2.0.tar.gz
- Subject digest: 31ff9b72fa93403643b9df45ffca2f6960cf16bf60eaac3894b6f3e2ef9883f7
- Sigstore transparency entry: 976447092
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: wordlift/worai@51f47fa18e8786658e57a8917b6afce79c2f6d59
- Branch / Tag: refs/tags/4.2.0
- Owner: https://github.com/wordlift
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@51f47fa18e8786658e57a8917b6afce79c2f6d59
- Trigger Event: push

File details

Details for the file worai-4.2.0-py3-none-any.whl.

File metadata

Download URL: worai-4.2.0-py3-none-any.whl
Upload date: Feb 22, 2026
Size: 119.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for worai-4.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bf49c9dc701aee50aa3a8b275a91b4dc0d6c9e52df33e70c9253feca8876ef5f`
MD5	`5d4b038b09c82fbc21e3b03cdf345cdd`
BLAKE2b-256	`5d75805580740e9525764a6b3f00d50ef689f6de077573417df310ae082ab58e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for worai-4.2.0-py3-none-any.whl:

Publisher: publish.yml on wordlift/worai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: worai-4.2.0-py3-none-any.whl
- Subject digest: bf49c9dc701aee50aa3a8b275a91b4dc0d6c9e52df33e70c9253feca8876ef5f
- Sigstore transparency entry: 976447095
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: wordlift/worai@51f47fa18e8786658e57a8917b6afce79c2f6d59
- Branch / Tag: refs/tags/4.2.0
- Owner: https://github.com/wordlift
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@51f47fa18e8786658e57a8917b6afce79c2f6d59
- Trigger Event: push

worai 4.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

worai

Install

Quick Start

Configuration

Commands

Examples

Troubleshooting

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance