AI-powered CLI for WordLift knowledge graph and SEO workflows.
Project description
worai
Command-line toolkit for WordLift operations and SEO checks. Pronunciation: "waw-RYE"
Docs: https://docs.wordlift.io/worai/
Install
pipx install woraipip install worai
Full docs: https://docs.wordlift.io/worai/
Runtime dependency note:
wordlift-sdk>=5.0.0,<6.0.0(installed automatically by pip)copier(required byworai graph sync create, installed automatically by pip)
If you plan to run seocheck, install Playwright browsers:
playwright install chromium
Quick Start
worai --helpworai seocheck https://example.com/sitemap.xmlworai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.jsonworai <command> --help
Configuration
Config file (TOML) discovery order:
--configWORAI_CONFIG./worai.toml~/.config/worai/config.toml~/.worai.toml
Profiles:
[profile.<name>]with--profileorWORAI_PROFILE
Common keys:
wordlift.api_keygsc.idgsc.client_secretsga.idga.client_secretsoauth.token(shared token for GSC + GA)postprocessor_runtime(graph sync runtime:subprocessorpersistent; profile override supported)ingest.source(auto|urls|sitemap|sheets|local)ingest.loader(auto|simple|proxy|playwright|premium_scraper|web_scrape_api|passthrough)ingest.passthrough_when_html(default:true)
Supported environment variables:
WORAI_CONFIG— path to a config TOML file (overrides discovery order).WORAI_PROFILE— profile name under[profile.<name>].WORAI_LOG_LEVEL— default log level (debug|info|warning|error).WORAI_LOG_FORMAT— default log format (text|json).WORDLIFT_KEY— WordLift API key for entity operations.WORDLIFT_API_KEY— alternate WordLift API key name (also accepted by some commands).GSC_CLIENT_SECRETS— path to OAuth client secrets JSON for GSC.GSC_ID— GSC property URL.OAUTH_TOKEN— path to store the shared OAuth token (GSC + GA).GSC_OUTPUT— default output CSV path for GSC export.GA_ID— GA4 property ID for Analytics sections.GA_CLIENT_SECRETS— path to OAuth client secrets JSON for GA4.GSC_TOKEN/GA_TOKEN— legacy aliases forOAUTH_TOKEN(must point to the same file if used).WORAI_DISABLE_UPDATE_CHECK— set to1|true|yes|onto disable startup update checks.
.env support:
worailoads.envfrom the current working directory (and parent lookup) at startup.- values from
.envare treated as environment variables. - existing environment variables take precedence over
.envvalues.
Example environment setup:
export WORDLIFT_KEY="wl_..."
export WORAI_CONFIG="~/worai.toml"
export WORAI_PROFILE="dev"
export GSC_CLIENT_SECRETS="~/client_secrets.json"
export OAUTH_TOKEN="~/oauth_token.json"
Example worai.toml:
[defaults]
log_level = "info"
[wordlift]
api_key = "wl_..."
[gsc]
id = "sc-domain:example.com"
client_secrets = "/path/to/client_secrets.json"
[ga]
id = "123456789"
client_secrets = "/path/to/client_secrets.json"
[oauth]
token = "/path/to/oauth_token.json"
[ingest]
source = "auto"
loader = "web_scrape_api"
passthrough_when_html = true
Ingestion profile examples:
[profile.inventory_local]
ingest.source = "local"
ingest.loader = "passthrough"
ingest.passthrough_when_html = true
[profile.inventory_remote]
ingest.source = "sitemap"
ingest.loader = "web_scrape_api"
[profile.graph_sync_proxy]
urls = ["https://example.com/a", "https://example.com/b"]
ingest.source = "urls"
ingest.loader = "proxy"
web_page_import_timeout = "60s"
Commands
Full docs: https://docs.wordlift.io/worai/
seocheck— run SEO checks for sitemap URLs and URL lists.google-search-console— export GSC page metrics as CSV.dedupe— deduplicate WordLift entities by schema:url.canonicalize-duplicate-pages— select canonical URLs using GSC KPIs.delete-entities-from-csv— delete entities listed in a CSV.find-faq-page-wrong-type— find and patch FAQPage typing issues.find-missing-names— find entities missing schema:name/headline.find-url-by-type— list schema:url values by type from RDF.graph— run graph-specific workflows.link-groups— build or apply LinkGroup data from CSV.patch— patch entities from RDF.structured-data— generate JSON-LD/YARRRML mappings or materialize RDF from YARRRML.validate— validate JSON-LD with SHACL shapes (usestructured-data validate pagefor webpage URLs).self update— check for new worai versions and optionally run the upgrade command.upload-entities-from-turtle— upload .ttl files with resume.dil-import- upload DILs from a CSV file.
Command help:
worai <command> --help
Autocompletion:
worai --install-completionworai --show-completion
Updates:
woraichecks for new versions periodically and prints a non-blocking notice when an update is available.- run
worai self updateto check manually and see/apply the suggested upgrade command.
Examples
seocheck
worai seocheck https://example.com/sitemap.xmlworai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --save-htmlworai seocheck https://example.com/sitemap.xml --output-dir ./seocheck-report --no-open-reportworai seocheck https://example.com/sitemap.xml --user-agent "Mozilla/5.0 ..."worai seocheck https://example.com/sitemap.xml --sitemap-fetch-mode browserworai seocheck https://example.com/sitemap.xml --no-report-uiworai seocheck https://example.com/sitemap.xml --recheck-failed --recheck-from ./seocheck-report
google-search-console
worai google-search-console --site sc-domain:example.com --client-secrets ./client_secrets.json- Uses OAuth redirect port 8080 by default.
seoreport (with Analytics)
worai seoreport --site sc-domain:example.com --ga-id 123456789 --format html
canonicalize-duplicate-pages
worai canonicalize-duplicate-pages --input gsc_pages.csv --output canonical_targets.csv --kpi-window 28d --kpi-metric clicksworai canonicalize-duplicate-pages --input gsc_pages.csv --entity-type Product
dedupe
worai dedupe --dry-run
find-faq-page-wrong-type
worai find-faq-page-wrong-type ./data.ttl --dry-run --replace-typeworai find-faq-page-wrong-type ./data.ttl --patch --replace-type
find-missing-names
worai find-missing-names ./data.ttl
find-url-by-type
worai find-url-by-type ./data.ttl schema:Service schema:Product
link-groups
worai link-groups ./links.csv --format turtleworai link-groups ./links.csv --apply --dry-run --concurrency 4
graph
worai --config ./worai.toml graph sync run --profile acmeworai graph sync run --profile acme --debugworai graph sync create ./acme-graphworai graph sync create ./acme-graph --template ./graph-sync-template --defaultsworai graph sync create ./acme-graph --data-file ./answers.yml --non-interactiveworai graph sync create ./acme-graph --vcs-ref v1.2.3worai graph property delete seovoc:html --dry-runworai graph property delete https://w3id.org/seovoc/html --yes --workers 4graph property deletesendsX-include-Private: trueby default for both GraphQL match discovery and entity PATCH requests.graph sync createruns Copier in trusted mode by default so template_tasksexecute.- Mapping docs (for
[profile.<name>]):docs/graph-sync-mappings-reference.md,docs/graph-sync-mappings-guide.md,docs/graph-sync-mappings-examples.md web_page_import_timeoutis configured in seconds inworai.toml(60->60000ms in SDK).postprocessor_runtime = "persistent"inworai.tomlsets SDK envPOSTPROCESSOR_RUNTIME=persistentforgraph sync run(profile value overrides global).- SDK
wordlift-sdk5.1.1+ postprocessor context migration:context.settings->context.profile(for examplecontext.profile["settings"]["api_url"])context.account.key->context.account_keycontext.accountremains the clean/meaccount object
- SDK 5 ingestion defaults to
INGEST_LOADER=web_scrape_api; legacyweb_page_import_mode=defaultmaps toweb_scrape_api. WEB_PAGE_IMPORT_MODEis emitted as an SDK-valid fetch mode:ingest.loader=proxy->WEB_PAGE_IMPORT_MODE=proxyingest.loader=premium_scraper->WEB_PAGE_IMPORT_MODE=premium_scraperingest.loader=web_scrape_api(and other loaders) ->WEB_PAGE_IMPORT_MODE=default
patch
worai patch ./data.ttl --dry-run --add-types
structured-data
worai structured-data create https://example.com/article Review --output-dir ./structured-dataworai structured-data create https://example.com/article --type Review --output-dir ./structured-dataworai structured-data create https://example.com/article --type Review --debugworai structured-data create https://example.com/article --type Review --max-xhtml-chars 40000 --max-nesting-depth 2worai structured-data generate https://example.com/sitemap.xml --yarrrml ./mapping.yarrrml --output-dir ./outworai structured-data generate https://example.com/page --yarrrml ./mapping.yarrrml --format jsonldworai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csvworai structured-data inventory ./urls.txt --output ./structured-data-inventory.csvworai structured-data inventory https://docs.google.com/spreadsheets/d/<id>/edit --sheet-name URLs_US --output ./structured-data-inventory.csvworai structured-data inventory https://example.com/sitemap.xml --destination-sheet-id <spreadsheet_id> --destination-sheet-name Inventoryworai structured-data inventory https://example.com/sitemap.xml --output ./structured-data-inventory.csv --concurrency autoworai structured-data inventory /path/to/debug_cloud/us --source-type debug-cloud --output ./structured-data-inventory.csvworai structured-data inventory /path/to/debug_cloud/us --ingest-source local --ingest-loader passthrough --output ./structured-data-inventory.csvworai structured-data inventory https://example.com/sitemap.xml --ingest-loader web_scrape_api --output ./structured-data-inventory.csv
validate
worai validate jsonld --shape review-snippet --shape schema-review ./data.jsonldworai validate jsonld --format raw https://api.wordlift.io/data/example.jsonldworai structured-data validate page https://example.com/article --shape review-snippet
self update
worai self update --check-onlyworai self update --yes
upload-entities-from-turtle
worai upload-entities-from-turtle ./entities --recursive --limit 50
dil-import
worai dil-import <wordlift_key> <path_to_csv_file>
Troubleshooting
- Playwright missing browsers:
playwright install chromium
- YARRRML conversion:
npm install -g @rmlio/yarrrml-parser
- RML execution:
morph-kgcis included in project dependencies
- Dependency notes:
- Common runtime libs (e.g.,
requests,rdflib,tqdm,advertools, Google auth helpers) are provided transitively bywordlift-sdk.
- Common runtime libs (e.g.,
- OAuth token issues:
- Remove the token file and re-run
worai google-search-console. - If you are prompted to re-auth every run, delete the token file to force a new consent flow that includes a refresh token.
- Remove the token file and re-run
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file worai-4.2.0.tar.gz.
File metadata
- Download URL: worai-4.2.0.tar.gz
- Upload date:
- Size: 107.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31ff9b72fa93403643b9df45ffca2f6960cf16bf60eaac3894b6f3e2ef9883f7
|
|
| MD5 |
396bb2bbafe8d02a2a4c69a9c84511c8
|
|
| BLAKE2b-256 |
3ae9319bf7ed7fc9f965a1500ec7e3cec5a20ae19e9dfcf91bf704889fbca687
|
Provenance
The following attestation bundles were made for worai-4.2.0.tar.gz:
Publisher:
publish.yml on wordlift/worai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
worai-4.2.0.tar.gz -
Subject digest:
31ff9b72fa93403643b9df45ffca2f6960cf16bf60eaac3894b6f3e2ef9883f7 - Sigstore transparency entry: 976447092
- Sigstore integration time:
-
Permalink:
wordlift/worai@51f47fa18e8786658e57a8917b6afce79c2f6d59 -
Branch / Tag:
refs/tags/4.2.0 - Owner: https://github.com/wordlift
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@51f47fa18e8786658e57a8917b6afce79c2f6d59 -
Trigger Event:
push
-
Statement type:
File details
Details for the file worai-4.2.0-py3-none-any.whl.
File metadata
- Download URL: worai-4.2.0-py3-none-any.whl
- Upload date:
- Size: 119.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf49c9dc701aee50aa3a8b275a91b4dc0d6c9e52df33e70c9253feca8876ef5f
|
|
| MD5 |
5d4b038b09c82fbc21e3b03cdf345cdd
|
|
| BLAKE2b-256 |
5d75805580740e9525764a6b3f00d50ef689f6de077573417df310ae082ab58e
|
Provenance
The following attestation bundles were made for worai-4.2.0-py3-none-any.whl:
Publisher:
publish.yml on wordlift/worai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
worai-4.2.0-py3-none-any.whl -
Subject digest:
bf49c9dc701aee50aa3a8b275a91b4dc0d6c9e52df33e70c9253feca8876ef5f - Sigstore transparency entry: 976447095
- Sigstore integration time:
-
Permalink:
wordlift/worai@51f47fa18e8786658e57a8917b6afce79c2f6d59 -
Branch / Tag:
refs/tags/4.2.0 - Owner: https://github.com/wordlift
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@51f47fa18e8786658e57a8917b6afce79c2f6d59 -
Trigger Event:
push
-
Statement type: