Search Providers package for Mediacloud
Project description
Media Cloud Providers Library
A package of search providers for Media Cloud, wrapping up interfaces for different social media platform.
Install with pip (pip install .) and the install.sh script.
Requires environment variables set for various interfaces to work correctly.
Build
Make sure pip install flit twine so you can build and deploy to PyPI.
- Bump the version number in
pyproject.toml - Add a note about changes to the version history below
- Commit the changes and tag it with a semantic version number
- A github action will build and push the repository on committing a tagged version
Version History
- v0.0.0 - Testing new release pipeline, temporary, do not use.
- v3.1.2 - Fix random sampling behavior in ES provider to be genuinely random, bugfix related to marginal sorting error, additional counters for fine-grained visibility
- v3.1.1 - Fix ES Provider to send None as last page pagination token
- v3.1.0 - Add new ProviderException classes to pass more meaningful errors to consumer processes
- v3.0.1 - Fix ES Provider to accept sort_{order,field} paging arguments like NSA-based Provider
- v3.0.0 - New "OnlineNewsMediaCloudProvider" using Elasticsearch DSL for direct access to the ES cluster. Retain old provider as "OnlineNewsMediaCloudOldProvider" for now.
- v2.2.0 - Added an optional argument to providers to toggle caching behavior, added more specific error on 504
- v2.1.1 - Bugfix
- v2.1.0 - Mediacloud news client code incorperated into this package
- v2.0.5 - Build-system in pyproject.toml
- v2.0.4 - reintroduce stopwords
- v2.0.3 - version bump for automatic releases
- v2.0.2 - respect domain filters on Media Cloud searches
- v2.0.1 - more work on caching strategies
- v2.0.0 - change CachingManager interface to support online news providers better
- v1.0.1 - fix default timeout option that applies across all providers
- v1.0.0 - Remove legacy Media Cloud, add timeout option to
provider_for - v0.5.3 - Temporary fix to onlinenews-mediacloud search handling
- v0.5.3 - Tweaks to onlinenews-mediacloud for compatibility with new database pattern
- v0.5.2 - Fix to allow override of chunk'ing in MC client
- v0.5.1 - Fix use of media cloud to respect domains clause on story list paging
- v0.5.0 - Integrate new mediacloud-news-client into onlinenews-mediacloud
- v0.4.0 - Specify custom base URLs via new string param to
provider_by_nameandprovider_for - v0.3.0 - Add support for paging through stories directly, and including text in paged results for speed
- v0.2.6 - Fixed querying by domain on new mediacloud system
- v0.2.5 - Alignment with new mediacloud system. Old onlinenews provider is now "onlinenews-mclegacy", "onlinenews-mediacloud" now queries the new index.
- v0.2.4 - Added support for api keys via "provider_by_name"
- v0.2.3 - removed support for API keys in environment variables- now expected as an argument in
providers.provider_for - v0.2.2 - transition to use the dedicated mediacloud-api-legacy package to avoid version conflictsgit
- v0.2.1 - add in a date hack to resolve a lower-level bug in the Media Cloud legacy count-over-time results
- v0.2.0 - add in support for Media Cloud legacy database
- v0.1.7 - corrected support for a "filters" kwarg in online_news
- v0.1.6 - Added support for a "filters" kwarg in online_news
- v0.1.5 - Added politeness wait to all chunked queries in twitter provider
- v0.1.4 - Added Query Chunking for large collections in the Twitter provider
- v0.1.3 - Added Query Chunking for large queries in the onlinenews provider
- v0.1.2 - Test Completeness
- v0.1.1 - Parity with web-search module, and language model
- v0.1.0 - Initial pypi upload
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mc_providers-0.0.0.tar.gz.
File metadata
- Download URL: mc_providers-0.0.0.tar.gz
- Upload date:
- Size: 110.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa0c788dd4db557f5222f4ab32b529abc14146d518a203fcae4bebd1a2dcd736
|
|
| MD5 |
08c2fc085381cd63c69daa61b63b8955
|
|
| BLAKE2b-256 |
aab46d9b41b768088cf346be08d076bd9871d5fee97a3517f5b48cf6a13b519c
|
Provenance
The following attestation bundles were made for mc_providers-0.0.0.tar.gz:
Publisher:
publish-to-pypi.yml on mediacloud/mc-providers
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mc_providers-0.0.0.tar.gz -
Subject digest:
aa0c788dd4db557f5222f4ab32b529abc14146d518a203fcae4bebd1a2dcd736 - Sigstore transparency entry: 173082487
- Sigstore integration time:
-
Permalink:
mediacloud/mc-providers@eaa8ff3f22291176dac6b7c881c423db9bfc64d3 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/mediacloud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@eaa8ff3f22291176dac6b7c881c423db9bfc64d3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mc_providers-0.0.0-py3-none-any.whl.
File metadata
- Download URL: mc_providers-0.0.0-py3-none-any.whl
- Upload date:
- Size: 119.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2b5d6bc940b6b6e31959358bb08125a25efbf82be3ff654e5f2863786dc0a43
|
|
| MD5 |
1cfc72937bf2eac7560eb826fc664b12
|
|
| BLAKE2b-256 |
023ec3e59bf31e2da48014a71f8e1fb2710171c627fb0d2eaff5ed9ab7cb29e3
|
Provenance
The following attestation bundles were made for mc_providers-0.0.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on mediacloud/mc-providers
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mc_providers-0.0.0-py3-none-any.whl -
Subject digest:
d2b5d6bc940b6b6e31959358bb08125a25efbf82be3ff654e5f2863786dc0a43 - Sigstore transparency entry: 173082488
- Sigstore integration time:
-
Permalink:
mediacloud/mc-providers@eaa8ff3f22291176dac6b7c881c423db9bfc64d3 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/mediacloud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@eaa8ff3f22291176dac6b7c881c423db9bfc64d3 -
Trigger Event:
push
-
Statement type: