Skip to main content

Python package to interact with Factiva news-related APIs. Services are described in the Dow Jones Developer Platform.

Project description

https://github.com/dowjones/factiva-news-python/actions/workflows/master_test_publish.yml/badge.svg

This library simplifies the integration to Factiva API services for news-related services.

The following services are currently implemented.

  • Snapshots: Allows to run each snapshot creation, monitoring, download and local exploration, in an individual manner. Also allows to run the whole process within a single method.

  • Streams: In addition to creating and getting stream details, contains the methods to easily implement a stream listener and push the content to other locations appropriate for high-available setups.

The previous components rely on the API-Key authentication method, which is a prerequisite when using either of those services.

Installation

To install this library, run the following commands.

$ pip install --upgrade factiva-news

Using Library services

Both services, Snapshots and Streams are implemented in this library.

Enviroment vars

To be able to use Stream Listener options, add the following environment vars depending on your selected listener tool

To use BigQuery Stream Listener .. code-block:

$ export GOOGLE_APPLICATION_CREDENTIALS="/Users/Files/credentials.json"
$ export STREAMLOG_BQ_TABLENAME=project.dataset.table

To use MongoDB Stream Listener .. code-block:

$ export MONGODB_CONNECTION_STRING=mongodb://localhost:27017
$ export MONGODB_DATABASE_NAME=factiva-news
$ export MONGODB_COLLECTION_NAME=stream-listener

To define custom directories. If they are not set, the project root path will be used .. code-block:

$ export DOWNLOAD_FILES_DIR=/users/dowloads
$ export STREAM_FILES_DIR=/users/listeners
$ export LOG_FILES_DIR=/users/logs

Snapshots

Create a new snapshot and download to a local repository just require a few lines of code.

from factiva.news.snapshot import Snapshot
my_query = "publication_datetime >= '2020-01-01 00:00:00' AND LOWER(language_code) = 'en'"
my_snapshot = Snapshot(
    user_key='abcd1234abcd1234abcd1234abcd1234',  # Can be ommited if exist as env variable
    query=my_query)
my_snapshot.process_extract()  # This operation can take several minutes to complete

After the process completes, the output files are stored in a subfolder named as the Extraction Job ID.

In the previous code a new snapshot is created using my_query as selection criteria and user_key for user authentication. After the job is being validated internally, a Snapshot Id is obtained along with the list of files to download. Files are automatically downloaded to a folder named equal to the snapshot ID, and contents are loaded as a Pandas DataFrame to the variable news_articles. This process may take several minutes, but automates the extraction process significantly.

Streams

Create a stream instance and get the details to configure the stream client and listen the content as it is delivered.

from factiva.news.stream import Stream

stream_query = Stream(
    user_key='abcd1234abcd1234abcd1234abcd1234',   # Can be ommited if exist as env variable
    user_key_stats=True,
    query="publication_datetime >= '2021-04-01 00:00:00' AND LOWER(language_code)='en' AND UPPER(source_code) = 'DJDN'",
    )

print(stream_query.create())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

factiva-news-0.2.5.tar.gz (29.0 kB view details)

Uploaded Source

Built Distribution

factiva_news-0.2.5-py3-none-any.whl (34.3 kB view details)

Uploaded Python 3

File details

Details for the file factiva-news-0.2.5.tar.gz.

File metadata

  • Download URL: factiva-news-0.2.5.tar.gz
  • Upload date:
  • Size: 29.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for factiva-news-0.2.5.tar.gz
Algorithm Hash digest
SHA256 1a7967db1a79a8e68cf710ec36fe9936b57c22eeeaccde26813f9b222bea94ec
MD5 f374c713c9407e3f57472a5eb9fb84d0
BLAKE2b-256 5d8e5f79309303fd360c393ea694f462f7d8ce3592c650c248d9c8d412590d65

See more details on using hashes here.

File details

Details for the file factiva_news-0.2.5-py3-none-any.whl.

File metadata

File hashes

Hashes for factiva_news-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0b9fb75102afc5f8cb6474d31e380c1c23a1bc98a59d7064c971e1b5c2b06da6
MD5 fdaeafba0d2ed11e8e635310f1a9c1b5
BLAKE2b-256 1000f5b417d09ba6d28cbfaad1e33fa3e906bc1ed3f13e04fa844156982d3c5d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page