Skip to main content

A tool to import Retraction Watch data

Project description

Retraction Watch Database Importer

This Airflow script imports the Retraction Watch database into the annotations system of the Crossref Labs API.

license activity

Airflow AWS Linux Python

Input Format

The script expects an S3 folder that contains CSV files with Retraction Watch data.

The CSV file should have the headings (with this capitalization):

  • DOI
  • RetractionDOI
  • Reason
  • RetractionNature
  • Notes
  • URLS

The first row of the CSV should be the headings. Multiple entries are possible (e.g. an expression of concern and a retraction), but only one type of each, for each DOI, will be imported. (I.e. you cannot have two retractions or two expressions of concern.)

Idempotency

The script is idempotent. If you run it multiple times, it will only import new data and the results should be the same after multiple runs.

Archiving

After processing a JSON input file, the script will move it to an archive folder in the same S3 bucket.

Periodic Runs and Missing Input Files

The script is designed to be run periodically. If it does not find any input files, it will raise an exception. This is by design.

© Crossref 2023

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retraction_watch_import-0.0.4.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

retraction_watch_import-0.0.4-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file retraction_watch_import-0.0.4.tar.gz.

File metadata

File hashes

Hashes for retraction_watch_import-0.0.4.tar.gz
Algorithm Hash digest
SHA256 436878bcf2ae1ed2825538143883278ef0a1136b1ac3e50bfae0df09447dc71f
MD5 1891723f89e6a23160fa48ee1c292c1f
BLAKE2b-256 0ba91955a4097f36d4a667d2b5b37c8c197e0793f85ff6fb4f07064c1cdde953

See more details on using hashes here.

File details

Details for the file retraction_watch_import-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for retraction_watch_import-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2af73dc486c550ac5e276bbb093b6d6ee7a2382ba9614cb9c003c12d422988c3
MD5 b4e4e46fbf7d3abd063f2cedd6917f0c
BLAKE2b-256 2a3ae5a2ab63e8bc89dbbe5c335d99774c8f4c4fd8382e06af4e5674d0eac06a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page