Skip to main content

A library for fetching scientific articles from various sources

Project description

Scista

Scista - это Python library for searching and downloading scientific articles from various sources, including OpenAlex, CORE and Unpaywall.

Installation

pip install scista

Requirements

  • Python 3.7+
  • API key for CORE (get it on CORE API)
  • Email for Unpaywall

Search Filters

The library provides several optional search filters that can be used individually or in combination:

articles = fetcher.fetch_articles(
    topic="quantum computing",     # Search by topic in title
    num_articles=5,               # Number of articles to fetch (default: 5)
    categories=["Physics"],       # Filter by scientific categories
    from_date="2023-01-01",      # Start date (format: YYYY-MM-DD)
    to_date="2023-12-31",        # End date (format: YYYY-MM-DD)
    sort_by_date=True,           # Sort by date (newest first if True)
    journals=["1234-5678"]       # Filter by journal ISSN(s)
)

All filters are optional. You can use any combination of them:

# Search only by topic
articles = fetcher.fetch_articles(topic="quantum computing")

# Search by category and date range
articles = fetcher.fetch_articles(
    categories=["Physics"],
    from_date="2023-01-01",
    to_date="2023-12-31"
)

# Get latest articles from specific journals
articles = fetcher.fetch_articles(
    journals=["1234-5678", "8765-4321"],
    sort_by_date=True,
    num_articles=10
)

Filter Details

  • topic: Search for articles with this topic in the title
  • num_articles: Maximum number of articles to fetch (default: 5)
  • categories: Scientific categories to filter by. Can be a single category or a list
  • from_date: Start date in YYYY-MM-DD format
  • to_date: End date in YYYY-MM-DD format
  • sort_by_date: If True, sorts by date descending (newest first)
  • journals: Filter by journal ISSN(s). Can be a single ISSN or a list

Usage

import logging
from scista import ArticleFetcher

# Configure logging (optional)
logging.basicConfig(
    level=logging.INFO,  # Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Initialize with your API keys
fetcher = ArticleFetcher(
    core_api_key="your_core_api_key",
    email_for_unpaywall="your_email@example.com"
)

# Search for articles
articles = fetcher.fetch_articles(
    topic="quantum computing",  # Topic to search
    num_articles=5,            # Number of articles
    categories=["Physics"],    # Category
    from_date="2023-01-01",   # Start date
    to_date="2023-12-31",     # End date
    sort_by_date=True         # Sort by date
)

# Process results
for i, article in enumerate(articles, 1):
    print(f"\nArticle {i}:")
    print(article)
    
    # Save PDF if available
    if article.pdf_url:
        article.save_pdf(f"article_{i}.pdf")

Logging

The library uses the standard logging Python module. You can configure logging to your needs:

import logging

# Базовая настройка
logging.basicConfig(
    level=logging.INFO,  # Уровень логирования
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Or more complex configuration
logger = logging.getLogger('scista')
handler = logging.FileHandler('scista.log')
handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

Logging levels:

  • DEBUG: Detailed debug information
  • INFO: Confirmation of successful operations
  • WARNING: Warnings about potential problems
  • ERROR: Errors that do not interrupt the program
  • CRITICAL: Critical errors

Functionality

  • Search for articles by topic, category and date
  • Get article metadata (title, authors, DOI, etc.)
  • Download full texts and PDF versions of articles
  • Support for multiple data sources

License

MIT License

Author

AlestackOverglow

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scista-0.1.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scista-0.1.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file scista-0.1.0.tar.gz.

File metadata

  • Download URL: scista-0.1.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.1

File hashes

Hashes for scista-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0d5085dc7b49afec947c13bac339f08a15c68e032e43011be58d4371fc0d38b0
MD5 6b6d43c4755849aa836d54d0efa0d350
BLAKE2b-256 58ab46a7791bcd0ee45cde70f76e78c357c16356872c5a2282aa6e6dda4d1ee3

See more details on using hashes here.

File details

Details for the file scista-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: scista-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.1

File hashes

Hashes for scista-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a33000e75cdab36ae7c0b866bf91849e95e917c30d0c368e7c42f341db80a2cd
MD5 733fac09016acc373b2fae2a58b6d838
BLAKE2b-256 5118a66e7cd30a7ad84d7bc0b2dde0e7e8088aa99398fbadfb704d73d73a2c90

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page