A library for fetching scientific articles from various sources
Project description
Scista
Scista - это Python library for searching and downloading scientific articles from various sources, including OpenAlex, CORE and Unpaywall.
Installation
pip install scista
Requirements
- Python 3.7+
- API key for CORE (get it on CORE API)
- Email for Unpaywall
Search Filters
The library provides several optional search filters that can be used individually or in combination:
articles = fetcher.fetch_articles(
topic="quantum computing", # Search by topic in title
num_articles=5, # Number of articles to fetch (default: 5)
categories=["Physics"], # Filter by scientific categories
from_date="2023-01-01", # Start date (format: YYYY-MM-DD)
to_date="2023-12-31", # End date (format: YYYY-MM-DD)
sort_by_date=True, # Sort by date (newest first if True)
journals=["1234-5678"] # Filter by journal ISSN(s)
)
All filters are optional. You can use any combination of them:
# Search only by topic
articles = fetcher.fetch_articles(topic="quantum computing")
# Search by category and date range
articles = fetcher.fetch_articles(
categories=["Physics"],
from_date="2023-01-01",
to_date="2023-12-31"
)
# Get latest articles from specific journals
articles = fetcher.fetch_articles(
journals=["1234-5678", "8765-4321"],
sort_by_date=True,
num_articles=10
)
Filter Details
topic: Search for articles with this topic in the titlenum_articles: Maximum number of articles to fetch (default: 5)categories: Scientific categories to filter by. Can be a single category or a listfrom_date: Start date in YYYY-MM-DD formatto_date: End date in YYYY-MM-DD formatsort_by_date: If True, sorts by date descending (newest first)journals: Filter by journal ISSN(s). Can be a single ISSN or a list
Usage
import logging
from scista import ArticleFetcher
# Configure logging (optional)
logging.basicConfig(
level=logging.INFO, # Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Initialize with your API keys
fetcher = ArticleFetcher(
core_api_key="your_core_api_key",
email_for_unpaywall="your_email@example.com"
)
# Search for articles
articles = fetcher.fetch_articles(
topic="quantum computing", # Topic to search
num_articles=5, # Number of articles
categories=["Physics"], # Category
from_date="2023-01-01", # Start date
to_date="2023-12-31", # End date
sort_by_date=True # Sort by date
)
# Process results
for i, article in enumerate(articles, 1):
print(f"\nArticle {i}:")
print(article)
# Save PDF if available
if article.pdf_url:
article.save_pdf(f"article_{i}.pdf")
Logging
The library uses the standard logging Python module. You can configure logging to your needs:
import logging
# Базовая настройка
logging.basicConfig(
level=logging.INFO, # Уровень логирования
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Or more complex configuration
logger = logging.getLogger('scista')
handler = logging.FileHandler('scista.log')
handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
Logging levels:
- DEBUG: Detailed debug information
- INFO: Confirmation of successful operations
- WARNING: Warnings about potential problems
- ERROR: Errors that do not interrupt the program
- CRITICAL: Critical errors
Functionality
- Search for articles by topic, category and date
- Get article metadata (title, authors, DOI, etc.)
- Download full texts and PDF versions of articles
- Support for multiple data sources
License
MIT License
Author
AlestackOverglow
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scista-0.1.0.tar.gz.
File metadata
- Download URL: scista-0.1.0.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d5085dc7b49afec947c13bac339f08a15c68e032e43011be58d4371fc0d38b0
|
|
| MD5 |
6b6d43c4755849aa836d54d0efa0d350
|
|
| BLAKE2b-256 |
58ab46a7791bcd0ee45cde70f76e78c357c16356872c5a2282aa6e6dda4d1ee3
|
File details
Details for the file scista-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scista-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a33000e75cdab36ae7c0b866bf91849e95e917c30d0c368e7c42f341db80a2cd
|
|
| MD5 |
733fac09016acc373b2fae2a58b6d838
|
|
| BLAKE2b-256 |
5118a66e7cd30a7ad84d7bc0b2dde0e7e8088aa99398fbadfb704d73d73a2c90
|