Web scraping API for Finnish websites
Project description
finscraper
The library provides an easy-to-use API for fetching data from various Finnish websites:
Website | Type | Spider API class |
---|---|---|
Ilta-Sanomat | News article | ISArticle |
Iltalehti | News article | ILArticle |
YLE Uutiset | News article | YLEArticle |
Suomi24 | Discussion thread | Suomi24Page |
Muusikoiden.net | Discussion thread | MNetPage |
Vauva | Discussion thread | VauvaPage |
Oikotie Asunnot | Apartment ad | OikotieApartment |
Tori | Item deal | ToriDeal |
Documentation is available at https://finscraper.readthedocs.io and simple online demo here.
Installation
pip install finscraper
Quickstart
Fetch 10 news articles as a pandas DataFrame from Ilta-Sanomat:
from finscraper.spiders import ISArticle
spider = ISArticle().scrape(10)
articles = spider.get()
The API is similar for all the spiders:
Contributing
Please see CONTRIBUTING.md for more information.
Jesse Myrberg (jesse.myrberg@gmail.com)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
finscraper-0.2.5.tar.gz
(25.0 kB
view details)
Built Distribution
File details
Details for the file finscraper-0.2.5.tar.gz
.
File metadata
- Download URL: finscraper-0.2.5.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79906e8e8004718a470cfda7e36e1bf42a02bdf5da199743adb4b64f0a27d2e9 |
|
MD5 | 01d952c616f934c3c717364502be2350 |
|
BLAKE2b-256 | 1982c379483d86d03d3cf859e0fddaf9107b49b64200fcf7398687364eaa2e06 |
File details
Details for the file finscraper-0.2.5-py3-none-any.whl
.
File metadata
- Download URL: finscraper-0.2.5-py3-none-any.whl
- Upload date:
- Size: 31.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9bbc914b0e3eda951e31dba978cca23775121009a0baabc0cb008d614a64be5a |
|
MD5 | f61f94c3fe76011867a722ba29a031e2 |
|
BLAKE2b-256 | 13a909fb19b25b4a92f0231c1f7759ac7dd5cd6d10f2fad9631b0d2df5275564 |