Skip to main content

The ultimate library for data scientist to scrape data from https://www.lefaso.net

Project description

lefaso-net-scraper

PyPI version

Description

lefaso-net-scraper is a robust and versatile Python library designed to efficiently extract articles from the popular online news source of of Burkina Faso, www.lefaso.net. This powerful scraping tool allows users to effortlessly collect article content and data from Internet users’ comments on lefaso.net.

Data Format

Field Description
article_topic article topic
article_title article title
article_published_date article published date
article_origin article origin
article_url article url
article_content article content
article_comments article comments

Installation

  • With poetry
poetry add lefaso-net-scraper
  • With pip
pip install lefaso-net-scraper

Usage

# coding: utf-8

from lefaso_net_scraper import LefasoNetScraper

section_url = 'https://lefaso.net/spip.php?rubrique473'
scraper = LefasoNetScraper(section_url)
data = scraper.run()

Settings Pagination range

# coding: utf-8

from lefaso_net_scraper import LefasoNetScraper

section_url = 'https://lefaso.net/spip.php?rubrique473'
scraper = LefasoNetScraper(section_url)
scraper.set_pagination_range(start=20, stop=100)
data = scraper.run()

Save data to csv

# coding: utf-8

from lefaso_net_scraper import LefasoNetScraper
import pandas as pd

section_url = 'https://lefaso.net/spip.php?rubrique473'
scraper = LefasoNetScraper(section_url)
data = scraper.run()
df = pd.DataFrame.from_records(data)
df.to_csv('path/to/df.csv')


Support this project and others !


Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lefaso_net_scraper-0.3.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lefaso_net_scraper-0.3.0-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file lefaso_net_scraper-0.3.0.tar.gz.

File metadata

  • Download URL: lefaso_net_scraper-0.3.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.12 Linux/6.2.0-1012-azure

File hashes

Hashes for lefaso_net_scraper-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0ae57e9eaa43e05a5ab082f5a50f93626c4ac39ca79bf53a7cbde298c6621935
MD5 602a93aaad4faea763ed92f2c8279e9c
BLAKE2b-256 ca6c746c0bb3753d6239cfa347abc31c43a59d9cc8a653516c75479928cdb33f

See more details on using hashes here.

File details

Details for the file lefaso_net_scraper-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lefaso_net_scraper-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.12 Linux/6.2.0-1012-azure

File hashes

Hashes for lefaso_net_scraper-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c3fd2140465210892fb8cb127a4bce5be48e3a1d748446d4a4cf6bcee2cf98bb
MD5 6c641f4a48da9fe0ff776112a9490ad2
BLAKE2b-256 431157c31d4f87b9632686ada00af7069514ee6e8efd8d6d46df5c851b633d63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page