This Python script utilizes Selenium to scrape data from Coursera, providing detailed information about various courses such as link, title, rating, recent views, students enrolled, time requirement, skills, learner count, difficulty level, duration, and sub-course details.
Project description
Coursera Scraper
This repository contains a Python script that utilizes Selenium to scrape data from Coursera.
Scraped Fields
The script scrapes the following fields for each course:
- Link: Link of the course
- Title: The title of the course.
- Institute: The institution offering the course (if available).
- Rating: The course rating.
- Recent Views: The number of recent views for the course.
- Students Enrolled: The number of students enrolled in the course.
- Time Requirement: The approximate time required to complete the course.
- Skills: A list of skills covered in the course.
- Learner Count: The total number of learners who have taken the course.
- Difficulty Level: The difficulty level of the course.
- Duration: The duration of the course.
- Sub-course: Additional information about the course or its sub-courses.
The scraped data is stored in a dictionary format for each course.
Prerequisites
You can install the required Python packages by running the following command:
pip install coursera-scraper
from scraper.main import scraper
for course_detail in scraper(keyword='python'):
# do some processing on course detail
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for coursera_scraper-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46dca9e6f1cc4734ba8a2b6319cbdb66972231f85c89c66ed29a9b6552568c1d |
|
MD5 | c66ce0e75eac695415008ab4c1b25198 |
|
BLAKE2b-256 | 72517add00859e587f34858e1e205c770b0d184c427a9fdaec0d9268af83383e |