Lightweight Web Scraping Automation for Everyone
Project description
AutoScrap
Lightweight Web Scraping Automation for Everyone
Installation
After publishing to PyPI, install with:
pip install autoscrap
Or for development:
pip install -r requirements.txt
Features
- Simple functions for web scraping:
get_text(url, tag): Fetches all text within a given HTML tag from a URL.extract_table(url): Extracts the first HTML table from a URL as a list of lists or pandas DataFrame.
- No need to learn BeautifulSoup or Selenium.
Usage (Python)
from autoscrap.core import get_text, extract_table
# Get all text inside <p> tags
paragraphs = get_text('https://example.com', 'p')
print(paragraphs)
# Extract the first table as a list of lists
rows = extract_table('https://example.com/table')
print(rows)
# Extract as pandas DataFrame (requires pandas)
df = extract_table('https://example.com/table', as_dataframe=True)
print(df)
Usage (Command Line)
# Extract all <p> tag text from a page
python -m autoscrap.cli get_text https://example.com p
# Extract the first table as plain text
python -m autoscrap.cli extract_table https://example.com
# Extract the first table as a pandas DataFrame (requires pandas)
python -m autoscrap.cli extract_table https://example.com --as-dataframe
Running Tests
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
autoscrap-0.1.0.tar.gz
(3.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autoscrap-0.1.0.tar.gz.
File metadata
- Download URL: autoscrap-0.1.0.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61e024fe57f4723bc059be5fe2379d8469f87d7b5470f8dea4eaa1fefee06fda
|
|
| MD5 |
937e7ffabae153d43f7f2e25e8d72607
|
|
| BLAKE2b-256 |
574d21f38509f6690470b6af1fc5731bb2a0c997b65b5016ead3cb0e3d7b0283
|
File details
Details for the file autoscrap-0.1.0-py3-none-any.whl.
File metadata
- Download URL: autoscrap-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f212577e56cfb7977b98d241f6a2994aeaf5f911342fcf2cb2808bc1c1e9f2c4
|
|
| MD5 |
20d183da6c3d98346e57cddd50737a3c
|
|
| BLAKE2b-256 |
04dfbf6f8f6d9e1bd63b5eeb617296147e3a59e126dc93e60ac79fc79d718ef6
|