A small package that contains commonly used codes while scraping
Project description
Scrap Utils
This is small package that contains some code regularly repeated when scraping.
It has the following functions:
load_json(filepath, encoding=None, errors=None, parse_float=None,
parse_int=None, parse_constant=None)
dump_json(data, filepath, encoding=None, errors=None, indent=4, skipkeys=False,
ensure_ascii=False, separators=None, sort_keys=False)
to_csv(dataset, filepath, mode="a", encoding=None, errors=None, newline='',
header=True, dialect='excel', **fmtparams)
requests_get(url, trials=0, sleep_time=30, max_try=math.inf, **requests_kwargs)
requests_post(url, trials=0, sleep_time=30, max_try=math.inf, **requests_kwargs)
To-do list I'm considering:
- remove print statements
- add unittest
- soup_get()
- driver_get()
- start_firefox()
- read_csv()
Feel free to add your contribution here
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrap-utils-0.0.1.tar.gz
(3.2 kB
view hashes)
Built Distribution
Close
Hashes for scrap_utils-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfa02d84d0015bd13025a164d07a62e74bfe96a344fe880ba6a159ecbe14df08 |
|
MD5 | 4cff4046c0208b80edb7ba9c9fe729b1 |
|
BLAKE2b-256 | ce8e21f3a23558c0bb8082d7304ba4ee894893c2d031e75e58bd45a99903c72c |