No project description provided
Project description
vk-url-scraper
Library to scrape data and especially media links (videos and photos) from vk.com URLs.
TODO
- docs online from sphinx
Quick usage
pip install vk-url-scraper
to install.
from vk_url_scraper import VkScraper
vks = VkScraper("username", "password")
# scrape any "photo" URL
res = vks.scrape("https://vk.com/photo1_278184324?rev=1")
# scrape any "wall" URL
res = vks.scrape("https://vk.com/wall-1_398461")
# scrape any "video" URL
res = vks.scrape("https://vk.com/video-6596301_145810025")
print(res[0]["text]) # eg: -> to get the text from code
# Every scrape* function returns a list of dict like
{
"id": "wall_id",
"text": "text in this post" ,
"datetime": utc datetime of post,
"attachments": {
# if photo, video, link exists
"photo": [list of urls with max quality],
"video": [list of urls with max quality],
"link": [list of urls with max quality],
},
"payload": "original JSON response converted to dict which you can parse for more data
}
see [docs] for all available functions.
Development
- setup environment with
pip install -r requirements
orpipenv install -r requirements
- To run all checks to
make run-checks
(fixes style) or individually- To fix style:
black .
andisort .
->flake8 .
to validate lint - To do type checking:
mypy .
- To test:
pytest .
(pytest -v --color=yes --doctest-modules tests/ vk_url_scraper/
to user verbose, colors, and test docstring examples)
- To fix style:
make docs
to generate shpynx docs -> edit config.py if needed
Releasing new version
- edit version.py with proper versioning
git tag vx.y.z
to tag versiongit push origin vx.y.z
-> this will trigger workflow and put project on pypi
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vk-url-scraper-0.1.5.tar.gz
(6.5 kB
view hashes)
Built Distribution
Close
Hashes for vk_url_scraper-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de74b161e8bae153160e1a6f0521457cb38a02a91e1dc598a41aef236d966b70 |
|
MD5 | 67cae5ae6bfd7b87fb7f50626c9d755a |
|
BLAKE2b-256 | deabd9d58e44e73faf56846ddff699325e16baa16a99f599dd694db0de449dc1 |