Skip to main content

No project description provided

Project description

vk-url-scraper

Library to scrape data and especially media links (videos and photos) from vk.com URLs.

TODO

  • docs online from sphinx

Quick usage

pip install vk-url-scraper to install.

from vk_url_scraper import VkScraper

vks = VkScraper("username", "password")

# scrape any "photo" URL
res = vks.scrape("https://vk.com/photo1_278184324?rev=1")

# scrape any "wall" URL
res = vks.scrape("https://vk.com/wall-1_398461")

# scrape any "video" URL
res = vks.scrape("https://vk.com/video-6596301_145810025")
print(res[0]["text]) # eg: -> to get the text from code
# Every scrape* function returns a list of dict like
{
	"id": "wall_id",
	"text": "text in this post" ,
	"datetime": utc datetime of post,
	"attachments": {
		# if photo, video, link exists
		"photo": [list of urls with max quality],
		"video": [list of urls with max quality],
		"link": [list of urls with max quality],
	},
	"payload": "original JSON response converted to dict which you can parse for more data
}

see [docs] for all available functions.

Development

  1. setup environment with pip install -r requirements or pipenv install -r requirements
  2. To run all checks to make run-checks (fixes style) or individually
    1. To fix style: black . and isort . -> flake8 . to validate lint
    2. To do type checking: mypy .
    3. To test: pytest . (pytest -v --color=yes --doctest-modules tests/ vk_url_scraper/ to user verbose, colors, and test docstring examples)
  3. make docs to generate shpynx docs -> edit config.py if needed

Releasing new version

  1. edit version.py with proper versioning
  2. git tag vx.y.z to tag version
  3. git push origin vx.y.z -> this will trigger workflow and put project on pypi

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vk-url-scraper-0.1.5.tar.gz (6.5 kB view hashes)

Uploaded Source

Built Distribution

vk_url_scraper-0.1.5-py3-none-any.whl (6.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page