Skip to main content

Scrape data from allabolag.se.

Project description

This is a scraper for collecting data from allabolag.se. It has no formal relationship with the site.

It is written and maintained for Newsworthy, but could possibly come in handy for other people as well.

Installing

pip install allabolag

Example usage

from allabolag import Company

company = Company("559071-2807")

# show all available data about the company in a raw...
print(company.raw_data)

# ...or cleaned format
print(company.data)

And you can iterate the list of recent liquidations.

from allabolag import iter_liquidated_companies

for company in iter_liquidated_companies(until="2019-06-01"):
  print(company)

Use AWS API Gateway to rotate IP addresses

from allabolag import AWSGatewayRequestClient
request_client = AWSGatewayRequestClient()
company = Company("559071-2807", request_client=request_client)

for company in iter_liquidated_companies(until="2019-06-01",request_client=request_client):
  print(company)

Developing

To run tests:

python3 -m pytest

Deployment

To deploy a new version to PyPi:

  1. Update Changelog below.

  2. Update version in setup.py

  3. Build: python3 setup.py sdist bdist_wheel

  4. Upload: python3 -m twine upload dist/allabolag-X.Y.X*

…assuming you have Twine installed (pip install twine) and configured.

Changelog

  • 0.7.1 - Update request client to use inited client, rather than class

  • 0.7.0 - Add AWSGatewayRequestClient to enable request through rotating IP with AWS API Gateway

  • 0.6.1 - Bug fix: Actually use header in requests.

  • 0.6.0 - Add headers to request - Minor dependency updates - Use logger for debugging

  • 0.5.1 - Fix return type for Company.liquidation

  • 0.5.0 - Add Company.liquidation

  • 0.4.1 - Remove debug output - Don’t crash when we reach the end of a list

  • 0.4.0 - Add option to start from page N - Add custom exception for missing company

  • 0.3.1 - Add cache for company data

  • 0.3.0 - Add Company.remarks (a list of remarks, e.g. “Konkurs”)

  • 0.2.1 - Make iter_list() more generic, by accepting the while url fragment

  • 0.2.0 - Add iter_list() function

  • 0.1.7

    • Bug fix: Add encoding for Python 2.7

  • 0.1.6

    • Fixes bug when company has remark about Svensk Handels Varningslistan

  • 0.1.5

    • Make Python 2.7 compatible.

  • 0.1.4

    • Updating _iter_liquidate_companies to handle rebuilt site.

  • 0.1.3

    • Bug fixes

  • 0.1.0

    • First version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

allabolag-0.7.1.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

allabolag-0.7.1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file allabolag-0.7.1.tar.gz.

File metadata

  • Download URL: allabolag-0.7.1.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for allabolag-0.7.1.tar.gz
Algorithm Hash digest
SHA256 4bd415954466d04189d8cd2f571e74fa61f17ae4792314d206df0a51e99cc789
MD5 f5b6bb4d02c7edac680ea17270a073ae
BLAKE2b-256 cee9664f12d9e3c17891d5deb454f412809c9f8f74eb9c99d2469ce595d07ec7

See more details on using hashes here.

File details

Details for the file allabolag-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: allabolag-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for allabolag-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0a4f43632891a3188eb1c1b692f6cb915e14c4616120cd00bad22c78a266023b
MD5 9d3cd0b5b0206020b86410d5b795944e
BLAKE2b-256 f89a82e0805249d6e816b650af80077002a2c545da852a01cf1c3dc0308f7c2e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page