Skip to main content

Scrape data from allabolag.se.

Project description

This is a scraper for collecting data from allabolag.se. It has no formal relationship with the site.

It is written and maintained for Newsworthy, but could possibly come in handy for other people as well.

Installing

pip install allabolag

Example usage

from allabolag import Company

company = Company("559071-2807")

# show all available data about the company in a raw...
print(company.raw_data)

# ...or cleaned format
print(company.data)

And you can iterate the list of recent liquidations.

from allabolag import iter_liquidated_companies

for company in iter_liquidated_companies(until="2019-06-01"):
  print(company)

Use AWS API Gateway to rotate IP addresses

from allabolag import AWSGatewayRequestClient
request_client = AWSGatewayRequestClient()
company = Company("559071-2807", request_client=request_client)

for company in iter_liquidated_companies(until="2019-06-01",request_client=request_client):
  print(company)

Developing

To run tests:

python3 -m pytest

Deployment

To deploy a new version to PyPi:

  1. Update Changelog below.

  2. Update version in setup.py

  3. Build: python3 setup.py sdist bdist_wheel

  4. Upload: python3 -m twine upload dist/allabolag-X.Y.X*

…assuming you have Twine installed (pip install twine) and configured.

Changelog

  • 0.8.0 - Handle Koncernredovisning - Make RequestClient Python 3.8 compatible

  • 0.7.1 - Update request client to use inited client, rather than class

  • 0.7.0 - Add AWSGatewayRequestClient to enable request through rotating IP with AWS API Gateway

  • 0.6.1 - Bug fix: Actually use header in requests.

  • 0.6.0 - Add headers to request - Minor dependency updates - Use logger for debugging

  • 0.5.1 - Fix return type for Company.liquidation

  • 0.5.0 - Add Company.liquidation

  • 0.4.1 - Remove debug output - Don’t crash when we reach the end of a list

  • 0.4.0 - Add option to start from page N - Add custom exception for missing company

  • 0.3.1 - Add cache for company data

  • 0.3.0 - Add Company.remarks (a list of remarks, e.g. “Konkurs”)

  • 0.2.1 - Make iter_list() more generic, by accepting the while url fragment

  • 0.2.0 - Add iter_list() function

  • 0.1.7

    • Bug fix: Add encoding for Python 2.7

  • 0.1.6

    • Fixes bug when company has remark about Svensk Handels Varningslistan

  • 0.1.5

    • Make Python 2.7 compatible.

  • 0.1.4

    • Updating _iter_liquidate_companies to handle rebuilt site.

  • 0.1.3

    • Bug fixes

  • 0.1.0

    • First version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

allabolag-0.8.0.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

allabolag-0.8.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file allabolag-0.8.0.tar.gz.

File metadata

  • Download URL: allabolag-0.8.0.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for allabolag-0.8.0.tar.gz
Algorithm Hash digest
SHA256 d6b8524b571e522e985f26c5505d0636e6c08306ddaa92a411ffd0bc2960acce
MD5 64edde32e93ab0e7feaf593c44ec1c2c
BLAKE2b-256 dbd29ff74cdccaab4e59d75ddc97b8d34ac1104d6c0bed8d442d15b420bab298

See more details on using hashes here.

File details

Details for the file allabolag-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: allabolag-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for allabolag-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 570e99c88a980751646f93f1311c5cf7863487d6f1e4f15ca9a5fb75848442e6
MD5 d7ab5dc1aa6fba97782ebeb078eada88
BLAKE2b-256 898dbab12da36d899162b47cfc026a92641ce29fc13cdd148cf0dd6f08b4b590

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page