Skip to main content

A tool for parsing crime statistics reports (form 4-ЕГС) from crimestat.ru.

Project description

crimestat3000

A tool for automated parsing of Russian crime statistics reports (form 4-ЕГС) from crimestat.ru. All you need to know is which section of report you need, which sheets and columns. (Beware: these tend to change over the years so make sure to check for that and if needed separate you parsing process into several parts with different configurations.)

There's no need to download files manually -- crimestat3000 will take care of that without generating temporary files. But if you happend to have the files locally pass the path to their location to local_dir argument to slightly increase processing speed.

A 4-ЕГС report shows cumulative sums since the beginning of the year. By default crimestat3000 turns them into monthly values -- one can swith it off by setting cumsum argument to True.

You can also optionally specify the level of detail you need. Some sheets contain information on a previously mentioned article's specific part or paragraph -- you can drop those or keep those or just start by parsing all the sheets there are to decide knowingly later. Finally you can set shorten_descr argument to True to turn column names like Строка 12: умышленное причинение легкого вреда здоровью, совершенное по мотивам политической, идеологической, расовой, национальной или религиозной ненависти или вражды либо по мотивам ненависти или вражды в отношении какой-либо социальной группы п. «б» ч. 2 ст. 115 УК РФ to 115_ч2_б.

Here's an example call:

import crimestat3000 as cs

kwargs = {
    'first_month': '01-2016',
    'last_month' : '12-2016',
    'section'    : 2,

    # optional arguments                                defaults
    # ==================                                ========
    # 'sheets'       : {'all' or list of sheets}        # 'all'
    'keep'         : {'all', 'article', 'article+'}     # 'all'
    'columns'      : ['C', 'E'],                        # 'C', usually the sheet's total
    'shorten_descr': True                               # False
    # 'local_dir'    : {None, path to local directory}  # None
    # 'cumsum'       : {True, False}                    # False
}

table_2016 = cs.parse.period(**kwargs)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crimestat3000-0.1.3.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crimestat3000-0.1.3-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file crimestat3000-0.1.3.tar.gz.

File metadata

  • Download URL: crimestat3000-0.1.3.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for crimestat3000-0.1.3.tar.gz
Algorithm Hash digest
SHA256 df04168be6724f8e37c9072eafe0f5914d3ea5c333f6a98fb0b50aa36efcc42a
MD5 ef201f0deebf1d17f7d89e58549d2f3f
BLAKE2b-256 f30173daa0d0a21c15d3b057f362161792f7708d0c3cd7328478ff77d1010c0d

See more details on using hashes here.

File details

Details for the file crimestat3000-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: crimestat3000-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for crimestat3000-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7cafaea1b3909268ee34a9eb75f68d71167b1308e2b52038a134f50a62040b6e
MD5 015241e9dbcbff0cd04a781b8b9139e9
BLAKE2b-256 d8c18ab17e74efeab46c543ced1e2c3f2e6e7207ff2393e0d1a79058ecbc469b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page