cnparser is a parser library of Corporate Number Publication Site data.
Project description
cnparser
cnparser is a python library for loading and enrichment Corporate Number Publication Site data that is provided from National Tax Agency Japan. cnparser only support to parse latest data now.
Installation
cnparser is available on pip installation.
$ python -m pip install cnparser
GitHub Install
Installing the latest version from GitHub:
$ git clone https://github.com/new-village/cnparser
$ cd cnparser
$ python setup.py install
Usage
This section demonstrates how to use this library to load and process data from the National Tax Agency's Corporate Number Publication Site.
Direct Data Loading
To download data for a specific prefecture, use the load function. By passing the prefecture name as an argument, you can obtain a DataFrame containing data for that prefecture.If you wish to download data for a specific prefecture, you must specify the prefecture name in Roman characters (list of the supported prefectures).
To execute the load function without specifying any arguments, data for all prefectures across Japan will be downloaded.
>>> import cnparser
>>> df = cnparser.load("Shimane")
CSV Data Loading
If you already have a downloaded CSV file, use the read_csv function. By passing the file path as an argument, you can obtain a DataFrame with headers from the CSV data.
>>> import cnparser
>>> df = cnparser.read_csv("path/to/data.csv")
Data Enrichment Functionality
The enrich function standardises and transforms the values of specific fields in the loaded DataFrame.
>>> import cnparser
>>> df = cnparser.enrich(df)
The functions perform all processing, but it is possible to apply only specific processing by defining specific processing as an argument.
>>> import cnparser
>>> df = cnparser.enrich(df, "enrich_kana" ...)
The processes supported by the enrich function are as follows:
enrich_kana: Function that adds a standardized furigana columnfuriganato the DataFrame. It handles data entry by convertingnameto kana, iffuriganais NaN. Note that currently only kanji and katakana conversions are supported. Alphabet conversions are not supported.enrich_kind: Function that adds thekindlabel to thelegal_entity.enrich_post_code: Function that adds the formatted postcode as XXX-XXX topost_code.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cnparser-1.7.0.tar.gz.
File metadata
- Download URL: cnparser-1.7.0.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa7608b2968d228f87515f3b96555f6b5fefe19c7cab852d9f5e4955eb2e8988
|
|
| MD5 |
806a3861c81a54d87165d9321a1e6e9e
|
|
| BLAKE2b-256 |
ead86f2ecb469371afa6950a5b83c6d09929742a202cbc803d8efdf39250d919
|
File details
Details for the file cnparser-1.7.0-py3-none-any.whl.
File metadata
- Download URL: cnparser-1.7.0-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87b3cef2a295e066dde1294777ae5c610a56d3c6e81916724c0311d279523809
|
|
| MD5 |
caf4d48f20b9d8c8fff4483c76e7974c
|
|
| BLAKE2b-256 |
702eeafe7fb6f745e3117b3071223cac3ff38c26340f804f813bd53288419fea
|