Skip to main content

(Karvy/Kfintech/CAMS) Consolidated Account Statement (CAS) PDF parser

Project description

CASParser

code style: black GitHub GitHub Workflow Status codecov PyPI - Python Version

Parse Consolidated Account Statement (CAS) PDF files generated from CAMS/KFINTECH

casparser also includes a command line tool with the following analysis tools

  • summary- print portfolio summary
  • (BETA) gains - Print capital gains report (summary and detailed)
    • with option to generate csv files for ITR in schedule 112A format

Installation

pip install -U casparser

with faster PyMuPDF parser

pip install -U 'casparser[fast]'

Note: Enabling this dependency could result in licensing changes. Check the License section for more details

Usage

import casparser
data = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password")

# Get data in json format
json_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="json")

# Get transactions data in csv string format
csv_str = casparser.read_cas_pdf("/path/to/cas/file.pdf", "password", output="csv")

Data structure

{
    "statement_period": {
        "from": "YYYY-MMM-DD",
        "to": "YYYY-MMM-DD"
    },
    "file_type": "CAMS/KARVY/UNKNOWN",
    "cas_type": "DETAILED/SUMMARY",
    "investor_info": {
        "email": "string",
        "name": "string",
        "mobile": "string",
        "address": "string"
    },
    "folios": [
        {
            "folio": "string",
            "amc": "string",
            "PAN": "string",
            "KYC": "OK/NOT OK",
            "PANKYC": "OK/NOT OK",
            "schemes": [
                {
                    "scheme": "string",
                    "isin": "string",
                    "amfi": "string",
                    "advisor": "string",
                    "rta_code": "string",
                    "rta": "string",
                    "open": "number",
                    "close": "number",
                    "close_calculated": "number",
                    "valuation": {
                      "date": "date",
                      "nav": "number",
                      "value": "number"
                    },
                    "transactions": [
                        {
                            "date": "YYYY-MM-DD",
                            "description": "string",
                            "amount": "number",
                            "units": "number",
                            "nav": "number",
                            "balance": "number",
                            "type": "string",
                            "dividend_rate": "number"
                        }
                    ]
                }
            ]
        }
    ]
}

Notes:

  • Transaction type can be any value from the following
    • PURCHASE
    • PURCHASE_SIP
    • REDEMPTION
    • SWITCH_IN
    • SWITCH_IN_MERGER
    • SWITCH_OUT
    • SWITCH_OUT_MERGER
    • DIVIDEND_PAYOUT
    • DIVIDEND_REINVESTMENT
    • SEGREGATION
    • STAMP_DUTY_TAX
    • TDS_TAX
    • STT_TAX
    • MISC
  • dividend_rate is applicable only for DIVIDEND_PAYOUT and DIVIDEND_REINVESTMENT transactions.

CLI

casparser also comes with a command-line interface that prints summary of parsed portfolio in a wide variety of formats.

Usage: casparser [-o output_file.json|output_file.csv] [-p password] [-s] [-a] CAS_PDF_FILE

  -o, --output FILE               Output file path. Saves the parsed data as json or csv
                                  depending on the file extension. For other extensions, the
                                  summary output is saved. [See note below]

  -s, --summary                   Print Summary of transactions parsed.
  -p PASSWORD                     CAS password
  -a, --include-all               Include schemes with zero valuation in the
                                  summary output
  -g, --gains                     Generate Capital Gains Report (BETA)
  --gains-112a ask|FY2020-21      Generate Capital Gains Report - 112A format for
                                  a given financial year - Use 'ask' for a prompt
                                  from available options (BETA)
  --force-pdfminer                Force PDFMiner parser even if MuPDF is
                                  detected

  --version                       Show the version and exit.
  -h, --help                      Show this message and exit.

CLI examples

# Print portfolio summary
casparser /path/to/cas.pdf -p password

# Print portfolio and capital gains summary
casparser /path/to/cas.pdf -p password -g

# Save parsed data as a json file
casparser /path/to/cas.pdf -p password -o pdf_parsed.json

# Save parsed data as a csv file
casparser /path/to/cas.pdf -p password -o pdf_parsed.csv

# Save capital gains transactions in csv files (pdf_parsed-gains-summary.csv and
# pdf_parsed-gains-detailed.csv)
casparser /path/to/cas.pdf -p password -g -o pdf_parsed.csv

Note: casparser cli supports two special output file formats [-o file.json / file.csv]

  1. json - complete parsed data is exported in json format (including investor info)
  2. csv - Summary info is exported in csv format if the input file is a summary statement or if a summary flag (-s/--summary) is passed as argument to the CLI. Otherwise, full transaction history is included in the export. If -g flag is present, two additional files '{basename}-gains-summary.csv', '{basename}-gains-detailed.csv' are created with the capital-gains data.
  3. any other extension - The summary table is saved in the file.

Demo

demo

ISIN & AMFI code support

Since v0.4.3, casparser includes support for identifying ISIN and AMFI code for the parsed schemes via the helper module casparser-isin. If the parser fails to assign ISIN or AMFI codes to a scheme, try updating the local ISIN database by

casparser-isin --update

If it still fails, please raise an issue at casparser-isin with the failing scheme name(s).

License

CASParser is distributed under MIT license by default. However enabling the optional dependency mupdf/fast would imply the use of PyMuPDF / MuPDF and hence the licenses GNU GPL v3 and GNU Affero GPL v3 would apply. Copies of all licenses have been included in this repository. - IANAL

Resources

  1. CAS from CAMS
  2. CAS from Karvy/Kfintech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

casparser-0.7.4.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

casparser-0.7.4-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file casparser-0.7.4.tar.gz.

File metadata

  • Download URL: casparser-0.7.4.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.12 Linux/6.2.0-1012-azure

File hashes

Hashes for casparser-0.7.4.tar.gz
Algorithm Hash digest
SHA256 a25a863aa20dc1fda7292b23621dbc18858c15fbbaafbd3c5e8cd72aab0eaa80
MD5 3501bf5f5539d928d97e3f73b254e2e5
BLAKE2b-256 ad842d93b9727a43a7dcfaafb84885102646793c3991581d0d76c1fba9a973de

See more details on using hashes here.

File details

Details for the file casparser-0.7.4-py3-none-any.whl.

File metadata

  • Download URL: casparser-0.7.4-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.12 Linux/6.2.0-1012-azure

File hashes

Hashes for casparser-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ac9802b2db04a0541969f1bf1a86e15a616dcd2f5bb879785e6dbb28bfec8f3d
MD5 7d485a1773a7a80935c5d5f312c41291
BLAKE2b-256 10fc264c2d05efae648b3941f786c4934f7e0e9ec4262995108606995393a0d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page