Skip to main content

Legal Entity Name Understanding

Project description

LENU - Legal Entity Name Understanding



in collaboration with



License Code style: black

LENU is a python library that helps to understand and work with Legal Entity Names in the context of the Legal Entity Identifier (LEI) Standard (ISO 17441) as well as the Entity Legal Form (ELF) Code List Standard (ISO 20275).

The library utilizes Machine Learning with Transformers and scikit-learn. It provides and utilizes pre-trained ELF Detection models published at https://huggingface.co/Sociovestix. This code as well as the LEI data and models are distributed under Creative Commons Zero 1.0 Universal license.

The project was started in November 2021 as a collaboration of the Global Legal Entity Identifier Foundation (GLEIF) and Sociovestix Labs with the goal to explore how Machine Learning can support in detecting the legal form (ELF Code) from a legal name.

It provides:

  • an interface to download LEI and ELF Code data from GLEIF's public website
  • an interface to train and make use of Machine Learning models to classify ELF Codes from given Legal Names
  • an interface to use pre-trained ELF Detection models published on https://huggingface.co/Sociovestix

Dependencies

LENU requires

  • python (>=3.8, <3.10)
  • scikit-learn - Provides Machine Learning functionality for token based modelling
  • transformers - Download and applying Neural Network Models
  • pytorch - Machine Learning Framework to train Neural Network Models
  • pandas - For reading and handling data
  • Typer - Adds the command line interface
  • requests and pydantic - For downloading LEI data from GLEIF's website

Installation

via PyPI:

pip install lenu

From github:

pip install https://github.com/Sociovestix/lenu

Editable install from locally cloned repository

git clone https://github.com/Sociovestix/lenu
pip install -e lenu

Usage

Create folders for LEI and ELF Code data and to store your models

mkdir data
mkdir models

Download LEI data and ELF Code data into your data folder

lenu download

Train a (default) ELF Code Classification model. An ELF Classification model is always Jurisdiction specific and will be trained from Legal Names from this Jurisdiction.

Examples:

lenu train DE       # Germany
lenu train US-DE    # United States - Delaware
lenu train IT       # Italy

# enable logging to see more information like the number of samples and accuracy
lenu --enable-logging train CH 

Identify ELF Code by using a model. The tool will return the best scoring ELF Codes.

lenu elf DE "Hans Müller KG"
#   ELF Code                  Entity Legal Form name Local name     Score
# 0     8Z6G                              Kommanditgesellschaft  0.979568
# 1     V2YH                       Stiftung des privaten Rechts  0.001141
# 2     OL20  Einzelunternehmen, eingetragener Kaufmann, ein...  0.000714

You can also use pre-trained models, which is recommended in most cases:

# Model available at https://huggingface.co/Sociovestix/lenu_DE
lenu elf Sociovestix/lenu_DE "Hans Müller KG"  
#  ELF Code      Entity Legal Form name Local name     Score
#0     8Z6G                  Kommanditgesellschaft  0.999445
#1     2HBR  Gesellschaft mit beschränkter Haftung  0.000247
#2     FR3V       Gesellschaft bürgerlichen Rechts  0.000071

Support and Contributing

Feel free to reach out to either Sociovestix Labs or GLEIF if you need support in using this library, in utilizing LEI data in general, or in case you would like to contribute to this library in any form.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lenu-0.3.1.tar.gz (123.6 kB view details)

Uploaded Source

Built Distribution

lenu-0.3.1-py3-none-any.whl (116.2 kB view details)

Uploaded Python 3

File details

Details for the file lenu-0.3.1.tar.gz.

File metadata

  • Download URL: lenu-0.3.1.tar.gz
  • Upload date:
  • Size: 123.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.8.5 Darwin/20.4.0

File hashes

Hashes for lenu-0.3.1.tar.gz
Algorithm Hash digest
SHA256 f45be739848499223ea84afff96317260990285f3960f85b9486bc170211f12e
MD5 8c9d00ad0db40293fddf92b2fd4ed936
BLAKE2b-256 49088178cad1d3cbc396a1b1c307c6c3275566e6e8d4eb527ca890f40545ec40

See more details on using hashes here.

File details

Details for the file lenu-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: lenu-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 116.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.8.5 Darwin/20.4.0

File hashes

Hashes for lenu-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 24624b2dcbd9f78c3e7d2bf33a08b5c64184e6054ade20506579b6944385f1f2
MD5 4f00aca8a0fd1058762c6871646606fd
BLAKE2b-256 e051c8118a4949f21ba02e0dfa1514d9bccf4231fe17ad54c7ccb87ebc68183e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page