Skip to main content

No project description provided

Project description

Library to add a normalized companies names column in Excel file

About the problem:

Once we find misspelling and not normalized companies names in an Excel file that contains the at least the column named organization we will struggle against the possibility of save and process data without integrity. This library aims to process the Excel file and add a new column named canonical_name.

Technologies used in this library

How to use the library

This library has two ways of use: The first one is installing locally through and the second is installing from PyPI repository. Let's see how to use it by the both ways below.

Observations: This document assumes that you're familiar to Poetry, python virtual environment and has it already installed in your machine.

First Way - Installing locally using Poetry:

1 - Run the command to access the bash using the virtualenv created by poetry

poetry shell

2 - Run the command below to Poetry installs the library locally

poetry install

Second Way - Installing using pip:

  • Run the command below to see install via pip
pip install normalize-companies-names

Executing the library

1 - Run the command below to see the information about the library

normalize --help

Result: Usage: normalize [OPTIONS]

Options:

-c, --canonicals TEXT Canonicals companies names separated by comma. (e.g 'MICROSOFT TECHNOLOGY LICENSING,MICRON TECHNOLOGY,DELTA TECHNOLOGY,ELTA TECHNOLOGY') [required]

-i, --input_filepath TEXT Path to the Excel file that need to be processed. [required]

-o, --output_filepath TEXT Path to save the processed Excel file. [required]

--help Show this message and exit.

2 - Run the command below to process a file and receive the processed one

normalize -c MICROSOFT,MICRON,ELTA,DELTA -i ./data/patent-records.xlsx -o ./data/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

normalize_companies_names-1.0.1.tar.gz (2.9 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page