This package implements a multiclass cascade classifier for text classification in the contexte of web-scraping.
Project description
Implementation of a multi-class cascade classifier in a package
General Presentation
This repository includes the second part of my second-year internship at ENSAE (National School of Statistics and Economic Administration), which I carried out at INRAE (National Research Institute for Agriculture, Food and the Environment) over a 4-month period.
More specifically, it contains the development and implementation part in a functional Python package, accessible from Pypi, of a cascade classifier.
! To find out more about the code: take a look at the Wiki of the Wiki !
Project hierarchy
├── LICENSE
├── README.md
├── dist/ <- Folder containing the package
├── examples/ <- For testing
│ ├── data/
│ │ └── merged_final.csv
│ ├── log/
│ ├── metrics/
│ │ ├── classification_report_famille.xlsx
│ │ ├── classification_report_secteur.xlsx
│ │ ├── confusion_matrix_famille.xlsx
│ │ ├── confusion_matrix_secteur.xlsx
│ │ ├── general_stats.txt
│ │ └── predictions.csv
│ ├── models/
│ │ ├── hyper-family.yaml
│ │ ├── hyper-sector.yaml
│ │ └── secteurs.joblib
│ ├── predict_out/
│ │ └── predictions.csv
│ └── train_test/
│ ├── test_split.csv
│ └── train_split.csv
├── pyproject.toml <- To generate the package
├── setup.cfg <- To generate the package
└── src/ <- Package source code
├── multiclass_cascade_classifier/
│ ├── Scripts.py
│ ├── Skeleton.py
│ ├── __init__.py
│ ├── base/
│ │ ├── ClassifierHelper.py
│ │ ├── DataFrameNormalizer.py
│ │ ├── DataHelper.py
│ │ ├── DataPredicter.py
│ │ ├── DataTrainer.py
│ │ ├── DataVectorizer.py
│ │ ├── FeaturesManipulator.py
│ │ ├── HyperSelector.py
│ │ ├── LogJournal.py
│ │ ├── MetricsGenerator.py
│ │ ├── PreProcessing.py
│ │ ├── VariablesChecker.py
│ │ ├── __init__.py
│ │ └── variables/ <- Contains general variables
│ │ ├── Variables.py
│ │ ├── __init__.py
│ ├── predict.py
│ ├── split.py
│ ├── test.py
│ └── train.py
└── multiclass_cascade_classifier.egg-info/
Installation via Pypi
pip install multiclass_cascade_classifier
Note: if this doesn't work, check the file name. It may change depending on the version.
You can now import and use the modules in this package! To find out more, check out the wiki!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for multiclass_cascade_classifier-1.0.14.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33d6e4a59e60f0461bdf2542b160eb4ba0b9ea7c7bdbe4f4127cfb7923e90295 |
|
MD5 | 612ee88aef1aeee2ec87ab0de34dccbc |
|
BLAKE2b-256 | de19fd450386d584910bea1331b965b5096d90da48067ab2863fa524cd24c6eb |
Hashes for multiclass_cascade_classifier-1.0.14-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b3c86ee5151b23bf505ca20a9d8284bd9384175fb3b9bac7eb3ccd1ddd5e360 |
|
MD5 | d63583edcb3ce6edc77149afd682416c |
|
BLAKE2b-256 | 5a174b24c6654297260e0dddeb4b8dc4e9f78894b97304bd90c0793e52feb1b4 |