Skip to main content

Mine implicit features using a generative feature language model.

Project description

GFLM: mine implicit features using a generative feature language model

Description

This package implements a Generative Feature Language Models for Mining Implicit Features.

Given the following input:

  • a text dataset
  • a set of predefined features

Compute the following:

  • mapping of explicit and implicit features on the data
  • using both gflm_word and gflm_section algorithms

Install

pip install feature_mining

Sample Usage

Usage:
    from feature_mining import FeatureMining
    fm = FeatureMining()
    fm.load_ipod(full_set=False)
    fm.fit()
    fm.predict()

Results:
    - prediction using 'section': fm.gflm.gflm_section
    - prediction using 'word': fm.gflm.gflm_word

Display result:
    fm.section_features()
    print(fm.gflm_section_result.sort_values(by=['gflm_section'], ascending=False)[['feature', 'section_text']].head(20))

Package created based on the following paper

S. Karmaker Santu, P. Sondhi and C. Zhai, "Generative Feature Language Models for Mining Implicit Features from Customer Reviews", Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16, 2016.

Pydocs (Code Documentation)

Accessible via this link: http://htmlpreview.github.io/?https://github.com/nfreundlich/CS410_CourseProject/blob/dev/docs/feature_mining.html

(Apologies for the color scheme - it was the default)

Tutorial

See Jupyter notebook tutorial https://github.com/nfreundlich/CS410_CourseProject/blob/dev/tutorial.ipynb

Video presentation and tutorial

Link to YouTube: https://www.youtube.com/watch?v=mjJHkyrkxHM

Package on PyPi

https://pypi.org/project/feature-mining/

Slides

https://github.com/nfreundlich/CS410_CourseProject/blob/dev/docs/CS_410_GFLM_Slides.pdf

Known Issues

Explicit feature mentions not removed from GFLM word/sentence: https://github.com/nfreundlich/CS410_CourseProject/issues/28

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feature_mining-0.1.1.tar.gz (296.8 kB view details)

Uploaded Source

Built Distribution

feature_mining-0.1.1-py3-none-any.whl (581.4 kB view details)

Uploaded Python 3

File details

Details for the file feature_mining-0.1.1.tar.gz.

File metadata

  • Download URL: feature_mining-0.1.1.tar.gz
  • Upload date:
  • Size: 296.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for feature_mining-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d97c989e19b415067b83c6a19e36e2f4a26b6fc09b4760014b2dd92d73ecb4c3
MD5 5d0771676428f0f5cc9dd5d256f6fd27
BLAKE2b-256 a7d35e36a8c49cf5890f8ad20deb036aef0c7edd669ebd0e0feda7db4ac967a6

See more details on using hashes here.

File details

Details for the file feature_mining-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: feature_mining-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 581.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for feature_mining-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 68e0e1ef6e4ac0dc18a4452ea4fa0ba6ed756f887bb2c7432a899ab620628acc
MD5 7a003dcdc011545690006a9812c3aa6b
BLAKE2b-256 6f7849a81e9bcb5cf69215c80e613851e3c863bc71f35d96ef395ea119353591

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page