Skip to main content

Capstone Text Mining Techniques

Reason this release was yanked:

broken

Project description

CAPSTONE-TEXT-MINING README

Version 0.0.7

What is it?

The Capstone-Text-Mining is a Python package developed for a graduate analytics program capstone project for a specific client. The client is a data analytics consulting and software-as-a-service provider with its own Machine Learning Operations (MLOps) Platform. The client specializes in paid search and marketing analytics.

The package allows an end user to apply text mining and natural language processing (NLP) techniques to analyze, evaluate, and identify superior keywords for paid search campaigns. It integrates typical text data cleanup steps, text mining and NLP approaches, and modeling techniques to evaluate the effectiveness of the keywords.

Main Features

The package provides the following capabilities:

  • Ability to load paid search campaign data, including keywords, campaign metadata, and outcome metrics (e.g., clickthrough rates)
  • Application of common text data cleaning techniques, including removal of punctuation and stopwords, tokenization, etc.
  • Application of various text mining and NLP techniques to the keywords to develop a variety of features. These methods include:
  • Topic Modeling
  • Named Entity Recognition
  • Hand Labeling of text features
  • Graph model of text
  • Sentiment Analysis
  • Regression and classification model creation using a baseline model (without text mining) and with text mining to estimate the improvement or “lift” the keyword provides in terms an outcome metric (e.g., CTR)
  • Data visualization to evaluate the most impactful text-based features to aid in the analysis and evaluation of keywords. This includes graph and SHAP.

Where to get it?

The source code is currently hosted on GitHub at: https://github.com/mfligiel/Capstone_Text_Mining

The package can installed from the Python Package Index (PyPI):

pip install Capstone_Text_Mining

Getting Started

We recommend sourcing or creating a campaign-level paid search dataset. Each record in the data set should represent a specific campaign and keyword option. Additional level of granularity may be added – e.g., by week, day, channel, etc. In addition to the keyword, each record should have some additional campaign variables to establish a baseline for evaluating the effectiveness of the keyword. The various text mining feature engineering functions may then be applied to the keyword(s) to generate various text-based features for your data set. This data can then be visualized. Models can also be applied to the data to evaluate the effectiveness of the keyword selected.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capstone_text_mining-0.0.7.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

capstone_text_mining-0.0.7-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file capstone_text_mining-0.0.7.tar.gz.

File metadata

  • Download URL: capstone_text_mining-0.0.7.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10

File hashes

Hashes for capstone_text_mining-0.0.7.tar.gz
Algorithm Hash digest
SHA256 5b02a2a97828c59c296cd045d3916c29afc0fee3f14e303b59ec4daa47b35f18
MD5 7b5b9a65e8d7c77fb0b49796c5f2fc8a
BLAKE2b-256 e1128e31960eb9826afe888e4139e1bb1bc238d06951a9474dd66b5092a293f8

See more details on using hashes here.

File details

Details for the file capstone_text_mining-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: capstone_text_mining-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.10

File hashes

Hashes for capstone_text_mining-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f6ba5d696d1e2d23815224751a6a7488bc502728b235fb44bf4a757f5426e68a
MD5 02e90827661d3fea87d1206f95e1dcc3
BLAKE2b-256 9f1638c5048bc4e83525244d1ad47591fcc577cdf66bf6348c90ec3e03f6ccaa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page