Skip to main content

In this introductory sample, we'll try to predict a sentiment (positive or negative) for customer reviews. In the world of machine learning, this type of prediction is known as binary classification.

Project description

sentimentdl_glove_imdb_en

In this introductory sample, we'll try to predict a sentiment (positive or negative) for customer reviews. In the world of machine learning, this type of prediction is known as binary classification.

Sentiment Classification - of IMDb User Reviews - using LSTM

An end-to-end toolkit on building a movie review sentiment classification LSTM model in Keras Deep Learning with model h5 file. Model is trained on IMDb Movie reviews source.

As part of model training, we have trained LSTM nodels, with reasoning as to why LSTMs are well suited to handle (sequential) text data.

Features List

  1. Modular project structure
  2. Python package setup configured, package available on pypi
  3. Connecting to MS SQL Databse by pyodbc, you can install latest MS SQL driver for python from here
  4. Logging and Exception handling to MS SQL by Calling Stored Procedures

MS SQL

You can find create table and stored Procedure scripts under "references" folder

Download Datasets

You can download required datasets from here and keep it in "data/raw" folder

Plan of Action

  1. Load IMDb Movie Reviews dataset (50,000 reviews)
  2. Pre-process dataset by removing special characters, numbers, etc. from user reviews + convert sentiment labels positive & negative to numbers 1 & 0, respectively
  3. Import GloVe Word Embedding to build Embedding Dictionary + Use this to build Embedding Matrix for our Corpus
  4. Model Training using Deep Learning in Keras for: LSTM Models and analyse model performance and results
  5. Last, perform predictions on real IMDb movie reviews

Steps to run on Windows

  • Prerequisites: Python 3.9 (ensure Python is added to PATH) + Git Client

  • Open GIT CMD >> navigate to working directory >> Clone this Github Repo (or download project files from GitHub directly)

    git clone https://github.com/MusaddiqueHussainLabs/sentimentdl_glove_imdb_en.git  
    
  • Open Windows Powershell >> navigate to new working directory (cloned repo folder)

  • Run Project

    • Using Conda Environment:

      conda env create -f conda_env_win.yml   # create conda environment called 'app_env'
      conda env list                          # check if app_env is created
      conda activate app_env                  # activate app_env
      python main.py                           # run the project
      conda deactivate                        # close conda environment once done
      
    • Using PIP + Virtualenv:

      pip install virtualenv                  # install virtual environment        
      virtualenv ENV                          # create virtual environment by the name ENV
      .\ENV\Scripts\activate                  # activate ENV
      pip install -r .\pip_requirements.txt       # install project dependencies
      python main.py                           # run the project
      deactivate                              # close virtual environment once done
      

Bug / Feature Request

If you find a bug (the website couldn't handle the query and / or gave undesired results), kindly open an issue here by including your search query and the expected result.

References / Thanks

Big thanks to below authors:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mhlabs_sentiment-0.0.1.tar.gz (53.6 MB view hashes)

Uploaded Source

Built Distribution

mhlabs_sentiment-0.0.1-py3-none-any.whl (53.7 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page