MED(Minimum Effective Dose) to NLP
Project description
MED2NLP
MED(Minimum Effective Dose) is our philosophy(Tao Da MED), with a simple goal of making NLP as easy and effectual as possible.
Background
Deep learning for NLP has rapidly undergone some amazing advancements over the last few years and there are a wide array of amazing resources out there. This project is an attempt to extrapolate and simply the work of many of the smartest data scientist industry. As an active member of the fast.ai community most of this project is based on the works of member of the fast.ai community and their blog posts. Some of the most notable resources used were:
-
Keita Kurita's article A Tutorial to Fine-Tuning BERT with Fast AI
-
Dev Sharma's article Using RoBERTa with Fastai for NLP
-
Thilina Rajapakse article Simple Transformers — Multi-Class Text Classification with BERT, RoBERTa, XLNet, XLM, and DistilBERT
Getting started
Since we are going to be taking advantage of some the SOTA deep learning libraries and project you are going to need to install Fastai and 🤗Transformers and I would also highly recommend that you use anaconda to set up a virtual environment.
Basic system setup
( Anaconda / Text_editor / git )
- add steps/requirements
1.xx setup - virtual environment
To set up your virtual environment you will open your terminal and enter the following commands:
1.1 - Create a conda environment with python version 3.7, and the name med2conda
conda create -n med2conda python=3.7
1.2 - Activate your conda environment: (Now you can add to the conda environment, it should show (med2conda) in the command line
conda activate med2conda
1.3 - Add cuda toolkit to your conda env (in this case it is cudatoolkit=10.0) check here for alternative
conda install pytorch cudatoolkit=10.0 -c pytorch
1.4 - Add pytorch and fastai to your conda env
conda install -c pytorch -c fastai fastai
1.5 - Add transformers to your conda env
conda install -c conda-forge transformers
1.6 - Add jupyter notebooks to your conda env
conda install jupyter notebook
1.6b Some people need it to tell conda which notebook to use
conda install nb_conda
1.7 - When you are done using the med2nlp library you will want to exit your conda env with:
conda deactivate
pip install med2nlp
Assumptions
Your data is in a dataframe
The best way to simplify your life is format your data into a standard format so that can use your tools in a consistent manner. This is a fairly common practice is is used a lot in pipelines, some people refer to it as tidy data. As a starting point I am going to assume that your data is in a dataframe(pandas,rapid,etc.).
TODO - add more assumptions
How to use
When you want to use the med2nlp library you are going to start your med2conda env from the terminal with:
conda activate med2conda
Make sure you are in the med2nlp folder then start a jupyter notebook with: > jupyter notebook
get your data into a dataframe
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.