Image Classification library built on top of Keras. Identifies the best set of hyperparameters and trains a classification model accordingly, hence, smart.
Project description
# smart_image_classifier
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/anuragmishracse/smart_image_classifier/blob/master/LICENSE)
**SM**art **I**mage **C**lassifier (abbreviated and named as **SMIC**) is a _deep learning_ library built on top of keras using the TensorFlow backend for building models for _image classification_.
Specialities of this library:
1. It searches for the optimum set of hyperparameters for the classification model
2. Works for any training set, given it's organized in a format that the library understands
3. One can build an Image Classifier in under 5 lines of code
_It is advised that you use a GPU for training your models, as it might take days using a CPU._
---------------------
## Requirements
Current implementation of the library depends on the following:
```
1. tqdm
2. pandas
3. numpy
4. opencv-python
5. tensorflow
6. h5py
7. keras==2.0.9
```
Please make sure that the requirements are satisifies.
Requirements can be installed by `pip install -r requirements.py`
## Installation
This package can be installed by
```
pip install smic
```
and you're done.
## Train / Test data organization
The train and test images should be put in seperate directories. The required data format is,
```
/path/to/data/folder/
|---->train
|----|----->trainImage1 #Image names can be anything
|----|----->trainImage2
|----|----->trainImage3
|----|----->___so on___
|---->test
|----|----->testImage1 #Image names can be anything
|----|----->testImage2
|----|----->testImage3
|----|----->___so on___
|---->trainLabels.csv #Contains records in `"trainImage1","cat"` format
|---->testLabels.csv #Contains records in `"testImage1","dog"` formmat
```
## Usage
Building an image classification model is made really easy.
```python
from smic import SMIC
clf = SMIC()
clf.prepare_train_data('/path/to/data/folder')
hyperparameters = clf.search_optimal_parameters()
clf.fit(hyperparameters, epochs = 50, batch_size=32)
```
`hyperparameters` is a dict returned by `search_optimal_parameters()` and contains the hyperparameters that seem to work best for the current task at hand.
If you want you can use your own hyperparameters; skip line [4] and create your own hyperparameters dict like
```python
hyperparameters = {'transfer_model' : 'vgg16', 'optimizer' : 'sgd',
'top_layers' : [['dense', 512, 'relu'],['dense', 512, 'relu']]}
```
Pass this dict as an argument to `.fit()`.
### Supported hyperparameters and values:
```
'transfer_model' : ['vgg16', 'vgg19', 'resnet50', 'inception_v3']
'optimizer': ['sgd', 'rmsprop', 'adam']
'top_layers': A list of all the layers that you want to add on top of the pre-trained CNN.
Eg: [['dense', 512, 'relu'],['dense', 512, 'relu'],...]
Here 'dense' is the type of layer, 512 is the output dimension and
'relu' is the activation function.
```
## TODO
1. An assumption used is that the dataset fits into memory; use batch processing to fit > RAM sized datasets.
2. The hyperparameter tuning currently searches for an optimizer, tranfer learning CNN and number of top layers; Add support for more hyperparameters like momentum value, Dropouts, Regularization etc.
3. Add image data augmentation, that can potentially help learn from fewer datasets.
4. The dataset needs to be organized in the above mentioned format; add support for other formats like:
```
Train
|--->Cat
|--->|----catImage1
|--->|----catImage2
|--->|----......
|--->Dog
|--->|----dogImage1
|--->|----dogImage2
|--->|----......
.............
```
5. Add tests; figure out a way to test changes so that it doesn't corrupt the repo; use CI.
## Note to community
1. A lot of effort needs to be put in, as a community, to develop a systematic approach for hyperparameter tuning, hence suggestions/ ideas welcome.
2. Pull requests are welcome for the above TODO or any other improvement.
3. In case of any issues/ queries, open a new issue or contact me over email.
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/anuragmishracse/smart_image_classifier/blob/master/LICENSE)
**SM**art **I**mage **C**lassifier (abbreviated and named as **SMIC**) is a _deep learning_ library built on top of keras using the TensorFlow backend for building models for _image classification_.
Specialities of this library:
1. It searches for the optimum set of hyperparameters for the classification model
2. Works for any training set, given it's organized in a format that the library understands
3. One can build an Image Classifier in under 5 lines of code
_It is advised that you use a GPU for training your models, as it might take days using a CPU._
---------------------
## Requirements
Current implementation of the library depends on the following:
```
1. tqdm
2. pandas
3. numpy
4. opencv-python
5. tensorflow
6. h5py
7. keras==2.0.9
```
Please make sure that the requirements are satisifies.
Requirements can be installed by `pip install -r requirements.py`
## Installation
This package can be installed by
```
pip install smic
```
and you're done.
## Train / Test data organization
The train and test images should be put in seperate directories. The required data format is,
```
/path/to/data/folder/
|---->train
|----|----->trainImage1 #Image names can be anything
|----|----->trainImage2
|----|----->trainImage3
|----|----->___so on___
|---->test
|----|----->testImage1 #Image names can be anything
|----|----->testImage2
|----|----->testImage3
|----|----->___so on___
|---->trainLabels.csv #Contains records in `"trainImage1","cat"` format
|---->testLabels.csv #Contains records in `"testImage1","dog"` formmat
```
## Usage
Building an image classification model is made really easy.
```python
from smic import SMIC
clf = SMIC()
clf.prepare_train_data('/path/to/data/folder')
hyperparameters = clf.search_optimal_parameters()
clf.fit(hyperparameters, epochs = 50, batch_size=32)
```
`hyperparameters` is a dict returned by `search_optimal_parameters()` and contains the hyperparameters that seem to work best for the current task at hand.
If you want you can use your own hyperparameters; skip line [4] and create your own hyperparameters dict like
```python
hyperparameters = {'transfer_model' : 'vgg16', 'optimizer' : 'sgd',
'top_layers' : [['dense', 512, 'relu'],['dense', 512, 'relu']]}
```
Pass this dict as an argument to `.fit()`.
### Supported hyperparameters and values:
```
'transfer_model' : ['vgg16', 'vgg19', 'resnet50', 'inception_v3']
'optimizer': ['sgd', 'rmsprop', 'adam']
'top_layers': A list of all the layers that you want to add on top of the pre-trained CNN.
Eg: [['dense', 512, 'relu'],['dense', 512, 'relu'],...]
Here 'dense' is the type of layer, 512 is the output dimension and
'relu' is the activation function.
```
## TODO
1. An assumption used is that the dataset fits into memory; use batch processing to fit > RAM sized datasets.
2. The hyperparameter tuning currently searches for an optimizer, tranfer learning CNN and number of top layers; Add support for more hyperparameters like momentum value, Dropouts, Regularization etc.
3. Add image data augmentation, that can potentially help learn from fewer datasets.
4. The dataset needs to be organized in the above mentioned format; add support for other formats like:
```
Train
|--->Cat
|--->|----catImage1
|--->|----catImage2
|--->|----......
|--->Dog
|--->|----dogImage1
|--->|----dogImage2
|--->|----......
.............
```
5. Add tests; figure out a way to test changes so that it doesn't corrupt the repo; use CI.
## Note to community
1. A lot of effort needs to be put in, as a community, to develop a systematic approach for hyperparameter tuning, hence suggestions/ ideas welcome.
2. Pull requests are welcome for the above TODO or any other improvement.
3. In case of any issues/ queries, open a new issue or contact me over email.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
smic-1.0.1.tar.gz
(5.0 kB
view details)
Built Distributions
smic-1.0.1-py2.7.egg
(5.6 kB
view details)
File details
Details for the file smic-1.0.1.tar.gz
.
File metadata
- Download URL: smic-1.0.1.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4781ebe0df117518b3f12ddc146baba82262dd1ffc294af97820d1d5df3991f7 |
|
MD5 | 13066fd1ecc832eef8808a8f89814b5d |
|
BLAKE2b-256 | 816ae6acab06c9911684981bf84fcf3fae8a70f904569147e7f9ab8bd35ac265 |
File details
Details for the file smic-1.0.1-py2.py3-none-any.whl
.
File metadata
- Download URL: smic-1.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e7714196bd33ab7015c84899a263fdf7827dadbd12ed30ca9ff2f24060eec7f |
|
MD5 | d8c8a61ad799182f89c1fd385103cdf2 |
|
BLAKE2b-256 | e24f5636123aa73eed0ea26b3fea5d026c531f2b1be6a640852b10a4f2d7f991 |
File details
Details for the file smic-1.0.1-py2.7.egg
.
File metadata
- Download URL: smic-1.0.1-py2.7.egg
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cef550bd1c64645f0886919fce82f9b7d88200e568e4c539136c265420003d78 |
|
MD5 | 1945cda6bce05e4005f6e9e40fcce081 |
|
BLAKE2b-256 | 449e06c8571322e4b3858d6ca0d5768bf8a92256aeb80b02a5e0a6414c4742e3 |