Kaggler

Code for Kaggle Data Science Competitions.

Project description

# Kaggler
Kaggler is a Python package for Kaggle data science competitions and distributed under the version 3 of the GNU General Public License.

It provides online learning algorithms for classification - inspired by Kaggle user [tinrtgu's code](http://goo.gl/K8hQBx). It uses the sparse input format that handels large sparse data efficiently. Core code is optimized for speed by using Cython.

# Algorithms
Currently algorithms available are as follows:

## Online learning algorithms
* Stochastic Gradient Descent (SGD)
* Follow-the-Regularized-Leader (FTRL)
* Factorization Machine (FM)
* Neural Networks (NN) - with a single (NN) or two (NN_H2) ReLU hidden layers

## Batch learning algorithm
* Neural Networks (NN) - with a single hidden layer and L-BFGS optimization

# Install
## Using pip
Python package is available at PyPi for pip installation:
```
sudo pip install -U Kaggler
```

## From source code
If you want to install it from source code:
```
python setup.py build_ext --inplace
sudo python setup.py install
```

# Input Format
libsvm style sparse file format is used for an input.
```
1 1:1 4:1 5:0.5
0 2:1 5:1
```

# Example
## SGD
```
from kaggler.online_model import SGD

clf = SGD(n=2**20, # number of hashed features
a=.01, # learning rate
l1=1e-6, # L1 regularization parameter
l2=1e-6, # L2 regularization parameter
interaction=True) # use feature interaction or not

for x, y in clf.read_sparse('train.sparse'):
p = clf.predict(x) # predict for an input
clf.update(x, p - y) # update the model with the target using error

for x, _ in clf.read_sparse('test.sparse'):
p = clf.predict(x)
```

## FTRL
```
from kaggler.online_model import FTRL

clf = FTRL(n=2**20, # number of hashed features
a=.1, # alpha in the per-coordinate rate
b=1, # beta in the per-coordinate rate
l1=1., # L1 regularization parameter
l2=1., # L2 regularization parameter
interaction=True) # use feature interaction or not

for x, y in clf.read_sparse('train.sparse'):
p = clf.predict(x) # predict for an input
clf.update(x, p - y) # update the model with the target using error

for x, _ in clf.read_sparse('test.sparse'):
p = clf.predict(x)
```

## FM
```
from kaggler.online_model import FM

clf = FM(n=1e5, # number of features
dim=4, # size of factors for interactions
a=.01) # learning rate

for x, y in clf.read_sparse('train.sparse'):
p = clf.predict(x) # predict for an input
clf.update(x, p - y) # update the model with the target using error

for x, _ in clf.read_sparse('test.sparse'):
p = clf.predict(x)
```

## NN with a single hidden layer
```
from kaggler.online_model import NN

clf = NN(n=1e5, # number of features
h=16, # number of hidden units
a=.1, # learning rate
l2=1e-6) # L2 regularization parameter

for x, y in clf.read_sparse('train.sparse'):
p = clf.predict(x) # predict for an input
clf.update(x, p - y) # update the model with the target using error

for x, _ in clf.read_sparse('test.sparse'):
p = clf.predict(x)
```

# Package Documentation
Package documentation is available at [here](http://pythonhosted.org//Kaggler).

Project details

Release history Release notifications | RSS feed

0.9.15

Mar 6, 2022

0.9.14

Mar 5, 2022

0.9.13

Jun 12, 2021

0.9.12

Jun 12, 2021

0.9.11

Jun 10, 2021

0.9.10

Jun 8, 2021

0.9.9

Jun 4, 2021

0.9.8

Jun 2, 2021

0.9.7

Jun 1, 2021

0.9.6

May 15, 2021

0.9.5

May 18, 2021

0.9.4

May 2, 2021

0.9.3

May 1, 2021

0.9.2

May 1, 2021

0.9.1

May 1, 2021

0.9.0

Apr 29, 2021

0.8.13

Apr 15, 2021

0.8.12

Oct 15, 2020

0.8.11

Mar 30, 2020

0.8.10

Mar 17, 2020

0.8.9

Jan 21, 2020

0.8.8

Dec 11, 2019

0.8.7

Oct 9, 2019

0.8.6

Oct 3, 2019

0.8.5

Sep 30, 2019

0.8.4

Sep 25, 2019

0.8.3

Sep 25, 2019

0.8.2

Aug 5, 2019

0.8.1

Aug 3, 2019

0.8.0

Aug 3, 2019

0.7.0

May 17, 2019

0.6.9

Apr 26, 2019

0.6.8

Apr 9, 2019

0.6.7

Apr 9, 2019

0.6.6

Apr 9, 2019

0.6.5

Apr 9, 2019

0.6.4

Mar 16, 2019

0.6.3

Jan 2, 2019

0.6.2

Dec 22, 2018

0.6.1

Jul 14, 2018

0.6.0

Jun 28, 2018

0.5.2

Mar 14, 2017

0.5.1

Mar 14, 2017

0.5.0

Jan 12, 2017

0.4.4

Nov 18, 2016

0.4.3

Oct 22, 2016

0.4.0

Sep 12, 2015

0.3.8

Apr 17, 2015

This version

0.3.7

Feb 15, 2015

0.3.6

Feb 12, 2015

0.3.5

Feb 12, 2015

0.3.4

Feb 12, 2015

0.3.3

Feb 11, 2015

0.3.2

Feb 11, 2015

0.3.1

Feb 10, 2015

0.3.dev pre-release

Feb 10, 2015

0.2.0

Jan 29, 2015

0.1.1

Sep 25, 2014

0.1.0

Jul 22, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Kaggler-0.3.7.tar.gz (25.4 kB view hashes)

Uploaded Feb 15, 2015 Source

Hashes for Kaggler-0.3.7.tar.gz

Hashes for Kaggler-0.3.7.tar.gz
Algorithm	Hash digest
SHA256	`35f0a360139e4c43810c1909b429ddc0388b3013a764c19f4bb1aedf569358a5`
MD5	`48d397ae65e9e64e7ae259520226a4e5`
BLAKE2b-256	`cbdf64ed21d423fa72822b6cc5b067e1929857f8553edb1063527a4c68843366`