Skip to main content

Combination Dependent Learning to Rank

Project description

# cdl2r
Combination Dependent Learning to Rank (組合せ依存型ランキング学習).

## requirements
- Python 3.6.x ~, 3.7.x ~

## dependencies
- NumPy
- Pandas

## installation
```shell
$ pip install cdl2r
```

## usage
### 1. prepare your dataset
The dataset format is like SVM-rank one.
The difference is `eid` must be specified in a line.
Here is a definition of a line.
`|` symbol means `OR` (so `<str>|<int>` means the value must have either str or int type).

```txt
<line> .=. <label> qid:<qid> eid:<eid> <features>#<comments>

<label> .=. <float>|<str as a class>
<qid> .=. <str>|<int>
<eid> .=. <str>|<int>
<features> .=. <dim>:<value>
<dim> .=. <0 or Natural Number>
<value> .=. <float>
<comments> .=. <Any text will do>
```

Let me show you an example.

```txt
0.5 qid:1 eid:x 1:0.1 2:-0.2 3:0.3#comment A
0.0 qid:1 eid:y 1:-0.1 2:0.2 4:0.4
-0.5 qid:1 eid:z 2:-0.2 3:0.3 4:-0.4#comment C
0.5 qid:2 eid:y 1:0.1 2:-0.2 3:0.3
0.0 qid:2 eid:z 1:-0.1 2:0.2 4:0.4
-0.5 qid:2 eid:w 2:-0.2 3:0.3 4:-0.4#comment E
```

### 2. loading your dataset
```python
from cdl2r.dataset import load_data

# loading dataset as a DataFrame object
data_path = '/path/to/dataset'
n_dimensions = 10
train = load_data(data_path, n_dimensions)
# train.columns
# >>> Index(['label', 'qid', 'eid', 'features'], dtype='object')
```

### 3. fitting the model
```python
from cdl2r.models import CDFMRegressor

# define your model
model = CDFMRegressor(n_factors=8, n_iterations=300, init_eta=1e-2)
# fitting, printing out epoch losses if verbose is True
model.fit(train, verbose=True)
```

### 4. save the model
```python
import pickle

with open('/path/to/file.pkl', mode='wb') as fp:
pickle.dump(model, fp)
```

### 5. make prediction
```python
# loading test dataset
test = load_data(test_path, n_dimensions)
pred = model.predict(test)
# pred.columns
# >>> Index(['pred_label', 'qid', 'eid', 'features'], dtype='object')
```

## development
### build Cython modules
```shell
$ python setup.py build_ext --inplace
```

### profiling
```shell
# decorate a method with `@profile` in a script where you want to profile.
$ kernprof -l -v <script>.py
```

### pylint
- max-line-length: 130
- disable snake-case

### release
```shell
# build
$ python setup.py bdist_whell

# testing upload
$ twine upload --repository testpypi dist/<cdl2r-version-pkg>
$ pip install --index-url https://test.pypi.org/simple/<cdl2r-version-pkg>

# upload
$ twine upload --repository pypi dist<cdl2r-version-pkg>
```


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

cdl2r-0.1.2-cp36-cp36m-macosx_10_14_x86_64.whl (35.4 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file cdl2r-0.1.2-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: cdl2r-0.1.2-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 35.4 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for cdl2r-0.1.2-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 54ca7c0cab90838bb89a5344de16a41f3c6d760b591d07a1cc0ff51a2ce7e391
MD5 e5b42973da349b38225fa569f30ae794
BLAKE2b-256 96be9c399c83d3667bd618d6f814911d2be229ca56f69ae7d77ea5ff9466d535

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page