Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

A set of learning-to-rank algorithms.

Project description

FastRank Build Status PyPI version

My most frequently used learning-to-rank algorithms ported to rust for efficiency.

Python Usage

pip install fastrank

Configuring Models

from fastrank import CModel, CDataset, CQRel, TrainRequest

RANDOM_FOREST = False

if RANDOM_FOREST:
    train_request = TrainRequest.random_forest()
    params = train_request.params
    params.num_trees = 200
    params.feature_sampling_rate = 0.5
    params.instance_sampling_rate = 0.5
else:
    train_request = TrainRequest.coordinate_ascent()
    params = train_request.params
    params.init_random = True
    params.normalize = True
    
# No matter what, deterministic seed and limit print statements.
params.quiet = True
params.seed = 16710601535089033473

Loading SVMrank/Ranklib files:

import os

query_dir = os.path.join(os.environ['HOME'], 'code', 'queries', 'trec_news')
qrels = CQRel.load_file(os.path.join(query_dir, 'newsir18-entity.qrel'))

dataset = CDataset.open_ranksvm(
    os.path.join(data_dir, "ent.ranklib.gz"),
    os.path.join(data_dir, "feature_names.json"),
)

Train & Evaluate Models

from sklearn.model_selection import KFold

EVAL_MEASURE = "NDCG@5"

models = []
evals = []
folds = KFold(n_splits=5, random_state=0, shuffle=False)
features = dataset.feature_names()
features.remove("0") # ranksvm starts at 1 for many tools
queries = sorted(d2018.queries())

fdataset = d2018.subsample_feature_names(features)

for train_idx, test_idx in folds.split(queries):
    train_queries = [queries[i] for i in train_idx]
    test_queries = [queries[i] for i in test_idx]
    train = fdataset.subsample_queries(train_queries)
    test = fdataset.subsample_queries(test_queries)
    model = train.train_model(train_request)
    eval_dict = test.evaluate(model, EVAL_MEASURE, qrels)
    evals.append(eval_dict)
    models.append(model)
    print("  NDCG@5 = %1.3f" % np.mean(list(eval_dict.values())))

Code Structure

fastrank

The core algorithms and data structures are implemented in Rust.

cfastrank PyPI version

A very thin layer of rust code provides a C-compatible API. A manylinux version is published to pypi. Don't install this manually -- install the fastrank package and let it be pulled in as a dependency.

pyfastrank

A pure-python libary accesses the core algorithms using cffi via cfastrank. A version is published to pypi.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for fastrank, version 0.4.1
Filename, size File type Python version Upload date Hashes
Filename, size fastrank-0.4.1-py3-none-any.whl (8.7 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size fastrank-0.4.1.tar.gz (7.9 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page