Skip to main content

mobileclick provides baseline methods and utility scripts for the NTCIR-12 MobileClick-2 task

Project description

# mobileclick
mobileclick provides baseline methods and scripts for the NTCIR-12 MobileClick-2 task: http://www.mobileclick.org/

[![Circle CI](https://circleci.com/gh/mpkato/mobileclick.svg?&style=shield)](https://circleci.com/gh/mpkato/mobileclick)
[![Coverage Status](https://coveralls.io/repos/mpkato/mobileclick/badge.svg)](https://coveralls.io/r/mpkato/mobileclick)
[![Code Climate](https://codeclimate.com/github/mpkato/mobileclick/badges/gpa.svg)](https://codeclimate.com/github/mpkato/mobileclick)

## Requirements

Minimum requirements:
- Python 2.7
- NumPy
- nltk
- BeautifulSoup

Requirements for Japanese runs:
- mecab-python


## Installation
Install mobileclick via PyPI:

```
$ pip install mobileclick
```

You can also install mobileclick from the source code:

```
$ python setup.py install
```

Mecab and mecab-python installation:
```
$ sh mecab_install.sh
```

## Installed scripts
Download MobileClick data (Please sign up at http://www.mobileclick.org/ ):
```
$ mobileclick_download_data
Please input the email and password for http://www.mobileclick.org
Email: <Your email address>
Password: <Your password>
```

Replicate the random iUnit ranking baseline:
```
$ mobileclick_lang_model_ranking_method --runname random_ranking_method --query data/MC2-training/en/1C2-E-queries.tsv --iunit data/MC2-training/en/1C2-E-iunits.tsv --indexdir data/MC2-training-documents/1C2-E.INDX --pagedir data/MC2-training-documents/1C2-E.HTML
```

Replicate the LM-based iUnit ranking baseline:
```
$ mobileclick_lang_model_ranking_method --runname lang_model_ranking_method --query data/MC2-training/en/1C2-E-queries.tsv --iunit data/MC2-training/en/1C2-E-iunits.tsv --indexdir data/MC2-training-documents/1C2-E.INDX --pagedir data/MC2-training-documents/1C2-E.HTML --language english
```

## Generate your runs
The current version can only deal with the iUnit ranking subtask.

### 1. Create a subclass of BaseRankingMethod

```python
from .base_ranking_method import BaseRankingMethod

class YourRankingMethod(BaseRankingMethod):
def init(self, tasks):
'''
Initialization

`tasks` is a list of Task instances.
Task.query: Query
Task.iunits: a list of Iunit instances
Task.indices: a list of Index instances

Query:
Query.qid: Query ID
Query.body: string of the query

Iunit:
Iunit.qid: Query ID
Iunit.uid: iUnit ID
Iunit.body: string of the iUnit

Index (index information of a webpage in the provided document collection):
Index.qid: Query ID
Index.filepath: filepath of an HTML file
Index.rank: rank in a search engine result page
Index.title: webpage title
Index.url: webpage url
Index.body: summary of the webpage
'''

def rank(self, task):
'''
Output ranked pairs of an iUnits and a score

e.g. Random ranking method
return [(i, 0) for i in task.iunits]
'''
```

### 2. Generate a run
```python
tasks = Task.read(
"data/MC2-training/en/1C2-E-queries.tsv",
"data/MC2-training/en/1C2-E-iunits.tsv",
"data/MC2-training-documents/1C2-E.INDX",
"data/MC2-training-documents/1C2-E.HTML")
method = YourRankingMethod()
run = method.generate_run("YourRun", "This is your run", tasks)
run.save('./')
```

Upload "./YourRun.tsv" to http://www.mobileclick.org/ for evaluation!

## License
MIT License (see LICENSE file).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for mobileclick, version 0.1.1
Filename, size File type Python version Upload date Hashes
Filename, size mobileclick-0.1.1.tar.gz (8.5 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page