Skip to main content

A package to mesure diversity of log files

Project description

LogDiv: A Python Module for Computing Diversity in Transaction Logs

LogDiv is a Python module for the computation of the diversity of items requested by users in transaction logs.

It takes two inputs:

  1. A log file with transactions.
  2. A file with item atributes.

Computing the diversity of items requested by users is a task of interest in many fields, such as sociology, recommender systems, e-commerce, and media studies. Check the example below.

Getting Started

Prerequisites

LogDiv requires:

  • Python
  • Numpy - Essential
  • Pandas - Essential
  • Matplotlib - Essential
  • tqdm - Optionnal: progression bar
  • Graph-tool - Optionnal: only one function requires it
$ pip install numpy
$ pip install panda
$ pip install matplotlib 
$ pip install tqdm 

Installing

To install LogDiv, you need to execute:

$ pip install logdiv

Specification

Entries format

LogDiv needs a specific format of entries to run:

  • A file describing all requests under a table format, whose fields are:
  • user ID
  • timestamp
  • requested item ID
  • referrer item ID
  • A file describing all pages visited under a table format, whose fields are:
  • item ID
  • topic
  • category

YAML file

Codes that use LogDiv are directed by a YAML file: if you want to modify entry files, or the features you want to compute, you just need to modify the YAML file, not the code itself. This file is self-explanatory.

Examples

You dispose of two examples to familiarize yourself with LogDiv:

  • Example 1 uses a short dataset to show how to use LogDiv
  • Example 2 uses a dataset of more than 500 thousands of requests to show what kind of results can be obtained

These examples (dataset, script and yaml file) can be found in datasets directory.

Example 1

The following example illustrates the entries format of the package.

user timestamp requested_item referrer_item
user1 2019-07-03 00:00:00 v1 v4
user1 2019-07-03 00:01:00 v4 v2
user1 2019-07-03 00:01:10 v4 v6
user1 2019-07-03 00:01:20 v4 v6
user1 2019-07-03 00:02:00 v6 v9
user1 2019-07-03 03:00:00 v8 v10
user1 2019-07-03 03:01:00 v8 v5
user2 2019-07-05 12:00:00 v3 v5
user2 2019-07-05 12:00:30 v5 v7
user2 2019-07-05 12:00:45 v7 v9
user2 2019-07-05 12:01:00 v9 v6
user3 2019-07-05 18:00:00 v10 v5
user3 2019-07-05 18:01:15 v10 v7
user3 2019-07-05 18:03:35 v10 v9
user3 2019-07-05 18:06:00 v7 v4
user3 2019-07-05 18:07:22 v5 v2
item class1 class2 class3
v1 Football beginner France
v2 Tennis pro England
v3 Football beginner World
v4 Tennis advanced France
v5 Rugby medium Italy
v6 Football beginner England
v7 Tennis pro Spain
v8 Football beginner France
v9 Tennis advanced World
v10 Rugby medium World

Example 2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logdiv-0.0.2.1.tar.gz (23.5 kB view hashes)

Uploaded Source

Built Distribution

logdiv-0.0.2.1-py3-none-any.whl (36.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page