Skip to main content

No project description provided

Project description

DOI

pyniverse: a Python package to analyse generic results of the Zooniverse volunteers

This Python package is intended to allow Zooniverse Project Owners to quickly run some simple analysis on the classification CSVs that the Zooniverse backend allows you to export via the Data Exports page.

How to install

Download/clone the GitHub repo, then to install in your $HOME directory

$ cd pyniverse/
$ ls
LICENSE            README.md          bin                examples           pyniverse          setup.py
$ python setup.py install --user

How to use

Most of the logic in Pyniverse is hidden away in a simple class, called Classifications, which contains a variety of methods, including several that plot graphs. Then there is a simple script in the bin/ folder called analyse-zooniverse-classifications.py that creates an instance of the class by passing it the path of the CSV file downloaded from the Zooniverse file and calling several of the methods. Let's see how it works.

$ cd examples/
$ analyse-zooniverse-classifications.py --input_file dat/test-zooniverse-classifications.csv.bz2
Reading classifications from CSV file...
    Total classifications:  218629
              Total users:    4529
         Gini coefficient:   -0.78

 Top   10 users have done:    18.6 %
 Top  100 users have done:    44.4 %
 Top 1000 users have done:    82.8 %

This step should take no more than 30 seconds and in addition to the above information, you should find some graphs in pdf/. If you didn't specify the name of the output file using the --output_stem option then the program will use the default which is test.

$ ls pdf/
test-classifications-day.pdf      test-classifications-week.pdf     test-user-distribution-log.pdf    test-users-month.pdf
test-classifications-month.pdf    test-user-distribution-linear.pdf test-users-day.pdf                test-users-week.pdf

There are three main graphs produced. The first is simply the number of classifications against time. Three time periods are produced: by day, by week and by month and a cumulative line is added.

Number of classifications per week

The next is the number of users trying the project for the first time, again by day, by week and by month.

Number of new users per day

And lastly the cumulative user distribution so you can see how asymmetric the contribution of the users is.

User Distribution

How to cite

If you use this package, please cite it using the DOI below

DOI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyniverse-1.0.0.tar.gz (7.4 kB view hashes)

Uploaded Source

Built Distribution

pyniverse-1.0.0-py3-none-any.whl (8.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page