Skip to main content

Package for curating data annotation efforts.

Project description

pyAnno 2.0 is a Python library for the analysis and diagnostic testing of annotation and curation efforts. pyAnno implements statistical models for inferring from categorical data annotated by multiple annotators

  • annotator accuracies and biases,
  • gold standard categories of items,
  • prevalence of categories in population, and
  • population distribution of annotator accuracies and biases.

The models include a generalization of Dawid and Skene’s (1979) multinomial model with Dirichlet priors on prevalence and estimator accuracy, and the two models introduces in Rzhetsky et al.’s (2009). The implementation allows Maximum Likelihood and Maximum A Posteriori estimation of parameters, and to draw samples from the full posterior distribution over annotator accuracy.


pyAnno is licensed under a modified BSD license (2-clause). For more information, see



The documentation is hosted at .


  • Pietro Berkes (Enthought, Ltd.)
  • Bob Carpenter (Columbia University, Statistics)
  • Andrey Rzhetsky (University of Chicago, Medicine)
  • James Evans (University of Chicago, Sociology)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pyanno-2.0.2.macosx-10.5-i386.exe (296.3 kB) Copy SHA256 hash SHA256 Windows Installer any
pyanno-2.0.2-py2.7.egg (377.8 kB) Copy SHA256 hash SHA256 Egg 2.7
pyanno-2.0.2.tar.gz (292.5 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page