Powerful command line tools for reference management with ASReview
Project description
ASReview-datatools
This package is currently under development. See ASReview-statistics for stable version compatible with ASReview LAB <=0.19.x.
ASReview-datatools is an extension for ASReview LAB software. The extension can be used for describing and cleaning your (input) data via the command line.
Installation
The ASReview-datatools extensions requires Python 3.6+ and ASReview LAB version 1.
The easiest way to install the datatools extension is to install from PyPI:
pip install asreview-datatools
After installation of the datatools extension, asreview should automatically
detect it. Test this by:
asreview --help
If it lists asreview data describe, then the extension is successfully installed.
Getting started
data describe
Describe a dataset
% asreview data describe MY_DATASET.csv
Export the results to a file (output.json)
% asreview data describe MY_DATASET.csv -o output.json
Describe the van_de_schoot_2017 dataset from the benchmark
platform.
% asreview data describe benchmark:van_de_schoot_2017 -o output.json
{
"asreviewVersion": "1.0rc2+14.gac96c1a",
"apiVersion": "0.4+4.g3f54294",
"data": {
"items": [
{
"id": "n_records",
"title": "Number of records",
"description": "The number of records in the dataset.",
"value": 6189
},
{
"id": "n_relevant",
"title": "Number of relevant records",
"description": "The number of relevant records in the dataset.",
"value": 43
},
{
"id": "n_irrelevant",
"title": "Number of irrelevant records",
"description": "The number of irrelevant records in the dataset.",
"value": 6146
},
{
"id": "n_unlabeled",
"title": "Number of unlabeled records",
"description": "The number of unlabeled records in the dataset.",
"value": 0
},
{
"id": "n_missing_title",
"title": "Number of records with missing title",
"description": "The number of records in the dataset with missing title.",
"value": 5
},
{
"id": "n_missing_abstract",
"title": "Number of records with missing abstract",
"description": "The number of records in the dataset with missing abstract.",
"value": 764
},
{
"id": "n_duplicates",
"title": "Number of duplicate records (basic algorithm)",
"description": "The number of duplicate records in the dataset based on similar text.",
"value": 104
}
]
}
}
data convert
Convert the format of a dataset. For example, convert a RIS dataset into a CSV, Excel, or TAB dataset.
asreview data convert MY_DATASET.ris MY_OUTPUT.csv
data dedup
Remove duplicate records with a simple and straightforward deduplication algorithm. The algorithm concatenates the title and abstract, whereafter it removes all non-alphanumeric tokens. Then the duplicates are removed.
asreview data dedup MY_DATASET.ris
Export the deduplicated dataset to a file (output.csv)
asreview data dedup MY_DATASET.ris -o output.csv
Using the van_de_schoot_2017 dataset from the benchmark
platform.
asreview data dedup benchmark:van_de_schoot_2017
License
This extension is MIT licensed.
Contact
Use the issue tracker or see more contact options in the ASReview LAB repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file asreview-datatools-1.0rc1.tar.gz.
File metadata
- Download URL: asreview-datatools-1.0rc1.tar.gz
- Upload date:
- Size: 22.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a8dc1550c6c24d5f109e4da1416192c3cfeeb0a86430dc7a0d95077d0aace77
|
|
| MD5 |
9a446959fa951612506f59950176c710
|
|
| BLAKE2b-256 |
4672560142b59ff3e0b06701ad2bdcf51ff82af814e7020bf740e5c27f5d20a4
|
File details
Details for the file asreview_datatools-1.0rc1-py3-none-any.whl.
File metadata
- Download URL: asreview_datatools-1.0rc1-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44922de15d024215a127fc4236ef36195ca73a1776ca021dbbe6aaf11def7ff7
|
|
| MD5 |
875de83068c65de0f69d6d6c7f8fbcdb
|
|
| BLAKE2b-256 |
a371408c8b25e0d53c3807b2a52938406ae4317f6ba1d6a73349f124714015df
|