A toolkit to aid in scientific mapping
Project description
tesci
An interactive toolkit for merging data from multiple citation databases
Overview
TeslaSCIToolkit
(abbrev. tesci
) is a scientific mapping tool that comes with the following features:
- Merging data from multiple citation databases
- Restricting access to sensitive columns in data sources with aggregations
- Exporting transformed data into other repositories
- CI/CD integration, currently GitHub Actions
For examples and use-cases, see examples directory.
Quickstart
Aggregating data from a single database source
To create an aggregation of simple.csv
based on average salary
and age
.
1. Interactive approach
tesci start -d simple.csv -o exported.csv
tesci aggregate avg -c salary -a avg_salary
tesci aggregate avg -c age -a avg_age
tesci apply
2. Configuration approach
aggregate:
- alias: avg_salary
column: salary
function: avg
- alias: avg_age
column: age
function: avg
data:
dest: exported.csv
src: simple.csv
The result is a transformation from simple.csv
to exported.csv
:
|
→ |
|
Merging data from multiple citation databases
After retrieving data sources from citation databases of your choice, place the databases in a directory of your choice. Then, specify the configuration used for merging. An example of a configuration is here.
After specifying your configuration choices, merge can then by run with:
tesci similarity merge --first-src PATH --second-src PATH --dest DIR
where PATH and DIR refer to relative filesystem paths and directories.
License
Licensed under either of Apache License, Version 2.0 or MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.