A python package to partially automate search term selection and writing search strategies for systematic reviews
Project description
Ananse
This project is a collaboration between Dr. Effah Antwi,Research Scientist, Natural Resources Canada, Canadian Forest Service
and Dr Wiafe Owusu-Banahene, Department of Computer Engineering, School of Engineering Sciences, University of Ghana, Legon, Accra, Ghana.
The Ananse
package is a python package designed to partially automate search term selection and writing search strategies for systematic reviews. Read the documentation at baasare.github.io/ananse and ananse.readthedocs.io/
Setup
Ananse requires python 3.7 or higher
Using pip
pip install ananse
Directly from the repository
git clone https://github.com/baasare/ananse.git python ananse/setup.py install
Quick start
Writing your own script
from ananse import Ananse min_len = 1 # minimum keyword length max_len = 4 # maximum keyword length # Create an instance of the package test_run = Ananse() # Import your naive search results from the current working directory imports = test_run.import_naive_results(path="./") # Columns to deduplicate imported search results columns = ['title', 'abstract'] #de-duplicate the imported search results data = test_run.deduplicate_dataframe(imports, columns) #extract keywords from article title and abstract as well as author and database tagged keywords all_terms = test_run.extract_terms(data, min_len=min_len, max_len=max_len) #create Document-Term Matrix, with columns as terms and rows as articles dtm, term_column = test_run.create_dtm(data.text, keywords=all_terms, min_len=max_len, max_len=max_len) #create co-occurrence network using Document-Term Matrix graph_network = test_run.create_network(dtm, all_terms) #plot histogram and node strength of the network test_run.plot_degree_histogram(graph_network) test_run.plot_degree_distribution(graph_network) #Determine cutoff for the relevant keywords cutoff_strengths = test_run.find_cutoff(graph_network, "spline", "degree", degrees=3, knot_num=1, percent=0.879956, diagnostics=True) #get suggested keywords and save to a csv file suggested_keywords = test_run.get_keywords(graph_network, "degree", cutoff_strengths, save_keywords=True) #Print suggested keywords for word in suggested_keywords: print(word)
Using Ananse Test Script
python tests/ananse_test
References
This is a python implementation of the R package as mentioned in paper An automated approach to identifying search terms for systematic reviews using keyword co‐occurrence networks by Eliza M. Grames, Andrew N. Stillman Morgan W. Tingley and Chris S. Elphick
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size ananse-1.1.2-py3-none-any.whl (9.7 kB) | File type Wheel | Python version py3 | Upload date | Hashes View |
Filename, size ananse-1.1.2.tar.gz (11.2 kB) | File type Source | Python version None | Upload date | Hashes View |