A small wrapper for connecting MongoDB collections to Prodigy
Project description
mondigy
Mondigy is a small library for using a Mongodb database as a data loader for Prodigy annotation applications.
Motivation
Prodigy naviely supports loading text data from files and dataset objects, but annotating data that is stored in a MongoDB database requires a custom data loader. With mondigy you can simply write a small config file with your database config and have an easy way to get data from Mongo to Prodigy.
Features
- Annotate text data from MongoDB
- Pipe data directly from your MongoDB database to your Prodigy application
Code Example
Let's define a db connection and start annotating data from our MongoDB database!
Step 1. Create a config file. For this example, we'll call it my_db_config.json
.
This config gets the first 1000 entries that are in_stock
from the products
collection
of our database, in order of decreasing date_added
.
my_db_config.json
{
"host": "my.database.com",
"user": "mongo_user",
"password": "mongo_pass",
"database": "my_db",
"auth_source": "admin",
"collection": "products",
"text_field": "description",
"other_fields": ["product_name", "product_id"],
"sort": ["date_added", -1],
"query": {"in_stock": true},
"limit": 1000
}
Step 2. Start your Prodigy server and let mondigy point your MongoDB collection at it by supplying the paths of your config file and the Mondigy loader.
prodigy mongo-loader my_db_config.json -F mondigy/loader.py | prodigy ner.manual ner_test en_core_web_sm - --label FEATURE,KEYWORD
Step 3. Annotate!
Installation & Setup
To install Mondigy, simply clone this repo via git clone https://github.com/jdagdelen/mondigy.git
.
Mondigy will set up the collections it requires in your mongo database. They are named with a _p.<collection>
convention. Don't delete these collections or manually edit any of the documents in them.
License
MIT © John Dagdelen
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.