Observe PoI text data from the various sources, segment it and then inform about it
Project description
Obsei: OBserve, SEgment and Inform
Obsei
is intended to be a workflow automation tool for text segmentation need. Obsei
consist of -
- OBserver, observes platform like Twitter, Facebook, App Stores, Google reviews, Amazon reviews and feed that information to,
- SEgmenter, which perform text classification and sentiment analysis and feed that information to,
- Informer, which send it to ticketing system, data store or other places for further action and analysis.
Installation
To use as SDK
Install via PyPi:
pip install obsei
Install from master branch (if you want to try the latest features):
git clone https://github.com/lalitpagaria/obsei.git
cd obsei
pip install --editable .
To update your installation, just do a git pull
. The --editable
flag
will update changes immediately.
To use as Rest interface
Start docker with default configuration file:
docker run -d --name obesi -p 9898:9898 lalitpagaria/obsei:latest
Start docker with custom configuration file (Assuming you have configfile config.yaml
at /home/user/obsei/config
at host machine):
docker run -d --name obesi -v "/home/user/obsei/config:/home/user/config" -e "OBSEI_CONFIG_PATH=/home/user/config" -e "OBSEI_CONFIG_FILENAME=config.yaml" -p 9898:9898 lalitpagaria/obsei:latest
Start docker locally with docker-compose
:
docker-compose up --build
Following environment variables are useful to customize various parameters -
OBSEI_CONFIG_PATH
: Configuration file path (default: ../config)OBSEI_CONFIG_FILENAME
: Configuration file name (default: rest.yaml)OBSEI_NUM_OF_WORKERS
: Number of workers for rest API server (default: 1)OBSEI_WORKER_TIMEOUT
: Worker idle timeout in seconds (default: 180)OBSEI_SERVER_PORT
: Rest API server port (default: 9898)OBSEI_WORKER_TYPE
: Gunicorn worker type (default: uvicorn.workers.UvicornWorker)
Use cases
Obsei
use cases are following, but not limited to -
- Automatic customer issue ticketing based on sentiment analysis
- Proper tagging of ticket like login issue, signup issue, delivery issue etc for faster disposal
- Checking effectiveness of social media marketing campaign
- Extraction of deeper insight from feedbacks on various platforms
- Research purpose
Components
- Source: Twitter (Facebook, Instagram, Google reviews, Amazon reviews, App Store reviews, Slack, Microsoft Team, Chat-bots etc planned in future)
- Analyzer: Sentiment and Text classification (QA, Natural Search, FAQ, Summarization etc planned in future)
- Sink: HTTP API, ElasticSearch, DailyGet, and Jira (Salesforce, Zendesk, Hubspot, Slack, Microsoft Team, etc planned in future)
- Processor: Simple integration between Source, Analyser and Sink (Rich workflows using rule engine planned in future)
Examples
Refer example folder for obsei
usage examples.
Attribution
This could not have been possible without following open source software -
- searchtweets-v2: For Twitter's API v2 wrapper
- vaderSentiment: For rule-based sentiment analysis
- transformers: For text-classification pipeline
- tweet-preprocessor: For tweets preprocessing and cleaning
- atlassian-python-api: To interact with Jira
- elasticsearch: To interact with Elasticsearch
- hydra: To elegantly configuring Obsei
- apscheduler: To schedule task to execute desired workflow
- pydantic: For data validation
- fastapi & gunicorn: For HTTP server and API interface
Citing Obsei
If you use obsei
in your research please use the following BibTeX entry:
@Misc{Pagaria2020Obsei,
author = {Lalit Pagaria},
title = {Obsei - A workflow automation tool for text segmentation need},
howpublished = {Github},
year = {2020},
url = {https://github.com/lalitpagaria/obsei}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.