ChatDB is a toolkit to easily store chat messages in DB.
Project description
ChatDB for NLP
ChatDB is a toolkit to easily store the conversation such as chat messages in a database. You can use ChatDB as a way of storing text in a stage of collecting data for NLP.
DBMS: Neo4j
Installation
You can choose either A or B.
A. The case to use Neo4j Desktop
If you will work on a host OS and use Neo4j Desktop, it is recommended to install ChatDB from the PyPI:
pip install chatdb
Download Neo4j Desktop from the following: https://neo4j.com/download/
B. The case to use Neo4j on a Docker container
You can use Git to clone the repository from GitHub:
git clone https://github.com/A03ki/chatdb.git
cd chatdb
If you will work on a host OS:
pip install -e .
docker-compose up -d db
If you will work on a docker container:
docker-compose up -d
docker-compose exec app /bin/sh -c "[ -e /bin/bash ] && /bin/bash || /bin/sh"
Usage
First, store the text data in a database.
from chatdb import Graph, Status
# Create Status
s1 = Status(text="How are you today?")
s2 = Status(text="I’m okay, thanks. And you?")
s3 = Status(text="I’m awesome.")
# Construct a relationship between Statuses
s1.reply_from(s2) # s2.reply_to(s1)
s2.reply_from(s3) # s3.reply_to(s2)
# Create the handler for Neo4j
# Work on a docker container
graph = Graph("bolt://db:7687", password="your_password")
# Work on a host OS
# graph = Graph("bolt://localhost:7687", password="your_password")
# Store data
graph.merge(s2)
Next, extract the text from a database.
from chatdb import Graph, TextOutputer, Status
graph = Graph("bolt://db:7687", password="your_password")
# graph = Graph("bolt://localhost:7687", password="your_password")
outputer = TextOutputer(graph)
print(outputer.match([Status]).extract_text())
print(outputer.match([Status]*2).extract_text())
print(outputer.match([Status]*3).extract_text())
Output:
[['I’m okay, thanks. And you?'], ['How are you today?'], ['I’m awesome.']]
[['I’m okay, thanks. And you?', 'I’m awesome.'], ['How are you today?', 'I’m okay, thanks. And you?']]
[['How are you today?', 'I’m okay, thanks. And you?', 'I’m awesome.']]
You can also use the Neo4j Browser to check data.
Try to go to http://localhost:7474
in your web browser and run the query which is MATCH (n:Status) RETURN n
.
https://raw.githubusercontent.com/optuna/optuna/master/
How to delete all data: MATCH (n:Status) DETACH DELETE n
For more information on how to use Neo4j Browser, see https://neo4j.com/developer/neo4j-browser/.
Support for collecting Tweet data
pip install tweepy
This example will store the timeline of Twitter, Inc and the tweet which this account are replying to.
import tweepy
from chatdb import Graph, SimpleTweetStatus
from chatdb.tools import TweetArchiver
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True,
wait_on_rate_limit_notify=True)
graph = Graph("bolt://db:7687", password="your_password")
# graph = Graph("bolt://localhost:7687", password="your_password")
archiver = TweetArchiver(graph, SimpleTweetStatus)
statuses = api.user_timeline(screen_name="Twitter")
for status in statuses:
in_reply_to_status_id_str = status.in_reply_to_status_id_str
if in_reply_to_status_id_str:
in_reply_to_status = api.get_status(in_reply_to_status_id_str)
archiver.add_status(**in_reply_to_status._json)
archiver.add_status(**status._json)
For more information on how to use Tweepy, see Tweepy Documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file chatdb-0.1.0.tar.gz
.
File metadata
- Download URL: chatdb-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48dcf53cb6a4d32f5bf44518502b67a8345e868d1a55b98d998586119252b99d |
|
MD5 | 4566cca746447a63580462255647c5a5 |
|
BLAKE2b-256 | 5358a20685f69f27c941a075535d27328a950f0bb8a2f78b2937c0df04cc9f19 |
File details
Details for the file chatdb-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: chatdb-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c29f9ebc52f72271b160b67432390b7ff7858faf0e22fa41203ca39157596468 |
|
MD5 | a9817e28604cb6d1523c74536f4b11d4 |
|
BLAKE2b-256 | 465a2c5ce71b3b41e363ccc4980f38e1d13aecd9f7bffdfb3cb636515680124a |