A set of search patterns that query a corpus of event-based and community-detected tweets, but it could be modified to query most social-network (node-edge) data.
Project description
askcomm: Python 3 module - Search patterns for event-based, community-detected twitter data.
By Chris Lindgren chris.a.lindgren@gmail.com
Distributed under the BSD 3-clause license. See LICENSE.txt or http://opensource.org/licenses/BSD-3-Clause for details.
Overview
A set of search patterns that query a corpus of event-based and community-detected tweets, but it could be modified to query most social-network (node-edge) data. The queries are great for content produced within the detected-community subgraph data.
It assumes you have:
- imported your corpus as a pandas DataFrame,
- included metadata information, such as a list of dates and list of groups to reorganize your corpus, and
- pre-processed your documents as community-detected data across periodic events.
Functions
query_controller
: Accepts corpus and hub user data and searches for tweets germane to the detected module community across a range of periods and communities. It uses the find_mentions
function to conduct a cross-reference search within a period's data range with 2 options: 'mentions_only' or 'user_and_mentions'. 'mentions_only
' searches a column with a List of mentions per tweet. 'user_and_mentions
' cross references the author of a tweet with the list of mentions. It returns a Dict of top result tweets found during that period.
query_controller(
hubs=df_hubs,#community-detected data
hub_col_period='period',#column name for periods
hub_col_module='info_module',# column name for community name
hub_col_users='name',#column name for
period_range=[1,10],#range of desired periods
module_range=[1,10],#range of desired communities/modules
corpus=c_htg,#content corpus
period_dates=period_dates,#List of lists with dates to
col_dates='dates'#column name for dates
)
convert_to_df
: Converts the Dict output from query_controller into a Dataframe with top result per user. If no tweet found , appends as None.
find_ht
: Queries subset of isolated mentioned or authored tweets with hashtag group list. It returns another subset as a dataframe.
find_links
: Queries links in tweets with search string. It returns subset as a dataframe.
Other functions include: find_mentions
and print_subset
.
It functions only with Python 3.x and is not backwards-compatible.
Warning: askcomm performs little to no custom error-handling, so make sure your inputs are formatted properly. If you have questions, please let me know via email.
System requirements
- pandas
Installation
- Download this repo onto your computer.
- Store the folder in a meaningful location.
- Open a terminal.
- In the terminal, navigate to the root of the folder.
- In the terminal, run
pip install .
Known Issues or Limitations
- Please contact me if you discover any issues.
Example notebooks
- Coming soon.
Distribution update terminal commands
# Create new distribution of code for archiving sudo python setup.py sdist bdist_wheel # Distribute to Python Package Index python -m twine upload --repository-url https://upload.pypi.org/legacy/ dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file askcomm-0.0.2.tar.gz
.
File metadata
- Download URL: askcomm-0.0.2.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f1e7c73057c4da66fbec9cde3b77bf8c85afda50a319b6f0f8d1d259c47e8f8 |
|
MD5 | ecd1fe1ce370e0c950543af6f4beb377 |
|
BLAKE2b-256 | 40008370bb7e61b8247dbaaf43fead625e9bc1c2f881cf4d8b0a0da4010c725a |
File details
Details for the file askcomm-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: askcomm-0.0.2-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6cd3d1647bfd2a7e4569b417ec136a4856563244cc9723ac7aa5f074d9d2c58 |
|
MD5 | be176baaae86dfa8fc940f271d13310f |
|
BLAKE2b-256 | 967b0039649e6711663f0f0137d7aef4e49ebc840a883e1c0e444cf2bdb1582d |