Skip to main content

A set of search patterns that query a corpus of event-based and community-detected tweets, but it could be modified to query most social-network (node-edge) data.

Project description

askcomm: Python 3 module - Search patterns for event-based, community-detected twitter data.

By Chris Lindgren chris.a.lindgren@gmail.com

Distributed under the BSD 3-clause license. See LICENSE.txt or http://opensource.org/licenses/BSD-3-Clause for details.

Overview

A set of search patterns that query a corpus of event-based and community-detected tweets, but it could be modified to query most social-network (node-edge) data. The queries are great for content produced within the detected-community subgraph data.

It assumes you have:

  • imported your corpus as a pandas DataFrame,
  • included metadata information, such as a list of dates and list of groups to reorganize your corpus, and
  • pre-processed your documents as community-detected data across periodic events.

Functions

query_controller: Accepts corpus and hub user data and searches for tweets germane to the detected module community across a range of periods and communities. It uses the find_mentions function to conduct a cross-reference search within a period's data range with 2 options: 'mentions_only' or 'user_and_mentions'. 'mentions_only' searches a column with a List of mentions per tweet. 'user_and_mentions' cross references the author of a tweet with the list of mentions. It returns a Dict of top result tweets found during that period.

query_controller(
    hubs=df_hubs,#community-detected data
    hub_col_period='period',#column name for periods
    hub_col_module='info_module',# column name for community name
    hub_col_users='name',#column name for 
    period_range=[1,10],#range of desired periods
    module_range=[1,10],#range of desired communities/modules
    corpus=c_htg,#content corpus
    period_dates=period_dates,#List of lists with dates to 
    col_dates='dates'#column name for dates
)

convert_to_df: Converts the Dict output from query_controller into a Dataframe with top result per user. If no tweet found , appends as None.

find_ht: Queries subset of isolated mentioned or authored tweets with hashtag group list. It returns another subset as a dataframe.

find_links: Queries links in tweets with search string. It returns subset as a dataframe.

Other functions include: find_mentions and print_subset.

It functions only with Python 3.x and is not backwards-compatible.

Warning: askcomm performs little to no custom error-handling, so make sure your inputs are formatted properly. If you have questions, please let me know via email.

System requirements

  • pandas

Installation

  1. Download this repo onto your computer.
  2. Store the folder in a meaningful location.
  3. Open a terminal.
  4. In the terminal, navigate to the root of the folder.
  5. In the terminal, run pip install .

Known Issues or Limitations

  • Please contact me if you discover any issues.

Example notebooks

  • Coming soon.

Distribution update terminal commands

# Create new distribution of code for archiving
sudo python setup.py sdist bdist_wheel

# Distribute to Python Package Index
python -m twine upload --repository-url https://upload.pypi.org/legacy/ dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

askcomm-0.0.2.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

askcomm-0.0.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file askcomm-0.0.2.tar.gz.

File metadata

  • Download URL: askcomm-0.0.2.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.5

File hashes

Hashes for askcomm-0.0.2.tar.gz
Algorithm Hash digest
SHA256 7f1e7c73057c4da66fbec9cde3b77bf8c85afda50a319b6f0f8d1d259c47e8f8
MD5 ecd1fe1ce370e0c950543af6f4beb377
BLAKE2b-256 40008370bb7e61b8247dbaaf43fead625e9bc1c2f881cf4d8b0a0da4010c725a

See more details on using hashes here.

File details

Details for the file askcomm-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: askcomm-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.5

File hashes

Hashes for askcomm-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c6cd3d1647bfd2a7e4569b417ec136a4856563244cc9723ac7aa5f074d9d2c58
MD5 be176baaae86dfa8fc940f271d13310f
BLAKE2b-256 967b0039649e6711663f0f0137d7aef4e49ebc840a883e1c0e444cf2bdb1582d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page