Skip to main content

A package for Group Communication Analysis with improved text processing and visualization

Project description

English | 简体中文

GCA Analyzer

PyPI version support-version license commit flake8 Tests Coverage Status Codacy Badge Documentation Status PyPI Downloads PyPI Downloads DOI Open in Colab Open in ModelScope

Introduction

GCA Analyzer is a Python package for analyzing group communication dynamics using NLP techniques and quantitative metrics. It provides comprehensive tools for understanding participation patterns, interaction dynamics, content newness, and communication density in group communications.

Features

  • Multi-language Support: Built-in support for Chinese and other languages through LLM models
  • Built-in Sample Data: Includes ready-to-use sample conversations for immediate testing
  • Notebook Integration: Jupyter examples for quick runs in Google Colab
  • Comprehensive Metrics: Analyzes group interactions through multiple dimensions
  • Automated Analysis: Finds optimal analysis windows and generates detailed statistics
  • Flexible Configuration: Customizable parameters for different analysis needs
  • Easy Integration: Command-line interface and Python API support

Quick Start

🚀 Latest Update: Colab Support

Experience Google Colab Online

Open in Colab

No installation required. Click the Colab badge above to run GCA Analyzer directly in your browser and get started quickly.


Installation

# Install from PyPI
pip install gca-analyzer

# For development
git clone https://github.com/etShaw-zh/gca_analyzer.git
cd gca_analyzer
pip install -e .

Basic Usage

Option 1: Use Built-in Sample Data (Recommended for First-time Users)

Start immediately with built-in sample data:

# Use built-in sample data
python -m gca_analyzer --sample-data

# Preview the sample data first
python -m gca_analyzer --sample-data --preview

# Interactive mode with sample data (recommended)
python -m gca_analyzer --interactive

Sample Data Contents:

  • 42 different engineering project conversations across multiple teams
  • 2,727 authentic conversation messages
  • 48 different participants from collaborative engineering sessions
  • Original data filtered to include only conversations with ≥30 messages for meaningful analysis

Data Source & Citation: This sample data is adapted from the Epistemic Network Analysis (ENA) Web Tool example dataset. When using this sample data in research or publications, please cite:

  • Shaffer, D. W., Collier, W., & Ruis, A. R. (2016). A tutorial on epistemic network analysis: Analyzing the structure of connections in cognitive, social, and interaction data. Journal of Learning Analytics, 3(3), 9-45.
  • ENA Web Tool: https://app.epistemicnetwork.org/

Option 2: Use Your Own Data

  1. Prepare your communication data in CSV format with required columns:
conversation_id,person_id,time,text
1A,student1,0:08,Hello teacher!
1A,teacher,0:10,Hello everyone!
  1. Run analysis:

    Interactive Mode:

    python -m gca_analyzer --interactive
    # or
    python -m gca_analyzer -i
    

    Command Line Mode:

    python -m gca_analyzer --data your_data.csv
    

    Advanced Options:

    python -m gca_analyzer --data your_data.csv --output results/ --model-name your-model --console-level INFO
    

Analysis Results

The analyzer generates comprehensive statistics for GCA measures:

Descriptive Statistics

  • Participation

    • Measures relative contribution frequency
    • Negative values indicate below-average participation
    • Positive values indicate above-average participation
  • Responsivity

    • Measures how well participants respond to others
    • Higher values indicate better response behavior
  • Internal Cohesion

    • Measures consistency in individual contributions
    • Higher values indicate more coherent messaging
  • Social Impact

    • Measures influence on group discussion
    • Higher values indicate a stronger impact on others
  • Newness

    • Measures introduction of new content
    • Higher values indicate more novel contributions
  • Communication Density

    • Measures information content per message
    • Higher values indicate more information-rich messages

Results are saved as CSV files in the specified output directory.

Visualizations

The analyzer provides interactive and informative visualizations:

GCA Analysis Results

  • Radar Plots: Compare measures across participants
  • Distribution Plots: Visualize measure distributions

Results are saved as interactive HTML files in the specified output directory.

Citation

DOI

If you use GCA Analyzer in your research, please cite it as follows:

@software{xiao2025gca,
  author       = {Xiao, J.},
  title        = {etShaw-zh/gca_analyzer: GCA analyzer: A python package for group communication analysis},
  version      = {v0.4.5},
  year         = {2025},
  url          = {https://doi.org/10.5281/zenodo.15906956},
  note         = {Computer software},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gca_analyzer-0.4.7.tar.gz (119.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gca_analyzer-0.4.7-py3-none-any.whl (119.7 kB view details)

Uploaded Python 3

File details

Details for the file gca_analyzer-0.4.7.tar.gz.

File metadata

  • Download URL: gca_analyzer-0.4.7.tar.gz
  • Upload date:
  • Size: 119.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for gca_analyzer-0.4.7.tar.gz
Algorithm Hash digest
SHA256 308d3a4c0b6234ff4a5f09427139ee0ece3882da63dee9aa0ef7ad11e292af18
MD5 3decb3374e5f8a0a23fdb902cea52dac
BLAKE2b-256 f20701b81e4d0a68530ec1db55410f1dde851b3eb9e046af4f7f5fc1d229b468

See more details on using hashes here.

File details

Details for the file gca_analyzer-0.4.7-py3-none-any.whl.

File metadata

  • Download URL: gca_analyzer-0.4.7-py3-none-any.whl
  • Upload date:
  • Size: 119.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for gca_analyzer-0.4.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2979ac30835e65ef39241497be115e650fd671581ba85cca246e16a35f40de34
MD5 65bdfdc283b553169b7c110d7d5b7956
BLAKE2b-256 e0a9396df8a0daef6d35f8a2c3b2c8b083a6c0052822e744abc2daa61b764cb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page