A package for Group Communication Analysis with improved text processing and visualization
Project description
English | 简体中文
GCA Analyzer
Introduction
GCA Analyzer is a Python package for analyzing group communication dynamics using NLP techniques and quantitative metrics. It provides comprehensive tools for understanding participation patterns, interaction dynamics, content newness, and communication density in group communications.
Features
- Multi-language Support: Built-in support for Chinese and other languages through LLM models
- Built-in Sample Data: Includes ready-to-use sample conversations for immediate testing
- Notebook Integration: Jupyter examples for quick runs in Google Colab
- Comprehensive Metrics: Analyzes group interactions through multiple dimensions
- Automated Analysis: Finds optimal analysis windows and generates detailed statistics
- Flexible Configuration: Customizable parameters for different analysis needs
- Easy Integration: Command-line interface and Python API support
Quick Start
🚀 Latest Update: Colab Support
Experience Google Colab Online
No installation required. Click the Colab badge above to run GCA Analyzer directly in your browser and get started quickly.
Installation
# Install from PyPI
pip install gca-analyzer
# For development
git clone https://github.com/etShaw-zh/gca_analyzer.git
cd gca_analyzer
pip install -e .
Basic Usage
Option 1: Use Built-in Sample Data (Recommended for First-time Users)
Start immediately with built-in sample data:
# Use built-in sample data
python -m gca_analyzer --sample-data
# Preview the sample data first
python -m gca_analyzer --sample-data --preview
# Interactive mode with sample data (recommended)
python -m gca_analyzer --interactive
Sample Data Contents:
- 42 different engineering project conversations across multiple teams
- 2,727 authentic conversation messages
- 48 different participants from collaborative engineering sessions
- Original data filtered to include only conversations with ≥30 messages for meaningful analysis
Data Source & Citation: This sample data is adapted from the Epistemic Network Analysis (ENA) Web Tool example dataset. When using this sample data in research or publications, please cite:
- Shaffer, D. W., Collier, W., & Ruis, A. R. (2016). A tutorial on epistemic network analysis: Analyzing the structure of connections in cognitive, social, and interaction data. Journal of Learning Analytics, 3(3), 9-45.
- ENA Web Tool: https://app.epistemicnetwork.org/
Option 2: Use Your Own Data
- Prepare your communication data in CSV format with required columns:
conversation_id,person_id,time,text
1A,student1,0:08,Hello teacher!
1A,teacher,0:10,Hello everyone!
-
Run analysis:
Interactive Mode:
python -m gca_analyzer --interactive # or python -m gca_analyzer -i
Command Line Mode:
python -m gca_analyzer --data your_data.csv
Advanced Options:
python -m gca_analyzer --data your_data.csv --output results/ --model-name your-model --console-level INFO
Analysis Results
The analyzer generates comprehensive statistics for GCA measures:
-
Participation
- Measures relative contribution frequency
- Negative values indicate below-average participation
- Positive values indicate above-average participation
-
Responsivity
- Measures how well participants respond to others
- Higher values indicate better response behavior
-
Internal Cohesion
- Measures consistency in individual contributions
- Higher values indicate more coherent messaging
-
Social Impact
- Measures influence on group discussion
- Higher values indicate a stronger impact on others
-
Newness
- Measures introduction of new content
- Higher values indicate more novel contributions
-
Communication Density
- Measures information content per message
- Higher values indicate more information-rich messages
Results are saved as CSV files in the specified output directory.
Visualizations
The analyzer provides interactive and informative visualizations:
- Radar Plots: Compare measures across participants
- Distribution Plots: Visualize measure distributions
Results are saved as interactive HTML files in the specified output directory.
Citation
If you use GCA Analyzer in your research, please cite it as follows:
@software{xiao2025gca,
author = {Xiao, J.},
title = {etShaw-zh/gca_analyzer: GCA analyzer: A python package for group communication analysis},
version = {v0.4.5},
year = {2025},
url = {https://doi.org/10.5281/zenodo.15906956},
note = {Computer software},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gca_analyzer-0.4.7.tar.gz.
File metadata
- Download URL: gca_analyzer-0.4.7.tar.gz
- Upload date:
- Size: 119.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
308d3a4c0b6234ff4a5f09427139ee0ece3882da63dee9aa0ef7ad11e292af18
|
|
| MD5 |
3decb3374e5f8a0a23fdb902cea52dac
|
|
| BLAKE2b-256 |
f20701b81e4d0a68530ec1db55410f1dde851b3eb9e046af4f7f5fc1d229b468
|
File details
Details for the file gca_analyzer-0.4.7-py3-none-any.whl.
File metadata
- Download URL: gca_analyzer-0.4.7-py3-none-any.whl
- Upload date:
- Size: 119.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2979ac30835e65ef39241497be115e650fd671581ba85cca246e16a35f40de34
|
|
| MD5 |
65bdfdc283b553169b7c110d7d5b7956
|
|
| BLAKE2b-256 |
e0a9396df8a0daef6d35f8a2c3b2c8b083a6c0052822e744abc2daa61b764cb0
|