SocialVec is a framework of Social Embeddings for eliciting social world knowledge from social networks.
Project description
The SocialVec package provides pre-trained embeddings for approximately 200,000 popular Twitter accounts. SocialVec is a framework for learning social entity embeddings, derived from a large-scale Twitter dataset encompassing 1.3 million users and the accounts they follow.
Free software: MIT license
Package Features
This package includes the following features:
Access to pre-trained SocialVec embeddings:
Pre-trained embeddings for approximately 200,000 popular Twitter accounts.
Embeddings are 100-dimensional, trained using the Skip-gram model with negative sampling (SGNS).
Entity similarity computation:
Calculate cosine similarity between SocialVec embeddings to assess social similarity between entities.
Enables tasks like:
Identifying similar entities (e.g., universities similar to UC Berkeley).
Recommending Twitter accounts based on existing followings.
Assessing the political leaning of news sources.
Entity analogy exploration:
Experiment with relational arithmetic on SocialVec embeddings to explore entity analogies, similar to word analogies.
Potential Applications
The SocialVec package can be used for a wide range of tasks, including:
Recommendation systems: Recommending Twitter accounts or other content based on user social affinity captured by the embeddings.
Social analysis: Investigating social trends and relationships between entities on Twitter.
Bias detection: Identifying potential biases in social media content or user behavior based on social context.
Inferring personal traits: Predicting user characteristics like age, gender, or political leaning based on their social connections on Twitter.
Examples
Here are some practical examples of what you can do with SocialVec:
Finding similar entities: Retrieve universities similar to UC Berkeley based on the cosine similarity of their SocialVec embeddings.
Recommending Twitter accounts: Suggest accounts similar to those followed by a specific user, leveraging social context captured in the embeddings.
Assessing political leaning: Determine the political bias of news sources by comparing their similarity to embeddings of politically polarized accounts (e.g., accounts of prominent politicians).
Exploring entity analogies: Complete analogies like “X-Factor : Simon Cowell :: The Voice : ?” using vector arithmetic on SocialVec embeddings.
Notes
This README covers the pre-trained embeddings provided by the package. Specific implementation details and additional functionality will be defined as part of the package’s development.
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2022-09-29)
First release on PyPI.
0.1.1 (2022-09-29)
Include config.yaml in the distribution.
0.1.2 (2022-10-02)
Rearrange config.yaml
Support multiple versions of the SocialVec model
Fix bug when searching for similarity using username
0.1.3 (2022-10-14)
Initial version of SocialVecClassifier
0.1.4 (2022-11-08)
Updates to SocialVecClassifier
0.1.5 (2023-11-07)
Update a dedicated model for the SocialVecClassifier (2020c)
0.1.6 (2023-11-09)
Modify requirements to support more up-to-date python versions
0.1.7 (2024-10-22)
Add the option to load the model to RAM in case there is no write permission to the package folder (which
0.1.7.1 (2024-10-22)
Add pypi documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for socialvec-0.1.7.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3685526f89ed7d92d777cdc5f286c3c77363cd98fcb8eb14a107ed260bece7e |
|
MD5 | 45f432abea4cd0d827d3c0c959d64513 |
|
BLAKE2b-256 | 6509982a1fc4db1c2fc9a354d0d9de2a35d6ad4c4a411ad1701602bf3ab2fb4e |