Skip to main content

SocialVec is a framework of Social Embeddings for eliciting social world knowledge from social networks.

Project description

https://img.shields.io/pypi/v/socialvec.svg

The SocialVec package provides pre-trained embeddings for approximately 200,000 popular Twitter accounts. SocialVec is a framework for learning social entity embeddings, derived from a large-scale Twitter dataset encompassing 1.3 million users and the accounts they follow.

  • Free software: MIT license

What are SocialVec Embeddings?

SocialVec embeddings are low-dimensional vector representations of popular Twitter accounts. These embeddings are trained on co-occurrence patterns observed in the Twitter social network. Accounts frequently co-followed by users are considered socially related, making these embeddings similar to word embeddings where words in similar contexts have similar vector representations.

Package Features

This package includes the following features:

  • Access to pre-trained SocialVec embeddings:

    • Pre-trained embeddings for approximately 200,000 popular Twitter accounts.

    • Embeddings are 100-dimensional, trained using the Skip-gram model with negative sampling (SGNS).

  • Entity similarity computation:

    • Calculate cosine similarity between SocialVec embeddings to assess social similarity between entities.

    • Enables tasks like:

      • Identifying similar entities (e.g., universities similar to UC Berkeley).

      • Recommending Twitter accounts based on existing followings.

      • Assessing the political leaning of news sources.

  • Entity analogy exploration:

    • Experiment with relational arithmetic on SocialVec embeddings to explore entity analogies, similar to word analogies.

Potential Applications

The SocialVec package can be used for a wide range of tasks, including:

  • Recommendation systems: Recommending Twitter accounts or other content based on user social affinity captured by the embeddings.

  • Social analysis: Investigating social trends and relationships between entities on Twitter.

  • Bias detection: Identifying potential biases in social media content or user behavior based on social context.

  • Inferring personal traits: Predicting user characteristics like age, gender, or political leaning based on their social connections on Twitter.

Examples

Here are some practical examples of what you can do with SocialVec:

  • Finding similar entities: Retrieve universities similar to UC Berkeley based on the cosine similarity of their SocialVec embeddings.

  • Recommending Twitter accounts: Suggest accounts similar to those followed by a specific user, leveraging social context captured in the embeddings.

  • Assessing political leaning: Determine the political bias of news sources by comparing their similarity to embeddings of politically polarized accounts (e.g., accounts of prominent politicians).

  • Exploring entity analogies: Complete analogies like “X-Factor : Simon Cowell :: The Voice : ?” using vector arithmetic on SocialVec embeddings.

Advantages of SocialVec

  • Captures social world knowledge: Unlike embeddings derived from factual knowledge bases like Wikipedia or Wikidata, SocialVec embeddings reflect relationships between entities based on social media interactions.

  • Wider coverage: SocialVec represents a broader range of entities, as many Twitter accounts do not have corresponding Wikipedia pages.

Notes

This README covers the pre-trained embeddings provided by the package. Specific implementation details and additional functionality will be defined as part of the package’s development.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2022-09-29)

  • First release on PyPI.

0.1.1 (2022-09-29)

  • Include config.yaml in the distribution.

0.1.2 (2022-10-02)

  • Rearrange config.yaml

  • Support multiple versions of the SocialVec model

  • Fix bug when searching for similarity using username

0.1.3 (2022-10-14)

  • Initial version of SocialVecClassifier

0.1.4 (2022-11-08)

  • Updates to SocialVecClassifier

0.1.5 (2023-11-07)

  • Update a dedicated model for the SocialVecClassifier (2020c)

0.1.6 (2023-11-09)

  • Modify requirements to support more up-to-date python versions

0.1.7 (2024-10-22)

  • Add the option to load the model to RAM in case there is no write permission to the package folder (which

0.1.7.1 (2024-10-22)

  • Add pypi documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

socialvec-0.1.7.1.tar.gz (32.3 kB view hashes)

Uploaded Source

Built Distribution

socialvec-0.1.7.1-py2.py3-none-any.whl (39.0 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page