Skip to main content

SocialVec is a framework of Social Embeddings for eliciting social world knowledge from social networks.

Project description

https://img.shields.io/pypi/v/socialvec.svg

The SocialVec package provides pre-trained embeddings for approximately 200,000 popular Twitter accounts. SocialVec is a framework for learning social entity embeddings, derived from a large-scale Twitter dataset encompassing 1.3 million users and the accounts they follow.

  • Free software: MIT license

What are SocialVec Embeddings?

SocialVec embeddings are low-dimensional vector representations of popular Twitter accounts. These embeddings are trained on co-occurrence patterns observed in the Twitter social network. Accounts frequently co-followed by users are considered socially related, making these embeddings similar to word embeddings where words in similar contexts have similar vector representations.

Package Features

This package includes the following features:

  • Access to pre-trained SocialVec embeddings:

    • Pre-trained embeddings for approximately 200,000 popular Twitter accounts.

    • Embeddings are 100-dimensional, trained using the Skip-gram model with negative sampling (SGNS).

  • Entity similarity computation:

    • Calculate cosine similarity between SocialVec embeddings to assess social similarity between entities.

    • Enables tasks like:

      • Identifying similar entities (e.g., universities similar to UC Berkeley).

      • Recommending Twitter accounts based on existing followings.

      • Assessing the political leaning of news sources.

  • Entity analogy exploration:

    • Experiment with relational arithmetic on SocialVec embeddings to explore entity analogies, similar to word analogies.

Potential Applications

The SocialVec package can be used for a wide range of tasks, including:

  • Recommendation systems: Recommending Twitter accounts or other content based on user social affinity captured by the embeddings.

  • Social analysis: Investigating social trends and relationships between entities on Twitter.

  • Bias detection: Identifying potential biases in social media content or user behavior based on social context.

  • Inferring personal traits: Predicting user characteristics like age, gender, or political leaning based on their social connections on Twitter.

Examples

Here are some practical examples of what you can do with SocialVec:

  • Finding similar entities: Retrieve universities similar to UC Berkeley based on the cosine similarity of their SocialVec embeddings.

  • Recommending Twitter accounts: Suggest accounts similar to those followed by a specific user, leveraging social context captured in the embeddings.

  • Assessing political leaning: Determine the political bias of news sources by comparing their similarity to embeddings of politically polarized accounts (e.g., accounts of prominent politicians).

  • Exploring entity analogies: Complete analogies like “X-Factor : Simon Cowell :: The Voice : ?” using vector arithmetic on SocialVec embeddings.

Advantages of SocialVec

  • Captures social world knowledge: Unlike embeddings derived from factual knowledge bases like Wikipedia or Wikidata, SocialVec embeddings reflect relationships between entities based on social media interactions.

  • Wider coverage: SocialVec represents a broader range of entities, as many Twitter accounts do not have corresponding Wikipedia pages.

Notes

This README covers the pre-trained embeddings provided by the package. Specific implementation details and additional functionality will be defined as part of the package’s development.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2022-09-29)

  • First release on PyPI.

0.1.1 (2022-09-29)

  • Include config.yaml in the distribution.

0.1.2 (2022-10-02)

  • Rearrange config.yaml

  • Support multiple versions of the SocialVec model

  • Fix bug when searching for similarity using username

0.1.3 (2022-10-14)

  • Initial version of SocialVecClassifier

0.1.4 (2022-11-08)

  • Updates to SocialVecClassifier

0.1.5 (2023-11-07)

  • Update a dedicated model for the SocialVecClassifier (2020c)

0.1.6 (2023-11-09)

  • Modify requirements to support more up-to-date python versions

0.1.7 (2024-10-22)

  • Add the option to load the model to RAM in case there is no write permission to the package folder (which

0.1.7.1 (2024-10-22)

  • Add pypi documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

socialvec-0.1.7.1.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

socialvec-0.1.7.1-py2.py3-none-any.whl (39.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file socialvec-0.1.7.1.tar.gz.

File metadata

  • Download URL: socialvec-0.1.7.1.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for socialvec-0.1.7.1.tar.gz
Algorithm Hash digest
SHA256 5349fa37c6b2bd5d4512d1bdbcebad856030116fd2e619d60294fb69de906d86
MD5 0f3354c091a1565fdf0038778be92b5c
BLAKE2b-256 94fa9ae5c92a3d0d616ee5e1f723621e53c7ee4649d2e79ad977a92e1ad415c8

See more details on using hashes here.

File details

Details for the file socialvec-0.1.7.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for socialvec-0.1.7.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e3685526f89ed7d92d777cdc5f286c3c77363cd98fcb8eb14a107ed260bece7e
MD5 45f432abea4cd0d827d3c0c959d64513
BLAKE2b-256 6509982a1fc4db1c2fc9a354d0d9de2a35d6ad4c4a411ad1701602bf3ab2fb4e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page