Skip to main content

Twitter Demographer

Project description

Twitter Demographer

https://img.shields.io/pypi/v/twitter-demographer.svg https://github.com/MilaNLProc/twitter-demographer/workflows/Python%20package/badge.svg Documentation Status

Twitter Demographer provides a simple API to enrich your twitter data with additional variables such as sentiment, user location, gender and age. The tool is completely extensible and you can add your own components to the system.

Features

From a simple set of tweet ids, Twitter Demographer allows you to rehydrate them and to add additional variables to your dataset.

You are not forced to use a specific component. The design of this tool should be modular enough to allow you to decide what to add and what to remove.

from twitter_demographer.twitter_demographer import Demographer
from twitter_demographer.components import Rehydrate
from twitter_demographer.demographics.m3 import GenderAndAge
import pandas as pd

twitter_bearer_token = "TWITTER BEARER"
geonames_token = "GEONAMES TOKEN"

demo = Demographer()

component_1 = Rehydrate(twitter_bearer_token)
component_2 = GeoNamesDecoder(geonames_token)
component_3 = GenderAndAge()

data = pd.DataFrame({"tweet_ids": ["1431271582861774854", "1467887357668077581",
                                   "1467887350084689928", "1467887352647462912"]})
print(data)
demo.add_component(component_1)
demo.add_component(component_2)
demo.add_component(component_3)

print(demo.infer(data))
             tweet_ids      screen_name              name           location user_id_str  ...  geo_location_country  geo_location_address    age gender   is_org
0  1431271582861774854  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
1  1467887357668077581  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
2  1467887350084689928  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
3  1467887352647462912  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org

Use-Case

Say you want to use an HuggingFace Classifier on some Twitter Data you have. For example, you might want to detect the sentiment of the data you have. The data you have might

Components

Twitter Demographer is based on components that can be concatenated together to build tools. For example, the GeoNamesDecoder to predict the location of a user from a string of text looks like this.

class GeoNamesDecoder(Component):

    def __init__(self, key):
        super().__init__()
        self.key = key

    def outputs(self):
        return ["geo_location_country", "geo_location_address"]

    def inputs(self):
        return ["location"]

    def infer(self, data):
        geo = self.initialize_return_dict()
        for val in data["location"]:
            if val is None:
                geo["geo_location_country"].append(None)
                geo["geo_location_address"].append(None)
            else:
                g = geocoder.geonames(val, key=self.key)
                geo["geo_location_country"].append(g.country)
                geo["geo_location_address"].append(g.address)
        return geo

Limitations and Ethical Considerations

Inferring user attributes always carries the risk of compromising user privacy, while this process can be useful for understanding and explaining phenomena in the social sciences, one should always consider the issues that this can create.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2021-12-16)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twitter_demographer-0.1.4.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

twitter_demographer-0.1.4-py2.py3-none-any.whl (13.6 kB view details)

Uploaded Python 2Python 3

File details

Details for the file twitter_demographer-0.1.4.tar.gz.

File metadata

  • Download URL: twitter_demographer-0.1.4.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for twitter_demographer-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a773236d932b43b45ddfc432efdf3fe7411190e17be673579aeb724ac1dcbc06
MD5 d55f87e220c873112934fbde9bc83780
BLAKE2b-256 4db480bb73bffbe7e7bdc284eb9236b3617b0005ce9a6d274b9af9114081903f

See more details on using hashes here.

File details

Details for the file twitter_demographer-0.1.4-py2.py3-none-any.whl.

File metadata

  • Download URL: twitter_demographer-0.1.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for twitter_demographer-0.1.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7a61619848e3a90432b354a60af43b04ac3a6fd57345a615423e3945b733596d
MD5 5e874345ef82c323a5b5f58a8987d843
BLAKE2b-256 79ac0d6a3a04a828ba50f1ecb128cec95b5366c13d0641113ce06e359b565427

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page