Skip to main content

Twitter Demographer

Project description

Twitter Demographer

https://img.shields.io/pypi/v/twitter-demographer.svg https://github.com/MilaNLProc/twitter-demographer/workflows/Python%20package/badge.svg Documentation Status

Twitter Demographer provides a simple API to enrich your twitter data with additional variables such as sentiment, user location, gender and age. The tool is completely extensible and you can add your own components to the system.

Features

From a simple set of tweet ids, Twitter Demographer allows you to rehydrate them and to add additional variables to your dataset.

You are not forced to use a specific component. The design of this tool should be modular enough to allow you to decide what to add and what to remove.

from twitter_demographer.twitter_demographer import Demographer
from twitter_demographer.components import Rehydrate
from twitter_demographer.demographics.m3 import GenderAndAge
import pandas as pd

twitter_bearer_token = "TWITTER BEARER"
geonames_token = "GEONAMES TOKEN"

demo = Demographer()

component_1 = Rehydrate(twitter_bearer_token)
component_2 = GeoNamesDecoder(geonames_token)
component_3 = GenderAndAge()

data = pd.DataFrame({"tweet_ids": ["1431271582861774854", "1467887357668077581",
                                   "1467887350084689928", "1467887352647462912"]})
print(data)
demo.add_component(component_1)
demo.add_component(component_2)
demo.add_component(component_3)

print(demo.infer(data))
             tweet_ids      screen_name              name           location user_id_str  ...  geo_location_country  geo_location_address    age gender   is_org
0  1431271582861774854  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
1  1467887357668077581  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
2  1467887350084689928  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org
3  1467887352647462912  federicobianchy  Federico Bianchi  Milano, Lombardia  2332157006  ...                 Italy                 Milan  19-29   male  non-org

Use-Case

Say you want to use an HuggingFace Classifier on some Twitter Data you have. For example, you might want to detect the sentiment of the data you have. The data you have might

Components

Twitter Demographer is based on components that can be concatenated together to build tools. For example, the GeoNamesDecoder to predict the location of a user from a string of text looks like this.

class GeoNamesDecoder(Component):

    def __init__(self, key):
        super().__init__()
        self.key = key

    def outputs(self):
        return ["geo_location_country", "geo_location_address"]

    def inputs(self):
        return ["location"]

    def infer(self, data):
        geo = self.initialize_return_dict()
        for val in data["location"]:
            if val is None:
                geo["geo_location_country"].append(None)
                geo["geo_location_address"].append(None)
            else:
                g = geocoder.geonames(val, key=self.key)
                geo["geo_location_country"].append(g.country)
                geo["geo_location_address"].append(g.address)
        return geo

Limitations and Ethical Considerations

Inferring user attributes always carries the risk of compromising user privacy, while this process can be useful for understanding and explaining phenomena in the social sciences, one should always consider the issues that this can create.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2021-12-16)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twitter_demographer-0.1.1.tar.gz (17.5 kB view hashes)

Uploaded Source

Built Distribution

twitter_demographer-0.1.1-py2.py3-none-any.whl (13.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page