Skip to main content

Algorithmically predict public sentiment on a topic using VADER sentiment analysis

Project description

abraham

PyPI PyPI - Downloads GitHub PyPI - Python Version GitHub issues GitHub last commit

Algorithmically predict public sentiment on a topic using flair sentiment analysis.

Installation

Installation is simple; just install via pip.

$ pip3 install abraham3k

Basic Usage

The most simple way of use is to use the _summary functions.

from abraham3k.prophets import Isaiah

watched = ["amd", "tesla"]

darthvader = Isaiah(
      news_source="newsapi",
      newsapi_key="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
      bearer_token="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
      weights={"desc": 0.33, "text": 0.33, "title": 0.34},
)

scores = darthvader.news_summary(
      watched,
      window=2,  # how many days back from up_to to get news from
      up_to="2021-4-22T00:00:00Z",
)
print(scores)

'''
{'amd': (56.2, 43.8), 'tesla': (40.4, 59.6)} # returns a tuple (positive count : negative count)
'''


scores = darthvader.twitter_summary(
      watched,
      start_time="2021-4-20T00:00:00Z" # note the variable name difference from above
      end_time="2021-4-22T00:00:00Z",
)
print(scores)

'''
{'amd': (57, 43), 'tesla': (42, 58)} # returns a tuple (positive count : negative count)
'''

You can run the function news_sentiment to get the raw scores for the news. This will return a nested dictionary with keys for each topic.

from abraham3k.prophets import Isaiah

darthvader = Isaiah(news_source="google") 

scores = darthvader.news_sentiment(["amd", 
                               "microsoft", 
                               "tesla", 
                               "theranos"], 
                               window=2)
print(scores['tesla']['text'])

'''
                                                 desc              datetime  probability sentiment
0   The latest PassMark ranking show AMD Intel swi...  2021-04-22T18:45:03Z     0.999276  NEGATIVE
1   The X570 chipset AMD offer advanced feature se...  2021-04-22T14:33:07Z     0.999649  POSITIVE
2   Apple released first developer beta macOS 11.4...  2021-04-21T19:10:02Z     0.990774  POSITIVE
3   Prepare terror PC. The release highly anticipa...  2021-04-22T18:00:02Z     0.839055  POSITIVE
4   Stressing ex x86 Canadian AI chip startup Tens...  2021-04-22T13:00:07Z     0.759295  POSITIVE
..                                                ...                   ...          ...       ...
95  Orthopaedic Medical Group Tampa Bay (OMG) exci...  2021-04-21T22:46:00Z     0.979155  POSITIVE
96  OtterBox appointed Leader, proudly 100% Austra...  2021-04-21T23:00:00Z     0.992927  POSITIVE
97  WATG, world's leading global destination hospi...  2021-04-21T22:52:00Z     0.993889  POSITIVE
98  AINQA Health Pte. Ltd. (Headquartered Singapor...  2021-04-22T02:30:00Z     0.641172  POSITIVE
99  Press Release Nokia publish first-quarter repo...  2021-04-22T05:00:00Z     0.894449  NEGATIVE
'''

The same way works for the twitter API (see below for integrating twitter usage).

from abraham3k.prophets import Isaiah

darthvader = Isaiah(news_source="google") 

scores = darthvader.twitter_sentiment(["amd", 
                                    "microsoft", 
                                    "tesla", 
                                    "theranos"]
                                    )

Changing News Sources

Isaiah supports two news sources: Google News and NewsAPI. Default is Google News, but you can change it to NewsAPI by passing Isaiah(news_source='newsapi', api_key='<your api key') when instantiating. I'd highly recommend using NewsAPI. It's much better than the Google News API. Setup is really simple, just head to the register page and sign up to get your API key.

Twitter Functionality

I'd highly recommend integrating twitter. It's really simple; just head to Twitter Developer to sign up and get your bearer_token.

Updates

I've made it pretty simple (at least for me) to push updates. Once I'm in the directory, I can run $ ./build-push 1.2.0 "update install requirements" where 1.2.0 is the version and "update install requirements" is the git commit message. It will update to PyPi and to the github repository.

Notes

Currently, there's another algorithm in progress (SALT), including salt.py and salt.ipynb in the abraham3k/ directory and the entire models/ directory. They're not ready for use yet, so don't worry about importing them or anything.

Contributions

Pull requests welcome!

Detailed Usage

View the full docstrings here.

Help on module abraham3k.prophets in abraham3k:
    class Isaiah(builtins.object)
     |  Isaiah(news_source='google', newsapi_key=None, bearer_token=None, weights={'title': 0.33, 'desc': 0.33, 'text': 0.34}, loud=False) -> None
     |  
     |  Performs sentiment analysis on a search term by taking care of gathering
     |  all the articles and scoring. Named after the biblical prophet
     |  
     |  ...
     |  
     |  Attributes
     |  ----------
     |  sia : Elijiah
     |      Elijiah analyzer
     |  news_source : str
     |      where to get the news from (google or newsapi)
     |  splitting : bool
     |      whether or not to recursively analyze each sentence
     |  weights : dict
     |      how to weight the title, desc, and text attributes
     |      ex: {"title": 0.2, "desc": 0.3, "text": 0.5}
     |  loud : bool
     |      print unnecessary output (for debugging ususally)
     |  bearer_token : str
     |      bearer token for the twitter api
     |  
     |  Methods
     |  -------
     |  get_articles(search_for, up_to=today, window=2)
     |      gets articles for a single search term
     |  compute_total_avg(results_df, meta)
     |      computes avg scores for each row and column of an entire dataframe
     |  score_all(topic_results, meta)
     |      takes care of scoring the entire dataframe for each topic
     |  news_sentiment_summary(topics, window=2, up_to=today)
     |      takes a list of topics and computes the avg scores for each
     |  news_sentiment(topics, window=2, up_to=today)
     |      takes a list of topics and gets the raw scores for each
     |      (per topic per text type per row)
     |  
     |  Methods defined here:
     |  
     |  __init__(self, news_source='google', newsapi_key=None, bearer_token=None, weights={'title': 0.33, 'desc': 0.33, 'text': 0.34}, loud=False) -> None
     |      Parameters
     |      ----------
     |      news_source : str = "google"
     |          where to get the news from
     |      newsapi_key : str = None
     |          api key to connect to newsapi.org
     |      bearer_token : str  = None
     |          bearer token for the twitter api
     |      spliting : bool = False
     |          recursively analyze each sentence or not
     |      weights : dict = {"title": 0.33, "desc": 0.33, "text": 0.34}
     |          how to weight the title, desc, and text attributes
     |      loud : dict = False
     |          print unnecessary output (for debugging ususally)
     |  
     |  get_articles(self, topics: list, window: int = 2, up_to: str = '2021-04-23T21:54:23Z') -> Dict
     |      Takes a list of topics and returns a dict of topics : pd.dataframe
     |      
     |      Parameters
     |      ----------
     |      topics : list
     |          list of terms to search for
     |      up_to : str = datetime.now().strftime(TWITTER_TF)
     |          latest date to get news for
     |      window : int = 2
     |          how many days back to search for
     |      
     |      Returns
     |      -------
     |      dict
     |          in format {topic: <pd.DataFrame>, topic: <pd.DataFrame>, ... } with
     |          dataframe being of the results with columns ['title', 'author',
     |              'source', 'desc', 'text', 'datetime', 'url', 'urlToImage']
     |          ex:
     |          {
     |              'coinbase': <pd.DataFrame>,
     |              'bitcoin': <pd.DataFrame>,
     |              ...
     |          }
     |  
     |  news_sentiment(self, topics: list, window: int = 2, up_to: str = '2021-04-23T21:54:23Z')
     |      Gets the WHOLE sentiment for each topic. No or minimal averaging occurs.
     |      
     |      Parameters
     |      ----------
     |      topics : list
     |          list of terms to search for
     |      up_to : str = datetime.now().strftime(TWITTER_TF)
     |          latest date to get news for
     |      window : int = 2
     |          how many days back to search for
     |      
     |      Returns
     |      -------
     |      scores : dict
     |          returns a 2d dict, set up like so:
     |          {
     |              topic: {"title": titles, "desc": desc, "text": text}
     |          }
     |          where title, desc, and text are dataframes and each row looks like this:
     |          neg    neu    pos  compound                   sentence              datetime
     |        0.173  0.827  0.000   -0.5859  Tesla working vehicle ...  2021-04-20T09:31:36Z
     |  
     |  news_summary(self, topics: list, window: int = 2, up_to: str = '2021-04-23T21:54:23Z')
     |      Gets the summary sentiment for each topic
     |      
     |      Parameters
     |      ----------
     |      topics : list
     |          list of terms to search for
     |      up_to : str = datetime.now().strftime(TWITTER_TF)
     |          latest date to get news for
     |      window : int = 2
     |          how many days back to search for
     |      
     |      Returns
     |      -------
     |      scores : dict
     |          a dict of dicts arranged as {topic: scores},
     |          where scores is a tuple (positive count, negative cound)
     |  
     |  twitter_sentiment(self, topics: list, start_time='2021-04-21T21:54:23Z', end_time='2021-04-23T21:54:23Z')
     |      Gets the WHOLE sentiment for each topic. No or minimal averaging occurs.
     |      
     |      Parameters
     |      ----------
     |      topics : list
     |          list of terms to search for
     |      start_time : str = (datetime.now() - timedelta(2)).strftime(TWITTER_TF)
     |          how far back to search from in time format %Y-%m-%dT%H:%M:%SZ'
     |      end_time : str = datetime.now().strftime(TWITTER_TF)
     |          how recent to search from in time format %Y-%m-%dT%H:%M:%SZ'
     |      
     |      Returns
     |      -------
     |      scores : dict
     |          a dict of dataframe of scores for each tweet
     |  
     |  twitter_summary(self, topics: list, start_time='2021-04-21T21:54:23Z', end_time='2021-04-23T21:54:23Z')
     |      Gets the summary sentiment for each topic from twitter
     |      
     |      Parameters
     |      ----------
     |      topics : list
     |          list of terms to search for
     |      start_time : str = (datetime.now() - timedelta(2)).strftime(TWITTER_TF)
     |          how far back to search from in time format %Y-%m-%dT%H:%M:%SZ'
     |      end_time : str = datetime.now().strftime(TWITTER_TF)
     |          how recent to search from in time format %Y-%m-%dT%H:%M:%SZ'
     |      
     |      Returns
     |      -------
     |      scores : dict
     |          a dict of dicts arranged as {topic: scores},
     |          where scores is a tuple (positive count, negative cound)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abraham3k-1.3.5.tar.gz (11.7 MB view details)

Uploaded Source

Built Distribution

abraham3k-1.3.5-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file abraham3k-1.3.5.tar.gz.

File metadata

  • Download URL: abraham3k-1.3.5.tar.gz
  • Upload date:
  • Size: 11.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.2

File hashes

Hashes for abraham3k-1.3.5.tar.gz
Algorithm Hash digest
SHA256 df8723efb58b69ffac2b8f574a56dd2a2ab6cabb3712091a2a689de322a24eea
MD5 c139617fa2fb80dc4d275b5bf9fe3643
BLAKE2b-256 45741c871253bcd45f2d0bfbf06018a90566012b75904644458872564899aa1b

See more details on using hashes here.

File details

Details for the file abraham3k-1.3.5-py3-none-any.whl.

File metadata

  • Download URL: abraham3k-1.3.5-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.9.2

File hashes

Hashes for abraham3k-1.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 39fe621aa20db5cbab6b5798a2f380c173146c0a7f7a8fbbde21059bdbeb2a31
MD5 6c5933822460d5efd23fb9b06b266a48
BLAKE2b-256 420dc39c619a209501e10d5d3274351b39a15018156331a674ff2501d8a4e47d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page