Skip to main content

This python library provides corpus in English and various local african languages e.g(Youruba, Hausa, Pidgin), it also does sentiment analysis on brands

Project description

This python library provides corpus in English and various local african languages e.g(Youruba, Hausa, Pidgin), it also does sentiment analysis on brands

USAGE

Brand Sentiment Analysis

brand = the name of the brand you will like to perfrom sentiment analysis on e.g "MTN" csvFileName = The name of the csv file you will like to save your output to, default is brandNews.csv. (optional parameter)
from anjie import brandSentimentAnalysis
brandSentimentAnalysis.anjie_brands(brand = "MTN", csvFileName = 'brandNews')
import pandas as pd
df = pd.read_csv("brandNews.csv.csv")

Scraping English Corpus

noRows = The number of rows of news you want. csvFileName = The name of the csv file you will like to save your output to, default is news.csv. (optional parameter) News categories include ['news', 'sports', 'metro-plus', 'politics', 'business', 'entertainment', 'editorial', 'columnist'] removeCategories = [] :as a parameter for news categories you dont want in the scraped corpus. (optional parameter) e.g , englishCorpus.scrape(noRows = 150, removeCategories = ['metro-plus', 'politics'])

pass onlyCategories = [] : as a parameter for only categories you want in the scraped corpus. (optional parameter) e.g , englishCorpus.scrape(noRows = 150, onlyCategories = ['news', 'sports', 'metro-plus', 'entertainment', 'editorial', 'columnist'])


from anjie import englishCorpus
englishCorpus.scrape(noRows = 150)
df = pd.read_csv("news.csv")

Scraping Hausa Corpus
noRows = The number of rows of news you want. only 60 rows of hausa corpus is currently available. csvName = The name of the csv file you will like to save your output to, default is hausa_news.csv. (optional parameter)


from anjie import hausaCorpus
hausaCorpus.scrape(noRows = 10)
import pandas as pd
df = pd.read_csv("hausa_news.csv")

Scraping Pidgin English corpus
noRows = The number of rows of news you want. csvFileName = The name of the csv file you will like to save your output to, default is pidgin_corpus.csv. (optional parameter) News categories include ['nigeria', 'africa', 'sport', 'entertainment'] removeCategories = [] :as a parameter for news categories you dont want in the scraped corpus. (optional parameter) e.g , englishCorpus.scrape(noRows = 150, removeCategories = ['entertainment'])

pass onlyCategories = [] : as a parameter for only categories you want in the scraped corpus. (optional parameter) e.g , englishCorpus.scrape(noRows = 150, onlyCategories = ['nigeria','sport', 'entertainment'])
from anjie import pidginCorpus
pidginCorpus.scrape(noRows = 20)
df = pd.read_csv("pidgin_corpus.csv")

Scraping Yoruba Corpus
noRows = The number of rows of news you want. csvFileName = The name of the csv file you will like to save your output to, default is yoruba_corpus.csv. (optional parameter)


from anjie import yorubaCorpus
yorubaCorpus.scrape(noRows = 20)
df = pd.read_csv("yoruba_corpus.csv")

Github link for project - https://github.com/Free-tek/Anjie_local_language_corpus_generator

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anjie-1.0.0.tar.gz (2.0 kB view details)

Uploaded Source

Built Distribution

anjie-1.0.0-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file anjie-1.0.0.tar.gz.

File metadata

  • Download URL: anjie-1.0.0.tar.gz
  • Upload date:
  • Size: 2.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for anjie-1.0.0.tar.gz
Algorithm Hash digest
SHA256 532c736e4e8411ba59647e5193bd5499aeac89fe21fa7a0261f2a2ad0f984f20
MD5 a902a3a716f07d72a23b913ca41bb5bb
BLAKE2b-256 4f37495584ed57eeb20943be430e2a27b91654bd53395f27d443056f7ce52e57

See more details on using hashes here.

File details

Details for the file anjie-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: anjie-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for anjie-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1fb27b2ed5d5025cf890e17f6d1c962d7fa00e034c64a4c99cae6a0870b51162
MD5 ae465761cf10ec7e0784c19d4a665ab0
BLAKE2b-256 e86612c31b5a45e0e0abb35757aaf6a39b46b323aad5d3a97b7406f8f512551b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page