Skip to main content

An Indonesian Headline Detection Python API.

Project description

headline_detector

Indonesian Headline Detection Python API

This is a Python library that provides APIs for detecting headlines in textual data, especially on social media platforms such as Twitter. The library utilizes a model that has been developed and trained on a dataset of Twitter posts containing both headline and non-headline texts, with the assistance of journalism professionals to ensure the data quality.

$ pip install headline-detector

Available scenario and the performance

Model Scenario 1 Scenario 2 Scenario 3 Scenario 4 Scenario 5 Scenario 6
Fasttext 0.8766 0.8714 0.8793 0.8714 0.8714 0.8661
CNN 0.9081 0.9081 0.8950 0.8898 0.8950 0.8898
IndoBERTweet 0.9895 0.9921 0.9738 0.9580 0.9843 0.9685

All meassured in accuracy

Model Throughput

Model Throughput (± Text/seconds)
IndoBERTweet ±1.3
CNN ±281.60
Fasttext ±2048.41

Tested on Intel i7-6700k and 32GB of RAM.

Usage

Output either 0 (non-headline) and 1 (headline)

from headline_detector import FasttextDetector, IndoBERTweetDetector, CNNDetector

detector = FasttextDetector.load_from_scenario(1)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

detector = CNNDetector.load_from_scenario(3)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

detector = IndoBERTweetDetector.load_from_scenario(5)
data = detector.predict_text(
    [
        "nama kamu siapa?",
        "Kapolda Jatim Teddy Minahasa Dikabarkan Ditangkap Terkait Narkoba  https://t.co/LD9X6VFaUR",
    ]
)
print(data)  # output: [0, 1]

# 0 is non-headline
# 1 is headline

Paper

Coming soon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

headline_detector-1.0.3.tar.gz (8.9 kB view hashes)

Uploaded Source

Built Distribution

headline_detector-1.0.3-py3-none-any.whl (10.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page