Skip to main content

A package to classify scientific documents by field of study

Project description

Overview

A machine learning model to classify scientific documents (articles and thesis) by field of study.

Available languages : Arabic, French, English.

Training set : 117976, Test set : 50558, Accuracy : 87%.

Available labels : 'Sciences and technology', 'Matter sciences', 'Mathematics and computer science', 'Natural and life sciences', 'Earth and universe sciences', 'Economics, marketing and management', 'Law and political sciences', 'Literature and foreign languages', 'social and human sciences', 'Sport and physical activities', 'Health sciences', 'Architecture and urban planning'.

Use

from doc_classifier import classify

summary = 'This article analyzes the basic classification of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. It combines analysis on common algorithms in machine learning, such as decision tree algorithm, random forest algorithm, artificial neural network algorithm, SVM algorithm, Boosting and Bagging algorithm, BP algorithm. Through the development of theoretical systems, further improvement of autonomous learning capabilities, the integration of multiple digital technologies, and the promotion of personalized custom services, the purpose is to improve people's awareness of machine learning and accelerate the speed of popularization of machine learning.'

label = classify(summary)

print(label)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc_classifier-0.0.9.tar.gz (28.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doc_classifier-0.0.9-py3-none-any.whl (28.6 MB view details)

Uploaded Python 3

File details

Details for the file doc_classifier-0.0.9.tar.gz.

File metadata

  • Download URL: doc_classifier-0.0.9.tar.gz
  • Upload date:
  • Size: 28.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for doc_classifier-0.0.9.tar.gz
Algorithm Hash digest
SHA256 c1dd1480d9402890b56fca287ef7d280229719fc18b9c15f09eb95190d39c2f8
MD5 3d777dd19d8b06d8ef9b08d0f2bab16e
BLAKE2b-256 fa69f0f4b0ef517e0c64fabd9fcf3d252130865994da1c12e81b31f9e361a2d2

See more details on using hashes here.

File details

Details for the file doc_classifier-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: doc_classifier-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 28.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for doc_classifier-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 edccf3b1ab09f52cf44b980dbac8128cd99856655bc3b4b36bdc274e896d76cd
MD5 dfd03a9cebaffaeaade5d40c759df77e
BLAKE2b-256 540332c5c5f624e0544b2808430d05045a594e177add05c27e4e39003dd9a647

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page