Skip to main content

A package to generate comprehensive insights from documents using NLP techniques.

Project description

Document Insights Generator The Document Insights Generator is a Python package that uses natural language processing (NLP) techniques to extract valuable insights from text documents. The tool supports PDF and Word (.docx) documents.

Features Text extraction from PDF and DOCX documents. Keyword extraction using TF-IDF. Named Entity Recognition (NER) using dslim/bert-base-NER transformer model. Topic modeling using Latent Dirichlet Allocation (LDA). Answers questions about the document content using GPT-2 model from the OpenAI API. Provides references based on the document’s content. Installation You can install the Document Insights Generator from PyPI:

bash Copy code pip install documentinsightsgenerator This will also install the required dependencies.

Usage Here is a basic example of using the Document Insights Generator:

python Copy code from documentinsightsgenerator import DocumentInsightsGenerator

# Initialize the DocumentInsightsGenerator with the API key dig = DocumentInsightsGenerator(api_key=”your-openai-api-key”)

# Load a document dig.load_document(“path/to/your/document.pdf”)

# Ask a question about the document answer = dig.answer_question(“What is the main topic of the document?”) print(f”Answer: {answer}n”) For more detailed examples, please refer to the examples directory.

Contributing We welcome contributions! Please see our contributing guidelines for more details.

License This project is licensed under the terms of the MIT license. See LICENSE for more information.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DocumentInsightsGenerator-0.1.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

DocumentInsightsGenerator-0.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file DocumentInsightsGenerator-0.1.tar.gz.

File metadata

File hashes

Hashes for DocumentInsightsGenerator-0.1.tar.gz
Algorithm Hash digest
SHA256 e8445b155f3ebc8278620459857fce0def28930bf12426f5928a4e581d57bcbb
MD5 7606cbe92f94c62d1b461af88ae2743d
BLAKE2b-256 11bb6557c99e6b2c519eccb0dcda137a3c2978ba2740b2314a3d0134c678c940

See more details on using hashes here.

File details

Details for the file DocumentInsightsGenerator-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for DocumentInsightsGenerator-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d6fec488c423e8970fd7d5cc11b4d6233795f5eb7735e00b5bd5f4d012f7ab74
MD5 417b337dafdef78db54abc901fd1b1b8
BLAKE2b-256 2cf1229ec2b0237b6f03555b42eb4c219a6adae053c595ba1a3dc889b615b10b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page