No project description provided
Project description
🎾 ReadmeDocsFetcher Node for Haystack
This custom component for Haystack is designed to fetch documentation pages from the ReadMe documentation you have access to. It uses a MarkdownConverter to convert all of your documentation pages to a list of Haystack Documents. You can use this node as a standalone node or within an indexing pipeline.
Instllation
pip install readmedocs-fetcher-haystack
Usage in Haystack
- To initialize a
ReadmeDocsFetcheryou have to provide anapi_keyparamter. This is your ReadMe Docs API Key. - There are also 4 optional parameters to initialize the
ReadmeDocsFetcherslugs: To fetch a list of specific pages from your documentation. E.g. if you have want to fetch 'https://docs.haystack.deepset.ai/docs/installation' the slug would beinstallation. If not set, all of the available pages will be fetched.base_url: Optionally provide this to add the full url of a documentation page to themetaof the created document. For examplebase_url='https://docs.haystack.deepset.ai'"version: If not set, the latest stable version of tour docs will be fethed.markdown_converter: When documents are fetched from ReadMe, temporary.mdfiles are created and we use aMakrdownConverterto create a list of haystackDocuments. If not provided at initialization, the aMarkdownConverterwith the default parameters is used.
Standalone
import os
from dotenv import load_dotenv
from haystack.nodes import MarkdownConverter
from readmedocs_fetcher_haystack import ReadmeDocsFetcher
load_dotenv()
README_API_KEY = os.getenv('README_API_KEY')
converter = MarkdownConverter(remove_code_snippets=False)
readme_fetcher = ReadmeDocsFetcher(api_key=README_API_KEY, markdown_converter=converter, base_url="https://docs.haystack.deepset.ai")
readme_fetcher.fetch_docs()
To fetch a single doc from a specific version:
readme_fetcher.fetch_docs(slugs=["nodes_overview"], version="v1.18")
In a Pipeline
import os
from dotenv import load_dotenv
from haystack import Pipeline
from haystack.nodes import MarkdownConverter, PreProcessor
from haystack.document_stores import InMemoryDocumentStore
from readmedocs_fetcher_haystack import ReadmeDocsFetcher
load_dotenv()
README_API_KEY = os.getenv('README_API_KEY')
converter = MarkdownConverter(remove_code_snippets=False)
readme_fetcher = ReadmeDocsFetcher(api_key=README_API_KEY, markdown_converter=converter, base_url="https://docs.haystack.deepset.ai")
preprocessor = PreProcessor()
doc_store = InMemoryDocumentStore()
pipe = Pipeline()
pipe.add_node(component=readme_fetcher, name="ReadmeFetcher", inputs=["File"])
pipe.add_node(component=preprocessor, name="Preprocessor", inputs=["ReadmeFetcher"])
pipe.add_node(component=doc_store, name="DocumentStore", inputs=["Preprocessor"])
pipe.run()
To fetch a single documentation page:
pipe.run(params={"ReadmeFetcher":{"slugs": ["nodes_overview"]}})
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file readmedocs_fetcher_haystack-0.0.2.tar.gz.
File metadata
- Download URL: readmedocs_fetcher_haystack-0.0.2.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.24.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c1ba4c4815982c2f5de71dca5a9847991423e839f1a4ea25345998b649576f7
|
|
| MD5 |
f37746a867c571245df6c264e66e08d9
|
|
| BLAKE2b-256 |
0e02edcc8d61650e4ab8f44cb3b26baf48265fbddf928318f7808a4bee7ea131
|
File details
Details for the file readmedocs_fetcher_haystack-0.0.2-py3-none-any.whl.
File metadata
- Download URL: readmedocs_fetcher_haystack-0.0.2-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.24.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae4a6bf3c133d6b5faccec602b35864ece0e52f434740aab78b020210b366dce
|
|
| MD5 |
21cceb269c13c30c5367a98872575429
|
|
| BLAKE2b-256 |
03fdb98d79bec08bdd7f469feb0c7b87f1b7153ecd89998f9898b9984b648d4e
|