Skip to main content

llama-index readers airbyte_salesforce integration

Project description

Airbyte Salesforce Loader

pip install llama-index-readers-airbyte-salesforce

The Airbyte Salesforce Loader allows you to access different Salesforce objects.

Usage

Here's an example usage of the AirbyteSalesforceReader.

from llama_index.readers.airbyte_salesforce import AirbyteSalesforceReader

salesforce_config = {
    # ...
}
reader = AirbyteSalesforceReader(config=salesforce_config)
documents = reader.load_data(stream_name="asset")

Configuration

Check out the Airbyte documentation page for details about how to configure the reader. The JSON schema the config object should adhere to can be found on Github: https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/source_salesforce/spec.yaml.

The general shape looks like this:

{
    "client_id": "<oauth client id>",
    "client_secret": "<oauth client secret>",
    "refresh_token": "<oauth refresh token>",
    "start_date": "<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>",
    "is_sandbox": False,  # set to True if you're using a sandbox environment
    "streams_criteria": [  # Array of filters for salesforce objects that should be loadable
        {
            "criteria": "exacts",
            "value": "Account",
        },  # Exact name of salesforce object
        {"criteria": "starts with", "value": "Asset"},  # Prefix of the name
        # Other allowed criteria: ends with, contains, starts not with, ends not with, not contains, not exacts
    ],
}

By default all fields are stored as metadata in the documents and the text is set to the JSON representation of all the fields. Construct the text of the document by passing a record_handler to the reader:

def handle_record(record, id):
    return Document(
        doc_id=id, text=record.data["title"], extra_info=record.data
    )


reader = AirbyteSalesforceReader(
    config=salesforce_config, record_handler=handle_record
)

Lazy loads

The reader.load_data endpoint will collect all documents and return them as a list. If there are a large number of documents, this can cause issues. By using reader.lazy_load_data instead, an iterator is returned which can be consumed document by document without the need to keep all documents in memory.

Incremental loads

This loader supports loading data incrementally (only returning documents that weren't loaded last time or got updated in the meantime):

reader = AirbyteSalesforceReader(config={...})
documents = reader.load_data(stream_name="asset")
current_state = reader.last_state  # can be pickled away or stored otherwise

updated_documents = reader.load_data(
    stream_name="asset", state=current_state
)  # only loads documents that were updated since last time

This loader is designed to be used as a way to load data into LlamaIndex.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_readers_airbyte_salesforce-0.4.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_airbyte_salesforce-0.4.1.tar.gz
Algorithm Hash digest
SHA256 ed62d87be6513f971d4f2b9cd1b20a88ff3201544ddbac1dba5c995ea54b989f
MD5 9a87b3df67f1610f7ad09ff267aa8fae
BLAKE2b-256 1f0ca8e9719df1d380fbec23d2cbb13eb3ef68dc7a21560b3cebc4fb95ee8fcc

See more details on using hashes here.

File details

Details for the file llama_index_readers_airbyte_salesforce-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_airbyte_salesforce-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 048f9056dcdab2963f7ac7e9f1be326f2aaebfe5cff4e9ad6a7aace7d836dade
MD5 71612030b5a7273a627315464a83c4e9
BLAKE2b-256 3f97bee575242c83cbca72f6a377a15f530b5ae9db20393d06cbf9874caa288e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page