Skip to main content

llama-index readers airbyte_salesforce integration

Project description

Airbyte Salesforce Loader

pip install llama-index-readers-airbyte-salesforce

The Airbyte Salesforce Loader allows you to access different Salesforce objects.

Usage

Here's an example usage of the AirbyteSalesforceReader.

from llama_index.readers.airbyte_salesforce import AirbyteSalesforceReader

salesforce_config = {
    # ...
}
reader = AirbyteSalesforceReader(config=salesforce_config)
documents = reader.load_data(stream_name="asset")

Configuration

Check out the Airbyte documentation page for details about how to configure the reader. The JSON schema the config object should adhere to can be found on Github: https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/source_salesforce/spec.yaml.

The general shape looks like this:

{
    "client_id": "<oauth client id>",
    "client_secret": "<oauth client secret>",
    "refresh_token": "<oauth refresh token>",
    "start_date": "<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>",
    "is_sandbox": False,  # set to True if you're using a sandbox environment
    "streams_criteria": [  # Array of filters for salesforce objects that should be loadable
        {
            "criteria": "exacts",
            "value": "Account",
        },  # Exact name of salesforce object
        {"criteria": "starts with", "value": "Asset"},  # Prefix of the name
        # Other allowed criteria: ends with, contains, starts not with, ends not with, not contains, not exacts
    ],
}

By default all fields are stored as metadata in the documents and the text is set to the JSON representation of all the fields. Construct the text of the document by passing a record_handler to the reader:

def handle_record(record, id):
    return Document(
        doc_id=id, text=record.data["title"], extra_info=record.data
    )


reader = AirbyteSalesforceReader(
    config=salesforce_config, record_handler=handle_record
)

Lazy loads

The reader.load_data endpoint will collect all documents and return them as a list. If there are a large number of documents, this can cause issues. By using reader.lazy_load_data instead, an iterator is returned which can be consumed document by document without the need to keep all documents in memory.

Incremental loads

This loader supports loading data incrementally (only returning documents that weren't loaded last time or got updated in the meantime):

reader = AirbyteSalesforceReader(config={...})
documents = reader.load_data(stream_name="asset")
current_state = reader.last_state  # can be pickled away or stored otherwise

updated_documents = reader.load_data(
    stream_name="asset", state=current_state
)  # only loads documents that were updated since last time

This loader is designed to be used as a way to load data into LlamaIndex.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file llama_index_readers_airbyte_salesforce-0.3.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_readers_airbyte_salesforce-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b692c2d84948f39985a027d000b472c7c2200e9a2afd5e25009fd318de5aa94f
MD5 44dd4f73a528d1cb85ea33414258c8d5
BLAKE2b-256 2f32b8bd30448bc0e2de10e3f00f0a3f0421541fb1866e9e2ad8bdeffa921854

See more details on using hashes here.

File details

Details for the file llama_index_readers_airbyte_salesforce-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_readers_airbyte_salesforce-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3859ccd5c068abdc52b659a86b0b62f2a8cfc0fc08c3c46338a7f280807f8fbc
MD5 2461c82bffb459e2b7f4ad31f85c14e9
BLAKE2b-256 8b6942e19972cd052773caefb5ef6d8adec36b6f04578d24948ea298358680d1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page