Skip to main content

A Python library for the Salesforce Bulk API

Project description

A Python library for the Salesforce Bulk API (that actually works)

Changes over salesforce-bulk

The salesforce-bulk library was used to export 18k records to Wingify’s Salesforce system. Even though the library was super useful, it’s broken, not maintained anymore and was a pain to work with while figuring out the bugs. [@bholagabbar](https://github.com/bholagabbar) decided to fix all the issues faced and release a new, usable library **salesforce-bulkipy**. This library has been tested successfully on our Salesforce Sandbox.

salesforce-bulkipy will be actively maintained, unlike salesforce-bulk

Installation

``sudo pip install salesforce-bulkipy``

Incase your setup fails, you may have a few essential tools missing. Try sudo apt-get install build-essential libssl-dev libffi-dev python-dev

Authentication

To access the Bulk API, you need to authenticate a user into Salesforce. There are 2 possible ways to achieve this. These methods work irrespective of whether your organisation has Two-Factor Authentication enabled or not, so that’s a massive overhead taken care of.

The code samples shown read credentials from a config.properties file. Feel free to adapt the input method to your setting

1. username, password, security_token

from salesforce_bulkipy import SalesforceBulkipy
import ConfigParser

config = ConfigParser.RawConfigParser()
config.read('config.properties')

username = config.get('Section', 'username')
password = config.get('Section', 'password')
security_token = config.get('Section', 'security_token')

bulk = SalesforceBulkipy(username=username, password=password, security_token=security_token) #optional parameter: sandbox=True

# Authentication Successful!

2. session_id, host

from salesforce_bulkipy import SalesforceBulkipy
import ConfigParser

config = ConfigParser.RawConfigParser()
config.read('config.properties')

session_id = config.get('Section', 'session_id')
session_id = config.get('Section', 'session_id')

bulk = SalesforceBulkipy(session_id=session_id, host=host) #optional parameter: sandbox=True

# Authentication Successful!

Operations

The basic sequence for driving the Bulk API is:

  1. Create a new job

  2. Add one or more batches to the job

  3. Wait for each batch to finish

  4. Close the job

Bulk Insert, Update, Upsert, Delete

All Bulk upload operations work the same. You set the operation when you create the job. Then you submit one or more documents that specify records with columns to insert/update/delete.

For the upsert operation, we also need to specify some thing called the external_key which can be any attribute(preferably unique) of your custom Salesforce object. Every record to upsert is checked against this key in Salesforce. Say your external key is Id. Now for every record you are pushing, it is checked it you have a record with the same Id already. If yes, then it is updated else that record is created.

For the delete operation, you should only submit the Id for each record.

For efficiency you should use the post_bulk_batch method to post each batch of data. (Note that a batch can have a maximum 10,000 records and be 1GB in size.) You pass a generator or iterator into this function and it will stream data via POST to Salesforce. For help sending CSV formatted data you can use the salesforce_bulk.CsvDictsAdapter class. It takes an iterator returning dictionaries and returns an iterator which produces CSV data.

Concurrency mode: When creating the job, you can pass concurrency=Serial or concurrency=Parallel to set the concurrency mode for the job.

Bulk Insert Example

from salesforce_bulkipy import SalesforceBulkipy
from salesforce_bulkipy import CsvDictsAdapter

bulk = SalesforceBulkipy(username=username, password=password, security_token=security_token)

records_to_insert = [{}, {}]  # A list of A Custom Object dict

# Bulk Insert
job = bulk.create_insert_job("CustomObjectName", contentType='CSV')
csv_iter = CsvDictsAdapter(iter(records_to_insert))
batch = bulk.post_bulk_batch(job, csv_iter)
bulk.wait_for_batch(job, batch)
bulk.close_job(job)

Bulk Query Example

from salesforce_bulkipy import SalesforceBulkipy

bulk = SalesforceBulkipy(username=username, password=password, security_token=security_token)

records_to_insert = [{}, {}]  # A list of A Custom Object dict

# Bulk Query
query = '' # SOQL Query
job = bulk.create_query_job("Object_Name", contentType='CSV')
batch = bulk.query(job, query)
bulk.wait_for_batch(job, batch)
bulk.close_job(job)
# Result
results = bulk.get_batch_result_iter(job, batch, parse_csv=True)

Bulk Upsert Example

from salesforce_bulkipy import SalesforceBulkipy

bulk = SalesforceBulkipy(username=username, password=password, security_token=security_token)

records_to_upsert = [{}, {}]  # A list of A Custom Object dict

# Bulk Query
query = '' # SOQL Query
job = bulk.create_upsert_job("Object_Name", external_id_name="Unique_id", contentType='CSV')
csv_iter = CsvDictsAdapter(iter(records_to_insert))
batch = bulk.post_bulk_batch(job, csv_iter)
bulk.wait_for_batch(job, batch)
bulk.close_job(job)

Credits and Contributions

This repository is a maintained fork of heroku/salesforce-bulk. The changes incorporated here are a result of a joint effort by [@lambacck](https://github.com/lambacck), [@Jeremydavisvt](https://github.com/Jeremydavisvt), [@alexhughson](https://github.com/alexhughson) and [@bholagabbar](https://github.com/bholagabbar). Thanks to [@heroku](https://github.com/heroku) for creating the original useful library.

Feel free to contribute by creating Issues and Pull Requests. We’ll test and merge them.

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

salesforce-bulkipy-1.0.tar.gz (10.4 kB view details)

Uploaded Source

File details

Details for the file salesforce-bulkipy-1.0.tar.gz.

File metadata

File hashes

Hashes for salesforce-bulkipy-1.0.tar.gz
Algorithm Hash digest
SHA256 1b3262c0c6ca0b3afc7de1049314e35ce2573992ef214a657a1c891d0b53a95c
MD5 786b65dd14bd199c1a795088cb2056a1
BLAKE2b-256 0b12cf8cb903f38fcb37b26703ddd105b56fdef30a58d0c4a2f4f912bcc09d7c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page