Primary Key Chunking Flashcards
How do you enable PK Chunking for Bulk API?
Use Sforce-Enable-PKChunking in the header
When would you use PK Chunking?
When you need to extract 10s or 100s of millions of records from Salesforce, you’ll get better performance and reliability if you split the job into a number of separate queries that each retrieve a smaller portion of the data. The PK Chunking feature of the Bulk API automates this process by using the Primary Key (Record ID) of an object to break up the data into manageable chunks and query them separately.
What is PK Chunking?
It is a supported feature of SFDC Bulk API, so it does all the work of splitting the queries into manageable chunks
Which objects can PK Chunking be used on?
Can be used with most standard objects:
- Account
- Campaign
- CampaignMember
- Case
- CaseHistory
- Contact
- Event
- EventRelation
- Lead
- LoginHistory
- Opportunity
- Task
- User
- Custom Objects
How is the chunks (of PK Chunking) processed?
Each chunk is processed as a separate batch that counts toward your daily batch limit
Can you combine filtering and PK Chunking?
Yes, you can do that by including a WHERE clause in the Bulk API query
It’s recommended that you enable PK chunking when bulk queries consistently time out when:
- A table exceeds 10K records
- A table exceeds 50K records
- A table exceeds 100K records
- A table exceeds 10M records
10M records
What is the default size for a PK Chunk?
100,000 records
What is the maximum size for a PK Chunk?
250,000 records
What is the most efficient chunk size (recommended) for an organization?
It depends on a number of factors, such as the data volume, and the filters of the query.
Customers may need to experiment with different chunk sizes to determine what is most optimal for their implementation.
100,000 records is the default size, with chunks able to reach 250,000 records per chunk, but the increase in records per chunk means less efficiency in performance
When using PK Chunking, how would you specify the chunk size in the header?
Sforce-Enable-PKChunking: chunkSize=100000;
What is a best practice for querying a supported object’s share table using PK Chunking?
Determining the chunks is more efficient in this case if the boundaries are defined on the parent object record Ids rather than the share table record Ids. So, for example, the following header could be used for a Bulk API query against the OpportunityShare object table using PK Chunking:
Sforce-Enable-PKChunking: chunkSize=150000; parent-Opportunity