Large Data Volumes Flashcards
What technique can be used to reduce the amount of data in Salesforce?
By using Mashups:
Maintain large data sets in a different application, and then make that application available to Salesforce as needed.
What are Mashups?
Mashups use Salesforce presentation to display Salesforce-hosted data and externally hosted data
Provide two mashup designs supported by Salesforce
- External website
2. Callouts
Name two advantages of Using Mashups
- Data is never stale
2. No proprietary method needs to be developed to integrate the two systems
Name two disadvantages of Using Mashups
- Accessing data takes more time
- Functionality is reduced. For example, reporting and workflow do not work on the external data
In addition: Because of their real-time restrictions, mashups are limited to short interactions and small amounts of data
What is the impact of soft deleted records?
While the data is soft deleted, it still affects database performance because the data is still resident, and deleted records have to be excluded from any queries.
How long does data stay in the recycle bin?
For 15 days, or until the recycle bin grows to a specific size.
How can you hard delete records?
Use the Bulk API’s hard delete function to delete large data volumes.
What is the best practice when you want to improve the performance of loading data from the API?
Use the Salesforce Bulk API when you have more than a few hundred thousand records
What is the best practice when you want to delete large volumes of data
When deleting large volumes of data, a process that involves deleting one million or more records, use the hard delete option of the Bulk API.
Deleting large volumes of data might take significant time due to the complexity of the deletion process
What is the best practice when you want to make the data deletion process more efficient
When deleting records that have many children, delete the children first
What is the best practice when you want to avoid sharing computations
Avoid having any user own more than 10,000 records
What is the best practice when you want to improve performance when you have a large amount of data
Use a data-tiering strategy that spreads data across multiple objects and brings in data on demand from another object or external store
What is the best practice when you want to reduce the time it takes to create full copies of production sandboxes with large data volumes
When creating copies of production sandboxes, exclude field history if it isn’t required, and don’t change a lot of data until the sandbox copy is created
Provide a solution for the following situation:
The customer designed a custom integration to synchronize Salesforce data with external customer applications.
The integration process involved:
- Querying Salesforce for all data in a given object
- Loading this data into the external systems
- Querying Salesforce again to get IDs of all the data so the integration process could determine what data has been deleted from Salesforce
The objects contained several million records. The integration also used a specific API user that was part of the sharing hierarchy to limit the records retrieved. The queries were taking minutes to complete
The solution was to give the query access to all the data and then to use selective filters to get the appropriate records.
For example, using an administrator as the API user would have provided access to all of the data and prevented sharing from being considered in the query.
An additional solution would have been to create a delta extraction, lowering the volume of data that needed to be processed.