Data Skewing Flashcards
Provide a solution for the following situation:
The customer had hundreds of thousands of account records and 15 Mil invoices, which were within a custom object in a master-detail relationship with the account. Each account record took a long time to display because of the invoices related list’s lengthy rendering time
The delay in displaying the invoices related list was related to data skew. While most account records had few invoice records, there were some records that had thousands of them.
To reduce the delay, the customer tried to reduce the number of invoice records for those parents and keep data skew to a minimum in child objects. Using the Enable Separate Loading of Related Lists setting allowed the account detail to render while the customer was waiting for the related list query to complete
What is the best practice when you want to make deployments more efficient when you have many parent/child records in the batch
Distribute child records so that no parent has more than 10,000 child records. For example, in a deployment that has many contacts but does not use accounts, setup several dummy accounts and distribute the contacts among them
What is record ownership skew?
Large number of records with the same object type owned by a single user
Record ownership is a powerful feature for managing record access. When individual users own the records they create, the role hierarchy makes sure managers have access to the data owned by their subordinates. But when a single user owns a high percentage of the data for any one object, Force.com must perform a large sharing recalculations when you move that user in the hierarchy. The recalculations can be even worse when you add or remove the user to a role or public group that uses a sharing rule to make its data visible to other users in the organization
How can you try and limit record ownership skew?
- Design ownership strategy from the beginning so users own the data they create, then use the role hierarchy and sharing rules to provide access to others
- When you have a single owner with a large amount of data, place them in their own role at the top of the hierarchy
What is parent/child data skew?
When a large number of child records are associated with the same parent record, performance can degrade when the ownership of contacts (ex) change.
How can you avoid parent/child data skew?
SFDC recommends that you keep the number of child records assigned to a single parent below 10,000
What is account data skew?
Accounts and Opportunities have special data relationships that maintain parent and child record access under private sharing models. Too many child records associated with the same parent object causes data skew
What two issues can happen with account data skew?
Record locking
Sharing Issues
Why does record locking occur (or when will it)?
With Account data skew. When updating a large number of contacts under the same account in multiple threads. For each update the system locks both the contact and its parent to maintain integrity
Why does sharing issues occur (or when will it)?
With account data skew. If you change the owner of an account, you may need to examine every one of the account’s child records and adjust their sharing as well. That may include recalculating the role hierarchy and sharing rules
What is lookup skew?
When a very large number of records are associated with a single record in the lookup object (the object you’re searching against).
What strategies/techniques can you follow for mitigating problems related to Lookup Skew?
When you encounter any type of lock exception, try the following:
- Reducing Record Save time (ie. increase save performance, optimize trigger/class code, reduce workflow, consider asynchronous operations, etc)
- Distributing the skew
- Using a picklist field instead of a lookup field
- Reducing the load (i.e. from automated processes and integrations running concurrently)
Describe the reason locking happens (for example when you have account data skews)
When adding a new contact, when you click Save, the database automatically locks the parent account when it begins the DML operation and before it actually inserts the contact. The database releases the lock after executing the triggers and standard save operations
What is a side effect of account skew (and the data locking issue)
Parent Implicit Sharing
In a private sharing model, the built-in implicit sharing feature provides record accessibility, and its parent implicit sharing provides read access to an account for users who have access to standard child objects such as Contacts, Cases and Opportunities.
So when you create a contact, sharing calculations determine during the save operation if a parent implicit share to the account should be created.
Explain the impact of account data skew using a scenario
User Jane has access to contact Bob Smith and has a parent share to the single generic account. Her manager changes ownership of the contact Bob Smith to another salesperson and clicks Save.
The sharing calculations now run for a longer period of time because they have to determine whether to delete the parent implicit share. The calculations check if Jane has access to the remaining 299,999 contacts under the single generic account.
If another salesperson tries to add a new contact for the same account while the sharing calculations are occurring, that request will wait for Force.com to release the lock on the account, resulting in lock contention and reduced database concurrency. Because this is a synchronous request, this request starts counting against the concurrent Apex request limit if the wait exceeds 5 seconds. If the wait exceeds 10 seconds, the salesperson will get an “UNABLE_TO_LOCK_ROW” error.