Large Data Volumes Flashcards

1
Q

What is the criteria for Standard Indexes

A

Standard Indexed Fields
Used if the filter matches less than 30% of the total records, up to one million records.
For example, a standard index is used if:
• A query is executed against a table with two million records, and the filter matches 600,000 or fewer records.
• A query is executed against a table with five million records, and the filter matches one million or fewer records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Skinny Tables

A

Salesforce creates skinny tables to contain frequently used fields and to avoid joins, and it keeps the skinny tables in sync with their source tables when the source tables are modified. To enable skinny tables, contact salesforce.com Customer Support.

Skinny tables are most useful with tables containing millions of records. They can be created on custom objects, and on Account, Contact, Opportunity, Lead, and Case objects. In addition, they can enhance performance for reports, list views, and SOSL.

Skinny tables can contain a maximum of 100 columns.
• Skinny tables cannot contain fields from other objects.
• Skinny tables are not copied to sandbox organizations. To have production skinny tables activated in a sandbox organization, contact salesforce.com Customer Support.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Custom Indexes

A

Custom Indexed Fields
Used if the filter matches less than 10% of the total records, up to 333,333 records.
For example, a custom index is used if:
• A query is executed against a table with 500,000 records, and the filter matches 50,000 or fewer records.
• A query is executed against a table with five million records, and the filter matches 333,333 or fewer records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens if one of the search criteria is not indexed?

A

If the criteria for an indexed field are not met, only that index is excluded from the query—other indexes might still be used if they are in the WHERE clause and meet the thresholds for records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do all fields in the OR clause need to be indexed for it to work?

A

All fields in the OR clause must be indexed in order for any index to be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the OOB standard indexed fields?

A
  • RecordTypeId
  • Division
  • CreatedDate
  • Systemmodstamp (LastModifiedDate)
  • Name
  • Email (for contacts and leads)
  • Foreign key relationships (lookups and master-detail)
  • The unique Salesforce record ID, which is the primary key for each object

Salesforce also supports custom indexes on custom fields, with the exception of multi-select picklists, text area (long), text area(rich), non-deterministic formula fields, and encrypted text fields.

External IDs cause an index to be created on that field, which is then considered by the Force.com query optimizer. External IDs can be created on only the following fields.
• Auto Number
• Email
• Number
• Text
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are two column custom indexes

A

Two-column custom indexes are a specialized feature of the Salesforce platform. They are useful for list views and other situations in which you want to use one field to select the records to display and a second field to sort those records. For example, an Account list view that selects by State and sorts by City can use a two-column index with State in the first column and City in the second.

Two-column indexes are subject to the same restrictions as single-column indexes, with one exception. Two-column indexes can have nulls in the second column by default, whereas single-column indexes cannot, unless
salesforce.com Customer Support has explicitly enabled the option to include nulls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the techniques to optimize performance for LDV?

A
  • Mashups
    External website or Ape Callout to get data
  • Defer Sharing Calculation
    Perform sharing calc at night or weekend
  • Using SOQL and SOSL
  • Deleting Data
    Use bulk APIs hard delete option
  • Search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the advantages and disadvantages of Mashups?

A

Advantages of Using Mashups
• Data is never stale.
• No proprietary method needs to be developed to integrate the two systems.

Disadvantages of Using Mashups
• Accessing data takes more time.
• Functionality is reduced. For example, reporting and workflow do not work on the external data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When would you use SOQL?

A

Use SOQL when:
• You know in which objects or fields the data resides.
• You want to:
◊ Retrieve data from a single object or from multiple objects that are related to one another
◊ Count the number of records that meet specified criteria
◊ Sort results as part of the query
◊ Retrieve data from number, date, or checkbox fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When would you use SOSL?

A

Use SOSL when:
• You don’t know in which object or field the data resides, and you want to find it in the most efficient way possible.
• You want to:
◊ Retrieve multiple objects and fields efficiently, and the objects might or might not be related to one another
◊ Retrieve data for a particular division in an organization using the divisions feature, and you want to find it in the most efficient way possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Consider the following when using SOQL or SOSL

A

• Both SOQL WHERE filters and SOSL search queries can specify text you should look for. When a given search can use either language, SOSL is generally faster than SOQL if the search expression uses leading wildcards or a CONTAINS term.
• In some cases, when multiple WHERE filters are being used in SOQL, indexes cannot be used even though the fields in the WHERE clause can be indexed. In this situation, decompose the single query into multiple queries, each of which should
have one WHERE filter, and then combine the results.
• Executing a query with a WHERE filter that has null values for picklists or foreign key fields doesn’t use the index, and should
be avoided.

• When designing custom query-search user interfaces, it’s important to:
◊ Keep the number of fields to be searched or queried to a minimum. Using a large number of fields leads to a large
number of permutations, which can be difficult to tune.
◊ Determine whether SOQL , SOSL, or a combination of the two is appropriate for the search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Best Practices for Reporting?

A

Reduce the number of records to query—use a value in the data to segment the query. For example, query for only a single state instead of for all states. (Divisions)

• Minimize the number of:
◊ Objects queried in the report
◊ Relationships used to generate the report
• De-normalize data when practical—“over de-normalizing” the data results in more overhead. Use summarized data stored on the parent record for the report. This practice is more efficient than having the report summarize the child records.

Reduce the number of fields queried—only add fields to a
report, list view, or SOQL query that is required.

• Reduce the amount of data by archiving unused
records—move unused records to a custom object table to reduce the size of the report object.
• Use report filters that emphasize the use of standard or
custom indexed fields. Use index fields in report filters,
whenever possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Best Practices when Loading Data from API

A

Use the Salesforce Bulk API when you have more than a few hundred thousand records.

Use the fastest operation possible—insert() is fastest,
update() is next, and upsert() is next after that. If
possible, also break upsert() into two operations:
create() and update().
• Ensure that data is clean before loading when using the
Bulk API. Errors in batches trigger single-row processing
for that batch, and that processing heavily impacts
performance.

When updating, send only fields that have changed (delta-only loads).

Use Public Read/Write security during initial load to avoid
sharing calculation overhead

For custom integrations:
• Authenticate once per load, not on each record.
• Use GZIP compression and HTTP keep-alive to avoid
drops during lengthy save operations.

If possible for initial loads, populate roles before populating sharing rules. If possible, add people and data before creating and assigning groups and queues.

Disable Apex triggers, workflow rules, and validations during loads; investigate the use of batch Apex to process records after the load is complete.

When changing child records, group them by parent—group records by the field ParentId in the same batch to minimize locking conflicts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Best Practices while querying data from the API

A

• Use the getUpdated() and getDeleted() SOAP API
to sync an external system with Salesforce at intervals
greater than 5 minutes. Use the outbound messaging
feature for more frequent syncing.

• When using a query that can return more than one million results, consider using the query capability of the Bulk API, which might be more suitable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Best Practices for Searching

A

Keep searches specific and avoid using wildcards, if possible. For example, search with Michael instead of Mi*.

Use single-object searches for greater speed and accuracy.

Use the setup area for searching to enable language
optimizations, and turn on enhanced lookups and
auto-complete for better performance on lookup fields.

In some cases, partition data with divisions.

17
Q

Best Practices for SOQL and SOSL

A

Decompose the query—if you are using two indexed fields joined by an AND in the WHERE clause, and your search has exceeded the index threshold, break the query into two queries and join the results.

If querying on formula fields is required, make sure they are deterministic formulas. Avoid filtering with formula fields that contain dynamic, non-deterministic references.

Use values such as NA to replace NULLS options.

Use SOQL and SOSL where appropriate, keep queries
focused, and minimize the amount of data being queried or searched.

18
Q

Best Practices for Deleting

A

When deleting large volumes of data, a process that involves deleting one million or more records, use the hard delete option of the Bulk API. Deleting large volumes of data might take significant time due to the complexity of the deletion process

19
Q

General Best Practices

A

Avoid having any user own more than 10,000 records

Use a data-tiering strategy that spreads data across multiple objects, and brings in data on demand from another object or external store

When creating copies of production sandboxes, exclude field history if it isn’t required, and don’t change a lot of data until the sandbox copy is created.

Distribute child records so that no parent has more than
10,000 child records. For example, in a deployment that hasmany contacts but does not use accounts, set up several dummy accounts and distribute the contacts among them.