Optomizing SOQL Flashcards
How does Salesforce store Tenant data
Salesforce stores the application data for all virtual tables in a few large database tables, which are partitioned by tenant and serve as heap storage. The platform’s engine then materializes virtual table data at runtime by considering the corresponding metadata
Why does traditional performance-tuning techniques not work in Salesforce?
The Salesforce plaform storage model manages virtual database structures using a set of metadata, data and pivot tables
How long does it take for new data to be searchable within Salesforce?
15 minutes or more
What are divisions?
Divisions are a means of partitioning the data of large deployments to reduce the number of records returned by queries and reports
How can you enable divisions?
Contact Salesforce Customer Support
What is the difference between SOQL and SOSL
A SOQL query is the equivalent of a SELECT SQL statement and a SOSL query is a programmatic way of performing a text-based search
What does the SOQL statements execute with
Database
What does the SOSL statements execute with
Search indexes
What call does SOQL statements use?
query() call
What call does SOSL statements use
search() call
When will you use SOQL (name 5)
- You know in which objects or fields the data resides
- Retrieve data from a single object or from multiple objects that are related to one another
- Count the number of records that meet specified criteria
- Sort results as part of the query
- Retrieve data from number, date or checkbox fields
When will you use SOSL (Name 3)
- You don’t know in which object or field the data resides and you want to find it in the most efficient way possible
- Retrieve multiple objects and fields that might or might not be related to one another
- Retrieve data for a particular division in an organization using the divisions feature
Which search (SOQL or SOSL) is faster when the search expression uses the CONTAINS term
SOSL
How can you improve the performance of a SOQL query that has multiple where filters?
Decompose the single query into multiple queries, each of which should have one WHERE filter, and then combine the results
What type of where filters should be avoided?
Using a where filter that has null values for picklists or foreign key fields
Why would the following query perform poorly (or is it written correctly)?
SELECT Contact__c, Max_Score__c, CategoryName__c, Category__Team_Name__c
FROM Interest__c
WHERE Contact__c != null
AND Contact__c IN :contacts
AND override__c != 0
AND ((override__c != null AND override__c > 0)
OR (score__c != null AND score__c > 0))
AND Category__c != null
AND ((Category_Team_IsActive__c = true OR CategoryName__c IN :selectvalues)
AND (Category_Team_Name__c != null AND Category_Team_Name__c IN
:selectTeamValues))
Nulls in the criteria prevented the use of indexes, and some of the criteria was redundant and extended execution time.
Why would the following query perform poorly (or is it written correctly)?
SELECT Contact__c, Max_Score__c, CategoryName__c, Category_Team_Name__c
FROM Interest__c
WHERE Contact__c IN :contacts
AND (override__c > 0 OR score__c > 0)
AND Category__c != ‘Default’
AND ((Category_Team_Name__c IN :selectvalues AND Category_Team_IsActive__c = true)
OR CategoryName__c IN :selectvalues)
This query is written correctly :)
What problem can you run into if your query uses dynamic values in the WHERE fields
Null values can be passed in. Don’t let the query run to determine there are no records, instead, check for nulls and avoid the query (if possible)
Is anything wrong with the following pseudo code?
SELECT Name
FROM Account
WHERE Account_ID___c = :acctid;
if (rows found == 0) return “Not Found”
If acctid is null, the entire account table is scanned row by row until all data is examined
Is anything wrong with the following pseudo code?
if (acctid != null) { SELECT Name FROM Account WHERE Account_Id\_\_\_c = :acctid } else { return "Not Found" }
It’s good! And much better than the following
SELECT Name
FROM Account
WHERE Account_ID___c = :acctid;
if (rows found == 0) return “Not Found”
What is the best practice when you want to reduce the number of records to return while searching
Keep searches specific and avoid using wildcards if possible. For example, search with Michael instead of Mi*
What is the best practice when you want to reduce the number of joins while searching
Use single object searches for greater speed and accuracy
What is the best practice when you want to improve efficiency while searching (Hint: What do you need to configure within Salesforce)
Use the setup area for searching to enable language optimizations, and turn on enhanced lookups and auto-complete for better performance on lookup fields
What is the best practice when you want to improve search performance (hint: If you have a lot of region based data)
In some cases, partition data with divisions
What is the best practice when you want to allow indexed searches when SOQL queries with multiple WHERE filters cannot use indexes
Decompose the query - if you are using two indexed fields joined by an OR in the WHERE clause, and your search has exceeded the index threshold, break the query into two queries and join the results
What is the best practice when you want to avoid querying on formula fields, which are computed in real time
If querying on formula fields is required, make sure they are deterministic formulas. Avoid filtering with formula fields that contain dynamic, non-deterministic references
What is the best practice when you want to execute queries with null values in a where filter for picklists or foreign key fields
Use values such as NA to replace NULLS options
What is the best practice when you want to design custom query and search user interfaces according to best practices
Use SOQL and SOSL where appropriate, keep queries focused, and minimize the amount of data being queried or searched
What is the best practice when you want to avoid timeouts on large SOQL queries
Tune the SOQL query
Reduce query scope
Use selective filters
Consider using Bulk API with bulk query
If you’ve used the previous suggestions and still get timeouts, consider adding a LIMIT clause (starting with 100,000 records) to your queries.
If using batch Apex for your queries, use chaining to get sets of records (using LIMIT) or consider moving filter logic to the execute method
Provide a solution for the following custom search functionality situation:
The customer needed to search LDV across multiple objects using specific values and wildcards. They created a custom VF page that would allow the user to enter 1-20different fields, and then search using SOQL on those combinations of fields.
Search optimization became difficult because:
- When many values were entered, the WHERE clause was large and difficult to tune
- When wildcards were introduced, the queries took longer
- Querying across multiple objects was sometimes required to satisfy the overall search query. This practice resulted in multiple queries occurring, which extended the search
- SOQL is not always appropriate for all query types
The solutions were to :
- use only essential search fields to reduce the number of fields that ould be searched. Restricting the number of simultaneous fields that could be used during a single search to the common use cases allowed Salesforce to tune with indexes
- Denormalize the data from the multiple objects into a single custom object to avoid having to make multiple querying calls
- Dynamically determine the use of SOQL or SOSL to perform the search based on both the number of fields searched and the types of values entered. For example, very specific values (no wild cards) used SOQL to query, which allowed indexes to enhance performance
Provide a solution for the following situation:
The customer had the following query:
SELECT Id, Product_Code__c
FROM Customer_Product__c
WHERE CreatedDate = Last_N_Days:3
The amount of data in the object exceeded the threshold for standard indexes: 30% of the total records up to one million records. The query performed poorly.
The query was rewritten as: SELCT Id, Product_Code\_\_c FROM Customer_Product\_\_c WHERE CreatedDate = Last_N_Days:3 ORDER BY CreatedDate LIMIT 99999
In this query the threshold checks were not done, and the CreatedDate index was used to find the records. This kind of query returns a max of 99,999 records in the order that they were created within the last 3 days, assuming that 99,999 or fewer records were created during the last 3 days
Note: In general, when querying for data that has been added over the Last_N_Days, if you specify an ORDER BY query on an indexed field with a limit of fewer than 100,000 records, the ORDER BY index is used to do the query
What is the result from having too much data?
Consider the user experience (aka, paging through a list of thousands of records). Ensure you have enough selective filters. Design SOQL, reports and list views with LDV in mind
What could happen when you perform large data loads (Including what happens when you have record deletions)?
Large data loads and deletions can affect query performance
Recent deleted records stay in the recycle bin for 15 days (or less) and is still available for SOQL searches. These records are also included in the total number of records used by the Force.com query optimizer
Options: Use hard delete or contact Customer Support to physically delete the records if they don’t need to be in the recycle bin
What can happen when you use leading % wildcards in your reports or queries?
A LIKE Condition with leading % wildcard does not use an index. In report/list views the CONTAINS clause translates into %string%
What is the result of using NOT and != in SOQL queries
They do not get used by force.com query optimizer at all (even if the field is indexed). Instead use = or IN with the reciprocal values
What happens when you use complex joins in a SOQL
Complex AND/OR and sub queries might not perform as well as multiple issued queries. Non deterministic formulas can’t be indexed and result in additional joins. Having this as a separate field that is updated via workflow / trigger is better
What is wrong with the following query? What should be used instead?
Account[] accts = [SELECT Id FROM Account];
If the results are too large, this syntax causes a runtime exception. Instead use a SOQL query for loop
SOQL query for loops are great and all, but there are certain situations where they do not work. Name one with the alternate option to take instead
A SOQL query for loop that is used to mass update records can get a governor limit exception error. Instead of using a SOQL query in a for loop, the preferred method of mass updating records is to use batch Apex which minimizes the risk of hitting governor limits.
How many queued or active batch jobs can you have at one time?
- Evaluate the current count by viewing the Scheduled jobs page in SFDC or programmatically using SOAP API to query the AsyncApexJob object
When will the system terminate nonselective SOQL queries?
If it contains more than 200,000 records
What makes a SOQL query selective?
When one of the query filters is on an indexed field and the query filter reduces the resulting number of rows below a system-defined threshold
Performance improves when two or more filters used in the WHERE clause meet the mentioned conditions
Assuming the account object has more than 200,000 records (including soft-deleted records still in the recycle bin)
Is the following query selective or non-selective, and why?
Select id from Account where id in ()
Selective.
- The where clause is on an indexed field (id)
- Number of records returned would be fewer than the selectivity threshold
Assuming the account object has more than 200,000 records (including soft-deleted records still in the recycle bin)
Is the following query selective or non-selective, and why?
Select Id from Account where Name != ‘’
Non Selective.
Even though Name is indexed (primary key), this filter returns most of the records making the query non-selective
Assuming the account object has more than 200,000 records (including soft-deleted records still in the recycle bin)
Is the following query selective or non-selective, and why?
Select id from account where Name != ‘’ and customfield__c = ‘ValueA’
Depends :)
The first filter (name) is not selective
If the second returns less records than the threshold and the field is indexed, then the query is selective
What are the maximum number of records that can be returned by a SOSL query?
2000
What are the maximum number of records that can be returned by a SOQL query?
50 000
Why should you use the Query Plan Tool that is available in the Dev Console
- You can check the query plan for any SOQL queries that execute slowly
- Provides insight on the different plans
- Provides cost of using index compared to a full table scan (if filters are indexed)
- If cost of the table scan is lower than the index, and the query is timing out, need to perform further analysis on using other filters
How do you enable the Query Plan Tool in dev console?
Help > Preferences and then set “Enable Query Plan” to true
In the Query Plan Tool, what does the Cardinality mean?
The estimated number of records that the leading operation type would return. For example, the number of records returned if using an index table.
In the Query Plan Tool, what does the Fields section show?
The indexed field(s) used by the Query Optimizer. If the leading operation type is Index, the fields value is Index. Otherwise, the fields value is null
In the Query Plan Tool, what does the Leading Operation Type section show?
The main operation type that Salesforce will use to optimize the query
- Index: The query will use an index on the queried object
- Sharing: The query will use an index based on the sharing rules associated with the user who is executing the query. If there are sharing rules that limit which records that user can access, SFDC can use those rules to optimize the query
- TableScan: The query will scan all records for the queried object
- Other: The query will use optimizations internal to SFDC
In the Query Plan Tool, what does the Cost section show?
The cost of the query compared to the Force.com Query Optimizer’s selectivity threshold. Values above 1 mean that the query won’t be selective.
In the Query Plan Tool, what does the sObject Cardinality section mean?
The approximate record count for the queried object
In the Query Plan Tool, what does the sObjectType section show
The name of the queried object
Scenario: Users complain about slow reports. What could be the reason?
Typically the reports would use non-selective filters on un-indexed fields, and all sales folk are creating multiple reports all with the same problem.
Possible solution is to create a library of public, controlled, and optimized reports that meet the requirements.
Formula fields also unknowingly slow reports.
When will you typically choose a SOSL query over a SOQL query?
- You want to retrieve data from a number, date or checkbox field
- You’re searching for a specific distinct term that you know exists within a field
- You need to retrieve more than 2,000 records and you’re searching over text fields by using an OR clause
- You know in which objects or fields the data resides
When you’re searching for a specific distinct term that you know exists within a field
What differentiates SOSL from SOQL? (You can pick more than one if you think they’re correct)
- Syntax
- SOSL Searches the search index instead of the org database
- SOSL Searches more efficiently when you don’t know in which object the data resides
All of them!
SOSL works with:
- REST only
- SOAP only
- REST, SOAP and Apex
- SOQL only
REST, SOAP and Apex
What does a search for a single object look like in SOSL?
- FIND {cloud} RETURNING Account
- FIND in ACCOUNT RETURNING “cloud”
- FIND “Cloud” in ACCOUNT
- FIND(cloud) RIGHT NOW
FIND {cloud} RETURNING Account
What does a search for multiple objects look like in SOSL?
- FIND {sneakers} RETURNING ALL ARTICLES
- FIND {sneakers} in ALL OBJECTS
- FIND {sneakers} RETURNING Produc2, ContentVersion, FeedItem
- FIND {sneakers} RETURNING Account
FIND {sneakers} RETURNING Produc2, ContentVersion, FeedItem
Which REST resource adds auto-suggest functionality?
- Auto Suggested Records
- Search Suggested Article Title Matches
- RETURNING FieldSpec
- search()
Search Suggested Article Title Matches
The platform automatically recalculates optimizer statistics in the background when …
Your data set changes by 25% or more
What is the Force.com Query Optimizer?
The Force.com (or Lightning Query Optimizer) works behind the scenes to determine the best path to the data being requested based on the filters in the query. It will determine the best index from which to drive the query, the best table from which to drive the query if no good index is available, and more.
What does it mean in the Query Plan Tool when the Cost is above 1
The query will not be selective
For the Query Plan Tool, what are four leading operation types?
- Index
- Sharing
- Table Scan
- Other (optimizations internal to Salesforce)