Regular Expressions (REGEX) Flashcards
Which of the definitions below is most accurate for regular expressions in Google Analytics?
A set of characters and metacharacters that are used to match text in a specified pattern
A regular expression is a set of literal characters and metacharacters (such as wildcards) that are used to match text in a specified pattern for many different purposes within Google Analytics. Regular expressions provide great flexibility and, at the same time, fine-grained specificity for configuring goal and funnels, advanced segments, view filters, and table filters.
In a regular expression, what does the dot enable you to do?
You can use the dot as a wildcard to match a single character
You can use the dot as a wildcard to match any single character. By itself, it will not match multiple characters.
To represent a literal dot in a regular expression, you can precede the dot with a backslash. (A dot preceding a backslash would not serve the same purpose.)
For which of the following reasons could you use regular expressions in Google Analytics?
- To match multiple referrers in the definition of an advanced segment
- To filter internal traffic out of a view by specifying a range of IP addresses
- To filter for specific values directly within a report table
- To match multiple pages to a single step in a goal funnel
Which regular expression below would match all three of the following request URIs in a single funnel step?
/downloads/cvwriting /advice
/downloads/interviewskills /advice
/downloads/applications /advice
a. /downloads/.+/advice
b. /downloads/.?/advice
c. /downloads/*/advice
d. /download/./advice/
a. /downloads/.+/advice
For a regular expression to match any of these page URLs as a single funnel step, you need to combine two regular expression metacharacters.
The . (dot) metacharacter matches any single literal character (except line break characters \r and \n), and the + metacharacter requires one or more of the preceding character.
Thus, within a regular expression, .+ matches any text string with a minimum length of one character. This is similar to .*, but * requires zero or more of the preceding character rather than one or more of the preceding character.
In a regular expression, what is \d the equivalent of?
[0-9-]
Backslash d means a match for any one digit (zero through nine). \d{2} means a match for any two digits, such as 27, and \d{2,4} means a match for two to four digits, such as 398 or 4192.
Which of the following expressions will match Bing or Google, whether or not the first letter is capitalized?
A or B
a. ([G]oogle|[B]ing)
b. ([Gg]oogle|[Bb]ing)
b. ([Gg]oogle|[Bb]ing)
When evaluated against a regular expression, different cases of letters are treated as completely separate literal characters. Since there is no metacharacter that enables a match with an uppercase or lowercase version of a character, you need to specify a character set if you want to match either.
Thus, to match Bing or Google regardless of initial capitals, use () to group the word options and | as the “or” operator to separate them, and use [] to group a range of characters, in this case Gg or Bb, that a single character can match.
What is the simplest way of generating a regular expression for a view filter to exclude a range of IP addresses?
Use the Network report to identify the lowest and highest IP address in the range and specify a character range for the final octet
At analyticsmarket.com/freetools/ipregex, you can access a tool for formulating the first and last IP address of a range into a single regular expression that matches any IP address within the range. You can use the resulting regular expression in a view filter to exclude traffic data for visits to your website that originate from your own organization (or any other). ). Note that the Google’s own IP address range tool is no longer available.
Your organization’s IT team should be able to provide the lower and upper limits of your IP address range.
If you wish to exclude your company’s internal traffic from your views, which regular expression would you use to express the IP range 84.71.220.1 to 84.71.220.10 for your Exclude filter?
^84.71.220.([1-9]|10)$
To represent the possible values for the fourth octet in the IP address range, you can use ([1-9]|10) to signify any single digit between 1 and 9 or the number 10. The ^ beginning-of-string anchor and the $ end-of-string anchor also prevent other unwanted matches, such as 184.71.220.6 or 84.71.220.108 in this case.
You define a goal using Regular Expressions match type and the following regular expression value:
^/products/tools
Which of the following request URIs would generate conversions?
a. /products/tools
b. /products/tools/?prodid=2002
c. /promo/products/tools
d. /promo/products/tools/?prodid=2002
a. /products/tools
b. /products/tools/?prodid=2002
The caret (^) symbol represents the beginning of a text string. In order to match, a string must not contain any characters before this beginning-of-string anchor.
You have defined a URL destination goal with Regular Expression as the match type and the following value for the request URI:
^/news
Which of the following request URIs would match this goal?
a. /news-summer
b. /news/archives
c. /index/news
d. /news1
a. /news-summer
b. /news/archives
d. /news1
The caret symbol (^) serves as a beginning-of-string anchor in a regular expression and in this case would allow only request URIs beginning with /news to match the goal.