web analytics Flashcards

1
Q

Web usage mining

A

discovering interesting patterns in how visitors use a
web site

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  • Web content mining
A

– extracting useful information or knowledge from web
page contents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  • Web structure mining
A

– mining hyperlink structure of web

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is server log analysis?

A

The process of examining data captured by web servers when users interact with a website.
Every user request (e.g., visiting a page) creates an entry in a log file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What information can be found in server logs?

A

User information (IP address)
Date and Time
HTTP request method (GET, POST)
Resource requested (page, image, etc.)
HTTP status code (success, error, redirect)
Number of bytes transferred
Referrer (previous webpage)
Browser and platform (user agent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can server log analysis be useful?

A

Understand user behavior (popular pages, broken links)
Identify traffic sources (search engines, social media)
Optimize website performance for different devices
Detect suspicious activities (security threats)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Cookies?

A

Cookies are small pieces of data that websites store on your computer’s hard drive (or similar storage) through your web browser. They act like a little note that the website leaves behind to remember you.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how cookies work ?

A

A web server sends a cookie to your browser.
Your browser stores the cookie.
Every time you visit the same website again, your browser sends the cookie back to the server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What Information Do Cookies Contain?

A

A unique user ID (often random letters and numbers)
The website’s domain name (e.g., yoursite.com)
An expiration date (optional)
Additional customized data (optional)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Session Cookies vs. Persistent Cookies

A

Session Cookies: These are temporary and disappear when you close your browser. They’re used to track your activity during a single visit.
Persistent Cookies: These stay on your computer until they expire or you delete them. They’re used to remember you as a returning visitor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

First-Party vs. Third-Party Cookies:

A

First-Party Cookies: These are set by the website you’re directly visiting.
Third-Party Cookies: These are set by a different domain (e.g., an ad network) embedded on the website you’re visiting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Zombie Cookies:

A

Zombie Cookies: These are cookies that reappear even after deletion, often a security breach.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Evercookie:

A

Evercookie: This is a notorious example of a “zombie cookie” program.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Page Tagging?

A

Page tagging is a method for collecting website usage data on the client-side (user’s browser) as opposed to the server-side. This is achieved by embedding a small snippet of JavaScript code into your web pages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Page Views

A

Number of times page was viewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

visits

A

Identifying and counting visits (sessions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Sessionization

A

procedure for determining which page
views are part of same visit

18
Q

in sessionizations , visitors tracked by

A
  • IP address + user agent
  • cookies
  • URI parameters
  • combination
19
Q

Why is sessionization based on IP address alone unreliable?

A

Multiple devices can share the same IP address behind a router or firewall.
ISPs often dynamically assign IP addresses, leading to changes over time.
Some ISPs use web proxies that can alter IP addresses for each request.

20
Q
A
21
Q

What’s a better approach to sessionization than relying solely on IP address?

A

IP + User Agent
* Explicit session identifiers
– held in (transient) cookies
– passed along as URI parameters (e.g., shop/detail.php?
sesID=984533193)
* Vendors often allow combination

22
Q

What additional heuristics are used in sessionization?

A

Referrer Data: Analyzing the referring website can help determine if a new visit is from a returning user.
Session Timeout:
A period of inactivity beyond a threshold (e.g., 30 minutes) can indicate a new session.
A maximum total session duration can also be imposed.

23
Q

problems with cookies

A

– counting browsers
* people often share computers
* people own or have access to multiple computers
* people might use different browsers
– cookies get blocked or erased

24
Q

how to solve cookie problem

A

Most important cookies should be first-party cookies
* Measure cookie rejection rates
* Consider site registration/log in
* Make it worthwhile for visitors to keep cookies
* Do not look at absolute values but at trends

25
Q

type of visitors

A

new visitors: unique visitors with first-ever visit during period
– repeat visitor: unique visitors that visit multiple times during
period
– return visitors: unique visitors who also visited prior to period

26
Q

new visitors

A

new visitors: unique visitors with first-ever visit during period

27
Q

repeat visitor

A

– repeat visitor: unique visitors that visit multiple times during
period

28
Q

– return visitors

A

– return visitors: unique visitors who also visited prior to period

29
Q

time on Site

A

timestamp of last (recorded) activity minus
timestamp of first activity in session

30
Q

bounce rate

A

ratio of visits where visitor left instantly

31
Q

Site Bounce Rate:

A

(Number of single-page view visits) / (Total number of visits)

32
Q

Page Bounce Rate:

A

(Number of single-page view visits for a specific page) / (Number of visits where that page was the entry page)

33
Q

bounce rate segmentation

A

– traffic source
– search engine
– top landing pages
– countries

34
Q

What is a referrer?

A

A referrer is the web page that directed a visitor to your website.

35
Q

Why might referrers not always be known?

A

Bookmarks: Visitors might have bookmarked your site.
Direct Typing: The URL might have been typed directly into the browser.
Email Marketing: Traffic might come from an email campaign.
Browser Security Settings: Strict privacy settings can prevent referrer information from being sent.

36
Q

How can you leverage referrer information?

A

Segmentation: Apply segmentation of other metrics (e.g., bounce rate, time on site) based on the referrer. This helps you understand how visitors from different sources interact with your website.

37
Q

What is a conversion?

A

A conversion is a visitor performing a specific action that is considered a valuable outcome for a website or app.

38
Q

What is a conversion chain?

A

A conversion chain is a sequence of steps that a visitor must complete to achieve a conversion.

39
Q

What are different ways to count conversions?

A

Number of times conversion occurred: Total number of conversions.
Visits where conversion occurred: Number of visits that resulted in a conversion.
Unique visitors that converted: Number of distinct visitors who completed a conversion.

40
Q

A Sankey diagram

A

A Sankey diagram can be used to perform navigation analysis of users.

41
Q
A