Proctor Exam 2- Secondary Data Syndicated and Big Data Flashcards

1
Q

Machine learning can be applied anywhere there is a

need for quick automatic decisions based on ongoing feedback from patterns in the environment.

A

Machine learning can be applied anywhere there is a

need for quick automatic decisions based on ongoing feedback from patterns in the environment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Problems that lend themselves to machine learning:

  1. The data causes problems for traditional analytic techniques, such as where variables
    are highly correlated, data is non-linear, or where there are far more variables than
    records (so called “wide and shallow” datasets)
  2. Accuracy is more important than understanding
  3. Potential outputs are defined, but the action is dependent on conditions which themselves cannot be easily predicted or identified before the event happens.
  4. Rules and associations might be perceived or deduced, but are not easily described by logical rules
A

Problems that lend themselves to machine learning:

  1. The data causes problems for traditional analytic techniques, such as where variables
    are highly correlated, data is non-linear, or where there are far more variables than
    records (so called “wide and shallow” datasets)
  2. Accuracy is more important than understanding
  3. Potential outputs are defined, but the action is dependent on conditions which themselves cannot be easily predicted or identified before the event happens.
  4. Rules and associations might be perceived or deduced, but are not easily described by logical rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What makes this a good machine-learning problem is that the decisions and
the variables constantly change and the value of one variable and the right decision may depend
on the values of many more variables. Humans instinctively make these assessments, but it is
impossible to discretely list every rule and situation for a computer to look up and evaluate.

A

What makes this a good machine-learning problem is that the decisions and
the variables constantly change and the value of one variable and the right decision may depend
on the values of many more variables. Humans instinctively make these assessments, but it is
impossible to discretely list every rule and situation for a computer to look up and evaluate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Machines learn by studying data to detect patterns or by applying known rules (algorithms) to:
Categorize like or unlike people or things
Identify patterns and relationships that were unknown before analysis
Predict likely outcomes or actions based on identified patterns
Detect anomalous or unexpected behaviors

A

Machines learn by studying data to detect patterns or by applying known rules (algorithms) to:
Categorize like or unlike people or things
Identify patterns and relationships that were unknown before analysis
Predict likely outcomes or actions based on identified patterns
Detect anomalous or unexpected behaviors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Machines learn through essentially an exhaustive process of trial and error, sifting through information, comparing the information to a goal, making adjustments, and trying again

A

Machines learn through essentially an exhaustive process of trial and error, sifting through information, comparing the information to a goal, making adjustments, and trying again

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Within Machine Learning The traditional advanced analytic techniques you will learn about later in this course are not well
suited for the unstructured nature of some big data

A

Within Machine Learning The traditional advanced analytic techniques you will learn about later in this course are not well
suited for the unstructured nature of some big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Machine learning, however, takes advantage of a computer’s ability to
follow rules and execute swift comparisons as a fast way to understand patterns and meaning in data.

The algorithms automatically sort data, testing and comparing what it has seen in the past to what it is seeing in the present. The learning may lead to a new understanding of behavior or it might serve as automatic input to an action executed by another computer process.

A

Machine learning, however, takes advantage of a computer’s ability to
follow rules and execute swift comparisons as a fast way to understand patterns and meaning in
data.

The algorithms automatically sort data, testing and comparing what it has seen in the past to
what it is seeing in the present. The learning may lead to a new understanding of behavior or it might serve as automatic input to an action executed by another computer process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Supervised learning always has a predetermined outcome provided by the programmer. The machine seeks faster, more efficient, or more accurate ways to meet the goal based on the data and the programmers input.

A

Supervised learning always has a predetermined outcome provided by the programmer. The machine seeks faster, more efficient, or more accurate ways to meet the goal based on the data and the programmers input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Supervised Learning

Process
Machine is given pre-classified data and discovers a pattern associated with the classification. As more data becomes available, the machine adjusts its associations and gets better at classifying

Example
Sorting out junk emails from wanted content

Limitations
Only works on one task at a time.
User may not be able to interpret the associations behind the sorting.

Traditional stat tool
Regression
Classification
Decision Trees
Random Forests
Bayesian statistics
A

Supervised Learning

Process
Machine is given pre-classified data and discovers a pattern associated with the classification. As more data becomes available, the machine adjusts its associations and gets better at classifying

Example
Sorting out junk emails from wanted content

Limitations
Only works on one task at a time.
User may not be able to interpret the associations behind the sorting.

Traditional stat tool
Regression
Classification
Decision Trees
Random Forests
Bayesian statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unsupervised Learning

Here the data determines the outcome. The algorithm’s mission is to extract structure from the data, and to present the structure in a way that is useful to us. Data is segmented and scored based on what the computer itself decides is relevant or related.

A

Unsupervised Learning

Here the data determines the outcome. The algorithm’s mission is to extract structure from the data, and to present the structure in a way that is useful to us. Data is segmented and scored based on what the computer itself decides is relevant or related.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Unsupervised Learning

Process
Machine is given a lot of data and told to hunt for patterns and “clusters” of things related to each other. It draws its own conclusions about relationships.

Example
Recommendation engines
Loyalty card targeting

Limitation
Usually requires human input after the fact

Traditional Statistics Tool
Factor Analysis
Cluster Analysis
Multidimensional Scaling
Principle Component
A

Unsupervised Learning

Process
Machine is given a lot of data and told to hunt for patterns and “clusters” of things related to each other. It draws its own conclusions about relationships.

Example
Recommendation engines
Loyalty card targeting

Limitation
Usually requires human input after the fact

Traditional Statistics Tool
Factor Analysis
Cluster Analysis
Multidimensional Scaling
Principle Component
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reinforcement Learning

This type of learning has no supervisor, but instead it has a reward signal that defines success. Similar to human learning, when success is rewarded, the machine tries to learn the patterns that result in receiving the reinforcement signal. The machine’s decisions affect the subsequent data it receives.

A

Reinforcement Learning

This type of learning has no supervisor, but instead it has a reward signal that defines success. Similar to human learning, when success is rewarded, the machine tries to learn the patterns that result in receiving the reinforcement signal. The machine’s decisions affect the subsequent data it receives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reinforcement Learning

Process
Machine not only analyzes data but uses the output to improve efficiency or create new strategies. Learns how to apply a set of rules toward an outcome in the most efficient way.

Example
Game playing bots
War Game Simulations

Limitation
Strategies may not be understandable by humans so may be limited to one situation.

Traditional Statistics Tool
Game Theory
Linear Programming

A

Reinforcement Learning

Process
Machine not only analyzes data but uses the output to improve efficiency or create new strategies. Learns how to apply a set of rules toward an outcome in the most efficient way.

Example
Game playing bots
War Game Simulations

Limitation
Strategies may not be understandable by humans so may be limited to one situation.

Traditional Statistics Tool
Game Theory
Linear Programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Unsupervised Learning works well if we have little or limited knowledge of the data. The best examples of this application are so-called targeting engines or recommendation engine . When a
supermarket checkout machine issues you a coupon at checkout

A

Unsupervised Learning works well if we have little or limited knowledge of the data. The best examples of this application are so-called targeting engines or recommendation engine . When a
supermarket checkout machine issues you a coupon at checkout

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The best examples of reinforcement machine learning are machines that play games. Typically the
machine is taught the rules of the game and given a goal to win.

A

The best examples of reinforcement machine learning are machines that play games. Typically the
machine is taught the rules of the game and given a goal to win.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Big Data is “The record of all interactions with people, institutions, and things recorded and stored digitally.”
Big data, then, is the digital trail left by humans and their connected machines.

A

Big Data is “The record of all interactions with people, institutions, and things recorded and stored digitally.”
Big data, then, is the digital trail left by humans and their connected machines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The 7 V’s of Big Data

Volume 
Velocity
Variety
Variability
Visualization
Veracity
Value
A

The 7 V’s of Big Data

Volume 
Velocity
Variety
Variability
Visualization
Veracity
Value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Within the Healthcare industry, pharmaceutical syndicated services tracks sales, price, and
distribution of most pharmaceuticals.

A

Within the Healthcare industry, pharmaceutical syndicated services tracks sales, price, and
distribution of most pharmaceuticals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The majority of pharmaceutical data comes from patient billing and processing at every point of
the drug supply chain.

A

The majority of pharmaceutical data comes from patient billing and processing at every point of
the drug supply chain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Unlike most CPG categories however, the pharmaceutical sales and
distribution chain involves many government regulations, physicians, other providers, and
insurance companies as mediators of sales to the patient as the ultimate consumer. As we have
said, drug products historically have not been easily tracked using industry standard digital codes
such as the UPC .

A

Unlike most CPG categories however, the pharmaceutical sales and
distribution chain involves many government regulations, physicians, other providers, and
insurance companies as mediators of sales to the patient as the ultimate consumer. As we have
said, drug products historically have not been easily tracked using industry standard digital codes
such as the UPC .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Big Data Researcher Skills
These are the most common skills found on big data analytic teams:

Programming 
Data Manipulation
Exploratory Data Analytics 
Mathematics Statistics 
Business Skills domain Expertise
People Skills communication Skills
A

Big Data Researcher Skills
These are the most common skills found on big data analytic teams:

Programming 
Data Manipulation
Exploratory Data Analytics 
Mathematics Statistics 
Business Skills domain Expertise
People Skills communication Skills
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The evolution of big data has produced an entirely new field called data science, an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms.

A

The evolution of big data has produced an entirely new field called data science, an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In addition to the typical and historical skills of qualitative techniques and traditional statistical skills of quantitative surveys and analytics, modern researchers are concerned with:
Data Curation
Data Governance
Data Provenance

A

In addition to the typical and historical skills of qualitative techniques and traditional statistical skills of quantitative surveys and analytics, modern researchers are concerned with:
Data Curation
Data Governance
Data Provenance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Data Curation

Right data is assembled for the right question

A

Data Curation

Right data is assembled for the right question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Data Governance

Data is secure & accurate

A

Data Governance

Data is secure & accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Data Provenance

Data from reputable sources & tracked through all potential uses

A

Data Provenance

Data from reputable sources & tracked through all potential uses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Machine Learning is also being used to understand the return on investment of marketing itself
( MROI ), that is, measuring how much money is generated by investing in marketing

A

Machine Learning is also being used to understand the return on investment of marketing itself
( MROI ), that is, measuring how much money is generated by investing in marketing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

The datasets
that track marketing spending are large, have more variables than cases, and contain
relationships that are often non-linear mixed with responses that are well behaved and
straightforward.

A

The datasets
that track marketing spending are large, have more variables than cases, and contain
relationships that are often non-linear mixed with responses that are well behaved and
straightforward.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Prediction and understanding are related but independent goals of market research; It is simply a decision of the business regarding the
problem at hand whether one goal might be favored over the other.

A

Prediction and understanding are related but independent goals of market research; It is simply a decision of the business regarding the
problem at hand whether one goal might be favored over the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

striking the right balance between prediction and understanding is still required

A

striking the right balance between prediction and understanding is still required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

The complexity of big data
and the algorithms that read it have increased the frequency with which predictive models take
precedence as more and more marketing occurs in digital environments where automation is
possible and desirable.

A

The complexity of big data
and the algorithms that read it have increased the frequency with which predictive models take
precedence as more and more marketing occurs in digital environments where automation is
possible and desirable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

data that is collected for any purpose subsequently used in research other than to meet the needs of your
particular study is called “secondary” data.

A

data that is collected for any purpose subsequently used in research other than to meet the needs of your
particular study is called “secondary” data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

secondary data in some detail:
any purpose other than to meet the needs of your particular study.
non-specific research purposes, called “syndicated” or multi-client
data.
another purpose and subsequently used in research.

A

secondary data in some detail:
any purpose other than to meet the needs of your particular study.
non-specific research purposes, called “syndicated” or multi-client
data.
another purpose and subsequently used in research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

The second most important question in your research design, just behind the original purpose of
the research, is “What is already known about your research goal?”

A

The second most important question in your research design, just behind the original purpose of
the research, is “What is already known about your research goal?”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

To reiterate and re-emphasize: All market research designs for every research project should
begin with an assessment of what is already known about your research problem. The answer to
that question almost always involves the search for and use of secondary data in its many forms,
whether you are merely searching the Internet or buying needed data from a broker.

A

To reiterate and re-emphasize: All market research designs for every research project should
begin with an assessment of what is already known about your research problem. The answer to
that question almost always involves the search for and use of secondary data in its many forms,
whether you are merely searching the Internet or buying needed data from a broker.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

the advantage of each imagined example of secondary research is that it costs
less time, money and effort.

A

the advantage of each imagined example of secondary research is that it costs
less time, money and effort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

No
single organization outside of governments could fund entire population studies such as censuses
or large -scale public health studies, for example

A

No
single organization outside of governments could fund entire population studies such as censuses
or large -scale public health studies, for example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Secondary data may be older than you like or might be an annual estimate rather than the
monthly trend information that would be best for your project; or the best information that you
can nd may be at a regional level and you might need data about a city in the region

A

Secondary data may be older than you like or might be an annual estimate rather than the
monthly trend information that would be best for your project; or the best information that you
can nd may be at a regional level and you might need data about a city in the region

39
Q

Two key elements to review to assess the potential for bias are the original purpose of the
research or data and the methodology of the original research.

A

Two key elements to review to assess the potential for bias are the original purpose of the
research or data and the methodology of the original research.

40
Q

If possible, it is always a good idea to evaluate the original sources rather than a summary or an
article describing the research

A

If possible, it is always a good idea to evaluate the original sources rather than a summary or an
article describing the research

41
Q

The best way to evaluate accuracy is to examine multiple sources of the same estimates

A

The best way to evaluate accuracy is to examine multiple sources of the same estimates

42
Q

Finally, you should evaluate the credibility and reputation of the source of your data

A

Finally, you should evaluate the credibility and reputation of the source of your data

43
Q

One part of the syndicated data industry builds these pro les and makes them
available to buyers.

A

One part of the syndicated data industry builds these pro les and makes them
available to buyers.

44
Q

The bene t from knowing more about your customer is pretty obvious, and there are many ways
to know them. One part of the syndicated data industry builds these pro les and makes them
available to buyers.

A

The bene t from knowing more about your customer is pretty obvious, and there are many ways
to know them. One part of the syndicated data industry builds these pro les and makes them
available to buyers.

45
Q

Psychographics pertain to people’s values, attitudes, lifestyles , and personalities. These deeper,
more personal and less obviously observed criteria are helpful in understanding consumer
motivation.

A

Psychographics pertain to people’s values, attitudes, lifestyles , and personalities. These deeper,
more personal and less obviously observed criteria are helpful in understanding consumer
motivation.

46
Q

Demographics: Grouping people by age, gender,
ethnicity, income, education, geographic or other
physical and structural characteristics.

A

Demographics: Grouping people by age, gender,
ethnicity, income, education, geographic or other
physical and structural characteristics.

47
Q

Behavioral: Grouping people by common behaviors,
e.g., products purchased, websites or stores visited,
shopping behaviors.

A

Behavioral: Grouping people by common behaviors,
e.g., products purchased, websites or stores visited,
shopping behaviors.

48
Q

Psychographics: Grouping people by shared values,

principles, personality, interests, and lifestyle.

A

Psychographics: Grouping people by shared values,

principles, personality, interests, and lifestyle.

49
Q

Market Share is one of the oldest and simplest forms of measuring performance and seems so
basic that many do not realize that Arthur Nielsen, Sr. actually invented it as a tracking measure in
his first Nielsen syndicated database in the 1930s

A

Market Share is one of the oldest and simplest forms of measuring performance and seems so
basic that many do not realize that Arthur Nielsen, Sr. actually invented it as a tracking measure in
his first Nielsen syndicated database in the 1930s

50
Q

Simply tracking the sales of a product relative

to total category sales is still a simple and powerful measure of performance

A

Simply tracking the sales of a product relative

to total category sales is still a simple and powerful measure of performance

51
Q

A Development Index compares the sales of a product to either population (as sales per capita)
or to total possible ACV in a market.

A

A Development Index compares the sales of a product to either population (as sales per capita)
or to total possible ACV in a market.

52
Q

The Developmental Indices allow you to quickly spot performance gaps and detect any regional
issues that might result from retail performance, sales personnel performance or even product preference

A

The Developmental Indices allow you to quickly spot performance gaps and detect any regional
issues that might result from retail performance, sales personnel performance or even product preference

53
Q

High Category Indices show reasonable demand for products similar to ours,
but for some reason, we are not appealing to the market or there is a barrier to our sales that
bears further investigation.

A

High Category Indices show reasonable demand for products similar to ours,
but for some reason, we are not appealing to the market or there is a barrier to our sales that
bears further investigation.

54
Q

The BDI/CDI chart can help visualize where our new product might
have high potential (High CDI) but face strong competition (High Competitive BDI). When over 50
markets are commonly available, this matrix helps organize and visualize the information.

A

The BDI/CDI chart can help visualize where our new product might
have high potential (High CDI) but face strong competition (High Competitive BDI). When over 50
markets are commonly available, this matrix helps organize and visualize the information.

55
Q

Advanced analytic techniques can precisely
estimate and predict the effects of the change for every item, product, brand, or category, but
often simple observational techniques and arithmetic can provide much real information to
inform price decisions

A

Advanced analytic techniques can precisely
estimate and predict the effects of the change for every item, product, brand, or category, but
often simple observational techniques and arithmetic can provide much real information to
inform price decisions

56
Q

Every company generates and retains data as part of

conducting their business and much of it is useful in answering market research questions.

A

Every company generates and retains data as part of

conducting their business and much of it is useful in answering market research questions.

57
Q

Most modern businesses compile information in customer databases, data warehouses, or
enterprise decision support systems. Increasingly this information is augmented by external
information obtained from sources we will discuss later so that massive amounts of information
are available internally.

A

Most modern businesses compile information in customer databases, data warehouses, or
enterprise decision support systems. Increasingly this information is augmented by external
information obtained from sources we will discuss later so that massive amounts of information
are available internally.

58
Q

Modern computing power is enabling businesses to track and manage
millions of customers, whether consumers or businesses, in a practice called
Customer Relationship Management (CRM

A

Modern computing power is enabling businesses to track and manage
millions of customers, whether consumers or businesses, in a practice called
Customer Relationship Management (CRM

59
Q

Data mining is the application of usually automated analytic techniques to nd patterns in data
that can be used to grow a business

A

Data mining is the application of usually automated analytic techniques to nd patterns in data
that can be used to grow a business

60
Q

two types of

external data: Syndicated Services and Big Data.

A

two types of

external data: Syndicated Services and Big Data.

61
Q

Technologies that promise to automate and discover new
insights about customers and consumers are transforming the market research industry as well.

Technology and the Internet have now automated many market research processes that replace what was considerable human effort. Data coding, text analysis, sample selection,
and questionnaire management and creation are all examples

Automated facial recognition algorithms can now detect emotional response to advertising
real time.

Advanced data fusion techniques interconnect and link attitudes and opinions from
thousands of surveys that may share only the minimum of actual questionnaire content

A

Technologies that promise to automate and discover new
insights about customers and consumers are transforming the market research industry as well.

Technology and the Internet have now automated many market research processes that replace what was considerable human effort. Data coding, text analysis, sample selection,
and questionnaire management and creation are all examples

Automated facial recognition algorithms can now detect emotional response to advertising
real time.

Advanced data fusion techniques interconnect and link attitudes and opinions from
thousands of surveys that may share only the minimum of actual questionnaire content

62
Q

So it makes sense that primary and secondary research often work together in the same
project

A

So it makes sense that primary and secondary research often work together in the same
project

63
Q

The use of multiple sources validates your data and investigation by cross
verifying the same ideas and information. This process is sometimes known as
data triangulation , taken from the navigation process of locating an unknown point in space through geometric relationships among other known points. You can validate by triangulating
data sources, research methods, or even theories

A

The use of multiple sources validates your data and investigation by cross
verifying the same ideas and information. This process is sometimes known as
data triangulation , taken from the navigation process of locating an unknown point in space through geometric relationships among other known points. You can validate by triangulating
data sources, research methods, or even theories

64
Q

Data triangulation is simply using evidence from many di erent
types of data sources, such as interviews, documents, public records, social media conversations,
or observations.

A

Data triangulation is simply using evidence from many di erent
types of data sources, such as interviews, documents, public records, social media conversations,
or observations.

65
Q
Theory triangulation is a bit more
complicated because the theories
are helping you understand your
data better rather than the sorts of
integration required in data and
methodology triangulation. Here you
are applying different theories to the
data to help make sense of it. One theory might support your data and
another might undermine it.
A
Theory triangulation is a bit more
complicated because the theories
are helping you understand your
data better rather than the sorts of
integration required in data and
methodology triangulation. Here you
are applying different theories to the
data to help make sense of it. One theory might support your data and
another might undermine it.
66
Q

It is important to remember when working with secondary data that it often contains personal
data; that is, data that can be used either directly or indirectly (e.g., by combining it with other
data) about a specific individual.

A

It is important to remember when working with secondary data that it often contains personal
data; that is, data that can be used either directly or indirectly (e.g., by combining it with other
data) about a specific individual.

67
Q

Two key responsibilities that
apply to secondary data:
1. Researchers must ensure that personal data used in research is thoroughly protected from
unauthorized access and never disclosed without the consent of the data subject.
2. Researchers must always behave ethically and not do anything that might cause harm to a
data subject or damage the reputation of market, opinion, and social research.

A

Two key responsibilities that
apply to secondary data:
1. Researchers must ensure that personal data used in research is thoroughly protected from
unauthorized access and never disclosed without the consent of the data subject.
2. Researchers must always behave ethically and not do anything that might cause harm to a
data subject or damage the reputation of market, opinion, and social research.

68
Q

This data is collected
or purchased by research companies and then the curated data is sold to multiple buyers to help
track or interpret their businesses. The companies collect the data once, but sell it many times to
multiple buyers as a subscription. The research companies also leverage the data for research
products and consulting services that may be customized for their customers.

A

This data is collected
or purchased by research companies and then the curated data is sold to multiple buyers to help
track or interpret their businesses. The companies collect the data once, but sell it many times to
multiple buyers as a subscription. The research companies also leverage the data for research
products and consulting services that may be customized for their customers.

69
Q

Many research companies syndicate what are called “ omnibus surveys ”, where questions on
many topics are conducted during the same interview

A

Many research companies syndicate what are called “ omnibus surveys ”, where questions on
many topics are conducted during the same interview

70
Q

By keeping track of unique but
anonymous identi ers and using advanced data fusion techniques, CivicScience builds a massive
database of opinions, preferences, attitudes, and demographic information that it provides to its
customers by subscription.

A

By keeping track of unique but
anonymous identi ers and using advanced data fusion techniques, CivicScience builds a massive
database of opinions, preferences, attitudes, and demographic information that it provides to its
customers by subscription.

71
Q

Household Panel Sales Data

The types of syndicated sales tracking data discussed so far are very accurate estimates of sales
and share in a market, but they hide some very important dynamics of purchasing necessary to
understand trends, diagnose performance, and grow products with the right marketing plans

A

Household Panel Sales Data

The types of syndicated sales tracking data discussed so far are very accurate estimates of sales
and share in a market, but they hide some very important dynamics of purchasing necessary to
understand trends, diagnose performance, and grow products with the right marketing plans

72
Q

Market Measurement

What the business does not know, however, is how much its competitors are selling, and the need
for that knowledge drives the largest scale type of syndicated data: sales tracking for consumer
goods.

A

Market Measurement

What the business does not know, however, is how much its competitors are selling, and the need
for that knowledge drives the largest scale type of syndicated data: sales tracking for consumer
goods.

73
Q

The markets for consumer package goods (CPG) known as Fast Moving Consumer Goods
(FMCG) outside the United States, is particularly well suited for this kind of tracking. The
distribution channels are well known, products are identi ed easily, and there are lots of CPG
businesses that can pay for the information and support the costs of collecting the data.

A

The markets for consumer package goods (CPG) known as Fast Moving Consumer Goods
(FMCG) outside the United States, is particularly well suited for this kind of tracking. The
distribution channels are well known, products are identi ed easily, and there are lots of CPG
businesses that can pay for the information and support the costs of collecting the data.

74
Q

The Universal Product Code or UPC, developed in the early 1970s to
help automate inventory tracking, proved also to be an essential
factor in automating collection of market information and almost
instantly ampli ed our ability to understand and optimize price,
promotion, and distribution.

A

The Universal Product Code or UPC, developed in the early 1970s to
help automate inventory tracking, proved also to be an essential
factor in automating collection of market information and almost
instantly ampli ed our ability to understand and optimize price,
promotion, and distribution.

75
Q

The UPC scanner that allowed quick check-out and payment at the supermarket also
automatically entered each transaction into an electronic database that grew with each
transaction

A

The UPC scanner that allowed quick check-out and payment at the supermarket also
automatically entered each transaction into an electronic database that grew with each
transaction

76
Q

By the late 1970s, a research company named Information Resources Incorporated (IRI) realized
the potential for using UPC’s as the basis for automated and powerful data collection for market
research.

A

By the late 1970s, a research company named Information Resources Incorporated (IRI) realized
the potential for using UPC’s as the basis for automated and powerful data collection for market
research.

77
Q

The basic structure of all syndicated scanner databases
is similar no matter who builds or sells it. Databases
consist of five essential elements:
Product, Markets, class of Trade, Time Periods and measures

A

The basic structure of all syndicated scanner databases
is similar no matter who builds or sells it. Databases
consist of five essential elements:
Product, Markets, class of Trade, Time Periods and measures

78
Q

The basic measure of audience size is the audience rating .

A

The basic measure of audience size is the audience rating .

79
Q

rating is de ned
as the number of households with their radio/TV sets tuned to a particular station/channel or
program for a speci ed length of time divided by the total number of households that have
radio/TV.

A

rating is de ned
as the number of households with their radio/TV sets tuned to a particular station/channel or
program for a speci ed length of time divided by the total number of households that have
radio/TV.

80
Q

The
other fundamental measure is audience share , which is the number of TV sets in use tuned to a
program or commercial.

A

The
other fundamental measure is audience share , which is the number of TV sets in use tuned to a
program or commercial.

81
Q

Nielsen has had to
expand its service to cover this now very fragmented audience and other research companies
have stepped up to o er di erent versions of so-called “ cross-platform ” audience measurement.

A

Nielsen has had to
expand its service to cover this now very fragmented audience and other research companies
have stepped up to o er di erent versions of so-called “ cross-platform ” audience measurement.

82
Q

atistically, when a sample cannot or does
not measure something that is real, it is called sampling error . Sampling error happens when a
sample is not representative of a population or is not large enough to accurately measure a
population condition.

A

atistically, when a sample cannot or does
not measure something that is real, it is called sampling error . Sampling error happens when a
sample is not representative of a population or is not large enough to accurately measure a
population condition.

83
Q
Although the datasets are
large, they may not be
representative, and therefore
not projectable to wider
situations
The datasets build so fast they
may accumulate incorrect
information. Sometimes the underlying process behind the accumulation of data makes the data in the
past less predictive of the future. A predictive process that works well enough once may not
work in the future
Sometimes the variables of big data are not the things that are closest to what we want to
measure directly
A
Although the datasets are
large, they may not be
representative, and therefore
not projectable to wider
situations
The datasets build so fast they
may accumulate incorrect
information. Sometimes the underlying process behind the accumulation of data makes the data in the
past less predictive of the future. A predictive process that works well enough once may not
work in the future
Sometimes the variables of big data are not the things that are closest to what we want to
measure directly
84
Q

We are going to focus on one
type of this research Sentiment Analysis , as an example, because of its popularity and promise.
We will also see how big data can supplement and amplify primary research, and at least
speculate on the value of the sensor data from connected devices as a secondary research
application.

A

We are going to focus on one
type of this research Sentiment Analysis , as an example, because of its popularity and promise.
We will also see how big data can supplement and amplify primary research, and at least
speculate on the value of the sensor data from connected devices as a secondary research
application.

85
Q

Sentiment Analysis tries to analytically determine the attitude of a speaker or writer with respect
to some topic. The attitude may be an opinion, or an intended or actual emotional state.

A

Sentiment Analysis tries to analytically determine the attitude of a speaker or writer with respect
to some topic. The attitude may be an opinion, or an intended or actual emotional state.

86
Q

Computers read words through a process called Natural Language Processing (NLP) and they
infer sentiment by a combination of NLP and machine learning. Both NLP and sentiment analysis
are very complex computer processes to accomplish a task that seems obvious to humans, but
the sheer scale of the data requires a machine. New algorithms are continuously improving our
ability to accurately read this data in a field of computer science that is relatively young.

A

Computers read words through a process called Natural Language Processing (NLP) and they
infer sentiment by a combination of NLP and machine learning. Both NLP and sentiment analysis
are very complex computer processes to accomplish a task that seems obvious to humans, but
the sheer scale of the data requires a machine. New algorithms are continuously improving our
ability to accurately read this data in a field of computer science that is relatively young.

87
Q

Three Sentiment Analysis Approaches
Knowledge Based
Statistically Based
Hybrid

A

Three Sentiment Analysis Approaches
Knowledge Based
Statistically Based
Hybrid

88
Q

Knowledge Based

Pre-classify text by categories based on the
presence of unambiguous words such as
happy, sad, angry, or bored. Assign scores to
more ambiguous terms based on previous
search

Compare analysis text to
pre-scored reference
database

A

Knowledge Based

Pre-classify text by categories based on the
presence of unambiguous words such as
happy, sad, angry, or bored. Assign scores to
more ambiguous terms based on previous
search

Compare analysis text to
pre-scored reference
database

89
Q

Statistically Based

Use machine learning to train algorithms
against known outcomes. For example, a
machine can sift through reviews and learn
which words in the reviews are associated with
accompanying star ratings

Compare analysis text to a
reference database built
from the statistical analysis

A

Statistically Based

Use machine learning to train algorithms
against known outcomes. For example, a
machine can sift through reviews and learn
which words in the reviews are associated with
accompanying star ratings

Compare analysis text to a
reference database built
from the statistical analysis

90
Q

Hybrid

Both methods combined, for example pre-
coding unambiguous words and modeling
more ambiguous terms against star ratings

Compare analysis text to a
reference database

A

Hybrid

Both methods combined, for example pre-
coding unambiguous words and modeling
more ambiguous terms against star ratings

Compare analysis text to a
reference database

91
Q

Given a target subject, then, sentiment analysis can analyze an enormous amount of text and
classify whether the text is generally positive, negative, or neutral about the subject. The best
techniques are pretty good at this task relative to human classi cation. Many studies have shown
that there is generally about 80% agreement between human readers in classifying the same text.
The best machines analysis achieves about 70%. That level is pretty good compared to 80%.

A

Given a target subject, then, sentiment analysis can analyze an enormous amount of text and
classify whether the text is generally positive, negative, or neutral about the subject. The best
techniques are pretty good at this task relative to human classi cation. Many studies have shown
that there is generally about 80% agreement between human readers in classifying the same text.
The best machines analysis achieves about 70%. That level is pretty good compared to 80%.

92
Q

Machines currently are not as accurate when
assessing finer degrees of positivity or negativity, as in the difference between a 4- and a 5-star
rating in a review.

A

Machines currently are not as accurate when
assessing finer degrees of positivity or negativity, as in the difference between a 4- and a 5-star
rating in a review.

93
Q

Probably the biggest limitation of sentiment analysis is that all social listening data, no matter how
much of it exists, is essentially qualitative. Most comments are by de nition non-representative.
People passionate enough to comment generate these comments, and they are people with
access to social media. The large number of comments does not change this inherent fact.

A

Probably the biggest limitation of sentiment analysis is that all social listening data, no matter how
much of it exists, is essentially qualitative. Most comments are by de nition non-representative.
People passionate enough to comment generate these comments, and they are people with
access to social media. The large number of comments does not change this inherent fact.

94
Q

Internet of Things analysis is not yet a common form of data

for marketing organizations. It is being tested and we are all learning.

A

Internet of Things analysis is not yet a common form of data

for marketing organizations. It is being tested and we are all learning.