Lecture notes Flashcards
Internal data
Usually readily available
External data
Can be obtained in other ways, e.g. web scraping
Structured data
Typically easy to process and analyse
Unstructured data
Might require specialised methods
Tall data
Has a lot of rows, so many observations
Wide data
Has a lot of columns, so many variables
Velocity
Looks at speed (one time, monthly, weekly, hourly) and is often related to the timeliness of variables/outcomes
Variety
Looks at the combination of sources and data types (numbers, text, images). It also reflects on whether the data is internal or external, and structured or unstructured
90% confidence interval
On average, 9/10 are correct
Web scraping
Extracting data from websites. This can be done manually or automatic, but it usually refers to a bot or software doing so. Essentially, you can obtain most information on websites that you can normally see. It’s legal but the data needs to be public. There are techniques to limit web crawling, e.g. CAPTCHA
Application Programming Interface (API)
When firms allow access to databases. It is easier for the collector and provides no legal concerns
A/B testing
A methodology for comparing two versions of a webpage or app against each other to determine which one performs better. It is simple and usually cheap
Multivariate testing
Multiple variables are tested together at the same time to uncover the ideal combination that is effective in improving the primary metric. It is useful for relative effects and interactions
Full factorial design
Tests all combination; looks at how multiple factors influence a specific outcome. Useful if you suspect interactions
Fractional factorial design
Looks at a limited number of combinations. It does not account for interactions, but is simpler and faster
Bleier et al. Study 1 method
Used 13 design elements in 16 variants. Used a factorial design with 256 conditions. Also collected data on experience dimensions
Expert endorsement
Does not impact informativeness
Interaction effects (moderation)
Tests whether the main effect is different across a third variable
Customer journey
The experience customers go through when interacting with a firm. It does not only limit to purchase but also the stages before, and sometimes after. Interactions are sometimes referred to as touchpoints
Multi-touch attribution (MTA)
Reflects that consumers can interact with many touchpoints before eventually making a purchase. The problem is identifying which touchpoint is the direct cause (may also be a combination). As some are very costly, you want to identify the consequences of removing one
Honey
A browser extension that is supposed to find you discounts. You can click on it at the checkout page in webshops to apply the discount code
Markov model
Represents the touchpoints in a series of paths
Social contagion
Occurs when connections among consumers can affect how information and behaviour spread
Bass model
Explains adoption through innovators and imitators. It is based on innovation (p) and imitation (q)
Innovation
Signals consumers willing to test new things
Imitation
Signals social contagion. Driving imitation can greatly increase diffusion
Celebrities
Start with followers from outside social media but can use media to stay connected to their fanbase
Influencers
Famous because of their social media content. They are typically seen as more relatable and down-to-earth, and are more open to bidirectional interactions
Degree centrality
Preexisting popularity. The higher the degree, the more central the node is
In-degree
A count of the number of ties directed to the node
Out-degree
The number of ties that the node directs to others
Item-based framing
You need a lot of data on items. The user needs to provide some input on the items, which can be active or passive. It may lead to a cold start if the user has provided no input
User-based framing
You need a lot of data on the users. More users need to provide input, but no data on items is needed. Cold start occurs when users have provided no input or other users have no input
Item profiles
Consist of features to classify items. In movies, e.g. director, composer, actors. It can also include more subjective measures, e.g. script, reviews