Boegershausen et al. (2022) Flashcards
Web scraping
The process of developing software to automatically collect the information displayed in a web browser without involving data providers
Application programming interfaces (APIs)
Give researchers access to internal data bases. For instance, a wide range of algorithms. Enable two software components to communicate with each other using a set of definitions and protocols
Web data
Can be used to develop new theories about emerging marketing phenomena. It is used to boost ecological value. It can be collected without any attention, even without the data provider’s direct involvement so they can effectively complement more controlled data collection methods without the interference of data suppliers or collaborating firms, which takes away the business element. It can allow researchers to measure constructs more precisely and obtain more valid inferences. It can create knowledge by allowing researchers to move closer to marketing’s “natural habitat”
Longitudinal data
Gives benefits such as the ability to update data. Brings disadvantages like concerns about technical feasibility and ethical concerns like the greater likelihood of potentially identifying individuals. It also gives a heavier load on servers, which might be a legal concern
Data processing
Needed for all data. Occurs before data sets are cleaned or analysed
Raw data
Using raw data brings technical and ethical concerns because it might require databases to retain their original structure and facilitate processing. It might also raise questions on the right to store the data
Novel web scraping services
Promise to handle technical difficulties efficiently
Paper’s aim
Investigate how researchers can ensure that the data sets generated via web scraping and APIs are valid. Their framework highlights how addressing validity concerns requires the joint consideration of idiosyncratic technical and legal/ethical questions along the three stages of collecting web data: selecting data sources, designing the data collection, and extracting the data