Social Data: Biases And Pitfalls Flashcards

1
Q

Question: What is social data and what type of content does it emphasize?

A

Social data refers to digital traces produced by or about users with content that emphasizes explicit communication and interaction with others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question: What are some examples of social software platforms that generate social data?

A

Social software platforms that generate social data include various websites, apps, and services that enable users to connect, communicate, and share content with others, such as friend networks, search engines, photo sharing, rating and reviews websites, professional or fan websites, location networks, video sharing platforms, and crowdsourcing websites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is internal validity in research studies and what does it address?

A

Internal validity is a validity threat that refers to the extent to which an analysis leads correctly from the measurements taken to the study conclusions drawn, and whether the analysis accurately captures the relationship between variables and potential confounding factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is construct validity in research studies and what does it address?

A

Construct validity is a validity threat that refers to the extent to which a measure or indicator actually measures what it is intended to measure, and whether it accurately captures the underlying construct of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ecological validity in research studies and what does it address?

A

Ecological validity is a validity threat that refers to the extent to which an experimental setup or methodology properly reflects the real-world phenomenon being studied, and whether the findings can be generalized to other settings and populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the three categories of functional biases that can occur in social media data, and what do they refer to?

A

The three categories of functional biases that can occur in social media data are biases due to platform affordances and algorithms, biases due to community norms, and biases due to phenomena outside of social platforms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three stages of data collection that can introduce biases into social media data?

A

The three stages of data collection that can introduce biases into social media data are acquisition (biases due to API limits), querying (biases due to query formulation), and filtering (biases due to removal of data deemed irrelevant).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three stages of data processing that can introduce biases into social media data?

A

The three stages of data processing that can introduce biases into social media data are cleaning (biases due to default values), enrichment (biases from manual or automated annotations), and aggregation (e.g., grouping, organizing, or structuring data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some types of biases that can occur in data analysis of social media data?

A

Some types of biases that can occur in data analysis of social media data include confounding bias, peer effects, selection bias, ignorability, and obfuscated measurements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some considerations for evaluating the quality and validity of social media data, and what are some limitations of these evaluations?

A

Some considerations for evaluating the quality and validity of social media data include metrics such as reliability and lack of domain insights, interpretation and contextual validity, and disclaimers regarding the lack of negative results and reproducibility. However, these evaluations may be limited by a lack of generalizability, interpretive biases, and variations in data representation and performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly