Social Computing Flashcards

1
Q

Confusion Matrices

A

a more accurate substitute for the basic probability that a user gives the correct answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Consequences of information asymmetry

A

Leads to buyers making an adverse selection as the market consists of poor-quality items, as explained in The Lemon Problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Moral Hazard

A

Hazards that arise when an individual is incentivised to take greater risks as they are shielded from the negative consequences of their actions.

Such as, failing to uphold an agreement after payment as there is no consequences to their actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Purpose of Reputation Systems

A

Mitigate moral hazards and address the information asymmetry that leads to adverse selections in the market by quantifying the trust of individuals/entities using the wisdom of the crowd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Calculate Reputation Value

A

Calculated through averages of user ratings, improved with:
- correction for user bias
- weighted rankings (based on user trustworthiness i.e. past ratings rated helpful by others)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Confidence Values

A

Convey the certainty of the accuracy of a reputation values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Challenges with Reputation Systems

A

Ballot Stuffing
Slander & Self Promotion
Whitewashing
Fear of Retaliation
Individual Bias
Quality Variations
Lack of Incentives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ballot Stuffing

A

Multiple ratings from a single user

To tackle, ensure there is some effort-based (or monetary) cost to providing rating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Slander & Self-promotion

A

Competitors place reviews to damage sales of others

To tackle, require identity authentication and proof of transaction. Allow ratings of helpfulness of ratings. Use trust interference based on a user’s history

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Whitewashing

A

Users change their identity to reset their reputation

To tackle, ensure there is an effort-based cost to changing identity e.g. linking phone number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fear of Retaliation

A

Users fear retaliation and give overly positive ratings

To tackle, simultaneously disclose buyer and seller ratings e.g. Airbnb. Or prevent seller from rating buyers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Individual Bias

A

Users rate overly positive or negative

To tackle, adjust score based on history

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Quality Variations

A

Sellers sell low-value items to achieve a good reputation value, only to exploit this and sell low quality goods at high value

To tackle, weight ratings based on price and recency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Lack of Incentives

A

Users may not want to provide ratings

To tackle, provide non-monetary, or monetary, rewards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Motivations to Participate in Crowdsourcing

A

Financial Rewards
Self-development
Enjoyment
Altruism (Selflessness)
Social Aspects (Recognition, Competition, Validation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Social Mobilisation

A

Address complex search problems through referral schemes e.g. recursive incentive scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Output-agreement Mechanism

A

Gamification technique
ESP Game for Image Labelling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Input-agreement Mechanism

A

Gamification technique
TagATune Game

Provide descriptions and guess if listening to same song as peer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Benefit of Gamification

A

Increased engagement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Anchoring Effect

A

Individuals rely heavily on the first piece of information they receive (initial pay) when making subsequent judgments or estimations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Expert Crowdsourcing Attributes

A

Quality is highly important
less workers relied on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Crowdsourcing Contests

A

Individuals submit solutions and best submission is rewarded
- Workers behave strategically and pick tasks with less competition
- Early success experience is critical for workers to continue contributing
- Only a small amount of workers succeed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Multi-armed Bandit Problem

A

Describes the challenge of maximising rewards when balancing the exploration of new options and the exploitation of the best option known

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

ε-greedy

A

ε-greedy considers a probability of whether to explore or exploit.
ε-greedy(0.05) is 5% chance of EXPLORATION and 95% chance of EXPLOITATION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Average Quality per Cost

A

AKA Reward to Cost Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

ε-first

A

allocates a percentage of the budget to exploration before repeatedly pulling arm with highest average quality per cost.
ε- first(0.1) gives 10% of budget to exploration

Best performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Thompson sampling

A

keeps a prior for each arm and picks an arm by picking the best sample from each prior, it updates each prior using Bates’ theorem.
Naturally explores promising arms and exploits as uncertainty drops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Plurality Voting

A

top voted candidate wins. It fails to consider voter preferences beyond first choice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Borda count

A

candidates receive points based on sum of positions in voters’ preferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Condorcet Winner

A

a winner determined through pair-wise plurality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Condorcet Paradox

A

there are scenarios in which the majority of voters will be dissatisfied i.e. no matter the outcome the majority will prefer another option

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Condorcet Criterion

A

satisfied by a voting system which always chooses the Condorcet winner and therefore reflects the collective preferences of the voters

Borda count & plurality voting do not satisfy
Black’s rule & Copeland method satisfy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Black’s Rule

A

selects Condorcet winner or uses Borda count as a fallback

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Copeland Method

A

selects winner by subtracting pairwise losses from pairwise victories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Spearman’s Footrule distance

A

the sum of displacements between candidates in two rankings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Kendall-Tau Distance

A

the sum of pairwise disagreements of candidates in two rankings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Finding Optimal rankings

A

compare all enumerations to voter preferences and sum up the distances

Spearman’s Footrule optimal ranking can be solved in polynomial time

Kendall-Tau is NP-hard

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Collective Intelligence

A

intelligence that emerges from collaboration, collective efforts, or competition of many individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Social Computing

A

covers methods for building computational systems that harness collective intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Human Computation

A

outsources computational microtasks, which are difficult for machines to complete, to humans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Requirements for a microtask

A
  • Hard for computers, easy for humans
  • Easily and quickly explainable to non-experts
  • Fast to complete
  • Amenable to automatic quality control and aggregation
  • Robust to some noise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

MTurk

A

Amazon Mechanical Turk (MTurk) is a crowdsourcing platform enabling the outsourcing of Human Intelligence Tasks (HITs) to remote workers known as Turkers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Solyent

A

a word processor powered by human computation with the functions of
- Shortn: Shorten Text
- Crowdproof: Proofreading & Grammatical Corrections
- The Human Macro: Issue tasks i.e. find an image to accompany text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Find-Fix-Verify pattern

A
  • Find: Workers highlighted sections that needed attention e.g. mistakes, text that can be shortened
  • Fix: Other workers proposed improvements for only one of the most commonly highlighted sections
  • Verify: Other workers voted on the best improvements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Benefits of Find-Fix-Verify

A

ideal for complex tasks as it

  • Separation of find and fix avoids lazy users who focus on the easiest fixes
  • Highly parallelisable as a document can be split into chunks and distributed among workers
  • Verification ensures quality
  • Each stage has independent decisions
  • Small tasks decrease the likelihood of errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

TurKit

A

TurKit is a framework for facilitating development of applications using Mechanical Turk. Its Crash-and-Rerun feature ensures progress is cached when a crash occurs and work resumes from the last successful checkpoint

47
Q

Human Computation Quality Control Methods

A

Majority Voting
Weighted Voting: Assigns a higher weight to users who tend to agree more with others
Weighted Voting with Prior

48
Q

How to get prior for Weighted Voting with Prior

A

use ground truths to identify each user’s probability of choosing the correct answer. Helps to deal with uncertainty of the accuracy of users’ answers. Prior Knowledge comes from
- Previous Interactions
- Reputation System
- Gold Standards (True Labels)

49
Q

Trust

A

A subjective probability by which an individual expects another individual performs a given action on which its welfare depends

50
Q

Reputation

A

Considers the collective opinion of multiple individuals about a single individual; an aggregation of personal experience

51
Q

Experimental Design

A

Orchestration of experiments for analysis using proper scientific approach to tune a system for an optimal design

52
Q

A/B Split Testing

A

Compares two versions of content to determine which is better

53
Q

A/B Split Testing Advantages

A
  • Easy to test, implement and analyse
  • Flexible factor changes
54
Q

A/B Split Testing Disadvantages

A
  • Impractical to test all possible combinations to capture interdependence between factors so unlikely to find optimal solution
  • Not possible to determine which factors contribute most as too many factors change at once
  • Inefficient
  • Will not find optimal
55
Q

Multivariate Testing

A

Simultaneously tests multiple factors:
- One Factor At a Time
- Full Factorial Design
- Fractional Factorial Design

56
Q

One Factor At a Time (OFAT)

A

Changes only one factor at a time from the base treatment and analyses the outcome

57
Q

One Factor At a Time Advantages

A

Frequently Used

58
Q

One Factor At a Time Disadvantages

A
  • Significant number of trials required per treatment to obtain statistical significance
  • Unbalanced, values have different numbers of occurrences (bias towards base treatment values)
  • Limited number of treatments when changing one factor
59
Q

Full Factorial Design

A

Considers all possible combinations of factor values

Num Treatments = Num Factors ^ Num Options per Factor

60
Q

Interaction Effects

A

The impact of a factor is dependant on a second factor e.g.
- Best page content depends on best page layout

61
Q

Main Effects

A

Individual factors calculated by average outcome over all other varied factors

62
Q

Full Factorial Design Advantages

A
  • Number of trials to achieve statistical significance is less than OFAT
  • Balanced, each values occur the same number of times
  • Can capture interation effects
63
Q

Full Factorial Design Disadvantages

A
  • Requires significant number of recipes
  • Number of treatments increases exponentially with factors
  • Number of trials increases as treatments increaes
  • Impractical to test all combinations
64
Q

Fractional Factorial Design

A

Considers a subset of all possible combinations of factor values to reduce the number of treatments required

65
Q

Latin Squares

A

Control two sources of variation simultaneously in a practical way.
- They do not eliminate all confounding of variables but can eliminate some bias from interaction effects by ensuring each value is tested equally as often with different combinations of other factors

66
Q

Latin Squares Advantages

A
  • Requires less recipes (m^2) than Full Fractional Design (m^3) with same benefits
  • Interaction effects can cancel out
67
Q

Latin Squares Disadvantages

A
  • Can only measure main effects
  • Requires same number of trials as Full Fractional Design
  • Does not necessarily produce the best recipe
  • Limited to 3 factors
68
Q

Online Auction

A

Dynamic pricing based on competition acting as a price discovery mechanism

69
Q

English Auction

A

Auctioneer progressively increases the bid price until no further bids are made

70
Q

Dutch Auction

A

Price starts high and gradually lowers until a bidder accepts the price at which point the item is sold

71
Q

Sealed-bid First Price Auction

A

Each bidder submits a single bid in a sealed envelope and the item is allocated to to the highest bid at the price bid

72
Q

Sealed-bid Second Price Auction (Vickrey Auction)

A

Each bidder submits a single bid in a sealed envelope and the item is allocated to the highest bidder at the price of the second-highest bid

73
Q

Private Value

A

A bidder’s willingness to pay for an item, independent of what others believe the item is worth

74
Q

Utility

A

The satisfaction for bidder i winning the auction is
ui = vi - pi

vi is private value
pi is payment

if vi = pi then the bidder is indifferent to winning or losing as the utility gained from winning is 0 and there is 0 utility on a loss

75
Q

Dominant Strategy

A

Provides the best payoff regardless of the strategies of other players

76
Q

Best Strategies

A

Maximise payoff while considering strategies of other players

77
Q

English Auction Dominant Strategy

A

Bid the smallest amount (the bid increment) and stop when your private value is reached

78
Q

First-Price Sealed-Bid Auction Best Strategy

A

SHADE BID

Bidders will speculate about the bids of others and the probability of the highest bid.
Calculate the utility gained given the probabilities of each outcome

There is a 40% chance the highest bid (except you) is £150
- If you bid £151 and win, u = 0.40(£300 - £151) = £59.60
There is a 40% chance the highest bid is £210
- If you bid £211 and win, u= 0.80(£300-£211) = £71.20
There is a 20% chance the highest bid is £240
- If you bid £241 and win u = 1.00(£300-£241) = £59

The best strategy to maximise utility is to bid
There is no dominant strategy

79
Q

Vickrey Auction Dominant Strategy

A

Bid your private value vi

  • If the highest bid (excluding yours) is $>v_i$ then
    • By bidding less than $v_i$, you always lose and gain $0$ utility
    • By bidding more than $v_i$, you might win but always gain negative utility (overpaying)
  • If the highest bid (excluding yours) is $<v_i$ then
    • By bidding less than $v_i$, you might lose
    • By bidding more than $v_i$, you always win but gain the same utility as bidding $v_i$
  • There is no scenario (bidding more/less than $v_i$) which guarantees $u_i > 0$
80
Q

Strategic Equivalence of Auctions

A

English Auctions and Vickrey Auctions are Strategically Equivalent
Dutch Auctions and First-Price Sealed-Bid Auctions are Strategically Equivalent

81
Q

Vickrey vs English

A

Vickrey is better
It avoids wasteful effort of counter speculation but still ensures the bidder with the highest private value receives the item (efficiency)

82
Q

Revenue Equivalence Theorem

A

All four auction protocols produce the same revenue provided that
- Bidders are rational
- Bidders are risk neutral
- Private value assumption holds
- Bidders are symmetric, with the same beliefs about the probability of bids made by others

83
Q

Winner’s Curse

A

the winning bid exceeds the intrinsic value of the item

84
Q

Bidder Collusion

A

2 or more bidders from bidding rings to manipulate the auction

85
Q

Corruption

A

an auctioneer misuses their position
- They may lie about submitted bids
- Revealing all bids after an auction provides transparency

86
Q

Sniping

A

Last minute bids. Occurs when closing time is fixed
- Prevents being outbid by other
- Solved with flexible deadline and sealed-bid auctions

87
Q

Risk of Low Profit

A

Solved with a reserve price

88
Q

Organic Search

A

Unbiased search where results are based on a ranking algorithm

89
Q

Sponsored Search

A

Paid and biased search where ranking is based on auction mechanism

90
Q

Pay-per-Impression

A

Advertiser pays per million impressions (when ad is displayed)

91
Q

Per-per-Click

A

Advertiser pays when ad is clicked

92
Q

Pay-per-Transaction

A

Advertiser pays when an actual purchase is made. Attribution challenges

93
Q

Slot Allocation

A

sorting advertisements in ascending order according to bid * quality score

94
Q

GSP

A

Generalised Second-Price Auction

  • Pay the smallest amount you could have bid and still retain the same position
    A1: 10 * 0.6 = 6
    A2: 7 * 0.9 = 6.3
    A3: 5 * 0.4 = 2

Order: A2, A1, A3

A2 Pays: A1 Bid * A1 Quality / A2 Quality = £6.67
A1 Pays: A3 Bid * A3 Quality / A1 Quality = £3.33

95
Q

GSP Review

A

Payment is never more than bid.
Bidders might prefer lowest spots to increase their utility.
Auction does not fully capture preferences, bid is for one slot and not each slot

96
Q

True Positive

A

A recommended item is relevant to a user

97
Q

False Positive

A

A recommended item is not relevant to a user

98
Q

True Negative

A

An irrelevant item is not recommended to a user

99
Q

False Negative

A

A relevant item is not recommended to a user

100
Q

Precision

A

A measure of exactness

Precision = TP / (TP + FP)

Good Recommendations / All Recommendations

101
Q

Recall

A

A measure of completeness
How many recommendations did you recall

Recall = TP / (TP + FN)

Good Recommendations / All Good Items (Even those not recommended)

102
Q

Accuracy

A

The accuracy of all classifications i.e. true positives and true negatives

Accuracy = tp + tn / (tp + tn + fp + fn)

Correct Classifications / All Classifications

103
Q

Crowdsourcing Study Takeaways

A
  • Financial rewards decrease intrinsic motivation
  • Positive verbal feedback increases intrinsic motivation
  • Performance drops when financial incentive is lowest
  • Financial rewards can increase speed of work but no quality as workers always aim for minimum acceptable quality
  • Anchoring effect can be harnessed
  • Payment method affects factors
104
Q

Content-based Recommender Systems Advantages

A
  • No community required
105
Q

Content-based Recommender Systems Disadvantages

A
  • Content-descriptions required
  • Cold start issue for new users
  • No surprising suggestions
106
Q

Knowledge-based Recommender Systems Advantages

A
  • Deterministic recommendations
  • Guaranteed quality
  • No cold-start issue
  • Can resemble sales dialogue
107
Q

Knowledge-based Recommender Systems Disadvantages

A
  • Knowledge engineering effort required
  • Fundamentally static
  • Cannot react to short-term trends
108
Q

Collaborative Filtering Recommender Systems Advantages

A
  • No knowledge engineering effort required
  • Can produce surprising recommendations
109
Q

Collaborative Filtering Recommender Systems DIsadvantages

A
  • Requires feedback through ratings
  • Cold start issue for new users and items
  • Sparsity problems
  • No integration of other knowledge sources
110
Q

Crowdsourcing

A

obtains services, ideas, or content from a large group of people through an open call rather than from the traditional employee or supplier relationship

111
Q

Content-based Recommender System Steps

A
  • Identify Features (Title, Genre, Keywords)
  • Clean Features (Remove Stop-words, Stemming, Phrase Extraction, TF-IDF)
  • Find Similarities
  • Produce Reccomendations
112
Q

Dice’s Similarity Coefficient

A

Compares number of shared elements to total number of elements in both sets

D(X, Y) = 2 x |X ∩ Y| / (|X| + |Y|)

Fails to consider element frequency

113
Q

Jaccard’s Similarity Coefficient

A

J(X, Y) = |X ∩ Y| / |X ∪ Y|

Fails to consider element frequency

114
Q

Cosine Similarity Coefficient

A

Measures angle between two vectors