Final Vocab List Flashcards

1
Q

Adjacency list

A

A list of neighbouring vertices

A collection of unordered lists used to represent a finite graph.

Each unordered list within an adjacency list describes the set of neighbors of a particular vertex in the graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Adjacency matrix

A

A matrix where the i,j entry is true if there exists an edge between node i and node j

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Affiliation network

A

A bipartite graph that shows which individuals are affiliated with which groups or activities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Application layer

A

Top layer of the Internet

  • Takes layers below for granted
  • zoom, chrome, etc.
  • email, file-transfer, DNS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Best response

A

(In a game) the most optimal response by a player to a strategy played by another. Can be determined by inspecting the payoff matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Betweenness

A

The betweenness of an edge is the total amount of flow it carries, counting flow between all pairs of nodes using this edge. One unit of flow is added for each shortest path the edge is part of.

If multiple (k) shortest paths between two nodes, increment by 1/k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bipartite graph

A

A graph whose vertices can be partitioned into groups X and Y where no 2 vertices in the same group share an edge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Boolean model

A

Model of information retrieval based on boolean logic that identifies documents and user queries as sets to find intersections.

  • Preprocesses data (once)
  • Query -> fast
  • uses a term document incidence matrix
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Bridge

A

An edge is a bridge if its removal would increase the number of connected components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Clique

A

A group of vertices such that all vertices have edges to all others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Clustering Coefficient

A

The clustering coefficient of a node A is defined as the probability that two randomly selected friends of A are friends with each other.

Computed as:
(actual number of edges between neighbors of A) / (total number of possible edges between neighbors of A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Component (of a graph)

A

A subgraph of a graph that is maximally connected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Connected graph

A

A graph G is connected iff for all x, y in G, there exists a path between x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cosine similarity

A

Measure of similarity between two vectors by computing the cosine of the angle between them.

Used to measure document similarity by represneting docs as vectors.

cos(\theta) = dot product of D1 and D2 divided by magnitudes mulitplied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Crawler (search engine)

A

Internet bot that browses the web to index pages.

  • Uses BFS on a list of start URLs,
  • Fetches documents (drop if duplicate, don’t overburden any server)
  • Queues outgoing links
  • When done, sends document to indexer/analyzer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cycle

A

A path that begins at a vertex u and ends at u

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Degree (of a vertex)

A

The number of vertices adjacent to u is the degree of u

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Directed graph

A

A graph with directed edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Document frequency

A

df_t = number of documents in corpus that contain term t

min = 1 … max = N (size of corpus)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Domain name system (DNS)

A

The way that internet domain names are located and translated into internet protocol (IP) addresses.

Part of the Application layer.

The DNS is reverse hierarchical so it starts from right to left and does this determination:

  • so it would start at .edu and see what domain server is responsible for that domain
  • then from there see what server handles the .upenn domain
  • and so on.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Dominant strategy

A

(Weakly) Strategy that produces equal or higher payoff than any other response given a certain strategy of opponent

(Strongly) Strictly higher payoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Dominated strategy

A

A strategy is (strongly) dominated if there exists a strategy available to the same player that yields a strictly higher payoff in response to every other strategy played by other players

(weakly) can be equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Edge (of a graph)

A

An ordered pair of vertices (u,v) representing a relationship between u and v

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Focal closure

A

The tendency of two people to form a link when they have a focus in common (forming links to others who share characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Frontiers (in BFS)

A

The set of vertices that are of the same distance from the starting vertex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Game theory

A

Study of strategic interaction amongst a group of rational agents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Graph

A

A set of vertices and a set of edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Homophily

A

The tendency for people to seek out or be attracted to those who are similar to themselves.

When the actual number of cross edges is less than the expected number of cross edges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Hypertext transfer protocol (HTTP)

A

Protocol at the application layer, used for WWW, defines how messages are formatted and transmitted.

30
Q

IEDS

A

Iterative elimination of dominated strategies.

Used to solve games by repeatedly simultaneously eliminating dominated strategies.

31
Q

In degree (of a vertex)

A

The number of edges directed toward a vertex u is the in-degree of u

32
Q

Indexer/Analyzer (search engine)

A

Indexes, parses and stores data to facilitate fast and accurate information retrieval.

  • Assigns unique IDs
  • Records outgoing and incoming links
33
Q

Information retrieval

A

Finding material (usually documents) of an unstructured nature (usually text) from a large collection that satisfies an information need.

The corpus of documents is:

  • Curated
  • Homogenous Quality
  • No adversarial docs
  • Sophisticated queries

Searches can be metadata or full-text (or other) indexing

34
Q

Internet layer

A

(2nd from bottom layer)

  1. Host addressing & identification
    • need to differentiate computers w/ unique addresses
  2. Best effort (unreliable) packet delivery

Internet Protocol (IP)

35
Q

Internet protocol (IP)

A

The method or protocol by which data is sent from one computer to another on the Internet (IPv4 - (32 bit) and IPv6 (128 bit)).

Responsibilities are host addressing and best-effort delivery

36
Q

Inverse document frequency

A

Quantifies whether the term is common or rare across N documents.

idf_t = log_{10} (N/df_t)

min = 1 … max = log_{10} (N)

37
Q

Inverted index

A

Index data structure storing a mapping from terms to documents the term is seen in.

(Adjacency list)

e.g. Harry = {1,2,3,4,5,6,7}, Luna = {4,5,6,7}

38
Q

Link layer

A

Physical (bottom) layer of the internet. Deals with physical media: bits

Protocol Suite, the networking architecture of the Internet.

Handles error control via Checksums

- makes sure there is no corruption
- Checks bits w/ checksums
- ensures received === sent
39
Q

Local bridge

A

An edge is a local bridge if its endpoints do not share a common neighbour.

Deleting the edge would increase the distance to a value > 2.

Local bridges are weak ties

40
Q

Membership closure

A

Tendency that a person becomes involved with focus as function of number of friends who are already involved in it

41
Q

Mixed strategy

A

A strategy that randomises over a set of strategies with varying probabilities.

42
Q

Nash equilibrium

A

A strategy (S,T) is a Nash equilibrium if S is a best response to T and T is a best response to S.

  • No individual can change their mind and do better
  • Not unique (can have multiple)
  • Must exist
43
Q

Neighbourhood overlap

A

The neighbourhood overlap of two specific nodes:

The ratio of:
(number of nodes who are neighbours of both A and B) / (number of nodes who are neighbours of at least one of A or B)

44
Q

Out degree (of a vertex)

A

The number of edges directed away from a vertex u is the out-degree of u

45
Q

Scaled Pagerank

A

look at notes

46
Q

Pagerank limitations

A

Easy

  • Sources/sinks (Scaled Pagerank Random surfer)
  • links in different locations of page may have different relevance
  • dynamic pages (site based instead of page based)

Hard

  • A endorse B?
  • New pages
  • Search neutrality
  • Query Independence?
47
Q

Pagerank

A

Mine graph structure independent of user query

Works by getting credible sources to point to other credible sources

Graph of n nodes:

  1. Assign each node a PR of 1/n
  2. For each node
    • Take current PR and divide and send over all the outgoing edges (if none, then keep current PR)
    • Sum up incoming to calculate new PR value
  3. Repeat step 2 until PR values converge
48
Q

Path

A

A sequence of edges between two vertices u and v

49
Q

Payoff

A

Each player receives a payoff that can depend on the strategies selected by everyone.

50
Q

Social network

A

A network of social interactions and personal relationships.

51
Q

Span (of a bridge)

A

The shortest path length between endpoints if bridge wasn’t there

52
Q

Strict best response

A

A strategy S of Player 1 is a strict best response to a strategy T for Player 2 if S produces a strictly higher payoff than any other strategy paired with T.

P1(S, T) > P1(S’, T)

53
Q

Strongly dominant strategy

A

A strategy that always produces strictly higher payoff in response to any other strategy.

54
Q

Strong tie

A

Edges representing strong relationships between adjacent vertices

55
Q

Strong triadic closure

A

Strong Triadic Closure Property is that:

If a node has strong ties to two neighbours, then these neighbours must have at least a weak tie between them.

56
Q

Term (IR)

A

String/word in context of a document

57
Q

Term frequency

A

tf_{t, d} = number of occurrences of term t in document d

58
Q

Term-document incidence matrix

A

Matrix that describes the frequency of terms that occur in a collection of given documents

Rows: Terms
Columns: Documents

59
Q

Topological sorting

A

Topological sorting for a Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before v in the ordering.

Toposort can be found by performing DFS and returning the vertices in reverse order of finish times.

60
Q

Transmission control protocol (TCP)

A

Part of the Transport Layer

  • Reliable Packet delivery: “hears” if arrived
  • Slow
  • Connection-oriented (makes sure that connection has been established between hosts)
  • Uses: Texting, banking, downloads
61
Q

Transport layer

A

3rd Layer of Internet: end-to-end message transfer

  1. Segmentation
    • Can request lost pieces and re-order if out of order
  2. Application Addressing
    • Port number (0 to 65535) so that you can determine which application should receive the data
  3. Congestion/flow control
    • Choose which path
  4. Error Control (only TCP)
    • Redundancy, copying packets & resending until it goes through
62
Q

Triadic closure

A

If two people in a social network have a friend in common, there is an increased likelihood that they will become friends with each other

63
Q

Undirected graph

A

A graph without directed edges

64
Q

User datagram protocol (UDP)

A

Part of the Transport Layer

  • unreliable
  • fast
  • connection-less
  • uses: live-streaming, multiplayer games
65
Q

Vector space model

A

Encodes documents as vectors

Each document is a vector of weights

D1 = (W_{11}, W_{21}, …, W_{n1})

66
Q

Vertex (of a graph)

A

A node in a graph (commonly a member of a network)

67
Q

Weak tie

A

An edge representing a weak relationship between two nodes

68
Q

Weight (of a term in a document)

A

W_{t d} = Weight of term t in document d

W_{t d} = tf_{t,d} \cdot idf_t

69
Q

Weight (of an edge)

A

Numerical value assigned to an edge

70
Q

BFS

A

Breadth First Search (BFS)

> Discover nodes in the order of distance from a source node

  1. Pick a source node + add it to discovered
  2. Find all of its neighbours that are not discovered
  3. Add these neighbours to a queue + discovered
  4. Pick first from queue & repeat steps 2-4 as long as queue is not empty
  5. If the queue is empty, pick another random node from the graph that has not been discovered & restart from step 1 with this node as the source

Dijkstra’s shortest path

71
Q

DFS

A

Depth-First Search (DFS)

  1. Pick a source node + add it to discovered
  2. Find all of its neighbours that are not discovered
  3. Add these neighbours to a stack
  4. Remove from the top of stack, if discovered → ignore node and repeat 4 → else, add to discovered & repeat 2-4, as long as stack is not empty
  5. If the stack is empty, pick another random node from the graph that has not been discovered & restart from step 1 with this node as the source

Topological Sort