Web Architecture Flashcards

1
Q

How can resources be represented?

A

text (plain, html, csv), image, audio, video, application

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can resources be identified?

A

With URIs (Uniform Resource Identifiers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can resources be interacted with?

A

Using network protocols like HTTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the bow-tie model represent?

A

The shape of the web
Sections such as LSCC core, IN, OUT, and disconnected components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the web?

A

A distibuted information system that provides acess to resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is hypertext?

A

A way to link information in a non-linear interactive way - cannot be represented conveniently on paper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 2 disadvantages of hypertext?

A

Disorienting - easy to lose sense of direction
Cognitive overhead - additional effort to maintain several trails at once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is hypermedia?

A

Non-textual media

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are nodes, links, anchors, and endpoints?

A

Node - a point in the network e.g. a webpage
Link - a connection between nodes
Anchor - the clickable element that links pages
Endpoint - the destination of a link

<a href="http://www.endpoint.com/">anchor</a>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are embedded links?

A

Links that are encapsulated in a node, and form part of the document content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are first class links?

A

Where links are separated from nodes allowing multiple link overlays/linkbases (links over same node), creating different connections without changing the node
Link bases can be tailored to reader
Has 2 endpoints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are bidirectional links?

A

Links that can be traversed backwards as well as forwards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are N-ary links?

A

Links involving more than 2 nodes, allowing relationships between multiple entities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are generic links?

A

Links where, by using locspecs, all occurences of a word can be linked to the same endpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are functional links?

A

Links that represent predefined relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are typed links?

A

Links that define the nature or relationship of the link, such as “friend,” “parent,” or “employee”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is REST?

A

Representational State Transfer
A web architecture style that uses stateless communication to manipulate resource representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does REST aim to do (2)?

A

Minimise latency and network communication
Maximise independence and scalability of components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the 4 components of REST?

A

Origin servers - the ultimate place you get a resource from
Gateways - for integrating legacy servers
Proxies - to filter & cache
User agents

User agent & origin server are end points that communicate using HTTP
If using gateway, origin server & gateway don’t communicate in HTTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 5 constraints of REST?

A

Client-server - separation of concerns (client: user interface, server: data storage)
Stateless - no context stored on server, session state kept on client
Caching - response data labelled as cacheable or non-cacheable
Uniform interface between components - identify what next possible actions could be
Layered - system components have no knowledge of components they don’t directly interact with

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 3 advantages of client-server constraint?

A
  • Improves portability
  • Improves scalability (as server simplified)
  • Allows components to evolve separately
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the 3 advantages and 1 disadvantage of stateless constraint?

A

Advantages:
* Improves reliability
* Improves scalability
* Improves visibililty (of requests)

Disadvantage:
* Increases per-action overhead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 2 advantages and 1 disadvantage of caching constraint?

A

Advantages:
* Eliminates some actions
* Reduces latency

Disadvantage:
* Reduces reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the 2 advantages and 1 disadvantage of uniform interface constraint?

A

Advantages:
* Improves visibility (of interactions)
* Implementations decoupled from services they provide

Disadvantage:
* Reduces efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the 2 advantages and 1 disadvantage of layered constraint?

A

Advantages:
* Limits system complexity
* Improves scalability

Disadvantage:
* Adds latency & overhead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the 3 principles of address identifiers?

A

Global - addresses should be unambiguous & human readable
Distinct identifiers - using same URI for different resources creates a URI collision
Avoid aliases - don’t use different URIs for same resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How should documents be named?

A

Use logical names rather than physical addresses to avoid issues when documents are moved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

URL vs URN

A

URL specifies location of resource on internet
URN uniquely identifies resource by name - not that good approach as can just use HTTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are IRIs (Internationalised Resource Identifiers)?

A

An extension to URIs, allowing Unicode characters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Why shouldn’t you change URIs?

A

Breaks pages linked to old URI -> 404

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are the 5 principles of representation?

A
  • W3C representation principles - follow a format to future proof
  • Separate content, presentation & interaction
  • Identify links to other resources
  • Links should be navigable
  • Links should be web-wide
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Data vs metadata

A

Data is the actual information/content
Metadata is data that describes other data e.g. file size, creation date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What are the 5 principles of interaction?

A
  • Provide representations
  • Safe retrieval
  • References doesn’t imply dereference - just because you can retrieve a representation, doesn’t mean you must
  • Reuse representation formats
  • Representations should be consistent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is a safe method?

A

One that doesn’t change the state of the resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is an impotency method?

A

One that doesn’t change the result even when applied multiple times
Only POST isn’t impotent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What do the HTTP responses (1xx - 5xx) represent?

A

1xx - informational message
2xx - success
3xx - redirection
4xx - client error
5xx - server error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are the 2 styles of content negotiation?

A

Sever-driven - server makes final choice of representation
Client-driven - clinet makes final choice of representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What are the 3 stages of server-driven content negotiation?

A

1) Client tells server what it is able to accept in request header
2) Server chooses appropriate representation to return to client based on “quality” (provided by client)
3) Server tells clients its choice in response header

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Name three properties that can be negotiated

A

Media type
Language
Encoding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What are the 3 stages of client-side content negotiation?

A

1) Client requests resource representation
2) Server returns HTTP redirect status (“300 multiple choices”) with list of URIs
3) Client requests a representation of one of the URIs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is “Client Hints”?

A

A HTTP extension that allows browsers to state their capabilities & preferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How can you avoid the lost update problem?

A

When carrying out unsafe methods, check if the state of the resource has change since the GET method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What 2 ways are there for validating if resources are the same?

A

Strong validation - checks if representations are byte-for-byte identical
Weak validation - checks if representations contain “the same content”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are ETags? What headers can be applied to them?

A

Entity tags are identifiers for resource versions

Headers:
* If-Match: <etag>, <etag>, ...
* If-None-Match: <etag>, <etag>, ...

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What are cookies?

A

A way for web servers to persist state across HTTP requests (even though HTTP is supposed to be stateless)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What are “Secure” and “HttpOnly” cookies?

A

Secure - indicates that cookies should only ever be sent over HTTPS
HttpOnly - cookies should not be visible from within the Document.cookie interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Discuss the physical limits on data transmission (3)

A

Sending a message at c (3e8) still takes 0.067s to go halfway round the world
Optical fibres are ~70% of c, coxial cables are >80% of c
Routers, switches, etc introducers delays

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

How does TCP delay HTTP?

A

HTTP runs of top of TCP
TCP establishes connections with a three-way handshake (>=0.2s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What 2 methods reduce TCP delay for HTTP?

A

Keep-Alive - TCP connections reused for multiple HTTP requests
Pipelining - multiple requests made without waiting for responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What 4 improvements were made to adhere to data transfer capacity limits?

A

Compressed headers to reduce amount of data sent
Prioritised requests - sends important content first
Multiplex requests - when client requests HTML document with multiple images, stylesheets & scripts, send a single connection for all resources
Server push - when a client requests HTML doc with image, instead of waiting for them to request image, pre-emptively push resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What are tunnels in the context of proxies?

A

CONNECT method establishes tunnel between client & server
With tunnel established, proxy server no longer inspects/modifies data; just forwards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

How is data secured in HTTPS?

A

Using the TLS (Transport Layer Security) protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What are the 4 cryptography principles?

A

Confidentiality - no unathorised reading
Integrity - no unauthorised modificaiton
Authenticaiton - proof of authorisation
Non-repudiation - data author can’t deny authorship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

How are digital signatures created (3 steps)?

A

Combines asymmetric encryption & cryptographic hash

1) Generate cryptographic hash of image
2) Encrypt hash with private key
3) Attach encrypted hash to message

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

How are digital signatures verified (3 steps)?

A

1) Generate cryptographic hash of image
2) Decrypt hash with public key
3) Compare hashes

If hashes match, message has not been altered and signature is valid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What is the Certificate Authority?

A

A trusted organisation that issues digital certificates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

How does the Diffie-Hellman Key Exchange work?

A

1) Prime number p and root module g are shared publically
2) A and B pick random large integers: a and b
3) A and B calculate g^amodp=PUa and g^bmodp=PUb respectively and send results publically
4) A calculates PUb^amodp and B calculates PUa^b*modp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Authentication vs authorisation

A

Authentication - verifying identity of user/system
Authorisation - granting access/permissions to resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What are the 6 steps in granting authorisation in OAuth (the diagram)?

A

1) Client requests authorisation from the authorisation server via the resource owner
2) Resource owner authenticates the request
3) Authorisation server sends an authorisation code to client via the resource owner
4) Client sends the authorisation code to the authorisation server
5) Authorisation server sends access token to client
6) Client accesses resource on resource server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is cross-site request forgery?

A

When a user (unintentionally) allows one origin to talk to a different origin
User clicks on a link/form while authenticated on a site allowing attacker to perform actions on site with user’s authentication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

How is cross-site request forgery prevented?

A

Same origin policy
Restricts web pages from making requests to different domains than the one that hosted it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

How is “same origin” determined?

A
  • URIs use same protocol
  • URIs have same host
  • URIs have same port
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What is the default port for HTTP?

A

80

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

What blocking exceptions are there to same origin policy?

A

Embedded resources (media, stylesheets, scripts, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What is CORS (cross-origin resource sharing)?

A

Security feature that relaxes SOP, allowing certain origins to make requests to a domain different to the one that served the web page
Servers indicate which origin may make requests and restrict headers send & received

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What criteria must simple requests satisfy for CORS?

A
  • Only methods: GET, HEAD, POST
  • Only headers: Accept:, Accept-Language:, Content-Type:, Content-Language
  • Content-Type: text/plain, (application/…), (multipart/…)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What is a CORS preflight?

A

Used for more complex requests (other methods, custom headers)
A preliminary HTTP request to check if actual request is allowed by the server before sending actual request

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What is SGML (Standard Generalised Markup Language)?

A

An old markup language (old version of HTML)
A language for defining markup languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What is XML (eXtensible Markup Language)?

A

A general purpose markup language
A W3C-defined subset of SGML
A language for defining domain-specific markup languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

What is DTD (Document Type Definition)?

A

A formal definition of the grammar for an XML document
Tells document processor how to parse the document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

What 2 schema language competitors does DTD have?

A

XML Schema
RELAX NG

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Document well-formedness vs validity

A

Well-formedness - obeys syntax rules in XML spec
Validity - well-formed and structure is based on a defined schema (e.g. DTD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What is SVG (Scalable Vector Graphics)?

A

XML-based language for describing 2D graphics
Uses CSS for styling & animation
Integrates with HTML5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What is MathML?

A

XML-based language for expressing mathematical expressions
Integrates with HTML5

2 sub-languages:
* Presentational MathML
* Semantic MathML

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

What is EPub (Electronic Publication) (4)?

A
  • A portable & flexible e-book format
  • Organises content in a standardised structure using XML/HTML, CSS, & images
  • XHTML use makes content adaptable
  • Manifest & spine ensure correct order of book’s components
76
Q

What is Open Office XML?

A

File format standard for Microsoft Office documents
Organises document into separate XML files for content, styles, & media in a ZIP

77
Q

What is PDF (Portable Document Format)?

A

A file format for rendering documents with a fixed layout
Includes interactive features (forms, annotations, links)

78
Q

What is HTML?

A

A markup language for structured documents
The data format for web pages

79
Q

What does HTML5 introduce?

A

Audio, video, & canvas

80
Q

What are the 4 benefits of HTML5?

A

Better support for modern web applications
Replacing Adobe Flash
Handling invalid markup
Improved semantics

81
Q

What does W3C (World Wide Web Consortium) do (3)?

A
  • Standardises technologies (like HTML) used on the web
  • Ensures websites/web apps work across different browsers/devices
  • Web accessible to all users
82
Q

What does WHATWG (Web Hypertext Application Technology Working Group) do?

A

Web standards get evolved continuously rather than having fixed versions

83
Q

Name 2 early JavaScript-based web applications

A

XHR (XML HTTP Request)
AJAX (Asynchronous JavaScript and XML)

84
Q

What 3 modes are used by layout engines in web browsers?

A

Quirk mode - doesn’t conform to official standards to maintain backward compatability
Almost standards mode - small no. quirks implemented
Full standards mode - behaviour describes by HTML & CSS specs

85
Q

What are the 4 design principles of web technologies?

A
  • Compatability
  • Interoperability - well-defined behaviour & graceful error handling
  • Utility - seperate content & presentation
  • Universal Access
86
Q

What is sectioning, flow, and phrasing content?

A

Sectioning content defines the structure (headings & sections)
Flow content fills sections with main elements (paragraphs, images, lists)
Phrasing content adds inline details (links, bold text, …)

87
Q

In what 3 ways can you style the web?

A

HTML style attribute - <h1 style="color: red;">Hello</h1>
Provide an inline stylesheet in a style element - <style>h1 {color: red;}</style>
Link to an external stylesheet

88
Q

Describe a CSS rule set

A

Consists of a selector e.g. h1 (to format h1)
And a declaration block (containing a series of declarations) e.g. font-size: 2em

89
Q

Explain the basic selectors:
E F,
E>F,
E+F,
E~F

A

E F - an F element that is a descendant of an E element
E > F - an F element that is a direct child of an E element
E + F - an F element that is immediately preceded by an E
E ~ F - an F element that is preceded by an E

90
Q

Explain the attribute selectors:
E[foo],
E[foo=”bar”],
E[foo^=”bar”],
E[foo$=”bar”],
E[foo*=”bar”]

A

E[foo] - an E element with a foo attribute
E[foo=”bar”] - an E element with a foo attribute whose value is bar
E[foo^=”bar”] - an E element with a foo attribute whose value starts with bar
E[foo$=”bar”] - an E element with a foo attribute whose value ends with bar
E[foo*=”bar”] - an E element with a foo attribute whose value contains bar

91
Q

Give an example of generated content declaration

A
h1:before {
   content: "Chapter " counter(chap) ": ";
   counter-increment: chap;
}
92
Q

What is the hierarchy of the CSS box model?

A

Margin
Border
Padding
Content

93
Q

What are collapsing margins?

A

Where adjoining vertical margins combine to form a single margin in a content box

94
Q

Explain the positioning of: static, relative, absolute, fixed

A

Static - default position
Relative - positions relative to default position, without affecting other elements
Absolute - positions relative to nearest positioned ancestor
Fixed - posiitons relative to browser window

95
Q

What problem does XSL (eXtensible Stylesheet Language) address?

A

The problem of maintaining different versions of a document for presentation on systems with varying screen sizes and document formats

96
Q

XSL-FO vs XSLT vs XPath

A

XSL-FO - specifies layout & formatting of XML documents
XSLT - transforms XML documents into different formats
XPath - query language for navigating & selecting nodes from XML document

97
Q

XSL vs CSS (pros & cons)

A

XSL
Pros: can modify & transform document structure
Cons: complex, cumbersome, no consideration for differing needs of users vs authors

CSS
Pros: simple, cascading
Cons: unable to modify document structure

98
Q

What are the 3 Web API classes?

A

Document content
Browser services
Hardware access

99
Q

What is DOM (Document Object Model)?

A

An API for accessing & manipulating XML & HTML in web browsers & elsewhere

100
Q

How does DOM work?

A

DOM is a node interface (node types can be documents, elements, text, etc)
Each node has parents, children, and previous & next siblings
Methods allow the manipulation of children

101
Q

What is Canvas?

A

API for drawing graphics via JavaScript

102
Q

What are XMLHttpRequest, Fetch, and Web Sockets?

A
  • APIs for fetching representations of resources
  • Fetch and Web Sockets are modern replacements for XMLHttpRequest (Fetch is one-time; Web Sockets are persistent)
103
Q

Cookies vs Web Storage vs IndexedDB

A

Web storage API is a more structured, modern alternative to Cookies
IndexedDB is a more powerful, asynchronous API for larger structured data

104
Q

What are Web Workers & Service Workers

A

Web Workers are JavaScript threads that allow long-running tasks to be executed without blocking the main UI thread
Service Workers are a type of Web Worker that act as a proxy between web application & network, allowing offline capabilities

105
Q

What are asm.js and WebAssembly?

A

asm.js - a subset of JavaScript that allows C code to be compiled & run in any browser
WebAssembly - a more efficient alternative that is a bytecode format for virtual machines

106
Q

What is SOA (Service-Oriented Architecture)?

A

A design approach where software components provide services to other components over a network -> flexible, modular, interoperable

107
Q

What is a monolithic design?

A

Where a program is “composed all in one piece” and is unable to be changed

108
Q

What are SOAP, WSDL, and UDDI?

A

W3C Web Services

  • SOAP for invoking services
  • WSDL for describing services
  • UDDI for publishing & locating services
109
Q

Explain the 4 levels of the Richardson Maturity Model

A

Level 0:
* Single endpoint to handle all requests
* Minimal use of HTTP standards - relies on single HTTP method
Level 1:
* Introduces resources with unique URIs (so multiple endpoints)
* Templates and tunneling used to represent resources
* Still lacks proper use of HTTP methods - relies on single HTTP method
Level 2:
* Embraces HTTP standards
* CRUD operations mapped to HTTP verbs
Level 3:
* Add hypermedia links with HATEOS

110
Q

What are the 2 pseudostates in UML statecharts?

A

Initial state
Final state

111
Q

How are choice pseudostates represented?

A

[In brackets]

112
Q

What is OpenAPI?

A

A specification for defining RESTful APIs
Formerly known as Swagger

113
Q

How does OpenAPI represent API descriptions?

A

In JSON or YAML

114
Q

What is Project Xanadu?

A

The first hypertext project aiming to connect documents bidirectionally
Founded in 1960 by Ted Nelson

115
Q

What are tumblers (Project Xanadu)?

A

The addressing system used in Xanadu
Tumblers are numeric, write-once identifiers used for addressing of document fragments

116
Q

In what 2 ways can links be stored in Project Xanadu?

A

First-class - separately from documents
Within document - in its link-train

117
Q

What are transclusions (Project Xanadu)?

A

Special links where documents can contain parts of other documents
Address of fragment is inserted and referenced content is retrieved

118
Q

How was the Project Xanadu system distributed? What protocols were introduced?

A

Separated front-end clients from back-end servers

Protocols:
* Front End to Back End - facilitated user interaction
* Back End to Back End - manage consistency between back end servers

119
Q

What are these: HES/FRESS (1967), ZOG (1972), KMS (1983), Hyperties (1983), NoteCards (1984), Intermedia (1985), HyperCard (1987)

A

Hypertext systems

120
Q

What are the 4 types of hypertext systems?

A
  • Macro Literary Systems
  • Problem Exploration Tools
  • Structured Browsing Systems
  • General Hypertext Technology
121
Q

What were Halasz’s 7 issues?

A

1) Search and query - link navigation not always best to find things
2) Composites - augmenting the basic model
3) Virtual structures - documents defined by queries
4) Computation in hypermedia networks
5) Versioning
6) Support for collaborative work
7) Extensibility

122
Q

What were Halasz’s 4 ammendments to his 7 issues?

A
  • Ending the Tyranny of the Link
  • Very Large Hypertexts
  • Open Systems
  • User Interfaces for Large Information Spaces
123
Q

What do the following mean in graph theory:
* Diameter
* Average-case diameter
* Degree of a vertex
* Density

A

Diameter - longest distance between any 2 vertices
Average-case diameter - average distance between any pair of nodes
Degree of a vertex - number of edges connected to it
Density - ratio of edges to vertices

124
Q

What does centrality represent in graph theory?

A

The importance of a node, not “near the centre”
Centrality can be measured in a variety of ways (4)

125
Q

How is degree centrality found?

A

Simply the number of edges connected to a node

126
Q

How is betweenness centrality found?

A

The number of shortest paths going through a node

127
Q

How is closeness centrality found?

A

The average of the shortest distances from a node to all other nodes
i.e. find the shortest path to every other node and average the distances

128
Q

What is a small world network?

A

A graph where most nodes are not neighbours but most nodes can be reached by a small number of connections

129
Q

What is the average degree of a node?

A

log(n) where n is the no. nodes

130
Q

What is Barabasi-Albert model?

A

A tool for understanding the growth and structure of networks
Nodes have preferential treatment - the probability of a new node linking to an existing node is proportional to the degree of that node

131
Q

What 3 things does Google’s PageRank Algorithm account for?

A

No. links
Link quality
Link content

132
Q

What is a Web Crawler?

A

An algorithm that systematically crawls to web
Starts at a webpage, follows hyperlinks from that page, and follows links on those pages, etc…
Collects & stores metadata about pages

133
Q

What are the 4 policies of Web Crawlers?

A

Selection policy - states which pages to index (breadth-first, PageRank, …)
Re-visit policy - revist pages since they change (especially frequent changers)
Parallelisation policy - run multiple processes in parallel e.g. assign a crawler to do all hashs divisible by 5
Politeness policy - don’t make parallel calls to same server - spread requests

134
Q

What are 2 ways people boost the search rankings of their pages with Search Engine Optimisation?

A

Legitimate SEO (white hat) - prioritises user experience, good design, valid metadata
Illegitimate SEO (black hat) - manipulates search engine

135
Q

How is SEO combatted?

A

Regularly update algorithms
Favour legitimate SEOs and penalise illegitimate SEOs

136
Q

What is IETF (Internet Engineering Task Force)?

A

Makes web standards / is a web standard organisation (like W3C)
Consesus-driven decision-making

137
Q

What is ergodic literature?

A

Where, not only does the reader read the literature, they also make decisions (Minecraft type beat)

138
Q

What is ludic narrative?

A

Storytelling approach where plot unfolds through interactive mechanics, allowing players to influence story through choices & actions

139
Q

How is hyperdrama different to traditional drama?

A

Interactive, non-linear narratives
Can be experienced across different platforms

140
Q

What is a journal’s impact factor?

A

An indicator of its importance within its field
Decided through a rigorous review process

141
Q

When did scholarly publishing begin?

A

1665

142
Q

What are the 2 problems with journals having subscription costs?

A

Impedes new discoveries by limiting key information
Low-income countries & independent scholars can’t access them

143
Q

What are the 4 ideals of Open Access?

A

Entirely online
Available 24/7
All papers citation-linked
Full searchable, navigable, and retreivable

144
Q

Green OA vs Gold OA

A

Gold OA - articles made freely available, with costs covered by author
Green OA - as Gold OA isn’t always possible, offers access to an earlier version of research in public repository

145
Q

What are the 4 problems with OA?

A

50% of new papers still behind a paywall
Green OA relies on publishers policies
Gold OA relies on publishers changing their model
Scientific publications are lucrative

146
Q

What 3 things differentiate open hypermedia from traditional hypermedia?

A
  • Dynamic structure
  • Decentralised
  • More interactive
147
Q

What is DHRM (Dexter Hypertext Reference Model) and Hyper-G?

A

DHRM (1988-90)
A model of an open hypertext system
Compares the functionalities of existing systems

Hyper-G (1989-90)
An implementation of an open hypertext system
Aligns with the Dexter model

148
Q

What are Hyper-G’s 2 link categories?

A

Core links - relate documents stored on same server; changes processed by single server
Surface links - relate documents stored on different servers; changes involve multiple servers

149
Q

What are the 2 types of link integrity failure?

A

Dangling link problem - when an endpoint refers to an invalid node
Content reference problem - when an endpoint refers to a valid node, but to an invalid location within that node

150
Q

What is microcosm?

A

A distributed system of interconnected servers
Organises data into self-contained units called microcosms
Microcosms communicate with each other to share data & updates

151
Q

What are the 2 advantages and 2 disadvantages of microcosm?

A

Advantages:
* Flexible, varied linking
* Integrates with 3rd party applications

Disadvantages:
* Poor scalability
* No support for link integrity

152
Q

What does shim do in OHP (Open Hypermedia Protocol)?

A

Translates between OHP and the linkserver’s native protocol

153
Q

What is a LocSpec in OHP?

A

Identifies the position of an anchor (by byte offset / specific string occurence / named location)

154
Q

What is the advantage and disadvantage of OHP?

A

Advantage:
* Commonly used & accepted model

Disadvantage:
* High message overhead

155
Q

How are linkbases automatically created (OH)?

A

1) Extract key phrases from documents based on location & frequency
2) Create generic links to those documents based on phrases
3) Use linkbase editor to delete unwanted links

156
Q

What 4 places can we inject links?

A

During batch processing
On demand in origin server
Using proxy server
In the user agent

157
Q

What are the 3 limitations of HTML linking?

A
  • URIs can identify entire resources but not parts of them
  • Unflexible tags
  • Links are fixed & one-directional
158
Q

What is XPointer in relation to XPath?

A

XPath can’t identify ranges in a document, only elements
XPointer allows XPath path expressions to be used as fragment identifiers in URIs

159
Q

What is XLink?

A

A W3C standard for creating hyperlinks

160
Q

What is the state of XLink now?

A

Declined use due to shift from XML to JSON
HTML & JavaScript have largely replaced its functionality

161
Q

How is JSON superior to XML (4)?

A

Simpler, more concise, easier to read & write
Better parsing
Smaller file size
No strict schema

162
Q

Contextual vs behavioural advertising

A

Contextual - based on content in the page
Behavioural - based on profiling of a specific user

163
Q

AdWords Auctions for contextual ads (4 steps)

A

1) Find all ads whose keywords match search terms
2) Ignore ads in ineligible countries
3) Calculate AdRank
4) Show ads with sufficient AdRank

164
Q

What is AdSense?

A

Contextual advertising based on keyword matching
Publishers partake in auctions for different ad positions

165
Q

What are the 3 advertising cost models?

A

Cost per Mille - pay per 1000 impressions (displays of ad)
Cost per Click - pay when someone clicks on ad
Cost per Action - pay when someone does a specific action linked to the ad (e.g. sign up to mailing list, purchase something, …)

166
Q

What is RTB (Real Time Bidding)?

A

Utilises computer algorithms to automatically buy & sell ads in real-time
Uses per impression content

167
Q

What ecosystems is RTB (Real-Time Bidding) built on (4)?

A

Ad exchange - marketplace for advertising space
Supply-Side Platform - for selling advertising
Demand-Side Platform - for buying advertising
Data Management Platform - manage cookie IDs to target ads

168
Q

In what order is data built on through the advertising ecosystem (4)?

A
  1. Browser
  2. Website adds
  3. Add server adds
  4. Advertiser adds
169
Q

What is fingerprinting?

A

A stateless tracking technique (like cookies)
Identifier derived from information taken from device (like a hash)

170
Q

Semantic Web vs Annotated Web

A

Semantic Web - extension of current web; enables data to be interconnected & understood by machines

Annotated Web - a web where extra data (metadata/annotations) is added to content to make it easier to understand for humans & machines

171
Q

What are the 2 components of resource description framework?

A

Literals (object) - have value but no identity
Resources (subject) - represent objects with identity

Linked with a predicate (arrow)

172
Q

What are the 4 semantic web publishing practices?

A

1) Using URIs to identify things
2) Making URIs accessible via HTTP
3) Providing useful information when URIs are accessed
4) Linking to other URIs for better connectivity

173
Q

What are the 5 quality levels of linked data?

A

Lowest to highest:
* 1: available on the web
* 2: machine-readable
* 3: non-proprietary format
* 4: uses W3C standards
* 5: linked to other people’s data

174
Q

Do hypertext links have to be explicit, static and textual?

A

No:
* May be derived from underlying relationships
* Can be computed on the fly
* May change depending on who’s viewing them

175
Q

What is sculptural hypertext?

A

Instead of adding links between nodes, assumes all nodes are linked and remove links until desired structure achieved

176
Q

What are the following hypermedias:
* Temporal
* Computational
* Conceptual
* Pervasive
* Adaptive

A

Temporal - content changes over time
Computational - content dynamically generated/altered using algorithms/user input
Conceptual - links based on ideas/concepts, not just direct connections
Pervasive - content linked with real world e.g. augmented reality
Adaptive - content & links personalised based on user preferences

177
Q

What is net neutrality?

A

The principle that Internet Service Providers must treat all Internet communications equally

178
Q

What are the 6 arguments for net neutrality?

A

Accessibility
Choice - promotes competition
Experimentation - equal access for all companies
Free Speech
Innovation
Unbiased

179
Q

What are the 3 arguments against net neutrality?

A

Anti-competitive
Increases quality
Regulation - stop illegal content

180
Q

What 3 intellectual properties are there on websites? How are they protected?

A

Text/design/graphics/layout/music/software - protected by copyright
Logos/branding - protected by trademark rights
Underlying database - protected by database rights

181
Q

What 4 rights does a copyright owner have?

A
  • Copy the work
  • Make an adaptation of a work
  • Show the work to the public
  • Issue copies to the public

(For a limited time)

182
Q

What are the 2 types of copyright infringement?

A

Primary infringement - anyone who does any of the rights of the owner
Secondary infringement - if someone facilitates other people infringing on a copyright

183
Q

What is cyber squatting?

A

Registering a domain name with no legitimate reason other than to benefit from another’s reputation
Make money by trafficking users to unrelated material or selling domain at inflated price

184
Q

How can trademark holders police the use of its mark?

A

Instead of being responsible for policing, can use the Creative Commons Attribution license

185
Q

What do the following CCAs allow for:
* CCA
* CCA-ShareAlike
* CCA-NonCommercial
* CCA-NoDerivs
* CCA-NonCommercial-ShareAlike
* CCA-NonCommercial-NoDerivs

A

CCA: can distribute/remix/tweak/build on work as long as credited
CCA-ShareAlike: licence any new creations under identical terms
CCA-NonCommercial
CCA-NoDerivs: work must remain unchanged
CCA-NonCommercial-ShareAlike
CCA-NonCommercial-NoDerivs