General Study Flashcards

1
Q

Explain the Pub/Sub Model (4 Features)

A
  • Decoupling
  • Messaging Server
  • Asynchronous Communication
  • Pub/Sub Model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the use of a queue model do to a Publish Subscribe Model

A

The use of a strict queue means that only one user will be able to read the message

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give an example of a publisher subscribe model.

A

Youtube Subscribers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Local Conceptual Schema

A

A fragment of the global conceptual schema stored locally on site. Has its own local physical schema.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Advantages of Distribute + Replicate

A

Performance, Fault Tolerance, Scale Up, Application Related Aspects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When not to replicate

A
  • When there is Low Replication Transparency
  • When there is Data Consistency Issues
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Update Propagation Protocol?

A

Set of rules for how changes/updates to data are propagated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Synchronous/Primary Copy UPP:

A

Primary Copy
On Read: Read Locally and Return to User
On Write: Write locally, multicast to replicas
On Commit Req: Run 2PC Coordinator
On Abort: Abort + Inform other sites.

Secondary Copy
On Read: Read Locally
On Write from Client: Refuse or forward to primary.
On Write from Prim.: Write locally, multicast to replicas
On Commit Req: Commit locally
Participant of 2PC on primary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Synchronous/Primary Copy Advantages

A
  • Updates don’t have to be coordinated
  • No inconsistencies or Deadlocks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Synchronous/Primary Copy Disadvantages

A
  • Long response time
  • Useful only with few updates
  • Local copies almost useless
  • Not used in practise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Synchronous/Update Everywhere Advantages:

A
  • No inconsistencies
  • Elegant (Updates applied uniformly)
  • Data Consistency
  • Fault Tolerance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Asynchronous/Primary Copy

A

Primary Copy
On Read: Read Locally and Return to User
On Write: Write locally, return to user
On Commit/Abort: Terminate locally
After Commit: Multicast changed objects in single array.

Secondary Copy
On Read: Read Locally
On Write from Client: Refuse or forward to primary.
On Message from Prim.: Install changes in order
On Commit/Abort: Commit locally
Only local deadlocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Asych/Primary Copy Advantages

A
  • No coordination needed
  • Short response time
  • Good Performance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Asynch/Primary Copy Disadvantages

A
  • Local Copies not updates
  • Inconsistencies
  • Limited Fault Tolerance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Asynch/Update Everywhere

A

Primary Copy
On Read: Read Locally and Return to User
On Write: Write locally, return to user
On Commit/Abort: Terminate locally
On msg from other site: Detect Conflicts
After Commit: Multicast changed objects in single array.

Secondary Copy
On Read: Read Locally
On Write from Client: Refuse or forward to primary.
On Message from Prim.: Install changes in order
On Commit/Abort: Commit locally
Only local deadlocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain ROWA

A

Read One, Write All. Replication strategy where each site keeps local data copy and rules govern how operations are handled.

Concepts:
- Each Site uses 2PL
- Read Ops performed locally
- Write Ops performed at all sites

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Reconciliation patterns

A

1) Latest Update Wins (Most recent update preferred when conflict occurs)
2) Site Priority (Prioritize Updates from HQ)
3) Largest Value (Prioritize Largest transaction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Ad-hoc Reconciliation strategies

A

1) Identify Changes and try combine them
2) Analyze and eliminate unimportant transactions
3) Create your own priority schemas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Replication Protocols to deal with replications with failures

A

Site Failures -> Use ROWAA
Network/Comm errors -> Use Quorums

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Quorum

A

Needs to reach a certain threshold of successful responses before considering a transaction committed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

ROWAA (Primary Site)

A

Read -> Read any copy, if timed out read another copy

Write -> Send write(x) to all copies. If a site rejects, abort. All sites that don’t respond are missing writes.

Validation -> To commit a transaction. Check if missing writes are still down, if no then abort. Also check if available sites are still available, if no then abort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

ROWAA (Update Everywhere)

A

Read -> Read any copy from available sites

Update -> Update any copy from available sites

Modify -> Run a special atomic transaction at all sites: Make sure no concurrent views exist. Make sure sites are of the highest version.

Recovery -> Get missed updates from active nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does NOTIFY do?

A

Raises notification event on certain cannel to clients that subscribed. If no session is listening, notifications are lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Notification Structure

A

Channel, Process ID, Payload

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Notification Syntax

A

Notify Channel [, msg_payload]

OR

pg_notify(chn::text, msg::text);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does Listen do

A

Registers the spawning session to a notification channel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Listen Operations:

A
  • When notification raised, registered sessions are notified
  • Sessions can issue ‘unlisten’ to server
  • Registrations automatically UNLISTENED when session ends.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Listen Syntax

A

LISTEN channel

UNLISTEN channel

or

pg_listen conn notifyName

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Timing of Producer sending messages

A

Calling Notify

  • If raised during transaction, queue notification until after commit.
  • If transaction rolled back, notification never delivered
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Timing of Consumer seeing messages

A

Calling Listen

-If called during transaction, no access until local session transaction commits or rolls back.
- The channel is accessed after the local session terminates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the Notify Listen Model

A

Implementation mechanism of Pub/Sub Models. Notify and Listen are part of dblink extension (non-standard SQL).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Generate Json from a table

A

COPY(SELECT row_to_json(r) FROM (SELECT * FROM scott.dept r)) to ‘filename.json’;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Generate XML from data

A

SELECT table_to_xml(‘scott.dept’, true, false”);

SELECT query_to_xml(‘scott.dept’, true, false”);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Show how XPATH is used

A

SELECT data FROM test_table WHERE CAST (xpath(‘||root|s_node*|text()’,data) as text[]) = ‘{a_value}’;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Get a resource using CURL

A

curl
–insecure
–request
GET “resource url&key”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is REST (Just definition)

A

Representation State Transfer is an architecture used for interfacing with web services. Any API following REST principles is considered RESTful.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are the Principles of REST (There are 6)

A
  • Client Server Architecture
  • Stateless Operation (receive enough info to understand msg in isolation)
  • Resource Caching (Requests may be answered through a cache)
  • Uniform Interface (Server announces available actions and resources for ease)
    -Layered System (Client can’t tell if it’s connected directly or through middleman)
  • Code On Demand (optional; clients can send executable code)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is an idempotent operation

A

An operation that can be carried out multiple times while leaving the server in the same state ex. x=1.

Fault tolerant. Safe as doesn’t alter state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the function and idempotency of the following:
GET
PUSH
PUT
DELETE

A

GET - returns representation of a resource, idempotent.

PUSH - creates new resource, server decides and returns URL. Not idempotent nor safe.

PUT - creates new resource, client decides and returns URL. Idempotent but not safe.

DELETE - remove specified resource (not always physical). Idempotent but not safe.

40
Q

What are the prerequisites of interfacing web services to application functionality.

A
  • Service info transparent to language
  • Service discovered across large collection of services and servers.
  • Data exchange between machines.
  • Dealing with errors, server comm. unavailable/busy.
41
Q

What is SOAP? Is it industry Grade?

A

Simple Object Access Protocol is a protocol for inter-app communication with web services, used to exchange structured messages (ex. XML).

Not industry grade as it lacks guarantees of features such as transactionality. It is extendible to become industry grade,

42
Q

What is the SOAP messaging stucture?

A

Envelope: The root of the message.

Header: Holds info about processing of message. mustUnderstand? Can have multiple headers. Optional

Body: contains the actual message in XML. Either app-specific data or a fault message

Fault: Communicates errors. Standardized codes/messages containing; Fault code, fault string, fault actor, details.

43
Q

What is SOAP Binding

A

Can bind to different protocols ex. SMTP
- Put SOAP envelope inside protocol as a payload to that protocol
- With HTML using POST - envelope is big.

44
Q

What is WSDL

A

Web services description language is a contract between service and consumer. It specified the location and method of a service.

45
Q

What is UDDI

A

Universal Description, Discovery and Integration is an online directory to query and search for web services. New services can be published to it.

46
Q

Explain ACID

A

Atomicity: Either all operations of a transaction go through or none of them do

Consistency: Data should remain correct

Isolation: To preserve consistency , two conflicting operations are not permitted

Durability: After completion of transaction, impact should persist even if system fails.

47
Q

CAP Theorem and Solution

A
  • Consistency
  • Availability
  • Partition Tolerance

Can beat it through eventual or casual consistency

48
Q

What is BASE

A
  • Basically Available (Available through repetition)
  • Soft State (Values in DB can change over time)
  • Eventually Consistent (DB can become consistent in long run)
49
Q

What is Eventual Consistency?

A

Given some DDB, replicas are guaranteed to be consistent in state at some point in time with no writes present.

Allows us to have availability and partition tolerance, however no guarantees on conflicts or order.

has data inconsistency, transaction inconsistency, and integrity invariant violation.

50
Q

What is a Casually Consistent DB?

A

Order of operations guaranteed, order of concurrent writes not guaranteed.

  • CAP Free
  • Strongest consistency model in fault tolerant DDB
    May slow performance, doesn’t guarantee durability
51
Q

Some approaches to a casual consistent DB

A
  • Use RDBMS as prim. storage
  • Deliver DDB as middleware
  • Deliver CC+ with read only transactions/invariant preservation.
52
Q

Define CQS

A

Command Query Separation (Change/Read Systems State)

53
Q

Define CQRS

A

Command Query Responsibility Segregation (Favours separate data models for Read + Write)

54
Q

Define ES

A

Event Sourcing (data changes captured as sequence of events, in a log)

55
Q

What is D-Thespis

A

Middleware that delivers a CCDBMS for data. Accessible via REST API.

Improvements:
- Elastic horizontal scalability
- Improved update visibility latency

56
Q

Features of D-Thespis Model (7)

A
  • Rest Client API (Clients to R/W)
  • Middleware Engine (NO knowledge of model but serves READ requests)
  • Actor Provider (encapsulates logic to be executed)
  • Data Centre Clock (other layers use to obtain physical timestamp)
  • Cluster Clock (Access to read and maintain stable version vector)
  • Data Replication Job (Periodically executes data replication protocol)
  • Data Snapshotting Job (Periodically identify data entities with no more dependencies)
57
Q

What are the two forms of client server communication, define them.

A

TCP/IP: Connection oriented, guarantees data transfer, has checksum.

UDP/IP: Connectionless protocol, single datagram sent and no acknowledgement for delivery.

58
Q

What are the steps for defining the Implementation of IP using TCPIP in OS?

A

Allocate local resources, Specify local+remote endpoints, initiate connection, send or receive data.

59
Q

What command is used to allocate port number

A

Bind()

60
Q

Socket

A

API to TCP. Developed on UNIX as a set of OS calls:
- Comms. connection point that can be named and addressed in network
- Data structure
- Set of API functions

61
Q

What are the three steps for Client-Software Communication?

A

1) Client prepares endpoint of addresses of server in corresponding data structure and returns reference to it

2) Client issues socket() call to create socket

3) TCP client issues connect call, fills socket data and attempts connection. UDP client uses sendmsg().

62
Q

List the steps in the TCPIP connection flow.

A

1) Server creates socket and waits for remote clients to connect (listen())

2) Client calls connect(), server issues accept(). TCP 3 way handshake occurs (SYN, SYN/ACK, ACK)

3) Clients + Server exchange data over socket using read() or write(). Typically ACK each write.

4) Either party closes (FIN).

63
Q

What are the differences between TCP and UDP?

A

Reliability: TCP uses ACK, has retransmission and timeouts. UDP has none of those.

Order: TCP is ordered, messages received in order sent. UDP has no order or guarantees.

Overhead: TCP has high overhead due to 3 way handshake. UDP has no overhead.

Method of Transfer: TCP data read in streams of bytes with no message boundaries. UDP reads data as packets with boundaries.

Applications: TCP has HTTP, SMTP, FTP… UDP has Video streaming and VOIP

64
Q

How do you define a new socket in Java?

A

Socket s = new Socket(IP, Port);

65
Q

What is the conceptual server algorithm?

A

Create socket, bind to service port, repeat indefinitely until closed.

66
Q

Define Iterative and Concurrent Servers.

A

Iterative - Serve one client at a time until termination.

Concurrent - Serve several clients, using time slicing to not block clients.

67
Q

Define and mention the differences between HTTPs and Web Sockets

A

HTTPs - protocol above TCP, used by REST. Client Requests, server responds. Long Polling used (only responds if new message available or timeout reached).

Websockets - Protocol, starts with ws(s)://. Allows for sending data similar to UDP but with TCP reliability. HTTPs used as initial transport mechanisms, request to open a websocket and respond if possible. If successful both parties use existing TCP connection as a websocket connection.

68
Q

What can unrestricted sharing lead to (3 - and explain them)

A
  • Lost Updates (Two concurrent updates on same data, only one goes through)
  • Inconsistency Read (When transaction reads data thats being modified)
  • Dirty Read (When data is read that has been modified but not committed, if rolled back it is a dirty read)
69
Q

What is a transaction and what are its components?

A

Sequence of actions that realize a logical operation. Components:

  • Disk read and write operations
  • Actions supplemented by control instructions (start/commit/rollback)
70
Q

What does a transaction model assume?

A
  • R/W order is unchanged when being processed
  • Transaction assumed to be serial and correct in its totality
71
Q

What does a Transaction processing system do?

A

Applies and processes transactions over a DB. Must push for highest level of transaction troughput.

72
Q

What makes a good TP system?

A

Generates good interleaving or avoids bad interleaving. We want good results without having to know what each transaction is up to. Each update activity broken down into a primitive R/W operation.

73
Q

Comment on recoverable schedules.

A

Straightforward recovery process:
- Recoverable
- Cascading rollback
- Strict Schedule

74
Q

What is the serializability theory?

A

A schedule that executes transactions in their totality is serial. Assume serial schedules of independent transactions are correct. A schedule of N transactions is serializable if equivalent to a serial schedule of the same N transactions.

75
Q

What is a precedence graph and what does it do?

A

Directed graph in which nodes represent transaction on the schedule and the directed edges represent conflict operations between two transactions.

1) Looks at only r(x) or w(x) ops.
2) Constructs precedence/serialization graph
3) Edge created between nodes if preceding node operation appears before conflicting operation in latter node
4) Schedule serializable if and only if precedence graph has no cycles.

Topological sort extracts a serial schedule.

76
Q

What are the conditions for view equivalence?

A
  • Corresponding read operations in each schedule return the same values
  • Both schedules must return final DB state (last write must be the same)
77
Q

What’s the difference between Monolith and Microservices?

A

Monolith is one large service. Microservices are multiple small systems collated into a large one.

78
Q

What can arise when multiple clients work with the same data?

A

Stale Data: Data which has changed since being retrieved by the current process.

Pessimistic Locking: Resource is locked as soon as it is accessed and released as soon as all intended changes are committed. Prevents conflicts.

Optimistic Locking: Resources can be read and changed freely (assumed changes won’t conflict). Then check for conflict when committing changed result and act according to specified conflict resolution protocol. Avoids overhead of locking resource for a long period of time.

79
Q

Java RMI abstractions

A
  • Remote interfaces (allows remote invocation)
  • Stubs and Skeletons
    • Use proxy for the remote object on the client, and skeleton on the server side to receive incoming method calls, making remote method calls appear as if they were local method calls.
  • Serialization - Remote passing of objects and data without needing to serialize or deserialize it.
  • Naming Service - Use a registry to locate the objects, simplify look-ups
80
Q

What is transparency

A

Refers to the separation of the higher level semantics of a system from the lower level implementation issues.

81
Q

Explain 3 different types of transparency.

A

Network transparency - Hide existence of network from end user. Isolation from network artefact’s implementation details.

Location Transparency - The process is independent of the processor that executes it.

Naming Transparency - Each DB object has a unique name.

Fragmentation Transparency: Queries must be broadcast to all fragments and results collated.

Replication Transparency: Should end users be aware that the DB uses replication?

82
Q

How can we provide for transparency?

A
  • Access Layer for data resources
  • Operating system for network resources
  • DDBMS takes role of DBMS, OS etc.
83
Q

Explain the ANSI/SPARC Model.

A

Layers:

  • End User
  • External Level (Interacts with users and represents data)
  • Global Conceptual Schema (To link local conceptual schemas)
  • Conceptual Schema (Defines structure of DB w/o implementation details)
  • Internal Level (Deals with physical storage and devices. Low Level Details)
84
Q

2 Advantages of ANSI/SPARC

A

Data independence
Modularity & Flexibility

85
Q

2 Disadvantage of ANSI/SPARC

A

Complexity & Overhead
Low Performance

86
Q

What are the architectural components of a DDBMS?

A
  • User Processor
  • Data Processor
87
Q

Name some global directory issues.

A

Global vs Local
Central vs Distributed
Single vs Multiple copies

88
Q

RPO

A

Recovery Point Objective (At start of data loss)

89
Q

RTO

A

Recovery Time Objective (At start of data being available again/recovery)

90
Q

Consensus Issue

A

When all data servers present agree on a value, we have a consensus

91
Q

Paxos

A

Family of consensus algorithms. Asynchronous. Totally order transactions across data servers by consensus. Allows select + appointment of leader by consensus.

92
Q

Characteristics of Paxos

A

Accommodate a certain level of failures (some units). Messages can be lost, delayed, re-received but never corrupted.

93
Q

Requirements for Paxos

A

Single Replica Semantics, Data Consistency between values, Progress expectation

94
Q

Transactions outcomes

A

Either commit or rollback.

95
Q

Is 2PC fault tolerant and why?

A

No because it blocks all participants when coordinator is blocked. It is a consensus broker.