exam_2017 Flashcards

1
Q

Using examples from NFS, explain why distribution transparency and failure transparency are impossible or hard to achieve

A

Distribution transparency, because a user can’t tell whether NFS is down or the network is badly congested, because it’s impossible to distinguish between a dead process or a slow responding process. These communication latencies can’t be hidden.
Failure transparency, because a user can’t tell whether the server performed the operation if the NFS crashed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are scaling techniques?

A
  • Bigger Machines
  • Virtualisation
  • Asynchronous communication
  • Replication & Caching
  • Partitioning
  • Software Optimisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A mathematical proof states that reliable failure detection is impossible. Why then is it possible to do reliable failure detection in practice?

A
  • By using flooding consensus, an agreement is reached in which a selected leader gets accept messages from a quorum of servers. Two examples (fail-noisy methods) are Paxos, which is used by Google, or the more understandable and formally proven correct protocol: Raft.
  • By making agreements on a certain amount of time after which a server is considered down.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is consensus?

A

The process by which we reach agreement over system state between unreliable machines connected by asynchronous networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Paxos trying to solve?

A

How do we reach agreement over a single value in a scenario where failures might occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are Paxos stages?

A

It is essential to have a multi-state process.

  • Promise and commit
  • Majority agreement
  • Monotonically increasing numbers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is reliable failure detection important for consensus in a process group?

A

To achieve overall system reliability in the presence of a number of faulty processes, or else a process may wait infinite time for a response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does asynchronous communication help to build large systems? Give an example.

A

Async communication helps because systems don’t have to wait on each other to send bits over the line. A start and stop bit let the client know that the information is complete. Downloading or sending files or emails are examples of async communication.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 4 types of servers for google search?

A
  • Root
  • Cache
  • Parent
  • Leaf
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Scaling techniques for types of servers google search?

A

root: software optimisation
Cache: replication/caching
Parent: partitioning
Leaf: partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Functions for root, cache servers google search?

A

root: handles browser requests, acts as front-end web server
cache: Stores temporary requests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Functions for parent, leaf servers google search?

A

parent: distribute queries as in a multi-level tree
leaf: index/doc requests are handled from in-memory data structures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the pros of in-memory indexing systems?

A

Big increase in throughput.

Big decrease in query latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Issues of in-memory indexing systems?

A

Variance: query touches 1000s of machines, not dozens
Availability: 1 or few replicas of each doc’s index data
Queries of death

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are canary requests?

A

Request to check health status of a machine. You send a request to check if it works on one server first, if it fails unexpectedly, try another machine (could be coincidence). If fails K times, reject request

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the repository manager do?

A

Coordinates index switching as new shards become available

17
Q

What were the problems with traditional google search system?

A

More collections to search besides web. For example, Google Maps. You need more real-time results

18
Q

How was creating the index done first?

A

It was a batch process via MapReduce.

  • Store all documents in GFS
  • Run several MapReduce jobs to create index
  • Upload index to Leaf servers
19
Q

What was the problem with the MapReduce index method?

A

New documents would not show up in search results for 2-3 days.

20
Q

What solutions replaced mapreduce

A

Data storage system: Colossus / BigTable

Event-driven, incremental processing: Caffeine / Percolator

21
Q

What is BigTable?

A

A distributed storage system. A given table is a three-dimensional structure containing cells indexed by a row key, a column key and a timestamp. Each table may consist of many tablets. It’s typically used to replicate data to multiple bigtable clusters in different datacenters.

22
Q

What makes BigTable scalable?

A
  1. There is no versioning (timestamp is the version)
  2. Automatic resource management (less manual labor and instant resource availability)
  3. Tablets in table split if getting too big
  4. Different machines can handle different tablets, which results in the workload divided equally over resources
23
Q

What makes caching one of the most efficient scaling techniques?

A

A couple machines can do the work of a substantial amount of machines. It reduces network traffic, access latency, workload of server, and the robustness of the service is enhanced and the access time is shorter

24
Q

What is the disadvantage of caching?

A

Big latency spike/capacity drop when complete index updated or cache flushed.
- In some cases the data might be outdated, though there are methods to prevent this.

Cache misses increase lookup time because there’s already time spent looking into the cache.

25
Q

What are the benefits of IaC?

A
  1. Disaster recovery (much quicker because complete config is stored as code)
  2. Consistency
  3. Speed (tasks can be done in parallel)
  4. Version control
  5. Risk of bugs can be reduced
  6. Cost, one can do the work of many
26
Q

How is IaC vital to CI/CD?

A

The idea is that CICD was originally only focused on the applications and they made assumptions about the infrastructure. With IaC, the test and production environment are equal because both environments are built from the same definition files. Because the infrastructure is defined as code, it can be spinned up automatically and therefore every small patch can be pushed to the production environment continuously

27
Q

What are the powers and names?

A
2^10=1 Kilobyte
2^20=1 Megabyte
2^30=1 Gigabyte
2^40=1 Terabyte
2^50=1 Petabyte
28
Q

How many MB in 1 petabyte

A

2^30=1073741824.

1 with 9 numbers

29
Q

What are the 5 main stages within the ITIL Framework?

A
Service Strategy. ...
    Service Design. ...
    Service Transition. ...
    Service Operation. ...
    Continual Service Improvement.
30
Q

In the DevOps philosophy, operations people should be more involved in the earlier stages of the life cycle of a service as defined by ITIL. For the first 3 stages explain where and how operations people can contribute.

A

Strategy: defining the business model
Design: What are the requirements, what does dev operations need? Are we going to use a ticket system, what are the service hours, SLA, do we need replication etc.
Transition: Do we agree? Is this what we wanted? Since operations is closer to end-users, they may select users to test the product (pre release testing operations)

31
Q

Name two assurances that a Change Request Board (CRB) provides.

A
  • Assisting assessment/prioritise/approve requested changes into the live environment
  • Approved changes are managed in a rational and predictable manner to ensure all changes meet the quality requirements by enforcing change and release policies and procedures.
32
Q

Give 3 arguments why DevOps can reduce the need for a Change Request Board.

A

Many aspects are being covered by DevOps:

  • Change-management processes
  • Establishing definitions/standards for rating change risks
  • Rejecting or approving change requests
  • Coordinating post-change activities
33
Q

Give 2 examples of how DevOps can make it easier to meet Launch Readiness Criteria.

A

Make sure the service is monitored, SLA is defined, backups and restores are tested/working, user documentation/training is complete