2.2 Back of the Envelope Flashcards

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the approximate value of 2^10?

A

1,024

1,024 is also referred to as 1 KB (kilobyte).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the exact value of 2^20?

A

1,048,576

1,048,576 is also known as 1 MB (megabyte).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the approximate memory size of 2^30?

A

1 GB

1 GB (gigabyte) is equivalent to 1 billion bytes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the latency of an L1 cache reference?

A

0.5 ns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

True or False: The latency of a mutex lock/unlock is 25 ns.

A

False

The latency of a mutex lock/unlock is actually 100 ns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the latency for reading 1 MB sequentially from SSD?

A

1,000,000 ns (1 ms)

This reflects a speed of approximately 1GB/sec for SSD.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens during a typical first year for a new cluster?

A

Various hardware failures and maintenance events

Includes overheating, PDU failures, rack-moves, network rewiring, and individual machine failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a key characteristic of Google’s computing environment?

A

Large clusters of commodity PCs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the challenges faced in the Google engineering environment?

A

Need for distributed systems, high capacity systems, and careful design

This is due to data or request volumes that exceed the capability of a single machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the significance of designing software interfaces carefully?

A

Ensures flexibility and usability for other hypothetical clients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Protocol Buffers?

A

A good protocol description language used by Google

It is self-describing, supports multiple languages, and allows for efficient encoding/decoding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does ‘self-describing’ refer to in the context of protocols?

A

The ability of a protocol to provide its own structure and meaning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the estimated time to generate an image results page with 30 thumbnails using a serial read design?

A

560 ms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the estimated time to generate an image results page with 30 thumbnails using a parallel read design?

A

18 ms

This is a theoretical estimate that ignores variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the expected time to quicksort 1 GB of 4 byte numbers?

A

Approximately 30 seconds

17
Q

What are microbenchmarks used for?

A

To understand performance and reduce cycle time for performance testing

18
Q

What is the latency for reading 1 MB sequentially from disk?

A

30,000,000 ns (30 ms)

19
Q

What does the term ‘back-of-the-envelope’ calculations refer to?

A

Quick estimates to assess feasibility or performance

20
Q

Fill in the blank: The latency for sending a packet from CA to the Netherlands is _______.

A

150,000,000 ns

21
Q

What are the core language libraries and data structures mentioned?

A

Core language libraries, basic data structures, SSTables, protocol buffers, GFS, BigTable, indexing systems, MySQL, MapReduce

These are fundamental components for building software systems.

22
Q

Why is it important to understand the implementations of core libraries?

A

Understanding implementations enables effective back-of-the-envelope calculations

Without this knowledge, one cannot make informed decisions or estimates.

23
Q

What should you do when building infrastructure?

A

Identify common problems and build software systems to address them in a general way

Avoid trying to satisfy every client demand to maintain system simplicity.

24
Q

True or False: Building infrastructure should be done for its own sake.

A

False

Infrastructure should be built to address real needs, not hypothetical ones.

25
Q

What does ‘Design for Growth’ entail?

A

Anticipate how requirements will evolve and ensure design works if scale changes by 10X or 20X

Considerations for larger scales may differ from smaller ones.

26
Q

What is a key goal when designing for low latency?

A

Aim for low average times and consider 90%ile and 99%ile latencies

User experience can significantly degrade with high latency.

27
Q

What consistency model is most products with mutable state gravitating towards?

A

Eventual consistency

This model is better from an availability standpoint compared to strong consistency.

28
Q

What is the significance of using threads in applications?

A

Using threads can help improve both throughput and latency

Modern machines have multiple cores, making parallelization crucial.

29
Q

What should you consider regarding data access?

A

Disks: seeks, sequential reads; Memory: caches, branch predictors

Understanding these aspects is crucial for optimizing performance.

30
Q

Fill in the blank: Compression is a very important aspect of many systems, especially in _______.

A

inverted index posting list formats, storage systems for persistent data

Compression helps reduce storage requirements and improve performance.

31
Q

What are ‘canary requests’ used for?

A

To detect issues in applications and improve robustness against failures

They help identify problems before affecting a large number of users.

32
Q

What is essential for monitoring and debugging applications?

A

Exporting HTML-based status pages and key-value pairs via a standard interface

This allows for easy diagnosis and monitoring of system health.

33
Q

What is the source code philosophy at Google?

A

Google has one large shared source base with core libraries and application-specific code

This promotes code reuse and improvements across applications.

34
Q

What are some benefits of code reviews and design reviews?

A

Improve code quality, catch issues early, and facilitate knowledge sharing

They are part of good software engineering hygiene.

35
Q

What challenges arise from multi-site software engineering?

A

Increased coordination needs, communication difficulties, and trust establishment between teams

These challenges necessitate effective collaboration tools and practices.

36
Q

What motivates Google’s expansion to multiple engineering sites?

A

To hire the best candidates regardless of geographic location

This strategy enhances talent acquisition and diversity.

37
Q

What is a service-based model for software development?

A

A model that allows fluid changes, easy testing, and enables small teams to accomplish a lot

It promotes agility and responsiveness in software development.

38
Q

What impact does work at Google have?

A

Work has a very large impact, affecting hundreds of millions of users every month

This scale of influence can be highly motivating for engineers.