Enterprise Computing Flashcards

(264 cards)

1
Q

Lecture 1 - Introduction

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define the waterfall Model

A

This waterfall model cascades the three fundamental activities
of the software development process, so that they happen
sequentially.
It is performed as Exploration -> Development -> Operation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why can the waterfall model be considered inoptimal

A

Specifications can often change and sometimes it can be too late to change a wrong interpretation of the problem deep into the development phase. Modifying requirements causes lots of issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the Iterative/incremental Model

A

The iterative/incremental model is:
∙ iterative because the feed-forward between activities is augmented with feed-back between them - e.g. a feed forward/backward waterfall;

∙ incremental because the interleaved activities regularly deliver small additional pieces of functionality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What advantage does iterative/incremental development have over waterfall?

A

Iterative allows the going back and changing previous phases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Lecture 2 - lean Cycle evolution

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of the Lean Cycle in software production?

A

The Lean Cycle is used to apply the scientific method to software production. It emphasizes continuous feedback and learning through iterative cycles of Build-Measure-Learn or Learn-Measure-Build to adapt and improve products efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three main phases of the Lean Cycle?

A

The three main phases are:

Exploration: Identifying and testing hypotheses about the market.

Development: Building products or features based on validated hypotheses.

Operation: Delivering the product to customers and refining based on feedback.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens in the “Learn” phase of the Build-Measure-Learn cycle

A

In the Learn phase, an enterprise formulates hypotheses about the market and determines the empirical data required to validate these hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the focus of the “Measure” phase in the Build-Measure-Learn cycle?

A

The Measure phase involves testing the hypothesis by collecting empirical data, often through experiments or feedback from prototypes or early versions of the product.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the “Build” phase of the Build-Measure-Learn cycle.

A

In the Build phase, an enterprise creates a Minimum Viable Product (MVP) to test hypotheses. This MVP allows for quick iteration and feedback collection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an issue with the Build-Measure-Learn cycle

A

Building first can often incur significant costs, building without research is risky

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a better alternative to the Build measure learn cycle

A

Reversing the cycle can help to improve it, learning first about demand etc, measuring the market and then building the project

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a “pivot” in the context of the Lean Cycle?

A

A pivot is a significant change in strategy without changing the vision. It occurs when empirical data suggests that the current approach isn’t working, leading to adjustments like technology changes or shifts in product focus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the five types of pivots mentioned in the Lean Cycle?

A

Technology Pivot: Switching to a more efficient technology.

Zoom-In Pivot: Turning a product feature into the main product.

Zoom-Out Pivot: Making a product part of a larger product suite.

Customer Segment Pivot: Targeting a different customer group.

Customer Need Pivot: Addressing a different but more critical problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Provide an example of a technology pivot.

A

Microsoft shifted from selling standalone Office software to a subscription-based cloud service with Microsoft 365, improving value delivery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Explain a Zoom-In Pivot with an example.

A

A Zoom-In Pivot occurs when a product feature becomes the main product. Example: Flickr started as a multiplayer game but pivoted to focus on its photo-sharing feature, which gained popularity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Zoom-Out Pivot? Provide an example

A

A Zoom-Out Pivot happens when a product becomes part of a larger offering. Example: DotCloud transitioned to manage Docker containers, focusing on application mobility across clouds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a Customer Segment Pivot?

A

It occurs when a product addresses a different customer group than initially intended. Example: YouTube started as a dating platform but shifted to a general video-sharing platform.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define a Customer Need Pivot with an example.

A

This pivot addresses a more critical customer problem. Example: Twitter evolved from a podcasting platform to a microblogging SMS-based social network after its initial model was rendered obsolete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What was the pivot that led to the success of Instagram?

A

Instagram started as a location-based app with multiple features. It pivoted to focus solely on photo sharing, simplifying user experience and achieving massive success.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How did Netflix pivot to achieve its current model?

A

Netflix transitioned from a mail-order DVD rental service to a streaming platform, allowing instant access to films and TV shows, disrupting traditional rental models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the four types of MVPs mentioned?

A

Concierge MVP: Personalized service with customer awareness.

Wizard of Oz MVP: Simulated functionality without customer awareness.

Landing Page MVP: Testing interest via a promotional webpage.

Video MVP: Demonstrating a concept through a video.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe a Concierge MVP and its use case.

A

A Concierge MVP involves hands-on interaction with customers to refine the product concept. It’s used when the solution hypothesis is unclear and customer feedback is critical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What distinguishes a Wizard of Oz MVP?
In a Wizard of Oz MVP, the product’s functionality is manually simulated without the customer’s knowledge. It’s used to validate a clear solution hypothesis while minimizing development effort.
26
How does a Landing Page MVP test product ideas?
A Landing Page MVP uses a webpage to gauge interest in a product idea. Visitors can sign up or pledge support, providing data on demand before product development.
27
What was the MVP for Dropbox?
Dropbox used a Video MVP, creating a simple video to demonstrate its functionality. This validated interest and attracted early adopters without building a full product.
28
How did Airbnb use an MVP approach?
Airbnb’s MVP was a basic website showcasing their apartment space. They manually managed bookings, testing the concept of renting personal spaces to travelers.
29
Why is iterative learning essential in the Lean Cycle?
Iterative learning allows for continuous improvement by validating assumptions, minimizing waste, and adapting to market needs efficiently, ensuring the product aligns with customer demands.
30
Lecture 3
31
According to Eric Ries, what type of experiment is a startup?
A startup is a human experiment designed to create a new product or service under conditions of extreme uncertainty. The goal is to test hypotheses about customer needs and product-market fit.
32
According to Eric Ries, what is the biggest waste that product development faces today?
The biggest waste in product development is building products that nobody wants. This waste occurs due to a lack of understanding of customer needs before development begins.
33
What does Eric Ries describe as the universal constant of all successful startups?
Continuous learning is the universal constant of all successful startups. Startups must iteratively test and refine their ideas based on customer feedback and market demands. Performing a pivot while staying grounded in what learning has occurred already
34
Is agile development suitable for startups, according to Eric Ries?
Agile development is suitable for startups as it emphasizes iterative progress, flexibility, and adapting to changes, which aligns with the startup’s need to rapidly respond to feedback and refine their product.
35
What is a situation in which agile development is not suitable?
Agile is not so suitable for safety critical systems, the product must be released in such as state where there are no errors. Releasing with errors can cause serious problems
36
What is validated learning in the context of a startup?
Validated learning is the process of demonstrating empirically that a team has discovered valuable truths about a startup’s present and future business prospects. It uses metrics derived from customer behavior rather than opinions or assumptions.
37
What does Eric Ries suggest should be included in the first version of a product?
The first version of a product, or Minimum Viable Product (MVP), should include only the core functionalities required to test key assumptions about customer needs and product viability. The MVP is designed to gather feedback with minimal resources.
38
39
According to Reis what is the most important thing for startups in terms of the cycle time
Reducing the cycle time is the most important thing
40
According to Eric Ries, what should the heuristic be for any kind of startup advice?
The heuristic for startup advice is that it should be actionable, testable, and tied to specific customer or market contexts. Generic advice should be avoided in favor of insights that can drive specific experiments.
41
What are actionable metrics, and why are they important for startups?
Actionable metrics are specific, clear, and tied to decision-making. They enable startups to evaluate the effectiveness of their strategies and experiments. Unlike vanity metrics, actionable metrics directly inform whether the business is on the right path.
42
How does Lean Startup methodology reduce waste in development?
By focusing on building MVPs, testing hypotheses, and using validated learning, the Lean Startup methodology ensures resources are allocated to features and products that meet actual customer needs, reducing waste.
43
Why is customer feedback crucial in Lean methodology?
Customer feedback is crucial because it provides real-world insights into user needs, helping startups pivot or persevere based on evidence rather than assumptions.
44
What role does experimentation play in Lean Cycle Evolution? What does it enable?
Experimentation allows startups to test hypotheses about products, customers, and markets under real-world conditions, enabling informed decision-making and minimizing risks associated with uncertainty.
45
Lecture 4 - Monoliths
46
What is a monolithic system?
A monolithic system is a single program run by a single process, formed from a collection of modules that communicate via procedure calls.
47
Why is it easy to develop a monolithic system initially?
It is easy because all the code is in one language and one place, allowing the development team to utilize their existing programming skills, tools, and experience effectively.
48
What makes testing a monolithic system straightforward?
Testing is straightforward because the system is built as a single executable, enabling automatic testing with a suite of tests and easy debugging when issues arise.
49
How is scaling achieved in a monolithic system?
Scaling in a monolithic system is done through vertical scaling, which involves upgrading to a more powerful machine to run the system better.
50
What are the benefits of using a centralized database in a monolithic system?
Centralized databases ensure data accuracy, completeness, and consistency, which simplifies maintenance and management.
51
What are the two possible outcomes of a database transaction?
A transaction can either: Commit, completing successfully and moving the database to a new consistent state. Abort, completing unsuccessfully and restoring the database to its previous consistent state.
52
What does the acronym ACID stand for in the context of database transactions?
ACID stands for: Atomicity Consistency Isolation Durability
53
What is the atomicity property of a transaction?
Atomicity ensures that transactions either fully succeed or fully fail, even in the presence of system failures.
54
What is the consistency property of a transaction?
Consistency ensures that a transaction takes the database from one consistent state to another.
55
What does the isolation property guarantee in database transactions?
Isolation ensures that the effects of concurrent transactions are the same as if the transactions were performed sequentially.
56
What does the durability property guarantee in database transactions?
Durability guarantees that once a transaction is successful, its effects persist, even in the event of system failures.
57
How does Fred Brooks’ observation to "plan to throw one away" apply to monolithic MVP development?
This observation suggests that teams should anticipate that their first version of a monolithic MVP might need to be discarded and rebuilt to incorporate lessons learned during its development.
58
What is Gall’s Law, and how does it relate to monolithic MVPs?
Gall’s Law states that a complex system that works evolves from a simple system that worked. For monolithic MVPs, it implies starting with a simpler design that can be refined over time.
59
What does the "You Aren’t Gonna Need It" (YAGNI) principle advocate in Extreme Programming?
YAGNI advises against implementing features until they are actually needed, emphasizing simplicity in monolithic MVPs to avoid unnecessary complexity.
60
What is Conway’s Law and how does it apply to monolithic MVP development?
Conway’s Law states that a system’s design mirrors the structure of the organization that created it. For monolithic MVPs, this means the design will reflect the communication structure of the team.
61
Why might it be better to reverse Conways Law?
Structuring the organization so that there are departments for each module can make the process more congruent
62
What does the phrase "eating your own dogfood" mean in the context of monolithic MVPs?
"Eating your own dogfood" means that development teams should use their own product to identify and address issues, ensuring its quality and usability.
63
What are some advantages of monolithic systems?
Advantages include simplicity in development and testing, centralized data management, and easier debugging due to having all code in one place.
64
What are the primary limitations of monolithic systems?
Limitations include difficulty scaling horizontally, potential for large and complex codebases, and challenges in adapting to changes or integrating new technologies.
65
What is vertical scaling, and why is it a common approach in monolithic systems?
Vertical scaling involves upgrading the hardware of a single machine to improve performance. It’s common in monolithic systems because they run as a single process that benefits from more powerful hardware.
66
Lecture 6 - Microservices
67
What is a microservices system?
A microservices system consists of multiple programs running as independent processes that communicate by sending messages over a network.
68
What is Representational State Transfer (REST)?
REST is a network protocol that is a conventional form of the HyperText Transfer Protocol (HTTP) used in microservices to communicate and provide resources.
69
What are the different types of resources in RESTful microservices?
Document - File-like resources managed using GET (read), PUT (update), and DELETE (delete). Controller - External resources that execute tasks using POST. Collection - Directory-like resources where GET lists items and POST creates a new one with an invented name. Store - Similar to collection, but PUT creates new resources with a given name.
70
What is a distributed database in microservices?
A distributed database is a system where data is stored across multiple databases, each accessed by individual microservices. Ensuring accuracy, completeness, and consistency in such a setup is challenging.
71
What are the two main approaches to managing distributed transactions?
Two-phase commit Sagas
72
How does a two-phase commit work?
The coordinator transaction asks participants to vote on committing a change. If all agree, the commit is executed; otherwise, all transactions are aborted. Each participant holds locks on its data until the decision is made. If a vote is not made in a certain timeframe it will time out and the vote will be considered negative
73
How does the saga pattern manage distributed transactions?
A saga executes a sequence of transactions, committing or aborting each individually. If a failure occurs, compensating transactions undo previous commits. This approach sacrifices atomicity and relies on eventual consistency.
74
Why do we want to use Sagas?
In two phase commits services must hold locks until the coordinator finishes. If one service crashes, everyone might be stuck waiting - not present in sagas Non-blocking: No locks or coordination — each service finishes its own task independently. Fault tolerant: Easier to recover from failures. Scales well: Ideal for microservices, cloud-native apps. Better for long-running operations: Since you don’t hold resources hostage.
75
What is horizontal scaling in microservices?
Horizontal scaling involves adding more machines, each capable of running multiple microservice instances, to improve system scalability. Why does the number of machines allocated to run one microservices be the same as the number allocated to run another?
76
What is the reliability challenge in microservices?
Since microservices communicate over a network, failures in one service can impact others. Strategies like retries, circuit breakers, and redundancy are used to improve reliability.
77
What is the strangler design pattern?
The strangler pattern is a gradual migration approach where monolithic modules are placed behind a facade and replaced one-by-one with microservices, updating the facade as changes are made.
78
What is the second-system effect, and how does it relate to microservices migration?
The second-system effect, identified by Fred Brooks, suggests that engineers tend to overcomplicate their second system. In microservices migration, teams must avoid unnecessary complexity when breaking apart a monolith.
79
What is Jeff Bezos’ two-pizza rule, and how does it apply to microservices?
Jeff Bezos suggests that teams should be small enough to be fed with two pizzas. In microservices, small, independent teams are ideal for maintaining and developing individual services efficiently.
80
What is the "Big Ball of Mud" problem in software architecture?
The "Big Ball of Mud" describes an unstructured, poorly designed software system. Microservices help avoid this by enforcing modular design principles.
81
How does Ward Cunningham’s technical debt concept apply to microservices?
Technical debt refers to the cost of shortcuts in development. Poorly designed microservices architectures can accumulate technical debt, requiring costly future refactoring.
82
What is the CAP theorem, and how does it apply to microservices?
The CAP theorem states that distributed systems can only provide two of three guarantees: Consistency, Availability, and Partition Tolerance. Microservices architectures must choose trade-offs based on system needs.
83
What is the role of API gateways in microservices?
API gateways act as intermediaries between clients and microservices, handling request routing, security, rate limiting, and load balancing.
84
What are some common tools used to manage microservices architectures?
Popular tools include Kubernetes (orchestration), Docker (containerization), Consul (service discovery), and Istio (service mesh).
85
What is event-driven architecture in microservices?
Event-driven architecture uses events to trigger and communicate between microservices asynchronously, improving decoupling and scalability.
86
What are some key advantages of microservices?
Scalability Improved fault isolation Flexibility in using different technologies Faster development and deployment cycles
87
What are some disadvantages of microservices?
Increased complexity Network latency issues Distributed data management challenges Higher infrastructure costs
88
Lecture 7
89
What are the nine common characteristics of microservices according to Martin Fowler?
The nine common characteristics of microservices according to Martin Fowler are: Componentization via Services – Microservices are independently deployable services. Organized Around Business Capabilities – Teams are structured around business functions. Products Not Projects – Microservices focus on long-lived products, not temporary projects. Smart Endpoints and Dumb Pipes – Business logic is in the services, not the communication mechanism. Decentralized Governance – Different teams can use different technologies. Decentralized Data Management – Each service manages its own database. Infrastructure Automation – Deployment and monitoring are automated. Design for Failure – Systems assume failures and handle them gracefully. Netflix Death Monkey Evolutionary Design – Services can be updated or replaced independently.
90
What is a component according to Martin Fowler?
A component is a unit of software that can be replaced or upgraded independently. In microservices, a component is defined by its behavior and exposed via an API, making it easier to manage and scale.
91
Why should one organize around business capabilities in microservices?
Organizing around business capabilities ensures that teams focus on delivering value rather than being restricted by technology stacks. It enables better ownership, faster delivery, and clearer responsibilities. Each microservice aligns with a distinct business function.
92
Should endpoints be smart or dumb in microservices?
Endpoints should be smart, while the communication between them should be simple (dumb pipes). This means that microservices encapsulate business logic within the service, whereas the network simply routes data without complex logic.
93
What is the rule for microservice data management?
Each microservice should have its own dedicated database and not share it with other services. This ensures loose coupling and independence, allowing services to evolve independently.
94
What do you have to assume in any distributed system?
You must assume the "Fallacies of Distributed Computing," including: The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. The topology doesn’t change. There is one administrator. Transport cost is zero. The network is homogeneous. Recognizing these assumptions helps design resilient and fault-tolerant microservices.
95
How big is a microservice?
There is no fixed size, but a microservice should be small enough to be developed and managed by a small team (typically 2-5 developers) and should perform a single business function well.
96
What things must be sorted out before adopting microservices?
Before transitioning to microservices, teams must consider: Deployment automation – Microservices require CI/CD pipelines. Monitoring and logging – Observability is critical. Service discovery – Dynamic service registration is needed. Fault tolerance – Handling failures must be a priority. Data consistency – Distributed databases need careful management. Organizational readiness – Teams must be capable of handling service independence.
97
Lecture 9
98
The two common repository models are:
Monorepo - A single large repository that contains all microservices. Any commit triggers the production of multiple microservices. Multirepo - A separate repository for each service. Any commit only affects a single service.
99
What are the advantages and disadvantages of using a Monorepo?
Advantages: Simplifies dependency management. Centralized codebase for better consistency. Easier refactoring across services. Disadvantages: Can become slow and difficult to manage at scale. Requires robust tooling to handle changes efficiently. Can cause bottlenecks if too many teams work on the same repository.
100
What are the advantages and disadvantages of using a Multirepo?
Advantages: Allows independent development and deployment of services. Teams have full control over their own repositories. Reduces risk of large-scale merge conflicts. Disadvantages: Harder to coordinate cross-service changes. Dependency management can be more complex. May lead to duplication of code across repositories.
101
What are the two common branching models?
Feature-Based Development - Developers create long-lived feature branches that may last for weeks or months before merging into the main branch. Trunk-Based Development - Developers work primarily on a single main branch, with short-lived feature branches that are merged back within minutes or hours.
102
What are the advantages and disadvantages of Feature-Based Development?
Advantages: Allows isolated development of new features. Provides stability by keeping unfinished code out of the main branch. Disadvantages: Merging long-lived branches can be complex and lead to conflicts. Delays in integration may cause unexpected failures when merging.
103
What are the advantages and disadvantages of Trunk-Based Development?
Advantages: Encourages continuous integration. Reduces merge conflicts by keeping branches short-lived. Faster feedback and fewer integration problems. Disadvantages: Requires disciplined development practices. Can be difficult for large teams to coordinate without proper tooling.
104
What are the essential practices of version control?
The essential practices of version control include: Run commit tests locally. Wait for commit tests to complete before proceeding. Avoid committing on a broken build. Never leave work with a broken build. Be prepared to revert changes if needed. Avoid commenting out failing tests. Take responsibility for fixing breakages.
105
Why is it important to run commit tests locally?
Running commit tests locally ensures that the developer's changes do not introduce failures before pushing them to the repository. This prevents unnecessary build failures and broken tests in shared branches.
106
Why should developers wait for commit tests to complete?
Developers should wait for commit tests to complete because: It ensures that the build remains stable. Developers can quickly fix failures instead of delaying corrections.
107
Why should developers avoid committing on a broken build?
Committing on a broken build: Makes debugging more difficult. Causes further build failures and wastes time. Leads to a culture where broken builds become common and unresolved.
108
Why should developers never leave work with a broken build?
Developers should never leave a broken build because: It delays fixes and impacts the entire team. Developers may forget details of the change, making debugging harder. Experienced developers commit changes at least an hour before leaving to ensure stability.
109
Why should developers be prepared to revert changes?
Developers should be prepared to revert changes because: Quick reverts keep the project in a working state. If a fix takes too long (e.g., over 10 minutes), reverting prevents prolonged issues.
110
Why should developers avoid commenting out tests?
It can lead to lower code quality. Instead, developers should: Fix the code if it fails. Modify the test if assumptions change. Delete the test if the functionality no longer exists.
111
Why should developers take responsibility for breakages?
Taking responsibility for breakages ensures that: The codebase remains stable. Developers collaborate to resolve issues quickly. No single person is left fixing issues they did not introduce.
112
Lecture 10
113
What are the three benefits of a version control system according to Farley?
Step back to safety – Enables rolling back to previous versions in case of issues. Share changes easily – Facilitates collaboration by allowing multiple contributors to work on the same project. Store changes safely – Ensures that all changes are securely saved and can be retrieved when needed.
114
What are the three models of version control according to Farley?
Mono-repo – Stores everything in a single large repository. Multi-repo – Each independent component has its own repository. Multi-repo' – Stores interdependent components in separate repositories.
115
Why does a mono-repo provide the three benefits of a version control system?
mono-repo supports these benefits because: Step back to safety – Rolling back changes affects all components together. Share changes easily – Any component can be updated in a centralized manner. Store changes safely – Everything, including dependencies, is stored in one place.
116
Why might a multi-repo not provide the three benefits of a version control system?
A multi-repo can struggle with these benefits because: Step back to safety – No centralized rollback mechanism for all components. Share changes easily – Difficult to coordinate updates across repositories. Store changes safely – Versioning relationships between components are not inherently stored.
117
What are two solutions to the multi-repo problem according to Farley?
Fixed, well-understood APIs – Components interact through stable interfaces. Flexible, backward/forward-compatible APIs – Ensures components work together despite version differences.
118
Why do Farley's solutions to the multi-repo problem restore the three benefits of a version control system?
Step back to safety – Individual components can be rolled back independently. Share changes easily – Updates can be coordinated through APIs (though this remains complex). Store changes safely – Components are versioned separately but stored reliably.
119
Why is the "multi-repo'" model considered the worst of all worlds? ( this is on about the wierd non multi repo multi repo thing? - polyrepo?)
Because in the multi-repo' model: Components cannot be developed independently. Components cannot be deployed independently. In a monorepo, you can make a single commit that changes multiple projects at once. In multi-repo, if your service depends on changes in another repo, you need to: Make the change in repo A Wait for it to be merged and released Update repo B to use the new version This causes coordination overhead and risks of version mismatches. You're forced to treat changes across services like you're deploying to Mars.
120
Lecture 11 - Continuous Integration
121
What is Continuous Integration (CI)?
Continuous Integration (CI) is the practice of quickly integrating newly developed code with the rest of the application code. This process is usually automated and results in a build artifact at the end. The goal of CI is to detect errors early and streamline the deployment process.
122
What are the key benefits of Continuous Integration?
Key benefits of CI include: Early bug detection through automated testing. Faster development cycles and releases. Reduced integration problems. Improved collaboration among development teams. Higher code quality through continuous testing.
123
What are the four traditional product delivery releases?
The four traditional product delivery releases are: Alpha Release - An early version for internal testing. Beta Release - A more stable version for external user feedback. Release Candidate - A near-final version tested for last-minute issues. Final Release - The official version available to customers
124
What are the three modern feature delivery environments?
The three modern feature delivery environments are: Development Environment - Where individual teams integrate their work, updated throughout a sprint. Staging Environment - A near-production environment where multiple teams integrate their work, updated at the end of a sprint. Production Environment - The live system where software is deployed for customer use, updated based on business needs.
125
What is Shift Left Testing, and how does it apply to CI?
Shift Left Testing refers to the practice of moving testing earlier in the software development cycle. It ensures that testing is done frequently and early, reducing defects and improving code quality. In CI, Shift Left Testing is crucial as it enables continuous feedback, helping to identify and fix issues before deployment.
126
What is the Test Pyramid, and what are its levels?
The Test Pyramid is a model that categorizes different levels of testing: Unit Tests - Test individual functions or components, performed in milliseconds. Service Tests - Test interactions between services, performed in minutes. End-to-End Tests - Test the entire application workflow, mimicking user interaction, performed in several minutes.
127
What is the Test Snow Cone, and why is it an anti-pattern?
The Test Snow Cone is an anti-pattern where more end-to-end tests exist than unit or service tests. This leads to slow test execution and longer feedback cycles. CI best practices encourage more unit and service tests over end-to-end tests to ensure efficiency.
128
What are Brittle Tests in Continuous Integration?
Brittle Tests are tests that fail because another dependent service fails, even if the functionality being tested is correct. They can cause false negatives, making debugging difficult.
129
What are Flaky Tests, and why are they problematic?
Flaky Tests sometimes fail due to non-deterministic issues such as timeouts or race conditions. They create unreliable feedback and reduce confidence in test automation.
130
What is the Normalization of Deviance, and how does it affect CI?
Normalization of Deviance is a concept where teams gradually accept small failures as normal, leading to degraded quality over time. In CI, failing tests must be addressed immediately to prevent this mindset and ensure reliable software.
131
What are Build Light Indicators, and how are they used in CI?
Build Light Indicators visually represent the status of CI builds. A green light means the build is successful, while a red light indicates a failure. Some teams use lava lamps or monitor screens to display build statuses.
132
What role does automation play in Continuous Integration?
Automation is central to CI as it enables frequent builds, automated testing, and fast feedback. It ensures that code changes do not introduce new errors and maintains software quality at scale.
133
Lecture 12 - Continuous integration 2
134
What is integration hell in software development?
Integration hell is an anti-pattern in software development where different parts of a software system are integrated too late, leading to complex and time-consuming conflicts.
135
Why should commit tests be run locally before pushing changes (Rule 1)?
Running commit tests locally ensures that the deployment pipeline remains a valuable shared resource that is not blocked by unnecessary test failures.
136
Why should developers wait for test results before moving on (Rule 2)?
Developers should wait for test results to be available so they can immediately fix any issues, ensuring smooth progress.
137
Why must failures be fixed or reverted within 10 minutes (Rule 3)?
Fixing or reverting failures within 10 minutes prevents blocking progress for others and maintains development velocity.
138
What should happen if a teammate breaks the integration rules (Rule 4)?
If a teammate breaks the rules, their changes should be reverted to prevent them from blocking progress.
139
Why is it considered a "build sin" if someone else notices your failure first (Rule 5)?
If someone else notices a failure before you do, it indicates a lack of attentiveness and encourages developers to monitor their changes more closely.
140
What should a developer do once their commit passes (Rule 6)?
Once a commit passes, a developer should move on to their next task, as automated testing ensures that their changes are stable.
141
Who is responsible for fixing a failing test (Rule 7)?
The committer is responsible for fixing a failing test to ensure accountability in the development process.
142
What is the rule about responsibility when multiple people may be responsible for a failure (Rule 8)?
Everyone who may be responsible should agree on who will fix the failure, ensuring that accountability is maintained.
143
Why should developers monitor the progress of their changes (Rule 9)?
Monitoring changes ensures that any issue is detected early, preventing unfit software from being released
144
Why should any pipeline failure be addressed immediately (Rule 10)?
Immediate attention to pipeline failures ensures that the pipeline remains clear for other changes, maintaining continuous integration efficiency.
145
Lecture 13 - Continuous Delivery
146
What is Continuous Delivery (CD)?
Continuous Delivery (CD) is a software engineering practice where software is automatically moved from a source code repository to a staging environment. At the press of a "release" button, it can be deployed to the production environment for customer use.
147
How does Continuous Deployment differ from Continuous Delivery?
Continuous Deployment (CD) goes a step beyond Continuous Delivery by automatically deploying software to the production environment without manual intervention, making new features immediately available to customers.
148
What are the key principles of Continuous Delivery?
The key principles of Continuous Delivery include: Create a repeatable process Automate almost everything Version control for everything If it hurts, do it more frequently Build quality in Done means released Everyone is responsible Continuous improvement
149
Why is creating a repeatable process important in Continuous Delivery?
A repeatable process ensures consistency, reduces errors, and allows teams to become more efficient. A well-practiced process becomes routine and reliable.
150
Why is automation emphasized in Continuous Delivery?
Automation ensures accuracy, consistency, and efficiency. Manual processes introduce human error and inefficiencies, whereas automation standardizes execution and reduces risks.
151
What is the significance of version control in Continuous Delivery?
Version control allows any team member to build any version of the application on demand. It ensures traceability, facilitates rollback if needed, and supports collaboration.
152
Why should painful processes be done more frequently in Continuous Delivery?
Performing painful processes frequently helps improve efficiency, identify bottlenecks, and streamline workflows. Regular practice leads to familiarity and process refinement.
153
What does "Build Quality In" mean in Continuous Delivery?
This principle emphasizes fixing defects as soon as they are found. Early detection and resolution of defects are cost-effective and ensure higher software quality.
154
What does "Done Means Released" signify in Continuous Delivery?
A feature is only considered "done" once it has been deployed to a production-like environment. This avoids ambiguity in development progress and ensures accountability.
155
Why is team responsibility emphasized in Continuous Delivery?
Continuous Delivery requires collaboration across teams. When everyone is responsible, issues are resolved collectively rather than leading to blame culture and inefficiencies.
156
What is the role of continuous improvement in Continuous Delivery?
Continuous improvement encourages teams to reflect on successes and failures, leading to process optimizations and better software delivery over time.
157
What are the three types of testing in production?
he three types of testing in production are: A/B Testing Canary Testing Blue/Green Testing
158
What is A/B Testing in production?
A/B Testing involves directing a small percentage of user traffic to a new interface in production. If users respond negatively, traffic is reverted to the old interface.
159
How does Canary Testing work?
Canary Testing directs a small percentage of traffic to a new version of the software. If issues arise, traffic is rolled back to the previous version.
160
What is Blue/Green Testing?
Blue/Green Testing swaps the production (blue) and staging (green) environments. If the new version performs well, the switch is made permanent; otherwise, it is reversed.
161
Lecture 14 - Continuous Delivery 2
162
According to Jez Humble, how can we achieve continuous delivery?
Continuous delivery is achieved through fast, automated feedback on the production readiness of applications every time there is a change — to code, infrastructure, or configuration.
163
What condition should software always be in, according to Humble?
Software should always be in a production-ready or releasable state.
164
How does continuous delivery help to avoid the biggest source of waste in the software development process?
Continuous delivery helps avoid waste by making it easier to deploy new, experimental features into production quickly and efficiently, reducing delays and unnecessary rework.
165
When should testing be done in continuous delivery?
Testing should be done continuously throughout the development process, not just at the end.
166
Who is responsible for software quality in continuous delivery?
Everyone involved in the development process is responsible for quality, not just a dedicated QA team.
167
What is considered more important than delivering functionality, according to Humble?
Keeping the system in a working and stable state is more important than delivering new functionality.
168
How does continuous delivery reduce the risk of releases?
Continuous delivery reduces risk by enabling small, extensively tested changes to be released frequently, and by making reversion easy in case of issues.
169
What role does automation play in continuous delivery?
Automation is critical for providing fast feedback, reducing human error, and ensuring that every change can be safely and quickly deployed.
170
Why is continuous integration important in the context of continuous delivery?
Continuous integration ensures that changes are merged and tested frequently, reducing integration problems and allowing for quicker releases.
171
What are some common tools used in continuous delivery pipelines?
Common tools include Jenkins, GitHub Actions, GitLab CI/CD, CircleCI, and Travis CI for automation, along with Docker and Kubernetes for containerization and deployment.
172
What are the benefits of frequent, smaller releases in continuous delivery?
Frequent, smaller releases reduce risk, improve feedback loops, enable faster value delivery, and make it easier to pinpoint issues when they arise.
173
How does continuous delivery improve collaboration between development and operations teams?
Continuous delivery encourages DevOps practices, breaking down silos and promoting shared responsibility for deployment, monitoring, and system reliability.
174
What are some challenges organizations face when adopting continuous delivery?
Challenges include cultural resistance to change, legacy system constraints, lack of automation, and the need for robust testing strategies.
175
How does continuous delivery support business agility?
Continuous delivery enables businesses to respond quickly to market changes, customer feedback, and new opportunities by streamlining the software release process.
176
What is the relationship between continuous delivery and DevOps?
Continuous delivery is a key practice within DevOps, aiming to integrate development and operations for seamless, automated software releases.
177
How does monitoring play a role in continuous delivery?
Monitoring provides real-time feedback on system performance, helping teams detect and resolve issues quickly to maintain system reliability.
178
Why is rollback capability important in continuous delivery?
Rollback capability ensures that if an issue arises in production, teams can quickly revert to a previous stable version, minimizing downtime and impact.
179
How does feature flagging complement continuous delivery?
Feature flagging allows teams to deploy changes without exposing them to all users, enabling controlled testing and gradual rollouts.
180
What is the difference between continuous delivery and continuous deployment?
Continuous delivery ensures software is always ready for release, while continuous deployment automatically releases every successful change into production without manual intervention.
181
How can organizations measure the success of continuous delivery?
Success can be measured using metrics such as deployment frequency, lead time for changes, mean time to recovery (MTTR), and change failure rate.
182
Lecture 15 - Cloud Computing
183
What is cloud computing?
Cloud computing is the delivery of computing services—including servers, storage, databases, networking, software, and more—over the internet ("the cloud"). It allows users to rent rather than own IT infrastructure, offering scalability, flexibility, and cost savings.
184
Why is cloud computing compared to electricity utilities?
Just as electricity utilities provide power from centralized plants, cloud computing providers supply computing resources from centralized data centers. This model allows for economies of scale and expertise that individual users cannot achieve on their own.
185
What is a staging environment in cloud computing?
A staging environment is a pre-production environment where software is tested in conditions similar to production. It allows stakeholders to validate the software before deployment. Many enterprises rent their staging environments in the cloud rather than maintaining physical infrastructure.
186
Why has cloud computing become successful?
Broad network access (availability over standard networks, including VPNs). On-demand, self-service (users can provision resources as needed). Measured service (pay-per-use billing model). Rapid elasticity (scaling resources up or down as needed). Resource pooling (multi-tenant models for efficiency).
187
What are the key characteristics of broad network access in cloud computing?
Broad network access means cloud services are available over standard network technologies, including the internet and VPNs, ensuring accessibility from various devices and locations.
188
What does on-demand, self-service mean in cloud computing?
On-demand, self-service means customers can provision computing resources automatically without human intervention, typically through a web interface or API.
189
What is meant by measured service in cloud computing?
Measured service refers to the provider's ability to track and optimize resource usage, ensuring customers pay only for what they consume.
190
What is rapid elasticity in cloud computing?
Rapid elasticity allows customers to scale computing resources up or down dynamically based on demand, ensuring efficiency and cost-effectiveness.
191
What is resource pooling in cloud computing?
Resource pooling enables cloud providers to serve multiple customers using shared resources, efficiently distributing computing power among users through multi-tenancy.
192
What are the two phases of cloud computing?
The two main phases are: Serverful computing (traditional model with dedicated infrastructure). Serverless computing (execution-based model where infrastructure management is abstracted).
193
What are the different serverful computing models?
Serverful computing includes: Infrastructure-as-a-Service (IaaS): Access to raw computing resources (e.g., virtual machines). Platform-as-a-Service (PaaS): Managed infrastructure with OS and development tools. Software-as-a-Service (SaaS): Fully managed applications delivered over the cloud.
194
What technologies enable serverful computing?
Serverful computing relies on virtualization, which includes: Virtual Machines (VMs): Software-based simulations of physical computers managed by hypervisors. Containers: Lightweight, OS-level virtualization managed by the operating system.
195
How does the serverful cost model work?
The serverful cost model is based on resource rental, where customers pay for allocated resources, regardless of whether they are fully utilized. This is similar to renting a car.
196
What are the serverless computing models?
Serverless computing includes: Backend-as-a-Service (BaaS): Pre-built backend services (e.g., authentication, databases). Function-as-a-Service (FaaS): Execution of code in response to events without managing infrastructure.
197
How is serverless computing implemented?
Serverless computing uses hidden containers to run function code. Though servers are still used, their management is abstracted, and responsibility shifts to the cloud provider.
198
How does the serverless cost model work?
Serverless computing charges customers based on execution time rather than resource allocation. This model is often compared to hailing a taxi—you pay only for the ride, not for keeping a car.
199
How can microservices be implemented in cloud computing?
Microservices can be implemented using: Virtual machines (serverful approach, more overhead). Containers (lightweight, efficient serverful approach). Function instances (serverless approach, potential maintenance/performance challenges).
200
What are potential issues when mapping microservices to multiple function instances?
Mapping a single microservice to multiple function instances may create: Maintenance issues (tracking instances). Performance issues (cold start problems when instances are inactive).
201
Lecture 16 - Cloud Computing 2
202
How long did it take to get a new server ready for code deployment in an FT data centre vs. an AWS data centre, according to Wells?
FT data centre: Several weeks to months. AWS data centre: A few minutes to hours. This highlights the agility and scalability benefits of cloud computing.
203
Should one worry about vendor lock-in, according to Wells?
Vendor lock-in occurs when it becomes costly or difficult to switch cloud providers. Wells suggests it is not always a major concern because cloud providers offer significant advantages. Mitigation strategies include using multi-cloud approaches and open standards.
204
What was the deployment frequency before and after the moving to the cloud?
Before: Infrequent, possibly quarterly or monthly releases. After: Continuous deployment, allowing multiple releases per day. The cloud enables faster development cycles and quicker feedback loops.
205
Do you have to choose between speed and stability in cloud computing?
No, modern DevOps practices enable both. Automation, continuous integration/continuous deployment (CI/CD), and robust monitoring improve stability while maintaining rapid delivery.
206
Why should you use a queue in cloud-native architecture?
Queues decouple system components, enhancing scalability and reliability. They help handle asynchronous processing and load balancing. Example: Message queues (e.g., AWS SQS, RabbitMQ) prevent system overload.
207
What should you focus on when developing a distributed system?
Resilience and fault tolerance. Network latency and eventual consistency. Observability: logging, monitoring, and tracing. Scalability: designing for auto-scaling and load balancing.
208
Why should one adopt business-focused monitoring?
Traditional monitoring focuses on infrastructure metrics (CPU, memory, etc.). Business-focused monitoring tracks key performance indicators (KPIs) like user engagement, conversion rates, and revenue. Helps align IT efforts with business goals.
209
Why should one test infrastructure recovery plans?
Ensures business continuity in case of failures. Identifies weaknesses in disaster recovery strategies. Techniques include chaos engineering (e.g., Netflix's Chaos Monkey) to simulate failures and test resilience.
210
lecture 17 - DevOps
211
How does Amazon's approach differ from traditional development/operations models?
Amazon promotes the philosophy "You build it, you run it," which gives developers operational responsibilities. This closes the feedback loop with customers and enhances service quality, unlike traditional models where developers hand off code and disengage.
212
What is the CALMS acronym in DevOps?
CALMS stands for Culture, Automation, Lean, Measurement, and Sharing. These are the five key principles that guide DevOps practices.
213
What does the Culture principle in DevOps emphasize?
Culture in DevOps emphasizes collaboration and shared values among teams. It encourages a blameless environment focused on learning from mistakes and continuously improving.
214
How did Toyota and GM demonstrate the impact of culture in manufacturing?
At the NUMMI plant, Toyota retrained GM workers using a high-trust, continuous improvement culture. Within three months, the plant produced the highest-quality cars in America, highlighting the power of DevOps-aligned culture.
215
Why is automation important in DevOps?
Automation reduces the risk of deployment failures, speeds up processes, ensures repeatability, and increases transparency, allowing teams to focus on higher-value tasks.
216
What is Toyota's concept of Jidoka, and how does it relate to DevOps?
Jidoka means "automation with a human touch." Machines or operators can halt production upon detecting issues. In DevOps, this relates to empowering systems or developers to detect and respond to failures early.
217
What does the Lean principle advocate for in DevOps?
Lean focuses on eliminating waste (e.g., unnecessary processes, handoffs, or rework) to improve efficiency and reduce delays without sacrificing product quality.
218
How can waste be minimized in software development according to Lean practices?
By limiting work in progress and minimizing handoffs, teams can stay focused, avoid being interrupt-driven, and reduce coordination overhead.
219
What are examples of waste defined in Lean manufacturing and their software equivalents
Waste types include waiting (e.g., delayed deployments), transportation (data transfer inefficiencies), overproduction (building unneeded features), and rework (fixing bugs). Kanban boards help visualize and manage these wastes.
220
What role does Measurement play in DevOps?
Measurement involves continuous monitoring of metrics and logs to quickly detect, diagnose, and fix system issues. It supports data-driven decisions and improvements.
221
What are metrics in the context of DevOps and why are they critical?
Metrics are time-series data points that reflect system behavior. They are key to assessing system performance and comparing against KPIs or benchmarks for improvement.
222
How does Toyota use metrics to improve team performance?
Toyota tracks metrics like floor lengths in tenths to identify bottlenecks or underperforming areas, guiding targeted support through gemba walks (visits to the workplace).
223
What is the Sharing principle in DevOps and why is it important?
Sharing promotes open communication and knowledge exchange between development and operations teams. It fosters collaboration, rapid problem detection, and learning from incidents.
224
How can teams implement the Sharing principle practically?
By including operations in dev meetings, lunches, and team events, organizations build relationships and feedback loops that prevent problems and enhance mutual learning.
225
What is genchi genbutsu, and how does it support DevOps?
Genchi genbutsu means "go and see." Managers visit the worksite to understand issues firsthand, mirroring DevOps practices of engaging with systems and teams directly to solve problems.
226
What are the differing concerns of developers and operators according to Vargo?
According to Vargo, developers prioritize agility, meaning the ability to quickly write and deploy code. Operators prioritize stability, which refers to keeping systems reliable and preventing disruptions. This difference in priorities often leads to tension between the two roles.
227
What is DevOps in its purest form, according to Vargo?
DevOps, in its purest form, is about breaking down the metaphorical wall between developers and operators. It emphasizes collaboration, shared responsibilities, and integrated workflows between traditionally siloed teams to streamline development and operations.
228
Why should organizations reduce silos, according to Vargo?
Reducing organizational silos is essential because success in DevOps comes from cooperation between cross-functional teams. When teams work in isolation, it hinders communication, slows down processes, and reduces the overall efficiency of software delivery.
229
Why is it important to accept failure as normal in DevOps?
Failure should be accepted as normal because all human-created systems are inherently unreliable. Recognizing this encourages organizations to prepare for failure, build resilient systems, and respond to issues more effectively rather than trying to eliminate failure entirely.
230
Why should organizations implement gradual change, according to Vargo?
Gradual change is crucial because large, million-line changes are difficult to debug and verify. Smaller, incremental changes reduce risk, make it easier to detect bugs, and allow for faster feedback cycles and rollbacks if necessary.
231
Why should DevOps leverage tooling and automation?
Tooling and automation are essential because they convert manual work into repeatable, automated patterns. This improves consistency, reduces human error, and increases the speed and reliability of development and operational tasks.
232
Why is it important to measure everything in a DevOps environment?
Measurement is critical because it provides data to justify DevOps investments and sets clear metrics for success. Without data, it's difficult to assess progress or identify areas needing improvement.
233
How does Site Reliability Engineering (SRE) reduce organizational silos?
SRE reduces silos by promoting shared ownership with developers, using common tools, and adopting shared availability metrics. This fosters communication and collaboration between development and operations teams.
234
How does SRE approach the idea of accepting failure as normal?
SRE acknowledges system unreliability through the use of Service Level Objectives (SLOs), which define acceptable levels of performance. It also conducts blameless postmortems after failures to learn and improve without assigning personal blame.
235
How does SRE implement gradual change?
SRE encourages small, fast, and iterative deployments to minimize the cost of failure. These smaller changes reduce complexity and make issues easier to diagnose and fix.
236
How does SRE use tooling and automation to eliminate toil?
SRE aims to automate tasks that were manual in the past. The goal is that work done manually this year should be automated by the next, thereby reducing repetitive, manual work known as "toil."
237
What types of metrics does SRE measure?
SRE measures both system metrics, like reliability and performance, and human metrics, such as the amount of toil. This dual approach ensures technical health and team sustainability.
238
Lecture - 19
239
What does the acronym CALMS stand for in DevOps?
CALMS stands for Culture, Automation, Lean, Measurement, and Sharing, representing the five pillars of DevOps principles.
240
How does Software Reliability Engineering(SRE) Implement the Culture principle of DevOps?
SRE supports Culture by forming separate SRE teams or embedding SREs in development teams. It promotes a "blameless" culture in incident analysis to encourage openness and learning.
241
What is a blameless postmortem, and why is it important in SRE?
A blameless postmortem avoids blaming individuals or teams for failures. This helps foster trust, openness, and continuous improvement without fear of punishment.
242
What is the goal of organizational learning in SRE?
The goal is to share postmortems across engineering teams to learn from incidents and improve system resilience.
243
How does SRE reduce toil through automation?
SRE uses automation to eliminate "toil," which is manual, repetitive, and low-value operational work. The aim is to free engineers to focus on higher-value engineering tasks.
244
What is the ideal time distribution between engineering work and operations for an SRE team, especially at Google?
Ideally, SREs should spend at least 50% of their time on engineering work and the remainder on on-call duties like support and incident response.
245
Why is limiting incident frequency important for on-call SREs?
To prevent "pager fatigue" and ensure quality incident handling and postmortems, each SRE should face no more than two incidents per 8-12 hour shift.
246
What is an error budget and how is it used in SRE?
An error budget is calculated as the difference between observed reliability and agreed reliability. It helps control feature releases by allowing them when within budget and halting them when exceeded.
247
How does polarizing time help reduce handoffs in SRE?
By clearly separating development and operations times, engineers can focus solely on one type of work at a time, reducing context switching and miscommunication.
248
How does SRE apply the Measurement principle of DevOps?
SRE obsessively monitors a small set of key metrics chosen based on intuition, experience, and user needs, ensuring they reflect service health accurately.
249
What are SLIs, SLOs, and SLAs in the context of SRE?
SLI (Service Level Indicator): Quantitative measure of service aspect (e.g., latency). SLO (Service Level Objective): Target value or acceptable range for an SLI. SLA (Service Level Agreement): Contract specifying consequences for meeting or missing an SLO.
250
What are the two main aspects of the Sharing principle in SRE?
Knowledge sharing and tool/technique sharing.
251
How is knowledge sharing implemented between development and operations in SRE?
Developers inform ops about upcoming functionality; ops inform developers about performance issues, ensuring both sides are informed.
252
Why is sharing tools and techniques important in SRE? (having shared tools and techniques)
It standardizes environment management and deployment, allowing any engineer to self-service tasks like deployments, improving efficiency and reducing bottlenecks.
253
Lecture 20
254
What makes a good alert in Site Reliability Engineering, and why is it important?
A good alert is actionable and pertains to an issue that cannot be fixed without human intervention. If automated remediation is feasible, it should be attempted first. This is important for SREs because poor alerts lead to alert fatigue and unnecessary stress, particularly during on-call shifts.
255
What is 'reliability theatre' in the context of SRE, and why is it problematic?
Reliability theater refers to traditional setups like Network Operations Centers (NoCs) or war rooms that exist mainly to impress stakeholders rather than improve reliability. SREs find this problematic because it detracts from genuine incident response effectiveness.
256
In SRE, what is a 'snowflake' server, and why is it discouraged?
A server that is unique, manually configured, and difficult to reproduce or replace. A snowflake is a production server maintained through manual command-line tweaks, making it unique and hard to reproduce or debug. SREs discourage snowflakes because they violate the principle of infrastructure as code and hinder scalability and reliability.
257
What are 'pets', 'cattle', and 'poultry' in SRE, and how do they differ?
'Pets' are individually managed servers with unique configurations, akin to snowflakes. 'Cattle' are standardised servers managed in groups. 'Poultry' refers to ephemeral containers, also managed in bulk. The progression from pets to poultry reflects increasing automation and decreasing management overhead.
258
Why is autonomous better than automated in SRE, and how does this impact operations?
Autonomous systems make independent decisions and take actions without human input, whereas automated systems follow predefined instructions. Autonomous > automated because it reduces human intervention and eases the burden on the on-call rotation, improving system resilience.
259
What are the benefits of embedding an SRE in a development team?
Embedding SREs fosters trust and collaboration between development and operations. It allows SREs to contribute to system design early on, resulting in more reliable and maintainable software.
260
How is the 'right number of nines' determined in SRE? And what is the tradeoff?
The "right number of nines" (e.g., 99.9%, 99.99% uptime) depends on how much downtime the business can tolerate. It's a trade-off between reliability and cost, based on business requirements and customer expectations.
261
Define the property of eventual consistency in microservices/ distributed systems
Eventual consistency is a property of distributed systems—like those often found in microservices and enterprise computing—where the system guarantees that, given enough time without new updates, all parts of the system will eventually reflect the same data (reach consistency). Not immediate: Unlike strong consistency (where updates are visible everywhere instantly), eventual consistency tolerates delays. Asynchronous replication: Changes are copied to other services/databases in the background. Common in high-availability systems: It allows systems to keep working even if some parts are temporarily unreachable.
262
Define the property of eventual consistency in microservices/ distributed systems
Eventual consistency is a property of distributed systems—like those often found in microservices and enterprise computing—where the system guarantees that, given enough time without new updates, all parts of the system will eventually reflect the same data (reach consistency). Not immediate: Unlike strong consistency (where updates are visible everywhere instantly), eventual consistency tolerates delays. Asynchronous replication: Changes are copied to other services/databases in the background. Common in high-availability systems: It allows systems to keep working even if some parts are temporarily unreachable.
263
Define Key performance Indicators
Key Performance Indicators (KPIs) in software development are specific, measurable metrics used to evaluate the performance and progress of software teams, projects, or products. They help organizations track success, identify areas for improvement, and align development efforts with business goals.
264