Main Flashcards
What is structured data
Data stored with high degree of organisation
SQL, CSV
Is tags and elements structured or semi-structured
Semi-structured
Why use cloud databases?
Ease of access
Faster time to market
Reduce risks
Lower costs
Scalability
Diaster Recovery
In a SQL Database each row is identified via a
primary key
The statements of inserting, retrieving , updating and deleting
data in relational databases are made by queries which are
written in
SQL
Advantages of Relational Databases (3)
Simple model and queries
Data accuracy (non-repetive)
Data integrity
High security
Limitations of Relational Databases (3)
- Difficult to maintain
- Cost to setup and main tain
- Large physical memory required
- Lack of scalability
- Complexity of structure
- Decrease in performance over time
What usecases widely make use of relational databases? (2)
-Store financial records of the whole industry
-Keep track of inventory
-Hold customer and supplier information
-Keep track of customer orders
-Keep record on employees
NoSQL is what type of database
Non_relational
NoSQL stores data in
Documents
Unique properties of NoSql vs SQL
Flexible data models (post deployment)
Handle huge volmes of rapidly changing unstructured data
What would this look like in NoSQL?
Advantages of NoSQL databases
Scale-out architecture - handle large volumes of data
Store strucutred, unstructured, semi-structured
EAsily update schemeas
Big Data
Disadvantages of NoSQL databases
Lack of standardization
Lack of cross-platform support
Security
Data consistency
Types of NoSQL databases
What is a graph database
Use graph to define relationsships between stored data points
Store and navigate relationships
Give examples of graph databses
Neo4j, Graph DB
Key-value databsed
Use a simple key-value
method to store data
Stores data as a collection
of key-value pairs
Key servers as a unique
identifier
What database stores content by colums rather than rows?
Wide-column database
Advatnages of a wide-columned database
Big data
Examples of wide columed databases
Google Cloud Bigtable, HBASE
Explain visual difference between row and column store
SQL and NoSQL Scaling
SQL = Vertical
NoSQL = Horizontal
Vertical scaling refers to
Adding more resources to your server as demand increases.
Existing code need not change
Horizontal scaling
Adding more servers as demand increases
Vertical vs Horizontal: Downtime
Longer downtime on Vertical
Less downtime on Horizonal
Single point of failure
Vertical vs Horizontal: Message passing
Easy Data sharing and message sharing on Vertical
Complex data sharing and message sharing on horizontal
Vertical vs Horizontal: Complexity
Horizontal increases complexity
2 types of database consistency models
ACID
BASE
ACID stands for
Atomic
Consistency
Isolation
Durability
ACID is used in ___ databases
SQL
BASE stands for
Basically Available - spread across node
Soft state = due to lack of immediate consistency, values may change
Eventually consistent - eventually reach consistent state
ACID v BASE
Difference between strong and eventualy cosnsitency
Strong consistency = Consistent the amount something sent
Eventual consistency = consistent after time as updates propogate
Eventual Consistency: server to client flows
Strong Consistency: server to client flows
What is Cap Theorem / Brewers Theorem related too?
Impossible for distribued system to provide certain charateristics
What guarantees does Cap theorem / Bewer’s refer to?
- Consistency - all clients see same view of data, even after updates
- Available- All clients can find a replicate of data set in case of partial node failure
- Partition-tolerance - system continues to work in case of network failure
Why is CAP theorem called CAP?
Acronym
Consitency
Avaliability
Partitioning
NoSQL schema type?
Pre-define or dynamic schema
What database type is preferred for large large amounts of data
NoSQL
Where a traditional web application might use PhP and MySQL, a Google Cloud website using Apps may use
AppEngine
Datastore
GQL stnads for
Google Query Language
Firebase
- Cloud hosted Real-time database
- Data is stored as JSON and synchronized in realtime to every connected client
- Automatically receive updates with the newest data
- Support: iOS, Android, Web, REST API, C++, Unit and Admin
Setup
Peformance v Scalability
Scalability is number of nodes
Peformance is how effective each node is
Scalng up/down
Vertical Scaling
Scaling out/in
Horizon scaling
On a simple level the “Horizontal Scaling Compute Pattern” is acheived by
Adding or releasing compute nodes
Is Horizontal Scaling Compute Pattern reversible
Yes
What controls the Horizontal Scaling Compute Pattern?
Cloud Platform Management Systems
UseCases for Horizontal Scaling Compute Pattern
Cost efficient scaling required
Application capacity requirements exceed capacity of largest node
Variable requirements
Minimal downtime
What pattern fits the following use cases?
Cost efficient scaling required
Application capacity requirements exceed capacity of largest node
Variable requirements
Minimal downtime
Horizontal Scaling Pattern
What are some caveats of Horizontal Scaling Compute Pattern
- Efficient utilization of resources
- Operational efficiency
What is:
The measure of how a module depends on other modules?
Coupling
Difference between couipling and cohesion
Cohesion - within the same module
Coupling - inbetween modules
Types of coupling
Tightly - Many dependencies
Loosely - some dependencies
Uncoupled
Limitations of tightly coupled
Increased complexity over time
Reduces scalability, portability
If service fails, entire app fails
Advantages of loosely coupled
A Failure in one component != casecase
System is more resilient
Graceful failure
What design pattern allows you to achieve decoupling?
Queue-centric workflow pattern
Queue provides a good way of ….
front end and backend decoupling
FIFO stands for
First in First Out
In a Queue-centric workflow the name of the bus is
Message Queue
Is a queue synchronous or asynchronous
Asynchrous
Are queues relaible?
Yes (e.g. triplicate nodes, multiple workers etc)
What compoennts are the traditional”back end” compoennts of a queue
Worker role
Can multiple workers read from a service bus queue
Usually yes
Can multiple “front end” instnaces feed to a single queue
Yes
What pattern would the following use cases suggest?
Work is time consuming
Work requires external service
Work is resource intensive
Work benefits from rate levelling
Queue-centric workflow pattern
Queues help to enable tiers to..
Scale indpendantly
What queue pattern aims to:
Optimize resources AND
Minimize Human Intervention
Auto scaling
How does an auto scaling pattern work
Continous monitoring of resources
Templetes
Deploy new resources automatically
Is vertical scaling pattern of the auto scaling pattern?
No
Types of auto scaling
Reactive
Predictive - machine learning loads
Scheduled - user defined
Benefits of auto scaling pattern
Lower cost
Automation
Service availability
Reliable performance levels
Improved fault tolernace
What types of cloud patterns are to do with scalability?
Horizontal Scaling COmpute Pattern
Queue-centric workflow pattern
Auto-scaling pattern
What types of patterns are to do with Eventual COnsistency?
MapReduce Pattern
Database Sharding Pattern
What is the logic for databse sharding pattern?
One database can’t handle all of data
Split data across multiple databases
In database sharding apttern each database node is a
Shard
In a database sharding pattern which compoennts is reponsible for maintaining knowledge of where each piece of data is kept
Shard Map Manager
Caveats of sharding
Complex for implementation
Unbalanced Shard
What is an unbalanced shard?
Shard gets too big compared to peer shards
Map Reduce is a…
Programming model and associated implemention for BIG dataases
What type of data is MapReduce pattern used for
Big data
Parallel, distributed algorithm
How is MapReduce pattern implemtned
Clsuter
Many nodes working in parallel on different parts of the data
Describe Visualize Map Reduce at a very high level
Splitters
Map functions
Reduce functions
Outputs
Two min phases of MapReduce Pattern
Map
Reduce
What phase is between Map & Reduce in map reduce patterb
Shuffle & Sort
What is the role of the mapper?
Reads data as key/value pairs
Outputs data as key/value pairs
Descibe a mapper on colours
Input - different colour objects
Output - key == colour, value == number of objects
Role of shuffle & sort
Bring similar values together
Describe the overall MapReduce pattern process
Input
Splitting
Mapping
Shuffling / Sort
Reducing
Output
What was the MapReduce pattern designed for?
Processing large amounts of data
Example of MapReduce platform
Hadoop
Does MongoDB implement automatic sharding?
Yes
Patterns related to multitenancy and commodity hardware
Busy signal pattern
Node failure pattern
Patterns related to network latency
Colocate pattern
Valet key pattern
CDN pattern
Multisite deployment pattern
Multi-tenant
Same application, different databases
How different dataabses is enforced depends **
Adv of Multi Tenancy
Cost savings
Upgrades are easier (but more impact if goes wrong)
Types of multi-tenant models
Single database - shared schema
Signle Database - seperate schema
Seperate database
Multiple database, multiple tents per database, shared schema
Multi-tenant: Single Database, Shared Schema
One database
ID used to distinqiush per tennants in database
Multi-tenant: Single Database, Shared Schema - Advantages
Maintainabily
Multi-tenant: Single Database, Shared Schema - DisAdvantages
Security
No tenant isolation
Scalability - limited
Multi-tenant: Single Database, Seperate Schema
Seperate schema per tenant
Multi-tenant: Single Database, Seperate Schema - Advantages & Disadvnatages
Maintainability
Disadvantages:
Security & Scalability
Multi-tenant: Seperate database + Advnatages and Disadvantages
Each tenant has their own database
Highest level of isolation
A hubrid of single dataabse,shared schema and seperate database is
Multiple Databases, Multiple Tenants Per Database,
Shared Schema
Avanages of Multiple Databases, Multiple Tenants Per Database,
Shared Schema
Scalability
Commodity hardware refers to
Cheap, stnadardized servers
This focuses on how an application should react when a cloud service responds to a programmatic request with a busy signal rather than success
Busy Signal Pattern
Busy Signal Pattern
How application responds to busy signal
Handles transient failures
Types of faults
Transient
Intermittent
Permenant
What is explciit throttling
Protect servers against over utalizing services
Throttling limit set at app, resource or api level
HTTP codes for throttling
HTTP status code 429
(“Too many requests”) or 503
(“Server Too Busy”)
a limit is applied to the
number of requests per
second the users from any one
tenant can submit.
Explicit throttling
Two methods of handling transient faults
Retry
Expoential backoff
What is expoential back off
Client perodically retries a failured request with increasing delays between requests
What pattern:
How application handles a compute node failure
Node Failure Pattern
RAID is related to what pattern
Node Failure Pattern
Node failure apttern works by
Splitting data into blocks - distributing across nodes
RAID defines mechanism and expected levels of avaliabiltiy
What failure level does node failure pattern allow
N+1
Node fialure and queue-centric pattern can be combiend to achieve
Stop partiallyl completed work being lost
Busy signal pattern cab be used to enable …
retries
Causes of latency
Distance
Transmissions mediums
Routers - processing delay
Storage delay
Idea behind co-locate pattern
Distance adds latency
Locate nodes together
Impact of colocate pattern
Cost optimization
Scalability
User Experience
Context for colocate pattern
One node makes frequent use of another node
How do you use colocate pattern
Set region
What is the focus of the valet key pattern?
Efficiently using cloud storage services with intrusted clients
How does a valey key pattern work
Application issue an ephemeral (equivalent key) - limited scope and time
Client uses the key to access some resource directly
CDN pattern
CDN store cached content on edge servers
Name of location at edge on CDN pattern where users first hit
Point of Presence (PoP)
Pattern - focuses on deploying a single application to more than one data center
Multisite Deployment Pattern
Reason for multisite deployment pattern
Deploying to multiple data centres helps reduce network latency
Routing a client to nearest data centre
In multisite deployment pattern, data centres stay in ____
sync
API
Intermediate software agent that allows dependent application to communicate with each toher
Types of APIs
REST-Based
SOAP-Based
GraphQL-Based
REST stands for
Representational State Transfer
Use JSON for data formatting
REST-Based API
Properties of JSON
Lightweight text based data interchange format
Language independent
Most programming languages can easily read it and instantiate
Easy to understand and manipulate
Is JSON ordered
No
SOAP stands for
Simple Object Access Protocol (SOAP)
SOAP architecture
Function Driven Architecture
Uses XML schema data format
Scalability - SOAP v REST
SOAP scales more easily and efficiently than REST
XML properties
eXtensible Markup Language
Human- and machine- readable (verus JSON’s more Machine readable)
GraphQL
Query Language
More flexible an data-intesive opreations approach in API management
Whom developed GraphQL
Image shown in a rest _ api might translate to what in graph ql
Target speciifc queries in query - more like an SQL query in some ways
Functional programming is a programming _______
paradigm
Functional programming is a way of writing applications using
only pure functions and immutable values
The following terminology is associated with what:
Immutalbe data
Closure
First-class function
Modularity
Referential Transparency
Functional Programming
Referential transparency
Functional programs should perform operations just like as if it for the first time
Closure
Inner function which can access variables of parent functions
Benefits of functional programming
Easier to test
Parallel processing
Better modularity
Limitations of functional programming
Immutable values & recursion might lead to reduction in performance
Writing pure functions causes a reduction in readability
Functinal Programming fits with what cloud computing architecture
Serverless
Reasons for serverless
No wasted resources
Low management overhead
Scalable
Cost effective
Limitations of serverless architecture
Debugging
Security
Vendor limitations
Latency
Serverless architecture components on common cloud platforms
OPEM OpenWhisk
AWS:
API Gateway
Lambda
Google Cloud Functions
Microsoft Azure Functuins
FaaS v Serverless
Subset of serverless
Micro-services are ..
small, indpendent and loosely coupled services
Benefits of Micro-services
Agility
Small-focused teams
Fault isolation
Data isolation
Deployment of micro-services
Services can be deployed independently
Challenges of micro-services
Complexity
Lack of governance -different languages and frameworks at each level
Network congestion and latency
Data integrity - each micro-service for its own data persistence
Types of Google Cloud Functions
Foreground functions (Synchronous) - directly invoked via HTTP
Background functions (Asynchronous) - invoked via ann event.
What two google services can currently invokve backgroind/asynchronous cloud functions
Google CLud Storage
Pub/Sub events
Storage in Google Cloud generates what type of event
Object Change Notification (OCN)
Service MEsh
How different parts of an application share data with one another
Dedicate infrastructure layer
What type pf service architecture is associated with a service mesh
Microservice
What type of testing is related to :
- Peformance testing
- Security testing
- Usability testing
- Compatability testing
Non-unfctional
What type of testing is related to :
- Unit testing
-Integration testing
- System testing
- Acceptance testing
Functional testing
Functional testing
Verifies each function
Types of functuonal testing
Unit
Integration
Systemm
Acceptance
ADvantages of unit tests
Capture early
Reduce bugs
Modular
Integration testing
Tests multiple components in single test
Approaches to integration testing
Big Bang
Bottom Up
Top down
Sandwhich
Integration testing - Big bang
Everything tested all as once
efficient
good for small systems
Integration testing - Bottom-up
Lower level module are tested first
Easier fault location
Critical modules tested last
No prototyping possible
Integration testing - Top-down approach
Higher level modules tested first
early prototype possible
Critical modules tested on priority
Integration testing - divide and conquer method
Sndwhich testing approach
What testing evaluates system specification?
System testing
Types of system testing
Load testing
Stress testing
Acceptance testing
Present product to users
Usability testing involves..
Setting upa user-friendly group and following:
1. Plan the test
2. Recruit participants
3. Prepare materials
4. Setup enviroment
5. Conduct the test
6. Analyze data
7. Report results
Code covergae refers to
How much of code is being covered
Quantitive
Code Coverage Methods
Statement Coverage
Decision Coverage
Branch Coverage
Condition Coverage
Statement Coverage
Calculation of the number of statement in source code which have been execited
Formula for satement coverage
(number of executed satements/total statements) * 100
What is covered by satement coverage?
Unused statements
Dead code
#Unused Branches
Missing Sattements
Decision Coverage
ensuring that each branch of every possible decision point is executed at once
Decison Coverage FOrmula
Number of decision outcomed / Total number of decision outcomes
Branch Coverage Formula
Number of executed branches/ Total number of branches
Condition Coverage
Check individual outcomes for each logical condition
Python Test Runner
A program that runs the tests
unittest
pytest
Nose
Twisted