5. Implement High Availability Flashcards
RPO
Recovery Point Objective
The amount of data that must be restored in the event of a failure
RTO
Recovery Time Objective
The length of time an application can be unavailable before service must be restored
MTBF
Mean Time Between Failure
MTTR
Mean Time To Recover
Failover Clustering
Used for applications and services such as SQL and Exchange
Network Load Balancing
Used for network-based services such as Web, FTP and RDP servers. Allows configuring two or more servers as a single virtual cluster.
NLB Unicast Mode
- Cluster adapters for all nodes are assigned the same MAC address.
- Can cause subnet flooding since all packets are sent to all ports on the switch.
- Communication between nodes is not possible.
NLB Multicast Mode
- Cluster adapters for all nodes get their own MAC address
- Nodes are assigned a multicast MAC address (from IP of cluster)
NLB IGMP Multicast
Similar to multicast, but prevents switch flooding because MAC traffic only goes to ports of NLB cluster
IGMP
Internet Group Management Protocol
NLB Stop
Cluster stops immediately, all active connections are killed
NLB Drainstop
Cluster stops after answering all current connections, no new connections are accepted.
Hyper-V Replica
Replicates VMs from primary site to secondary site simultaneously
Extended (Chained) Replication
Host 1 > Host 2 > Host 3 Does not support application-consistent replication
NLB Affinity Types
- None
- Single
- Class C
NLB Affinity: None
NLB does not assign clients to a node, all requests can go to any node
NLB Affinity: Single
Single Affinity allows a client to be assigned to a single node. Best intranet performance.
NLB Affinity: Class C
NLB links clients with a specific node based on the Class C part of the client’s IP address. Best internet performance.
NLB Cluster Requirements
- Adapter can only support TCP/IP
- Servers in cluster must have static IP’s
Test Failover
Verifies a replica can start in the secondary site
Planned Failover
Used during planned downtime. Primary VM is powered off and replica is powered on and syncs changes to primary. Normal primary is restored after failover is complete.
Unplanned Failover
Only initiate if primary machine is offline.
DHCP Guard
Drops DHCP server messages from unauthorized VMs pretending to be a DHCP server.
Router Guard
Router Guard drops advertisement and redirection packets from unauthorized VMs pretending to be routers. Similar to DHCP Guard.
Protected Network
Virtual machine will be moved to another cluster node if a network disconnection is detected.
Port Mirroring
Allows VM network traffic to be monitored by copying packets and forwarding to another VM for monitoring
NIC Teaming
Place NICs in a team in the guest operating system to aggregate bandwidth and provide redundancy. Useful if teaming is not configured in management OS
Device Naming
Causes the name of the network adapter to be propagated into supported guest OSes
VM Checkpoints
System state backup of VM from specific point in time
Software Load Balancing
Allows having multiple servers hosting same virtual networking workload in a multitenant environment.
Hyper-V Live Migration
Transfers running VM from one host to another with no downtime.
Hyper-V Quick Migration
Requires pausing VM, saving VM, moving VM and starting again
Hyper-V Move VM
Power off VM and copy to another host and then power on.
CredSSP for Live Migration
Requires constrained delegation and ability to sign into the Source server
Kerberos for Live Migration
Allows avoiding having to sign into Source server but requires constrained delegation to be set up
Hyper-V Live Migration Performance: TCP/IP
Memory of VM is copied over the network to destination over TCP/IP
Hyper-V Live Migration Performance: Compression
Memory is compressed and then copied to destination over TCP/IP
Hyper-V Live Migration Performance: SMB
Memory is copied to destination over SMB connection. SMB Direct is used if NICs at source and destination have Remote Direct Memory Access enabled
Hyper-V Live Migration Requirements
- Administrator account (Local/Domain)
- Hyper-V Role, version 5 VM
- Same domain
- Management tools installed
Hyper-V Shared Nothing Live Migration
Migration between hosts not in a cluster. Requires Kerberos constrained delegation configuration on each server.
Hyper-V Storage Migration
Allows migrating a running VM from one storage device to another without downtime
Storage Migration Requirements
Only virtual hard disks configured for storage
NLB Hardware Requirements
- All hosts on same subnet
- No limit on NICs
- All NICs must be multicast or unicast
- If using unicast mode, NIC handling client-to-cluster traffic must support MAC addressing
What is a failover cluster?
All of the clustered application or service resources are assigned to one node/server in the cluster. If the node/server goes offline, another node/server spins up the resource and all traffic to the cluster is automatically sent to the new live node.
Examples of commonly clustered applications: SQL and Exchange
Examples of commonly clustered services: Hyper-V
In what editions of Windows Server 2016 is failover clustering available?
- Datacenter
- Standard
- Hyper-V
Failover Clustering Server Requirements
- All server hardware must be Server 2016 certified
- All of the “Validate a Configuration Wizard” tests must pass
Storage Failover Clustering Requirements
- Disks available must be Fibre Channel, iSCSI or Serial Attached SCSI
- Each node must have a dedicated NIC for iSCSI connectivity
- Multipath software must be based on Microsoft’s Multi-path I/O (MPIO)
- Storage drivers must be based on ` storport.sys`
- Drivers and firmware for storage controllers of each node should be identical
- All storage hardware should be certified for Server 2016
Failover Clustering Network Requirements
- Nodes should be connected to multiple networks for redundancy
- NICs should be same make, same drivers & firmware
- Network components should be certified for Server 2016
Failover Cluster Network Connections
- Public - client to cluster
- Private - node to node
Cluster Domain Scenarios
- Single-domain clusters
- Multi-domain clusters
- Workgroup clusters
Site-Aware Clustering
Also known as stretch or geoclustering, is the practice of having clusters span across geographic locations. Server 2016 clusters can be configured to be site-aware, allowing administrators to set up and controll cross-site heartbeats for optimal configuration.
Cluster Quorum
Consensus of the status of each of the nodes in the cluster. Quorum must be achieved in order for the cluster to come online by getting a majority of the available votes. Best practice is to have total available quorum votes be an odd number.
The Four Quorum Modes
- Node majority (no witness)
- Node majority with witness (disk or fileshare)
- Node and file share majority
- No majority (disk witness only)
Disk Witness
- 512 MB minimum
- Dedicated to cluster
- Must pass validation tests for storage
- NTFS or ReFS formatting
Setup if all nodes can see the disk and are using shared storage.
File Share Witness
- 5 MB free space minimum
- File share must be dedicated to the cluster
Setup for multi-site disaster recovery and the file server must be using the SMB file share.
Cloud Witness
New to Server 2016, cloud witness leverages the use of Microsoft Azure to have an “always-on in any location” quorum vote.
Dynamic Quorum Management
New to Server 2016, quorum votes for nodes are automatically added/removed as nodes are added/removed from the cluster. This is enabled by default.
“Validate a Configuration Wizard”
Runs four types of tests:
- Software and hardware inventory
- Network tests
- Storage tests (brings cluster offline)
- System configuration tests
Validates whether or not a cluster is supported by Microsoft. Report is stored in %windir%\Cluster\Reports
Actions on Nodes in a Cluster
- Pause node - prevents failing over to the node, useful for maintenenance/troubleshooting
- Evict node - irreversible, kicks node out of cluster. Node can be re-added from scratch, useful if node is damaged beyond repair.
Built-in Roles and Features that can be Clustered
- DFS Namespace Server
- DHCP Server
- Distributed Transaction Coordinator (DTC)
- File Server
- Generic Application
- Generic Script
- Hyper-V Replica Broker
- iSCSI Target Server
- iSNS Server
- Message Queueing
- Virtual Machine
Cluster Failover Process
- Cluster service takes all of the resources in the role offline to set dependancy hierarchy
- Cluster service transfers the role to the node that is listed next on the application’s list of preferred host nodes
- Cluster service attempts to bring all of the role’s resources online, starting at the bottom of the dependancy hierarchy
Steps assume live migration is not being used
Cluster Failback Settings
Determine when, if ever, a role/application should fail back to the primary cluster node when it becomes available. Default behavior is “Prevent Failback” but it can be scheduled and set by the administrator.
Cluster Dependancy Viewer
Gives visual report of how the roles/services for the clustered resource are dependent on other roles/services in the hierarchy.
What are Resources in a Cluster?
The smallest configurable part in a cluster. Include physical or logical objects such as disks, IP addresses, file share, etc. Can configure resource policies to determine how resources respond when a failure occurs and how resources are monitored for failures.
Resource Policy Options
- If Resource Fails, Do Not Restart
- If Resource Fails, Attempt Restart on Current Node
- If Restart Is Unsuccessful, Fail Over All Resources In This Service or Application
- If All The Restart Attempts Fail, Begin Restarting Again After The Specified Period (hh:mm)
Cluster Shared Volumes
Cluster Shared Volumes (CSV) enable multiple nodes in a failover cluster to simultaneously have read-write access to the same LUN (disk) that is provisioned as an NTFS volume.
Cluster-Aware Updating
Allows system updates to be applied automatically while the cluster remains available during the entire update process.
Node Fairness (Virtual Machine Load Balancing)
Prevents any one host/node from being overloaded with too many running VMs. It will automatically redistribute VMs to different hosts/nodes to the desired balance settings.
Scale-Out File Server for Application Data
Utilizing Storage Spaces you can create a Scale-Out File Server with highly available clustered disks which are useful for Hyper-V VM disk storage as well as SQL Server database file storage.
VM Drain on Node Shutdown
Windows Server will automatically attempt to live migrate VMs on a cluster node to another node during a reboot/shutdown.
Global Update Manager Mode
The Global Update Manager is a component of the cluster that ensures that before a change is marked as being committed for the entire cluster, all nodes have received and committed that change to their local cluster database. The GUM is only as fast as the slowest node in the cluster.
New to Server 2016 is Global Update Manager mode which allows you to configure the GUM read-write modes manually to speed up the processing of changes by the GUM.
Hyper-V Replica Broker
Allows for VMs in a cluster to be replicated. The Hyper-V Replica Broker keeps track of which nodes VMs reside on and ensure replication is maintained.
Storage Spaces Direct
Storage Spaces Direct uses locally attached drives on servers to create highly available storage. It is conceptually similar to RAID but done at the software level (Windows). Disks on one node are available for use by the whole cluster and parity is maintained on each node in the cluster for highly available storage.
Multiple physical disks together –> Storage Pool
Storage Spaces = Virtual Disks created from Storage Pools
Storage Spaces Direct Hardware Requirements
- 2-16 servers with locally attached SATA, SAS or NVMe drives
- Must have at least two SSDs on each server and at least four additional drives (2 SSD + 4 HDD)
Software Storage Bus
New Storage Spaces Direct feature which allows all of the servers to see all of each other’s local drives by spanning the cluster and establishing a software-defined storage structure.