Dell EMC PowerFlex: Networking Best Practices and Design Considerations Flashcards
How many SDSs can a PowerFlex system support?
512
What are the max number of SDCs in a PowerFlex system?
1024
What happens when a component fails in PowerFlex?
MDM initiates a auto-healing process
What is the recommended networking configuration for front and backend PowerFlex networking paths?
recommended to separate back-end traffic since it allows for improved rebuild/rebalance
What is SDC to SDC traffic?
front-end
includes all writes/reads arriving at or originating from a client
has a high throughput requirement
What is SDS to SDS traffic?
back-end
includes writes mirrored between SDSs, rebalance traffic, rebuild traffic, volume migration
high throughput requirement
What is MDM to MDM traffic?
back-end traffic
relatively lightweight data exchange
do not require same level of throughput required for SDC/SDS traffic
have very short timeout for their quorum exchanges (happens every 100ms)
What is the network requirement for MDM to MDM traffic?
stable, reliable low latency network
PowerFlex supports the use of one or more networks dedicated to MDM traffic
min 2 10G uplinks per MDM for prod environments - 25G recommended
What is the latency requirement for peer MDM to MDM traffic in a replication use case?
less than 200ms
What are MDMs responsible for besides mapping?
rebalance/rebuild traffic, RCGs, metadata synchronization for replication purposes
What is MDM to SDC traffic?
front-end
primary MDM must communicate with SDCs in event that data layout changes
can happen if node goes offline/volumes put into RCG
connection is lazy/asynchronous but still requires reliable low latency network
What is MDM to SDS traffic?
back-end
primary MDM must communicate w/ SDSs to monitor device health/issue rebalance/rebuilds
What is SDC to SDR traffic?
front-end traffic
in volume replication normally SDC to SDS traffic is routed through SDR
if volume is in RCG MDM adjusts the volume mapping presented to the SDC and directs SDC to issue IO operations to SDRs to pass onto relevant SDSs
high throughput requirement
What is SDR to SDS traffic?
back-end traffic - throughput is proportionate to number of volumes being replicated
when volumes are being replicated there are two subsequent IOs from the SDR to the SDSs on source system
SDR passes volume IO to associated SDS for processing/committal to disk
SDR applies writes to journaling volume
How does PowerFlex recognize journal volumes?
sees them as a normal volume
SDR sends IO to the SDSs whose disks comprise the SP in which the journal volume resides
What is MDM to SDR traffic?
MDMs must communicate w/ SDRs to issue journal interval closures, collect/report RPO compliance and maintain consistency
commands local SDRs to perform journal operations
What is SDR to SDR traffic?
SDRs within source or within target don’t communicate w/ one another
SDRs across systems communicate w/ one another
latency not sensitive but round trip time should be less than 200ms
How do SDCs communicate with one another?
SDCs do not communicate w/ one another
can be enforced using private VLANs and network firewalls
What is the fault tolerance rule for PowerFlex software components?
communicated between software components should be assigned to at least two subnets on different physical networks
native link fault tolerance and multipathing on all of these components across subnets assigned
What happens if a link failure is detected by PowerFlex?
almost immediate awareness of issue
dynamically adjusts communications within 2-3 seconds across subnets assigned to software components where link failure was
particularly important for SDS - SDS and SDC - SDS communication
What networking topology is strongly recommended to not be used for PowerFlex?
STP (spanning tree protocol)
Dell recommends using a non-blocking network design (use of all switch ports concurrently)
What is a two-tier spine leaf topology?
provides single switch hop between leaf switches and large amount of bandwidth between endpoints
eliminates oversubscription of uplink ports
How is a spine-leaf topology connected?
each leaf switch is connected to all spine switches
leaf switches don’t need to be directly connected to each other
spine switches don’t need to be connected to other spine switches
Why does Dell recommend a spine-leaf topology in most instances?
scalability to hundreds of nodes
facilitate scale out deployments without needing to rearchitect the solution
allows use of all network links concurrently
When would Dell recommend a flat network topology over a spine leaf?
if existing flat network is being extended or if network is not expected to scale
How is a flat network topology connected?
all the switches are used to connect to hosts - no spine switches
When might a flat network topology be cost prohibitive compared to a spine-leaf?
if you expand beyond small # of access switches
additional cross-link ports required could run up costs quickly
What are latency best practices for PowerFlex sizing?
latency for all SDS-SDC communication should never exceed 1ms network-only round trip time under normal operations
since most WANs lowest response time is more than this you should not operate PowerFlex clusters across a WAN
What are latency best practices for PowerFlex replication sizing?
latency between peered cluster components whether MDM-MDM or SDR-SDR should not exceed 200ms round trip time
Outside of performance what is the important of right sizing network throughput for PowerFlex?
reduce rebalance/rebuild times
What is the minimum link network for throughput on PowerFlex?
10G - 25G recommended
What do all PowerFlex nodes ship with?
4 ports at 25G minimum
What is the relation between SDS and throughput?
SDSs that make up a PD should reside on hardware w equivalent storage/network performance
since total bandwidth of PD will be limited by the weakest link during IO and rebuild/rebalance operations due to wide striping
Why do node types need to be the same in a PD when it comes to throughput?
HCI nodes have slower performance than bare metal nodes due to virtualization overhead
What is the networking redundancy recommendation for PowerFlex nodes?
each node has at least 2 separate network connections regardless of throughput requirement
What is the rule of network throughput for the appliance as a whole?
throughput to a node should match or exceed the combined maximum throughput of the storage media hosted on the node
How would you calculate the amount of throughput required on a node?
When determining the amount of network throughput required, keep in mind that modern media performance
is typically measured in megabytes per second, but modern network links are typically measured in gigabits
per second.
To translate megabytes per second to gigabits per second, first multiply megabytes by 8 to translate to
megabits, and then divide megabits by 1,000 to find gigabits.
What is the typical measure for bandwidth requirement on a node?
read operations
What is important to consider about the networking components for throughput requirements on a node?
verify RAID controller/HBA on node can meet or exceed maximum throughput of underlying storage media
What is important to know about bandwidth in write heavy environments/nodes?
per SDS a write operation requires 1.5 times more bandwidth/throughput than a read operation when compared to the throughput of the underlying storage
For HCI nodes what is the priority for network sizing?
VM Performance
must separate storage from other network traffic
What is IP redundancy connectivity on an HCI node?
recommended for MDMs
good for SDSs
good for SDCs if 3sec interrupt on failure is acceptable
What is MLAG connectivity on an HCI node?
good for SDCs when minimum IO interrupt on failure needed
ok for SDSs
not recommended for MDMs
When might it be beneficial to isolate front end and backend traffic for the storage network?
in two layer deployments where storage and virtualization/compute teams each manage their own networks
guarantee performance on storage/compute
Why are jumbo frames recommended?
allows more data to be passed in a single Ethernet frame
decreases total number of frames and number of interrupts to be processed by a node
performance benefit of around 10%
What is the difference between IP redundancy and MLAG/LAG connectivity?
LAG - combines ports between end points
IP - each physical port has its own IP address
Why is LAG/IP redundancy recommended over MLAG for MDM-MDM traffic?
continued availability of one IP address on the MDM helps prevent failovers due to short timeouts between MDMs
Why is IP level redundancy preferred over MLAG for SDC communication links?
improved network failure resistence in v3.5
What is the difference between MLAG (multi-chassis link aggregation groups) and LAG?
like LAG MLAG provides network link redundancy
unlike LAG MLAGs allow a single node to be connected to multiple switches