PowerFlex Whitepapers Flashcards

1
Q

What is the role of the SDR?

A

to proxy the IO of replicated volumes between the SDC and the SDSs where data is ultimately stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does the SDR work?

A

write IO operations are split - sending one copy to the SDS and another copy to the replication journal volume

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where is the SDR in the architecture?

A

sits between the SDC and SDS and is deployed alongside the SDS nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the SDR appear to be from the SDS point of view?

A

like an SDC sending writes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the SDR appear to be from the SDC point of view?

A

like an SDS to which writes can be sent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What method does PowerFlex utilize instead of snapshotting?

A

journaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a limitation of snapshot based replication?

A

identifying block change delta is easy but as RPOs get smaller the number of required snapshots increases dramatically

places hard limit on how small RPOs can be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the advantage of journal based replication?

A

provides possibility of smallest RPO and not constrained by the maximum number of snapshots available in a system/volume

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are journals maintained in PowerFlex?

A

live as volumes in an SP in the same PD

journal volume does not need to reside in the same SP as the volume being replicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is important to know about sizing journal volumes for replication?

A

journal volume must have enough available capacity to continue ingesting replication data even when the WAN is down and target site is not available to send

must consider the maximum cumulative writes that might occur in an outage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the minimum requirement for journal capacity?

A

28GB x # of SDR sessions

SDR sessions = # of SDRs installed + 1

reserve at least 5% of SP for journal volumes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can reserve journal capacity be distributed?

A

can be split into several volumes across multiple SPs or can reside all in one SP in a PD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the performance requirement for journal volumes?

A

performance of any SP where journal volume resides must match or exceed performance of SP where replicated volumes reside

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the single most important consideration when sizing journal capacity?

A

possible WAN outage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you assess the journal capacity needed per application?

A

need to know the maximum application write bandwidth during the busiest hour

minimum outage allowance is 1hr - strongly recommend using 3 hr allowance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an example of journal capacity calculation per application?

A

Calculation example:

Our application generates 1 GB of writes during peak hours.
Using 3 hours as the supported outage, we calculate from 10,800 seconds.
The journal capacity reservation needed is 1 GB/s * 10800 s = ~10.547 TB.
Because journal capacity is calculated as a percentage of storage pool capacity, we divide the needed space by the storage pool usable capacity. Let us assume that usable capacity is 200 TB.
100 * 10.547 TB / 200 TB = 5.27%.
As a safety margin, we will round up to 6%.

Repeat the calculation for each application being replicated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How does the SDR organize data as it’s being written?

A

assembles journal files that contain checkpoints to preserve write order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens to duplicate blocks that get sent to the journal volume?

A

consolidated to minimize volume of data being sent over

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is data transferred from source to target on PowerFlex?

A

SDR sends data over dedicated local subnets or external WAN networks assigned to replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is compression affected in replicated volumes?

A

compressed data is not sent over the WAN

SDS responsible for compressing writes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a volume migration limitation related to replication in PowerFlex?

A

migrating replicated volumes from one PD to another is not possible

since replication journals don’t span PDs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the asynchronous replication topologies available on PowerFlex?

A

one directional

bi-directional

one-to-many

many-to-one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What was necessary before PowerFlex 4.x to perform replication?

A

manually had to create source and target volume

can now automate it

24
Q

What are the rules w/ RCGs?

A

a volume whether source or target can be member of one RCG

RCGs can only consist of two PowerFlex systems max

25
Q

What must happen before two PowerFLex systems can talk to one another?

A

must have exchanged certificates and been peered

26
Q

What does an RCG do?

A

establish the attributes and behavior of the replication of one or more volume pairs

27
Q

What are the rules for volume pairs in an RCG?

A

must be identical in size

don’t have to reside in same type of SP

do not have to have same properties (compressed, thin/thick provisioned etc.)

28
Q

What are the RPO limits for RCGs on PowerFlex 4.x?

A

15 sec - 60 minutes

min on 3.5 was 30 seconds

29
Q

How many IP Addresses do SDRs have?

A

each have 2 for redundancy

30
Q

What is the bandwidth rule for PowerFlex replication?

A

of writes to replicated volumes can’t exceed bandwidth of single network path between clusters

31
Q

Why do write operations to replicated volumes take up 3x the bandwidth of PowerFlex?

A

journaling adds two IO operations

SDR first writes to relevant SDS backing the journal volume and the SDS sends another copy to the secondary

SDR makes a second read from the journal volume before sending to remote site

32
Q

What is the networking recommendation if using replication on PowerFlex?

A

4 x 25GbE
2 x 100GbE

33
Q

What is the bandwidth recommendation for WAN based replication?

A

sustained write bandwidth of all replicated volumes doesn’t exceed 80% of total available WAN bandwidth

34
Q

What are the max number of SDRs per system?

A

128

35
Q

What is the max replicated volume size?

A

64TBs

36
Q

What is the max number of RCGs per system?

A

32000

37
Q

How many snapshots can be put on policy based snapshot schedule?

A

60 out of 126 on a vTree

38
Q

What sector especially is important for secure snapshots?

A

financial sector

39
Q

What is the goal of maintenance modes on PowerFlex?

A

to avoid a rebuild operation and control offline process without throwing an error

40
Q

What happens in an unplanned outage event on PowerFlex?

A

system automatically goes into a rebuild state - redistributes data on remaining nodes until it’s back to being online

41
Q

What is the overall spare capacity rule on PowerFlex?

A

spare capacity in the system must be equal to or greater than the capacity of the smallest fault unit (node)

42
Q

How does a cluster function when a node is put in maintenance mode?

A

cluster still functions with one less node - less performance/capacity

writes are sent to and mirrored on other nodes in the system (one to many rebalance)

one node is brought back online many to one rebalance occurs (slow process)

43
Q

How does IMM (instant maintenance mode) work?

A

when node put into IMM data is not evacuated from node but data is not accessible

application read operations are directed to the other nodes that contain the mirror copy of the data

44
Q

What happens to the MDM when a node enters IMM?

A

provides an updated map to the SDCs for IO operations

instructs the SDCs to use another SDS for read/write IO

any changes that would’ve affected the node in IMM are tracked

45
Q

What happens when a node exits IMM?

A

don’t need to do full hydration (many to one) like if you took a node off the cluster and added another one after maintenance

only sync back relevant changes that occurred during maintenance

allows fast exit from maintenance and quick return to full capacity/performance

46
Q

What is the primary disadvantage of IMM?

A

having a temporary single available copy since all the data on the node in maintenance is unavailable including any secondary copies

if an operational primary node fails while secondary is in maintenance mode could result in data loss

47
Q

What happens when a node enters PMM (protective maintenance mode)?

A

initiates a many to many rebalancing process but data on node is preserved

data is unavailable while in PMM but temporary third copy of data is made on other nodes in the system

48
Q

What is the main advantage of PMM?

A

guarantees two copies and thus avoids single data copy risk like IMM

49
Q

What happens when a node exits PMM?

A

data is not needing full rehydration just the relevant changes that were tracked by other nodes

third copy is removed once all data has been resynced

50
Q

How does maintenance of an SDR work?

A

does not enter PMM - is a manual process

51
Q

What is the rule for mixing maintenance methodologies?

A

PMM and IMM can’t occur simultaneously in same PD

IMM can be used in a PD while PMM is used in another

52
Q

What is the rule for concurrent operations with maintenance modes?

A

within a PD all SDSs concurrently in/entering PMM must belong to the same fault set

if you don’t use fault sets only node in a PD can be in maintenance mode at a time

53
Q

What types of operations are typically recommended for IMM?

A

back-end software component upgrades (SDS, MDM, SDC etc.)

54
Q

What types of operations are typically recommended for PMM?

A

node maintenance actives like (firmware or driver upgrades)

55
Q

How much spare capacity do you need to build in if a cluster is going to use PMM?

A

must be enough spare capacity in a system to handle at least one node failure

Free + Spare - 5% of the Storage Pool >= capacity of PMM node(s)