TSM V7.1 : Understanding TSM Deduplication Flashcards

Question

What is the function of a deduplication cache? [client side dedup]

Answer 1

To maintain a local list of already identified duplicates. This way the client does not have to go to the server to find out. [although this entails very little traffic].

Answer 2

No,if the database is on fast disk and the network bandwith is high with low latency, disabling the cache may be wise.

Answer 3

- encryption - UNIX HSM client - subfile backup - simultaneneous storage pool write - Lan-free /Storage agent

Answer 4

-device class " FILE & deduplication setting is enabled : | tsm>define stgpool MYPOOL FILE ... deduplicate=yes

Answer 5

1) tsm>register node/update node DEDUPNODE...dedup=clientorserver .... 2) client options file (dsm.opt): DEDUPLICATION YES 3) files must be bound to mgmt class that has dedup stg pool as its destination.

Answer 6

-client options file: exclude.dedup option

Answer 7

both client and server V6.2.0 or later !

Answer 8

For capacity based pricing, capacity is calculated after the deduplication has occurred, so that reduces the cost.

Answer 9

max amount of backup data is 400TB, daily ingest less than 30 TB.

Answer 10

Configuration with dedup should preferrably be to a dedup pool only or a dedupdiskpool-to dedupdiskpool configuration.However if tape is needed for disaster recovery purposes, make sure that you copy the data out of the primary pool before the dedup takes place (e.g before reclamation), that way the data does not need to be reconstructed before it is copied to tape.

Answer 11

Low latency, High performance disk! | typically SAS/FC or SSD/Flash

Answer 12

12 CPU cores (may get up to 32 CPU cores with maximum load)

Answer 13

64 GB of RAM (may getup to 192 GB with maximum load)

Answer 14

for disaster recovery purposes, accross various geographically dispersed sites

Answer 15

1)Configuration is at dedup storage pool definition: tsm>define stgpool DEDUPPOOL ....IDENTIFYPROCESS=[0-50] [0: turn off the auto-start of identification processes]. 2) auto started at server start if so specified in stgpool definition. 3) influence by using the manual command: IDENTIFY DUPLICATES

Answer 16

before the physical dedup takes place (before reclamation). Otherwise the data would have to be reconstructed in order to copy it and that would take up a long time.

Answer 17

``` optionsetting: deduprequiresbackup. [can be changed dynamically by using SETOPT] [default setting : yes] setting in dsmserv.opt: deduprequiresbackup yes ```

Answer 18

1) disk to tape (dedup to non-dedup: careful planning) 2) disk to disk (dedup to dedup; data remains in primary pool until it expires) 3) node replication to 2nd server (incremental, only unique data (dedupped) )

Answer 19

The amount of diskspace is significantly reduced in the primary diskpool because of dedup. It can remain there until it expires (in theory)

Answer 20

unique binary files and encrypted data (exclude them from the mgmt class)

Answer 21

- when VTL or tape is your storage location | - when the backup window is already constrained

Answer 22

to limit the amount of (primary) disk storage needed for backup.

Answer 23

The active log maintans info about transactions that are in progress. Plan for it to be its maximal size: 128GB

Answer 24

The archive log maintains info about old transactions. It is emptied after a full database backup. It can grow very large. Plan for 500GB filesystem space for daily ingests of 4TB: 1 TB filesystem.

Answer 25

It must be able to maintain enough free capacity to receive the complete daily ingest of data + maintaining the dedupped data for the amount of time in the retention policies + some uplift (30% after dedup)

Answer 26

The total amount of data on the clients X the change rate (because of progressive incremental backup).

Answer 27

separate LUNS for TSM database and logs. No sharing with any other TSM storage pools or any other data file -spread disk i/o across as many disk controllers as possible.

Answer 28

Yes, TSM is designed to allow dedup storage pools to handle both types of dedupped data. TSM is optimized to not perform server side dedup on client-side dedupped data. Objects are recognized by both types of dedup, even if they were dedupped " by the other side"

Answer 29

use compression.

Answer 30

with many clients simultaneous asking for the processsing of duplicate chunks.

Answer 31

Only server-side dedup (client side not supported)

Answer 32

storage pool level

Answer 33

Duplicates are not identified across storage pools.

Answer 34

- high mountlimit - maxcapacity of volumes 50GB - directories should represent separate filesystems on separate logical volumes on as many possible separate disks.

Answer 35

use of scratch volumes, use the MAXSCRATCH parameter

Answer 36

scheduled. Therefore set the IDENTIFYPROCESS parameter (in DEFINE STG) to 0.

Answer 37

set the RECLAIM parameter to 100 (in define stgpool)

Answer 38

``` define domain DEDUPDISK > define policy DEDUPDISK POLICY1 > define mgmtclass DEDUPDISK POLICY1 STANDARD > assign defmgmtclass DEDUPDISK POLICY1 STANDARD > define copygroup DEDUPDISK POLICY1 STANDARD type=backup destination=DEDUPPOOL VEREXISTS=nolimit VERDELETED=10 RETEXTRA=30 RETONLY=80 > define copygroup DEDUPDISK POLICY1 STANDARD type=archive destination=DEDUPPOOL RETVER=30 > activate policyset DEDUPDISK POLICY1 ```

TSM V7.1 : Understanding TSM Deduplication Flashcards

(63 cards)