Module 10: Data Protection (Data Deduplication + Data Archiving) Flashcards
What are the cons of duplicate data?
impacts backup windows
increases network bandwidth
difficult to protect data within budget
What is data deduplication?
process of detecting and identifying the unique data segments within a given set of data to eliminate redundancy
What is the deduplication ratio?
ratio of data before deduplication to the amount of data after deduplication
What are the key benefits of data deduplication?
reduces infrastructure costs
enable longer retention periods
reduces backup windows
reduces network bandwidth
What is source based deduplication?
data is deduplicated at the source (backup client)
When is source based deduplication recommended?
ROBO environments
also commonly used by cloud service providers
What are the advantages of source based deduplication?
reduces storage capacity and network bandwidth requirements
What is target based deduplication?
data is deduplicated at the target (inline vs postprocess)
What are the advantages and disadvantages of target based deduplication?
offloads backup client from deduplication process
requires sufficient network bandwidth
What is a disadvantage of source base deduplication?
puts more burden on the host since its responsible for generating safe set and deduping
What does inline deduplication mean?
dedupes in cache and than send to disk
What is file based dedupe?
takes full backups of a file and can dedupe it to reduce copies - but if any part of file changes need to do another backup
What is sub-file based dedupe?
when you generate a file the first day it breaks it down into sub-file/objects
What is data archiving?
moves fixed content that is no longer actively accessed to a separate low cost archive storage system
What are the advantages of data archiving?
saves primary storage capacity
reduces backup window and backup storage costs