Chapter 6 - Data Deduplication Flashcards by Raul Pena

Name three scenarios that would be ideal for data deduplication.

General-purpose file servers
VDI deployments
backup targets

How well did you know this?

Not at all

Perfectly

What data deduplication policy specifies that files should be considered for data
deduplication?

Optimization policy

How well did you know this?

Not at all

Perfectly

What two fields in Get-DedupStatus are relevant to the optimization rate?

OptimizedFilesSavingsRate and SavingsRate

How well did you know this?

Not at all

Perfectly

What is a Chunk?

A part of a file that Data Deduplication selected by the chunking algorithm as likely to occur in other, similar files

How well did you know this?

Not at all

Perfectly

What is a Chunk Store?

An organized series of container files in the System Volume Information folder that DDPEval uses to uniquely store chunks

How well did you know this?

Not at all

Perfectly

What is Dedup?

An abbreviation for data deduplication that is commonly used in PowerShell, Windows Server APIs and components, and the Windows Server community

How well did you know this?

Not at all

Perfectly

What is File Metadata?

Information that describes properties about the file that are not related to the main content of the file

How well did you know this?

Not at all

Perfectly

What is File Stream?

The main content of the file

How well did you know this?

Not at all

Perfectly

What is a File system?

The software and on-disk data structure that the operating system uses
to store files on storage media

How well did you know this?

Not at all

Perfectly

What is a File System Filter?

A plugin that modifies the default behavior of the file system

How well did you know this?

Not at all

Perfectly

What is Optimization?

The process of chunking a file and storing its unique chunks in the chunk store

How well did you know this?

Not at all

Perfectly

What is Optimization Policy?

A policy which specifies the files that should be considered for data deduplication

How well did you know this?

Not at all

Perfectly

What is Reparse Point?

A special tag that notifies the file system to pass off I/O to a specified file system filter; in data deduplication, it is the way optimized files are stored (pointers to a chunk map)

How well did you know this?

Not at all

Perfectly

What is Volume?

A Windows construct for a logical storage drive that may span multiple physical storage devices across one or more servers

How well did you know this?

Not at all

Perfectly

What is Workload?

An application that runs on Windows Server

How well did you know this?

Not at all

Perfectly

What are some usage scenarios for data deduplication?

User documents: 30% to 50%
Deployment shares: 70% to 80%
Virtualization libraries: 80% to 95%
General file shares: 50% to 60%

How well did you know this?

Not at all

Perfectly

How does data deduplication help in general file servers?

There will be plenty of opportunity for data deduplication to work its magic in these environments—often consisting of team shares, user home folders, work folders, and software development shares.

How well did you know this?

Not at all

Perfectly

How does data deduplication help in Virtualized Desktop Infrastructure (VDI) deployments?

Many virtual hard disks are practically identical.

How well did you know this?

Not at all

Perfectly

How does data deduplication help with backup targets?

So much of the data we store as backups is identical to other data we have backed up!

How well did you know this?

Not at all

Perfectly

What is DDPEval?

Data Deduplication Savings Evaluation tool can evaluate the potential for optimization against directly connected volumes and mapped or unmapped network shares.

How well did you know this?

Not at all

Perfectly

Can data deduplication affect performance negatively?

Study These Flashcards

data deduplication is a periodic task that could interrupt the performance requirements of your workload. This is of most concern for workloads stored in traditional HDDs as opposed to SSDs.

What are the resource requirements of the workload?

Study These Flashcards

Storage that has “downtime,” such as weekends, is often an excellent candidate for data deduplication since this processing can occur during those times.

How does Windows Server 2016 enhance data deduplication in four ways?

Study These Flashcards

Support for larger volumes
Support for larger files
Support for Nano Server
Simplified backup support

How does Windows Server 2016 enhance support for data deduplication in larger volumes?

Study These Flashcards

Server 2016 now supports volume sizes up to 64 TB.

How does Windows Server 2016 enhance support for data deduplication in larger files?

Files up to 1 TB are fully supported

How does Windows Server 2016 enhance data deduplication in backup support?

A new default usage type now supports seamless deployment of data deduplication for virtualized backup applications.

What are the Three types of Data Deduplication?

Default Hyper-V Backup

What is the default type of data deduplication used for?

This is the option to choose for general-purpose file servers. It uses Background optimization.

What is the optimization policy for the default type of data deduplication?

Minimum file age = 3 days Optimize in-use files = No Optimize partial files = No

What is the Hyper-V type of data deduplication used for?

This is deduplication tuned specifically for VDI servers.. It uses Background Optimization and has "under the hood tweaks" for Hyper-V interoperability

What is the optimization policy for the Hyper-V type of data deduplication?

Minimum file age = 3 days Optimize in-use files = Yes Optimize partial files = Yes

What is the Backup type of data deduplication?

This is tuning for virtualized backup applications. It has priority optimization and “Under-the-hood” tweaks for interop with DPM/DPM-like solutions

What is the optimization policy for the backup type of data deduplication?

Minimum file age = 0 days Optimize in-use files = Yes Optimize partial files = No

What four jobs make data deduplication possible?

Optimization Garbage Collection Integrity Scrubbing Unoptimization

What is Garbage Collection?

Reclaims disk space by removing unnecessary chunks that are no longer being referenced by files that have been recently modified or deleted

What is Integrity Scrubbing?

Identifies corruption in the chunk store due to disk failures or bad sectors

What is Unoptimization?

Undoes the optimization done by deduplication and disables data deduplication for that volume

How do you add the role of data deduplication in PowerShell?

Install-WindowsFeature -Name FS-Data-Deduplication

How do you add the role of data deduplication in Nano Server?

Install-WindowsFeature -ComputerName -Name FS-Data-Deduplicatio

What are five cmdlets used to implement data deduplication in PowerShell?

``` Enable-DedupVolume Start-DedupJob Stop-DedupJob Get-DedupJob Start-DedupJob ```

What does the Enable-DedupjobVolume cmdlet do?

Enables data deduplication on one or more volumes.

What does the Start-DedupJob cmdlet do?

Starts a new data deduplication job

What does the Stop-DedupJob cmdlet do?

Stops a data deduplication job that’s already in progress (or removes it from the queue)

What does the Get-DedupJob cmdlet do?

Shows all the active and queued data deduplication jobs

What does the Start-DedupJob cmdlet do?

To disable date deduplication

What cmdlet is useful for Powershell data deduplication monitoring?

Get-DedupStatus

Which Fields are important in the Get-DedupStatus cmdlet for data deduplication monitoring?

``` LastOptimizationResult LastGarbageCollectionResult LastScrubbingResult OptimizedFilesSavingsRate SavingsRate ```

How do you interpret the monitoring of the LastOptimizationResult field in data deduplication monitoring?

(0 = success), LastOptimizationResultMessage, and LastOptimizationTime (should be recent)

How do you interpret the monitoring of the LastGarbageCollectionResult in data deduplication monitoring?

0 = success), LastGarbageCollectionResultMessage, and LastGarbageCollectionTime (should be

How do you interpret the monitoring of the LastOptimizationResult field in data deduplication monitoring?

(0 = success), LastScrubbingResultMessage, and LastScrubbingTime (should be recent)

How do you interpret OptimizedFilesSavingsRate in data deduplication monitoring?

applies only to the files that are “in-policy” for optimization (space used by optimized files after optimization/logical size of optimized files)

How do you interpret SavingsRate in data deduplication monitoring?

applies to the entire volume (space used by optimized files after optimization/total logical size of the optimization)

Chapter 6 - Data Deduplication Flashcards

(52 cards)