Chapter 6 - Data Deduplication Flashcards

1
Q

Name three scenarios that would be ideal for data deduplication.

A

General-purpose file servers
VDI deployments
backup targets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What data deduplication policy specifies that files should be considered for data
deduplication?

A

Optimization policy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What two fields in Get-DedupStatus are relevant to the optimization rate?

A

OptimizedFilesSavingsRate and SavingsRate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Chunk?

A

A part of a file that Data Deduplication selected by the chunking algorithm as likely to occur in other, similar files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a Chunk Store?

A

An organized series of container files in the System Volume Information folder that DDPEval uses to uniquely store chunks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Dedup?

A

An abbreviation for data deduplication that is commonly used in PowerShell, Windows Server APIs and components, and the Windows Server community

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is File Metadata?

A

Information that describes properties about the file that are not related to the main content of the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is File Stream?

A

The main content of the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a File system?

A

The software and on-disk data structure that the operating system uses
to store files on storage media

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a File System Filter?

A

A plugin that modifies the default behavior of the file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Optimization?

A

The process of chunking a file and storing its unique chunks in the chunk store

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Optimization Policy?

A

A policy which specifies the files that should be considered for data deduplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Reparse Point?

A

A special tag that notifies the file system to pass off I/O to a specified file system filter; in data deduplication, it is the way optimized files are stored (pointers to a chunk map)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Volume?

A

A Windows construct for a logical storage drive that may span multiple physical storage devices across one or more servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Workload?

A

An application that runs on Windows Server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some usage scenarios for data deduplication?

A

User documents: 30% to 50%
Deployment shares: 70% to 80%
Virtualization libraries: 80% to 95%
General file shares: 50% to 60%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How does data deduplication help in general file servers?

A

There will be plenty of opportunity for data deduplication to work its magic in these environments—often consisting of team shares, user home folders, work folders, and software development shares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How does data deduplication help in Virtualized Desktop Infrastructure (VDI) deployments?

A

Many virtual hard disks are practically identical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does data deduplication help with backup targets?

A

So much of the data we store as backups is identical to other data we have backed up!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is DDPEval?

A

Data Deduplication Savings Evaluation tool can evaluate the potential for optimization against directly connected volumes and mapped or unmapped network shares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Can data deduplication affect performance negatively?

A

data deduplication is a periodic task that could interrupt the performance requirements of your workload. This is of most concern for workloads stored in traditional HDDs as opposed to SSDs.

22
Q

What are the resource requirements of the workload?

A

Storage that has “downtime,” such as weekends, is often an excellent candidate for data deduplication since this processing can occur during those times.

23
Q

How does Windows Server 2016 enhance data deduplication in four ways?

A

Support for larger volumes
Support for larger files
Support for Nano Server
Simplified backup support

24
Q

How does Windows Server 2016 enhance support for data deduplication in larger volumes?

A

Server 2016 now supports volume sizes up to 64 TB.

25
Q

How does Windows Server 2016 enhance support for data deduplication in larger files?

A

Files up to 1 TB are fully supported

26
Q

How does Windows Server 2016 enhance data deduplication in backup support?

A

A new default usage type now supports seamless deployment of data deduplication for virtualized backup applications.

27
Q

What are the Three types of Data Deduplication?

A

Default
Hyper-V
Backup

28
Q

What is the default type of data deduplication used for?

A

This is the option to choose for general-purpose file servers. It uses Background optimization.

29
Q

What is the optimization policy for the default type of data deduplication?

A

Minimum file age = 3 days
Optimize in-use files = No
Optimize partial files = No

30
Q

What is the Hyper-V type of data deduplication used for?

A

This is deduplication tuned specifically for VDI servers.. It uses Background Optimization and has “under the hood tweaks” for Hyper-V interoperability

31
Q

What is the optimization policy for the Hyper-V type of data deduplication?

A

Minimum file age = 3 days
Optimize in-use files = Yes
Optimize partial files = Yes

32
Q

What is the Backup type of data deduplication?

A

This is tuning for virtualized backup applications. It has priority optimization and “Under-the-hood” tweaks for interop with DPM/DPM-like solutions

33
Q

What is the optimization policy for the backup type of data deduplication?

A

Minimum file age = 0 days
Optimize in-use files = Yes
Optimize partial files = No

34
Q

What four jobs make data deduplication possible?

A

Optimization
Garbage Collection
Integrity Scrubbing
Unoptimization

35
Q

What is Garbage Collection?

A

Reclaims disk space by removing unnecessary chunks that are no longer being referenced by files that have been recently modified or deleted

36
Q

What is Integrity Scrubbing?

A

Identifies corruption in the chunk store due to disk failures or bad sectors

37
Q

What is Unoptimization?

A

Undoes the optimization done by deduplication and disables data deduplication for that volume

38
Q

How do you add the role of data deduplication in PowerShell?

A

Install-WindowsFeature -Name FS-Data-Deduplication

39
Q

How do you add the role of data deduplication in Nano Server?

A

Install-WindowsFeature -ComputerName -Name FS-Data-Deduplicatio

40
Q

What are five cmdlets used to implement data deduplication in PowerShell?

A
Enable-DedupVolume
Start-DedupJob
Stop-DedupJob
Get-DedupJob
Start-DedupJob
41
Q

What does the Enable-DedupjobVolume cmdlet do?

A

Enables data deduplication on one or more volumes.

42
Q

What does the Start-DedupJob cmdlet do?

A

Starts a new data deduplication job

43
Q

What does the Stop-DedupJob cmdlet do?

A

Stops a data deduplication job that’s already in progress (or removes it from the queue)

44
Q

What does the Get-DedupJob cmdlet do?

A

Shows all the active and queued data deduplication jobs

45
Q

What does the Start-DedupJob cmdlet do?

A

To disable date deduplication

46
Q

What cmdlet is useful for Powershell data deduplication monitoring?

A

Get-DedupStatus

47
Q

Which Fields are important in the Get-DedupStatus cmdlet for data deduplication monitoring?

A
LastOptimizationResult
LastGarbageCollectionResult
LastScrubbingResult
OptimizedFilesSavingsRate
SavingsRate
48
Q

How do you interpret the monitoring of the LastOptimizationResult field in data deduplication monitoring?

A

(0 = success), LastOptimizationResultMessage, and LastOptimizationTime (should be recent)

49
Q

How do you interpret the monitoring of the LastGarbageCollectionResult in data deduplication monitoring?

A

0 = success), LastGarbageCollectionResultMessage, and LastGarbageCollectionTime (should be

50
Q

How do you interpret the monitoring of the LastOptimizationResult field in data deduplication monitoring?

A

(0 = success), LastScrubbingResultMessage, and LastScrubbingTime (should be recent)

51
Q

How do you interpret OptimizedFilesSavingsRate in data deduplication monitoring?

A

applies only to the files that are “in-policy” for optimization (space used by optimized files after optimization/logical size of optimized files)

52
Q

How do you interpret SavingsRate in data deduplication monitoring?

A

applies to the entire volume (space used by optimized files after optimization/total logical size of the optimization)