40. BACKUP AND RECOVERY METHODS Flashcards

1
Q

Learning Objectives

By the end of this chapter, you should be able to:

A
  • Identify and prioritize data that needs backup.
  • ​Employ different kinds of backup methods, depending on the situation.
  • Establish efficient backup and restore strategies.
  • Use different backup utilities, such as cpio, tar, gzip, bzip2, xz, dd, and rsync.
  • Describe the two most well-known backup programs, Amanda and Bacula.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why Backups?

Name a few reasons to back up data?

A
  1. Data is valuable
  2. Hardware fails
  3. Software fails
  4. People make mistakes
  5. Malicious people can cause deliberate damage
  6. Unexplained events happen
  7. Rewinds can be useful
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What Needs Backup?

Some data is critical for backup, some less critical, and some never needs saving.

What data should be backed up?

A

Definitely yes

  • The following data should always be backed up:
    • Business-related data
    • System configuration files
    • User files (usually under /home)

Maybe

  • Spooling directories (for printing, mail, etc.)
  • Logging files (found in /var/log, and elsewhere)

Probably not

  • Software that can easily be re-installed; on a well-managed system, this should be almost everything
  • The /tmp directory, because its contents are indeed supposed to be only temporary

Definitely not

  • Pseudo-filesystems such as /proc, /dev and /sys
  • Any swap partitions or files

Obviously, files essential to your organization require backup. Configuration files may change frequently, and along with individual user’s files, require backup as well.

Logging files can be important if you have to investigate your system’s history, which can be particularly important for detecting intrusions and other security violations.

You don’t have to back up anything that can easily be re-installed. Also, the swap partitions (or files) and /proc filesystems are generally not useful or necessary to backup, since the data in these areas is basically temporary (just like in the /tmp directory).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Backup vs. Archive

All backup media have a finite lifetime before becoming unreadable. The conventional estimates are listed below:

  • Magnetic Tapes: 10-30 years
  • CDs and DVDs: 3-10 years
  • Hard Disks: 2-5 years.
  • Lifetime is very sensitive to:

Environmental conditions (temperature, humidity, etc.)

  • Quality of media
  • Having working software that can read data on current operating systems and hardware.
  • Lifetime is sufficient for backup, but not for permanent digital archiving.

For lifetimes longer than the usual backup timescales, data can be preserved using multiple copies, plus copying over to newer media from time to time.

For very long times (i.e., many decades, centuries, etc.), standard methods do not work easily, as everything can go obsolete: hardware, software and document format, media.

None of the inexpensive digital formats can actually compete with paper and film for long periods of time (if they are properly stored and continuously cared for - like wine).

This is a problem serious people think about and there should be good solutions available before all is lost.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Tape Drives

Tape drives are not as common as they used to be. They are relatively slow and permit only sequential access. On any modern setup, they are rarely used for primary backup. They are sometimes used for off-site storage for archival purposes, for long time reference. However, magnetic tape drives always have only a finite lifetime without physical degradation and loss of data.

Modern tape drives are usually of the LTO (Linear Tape Open) variety, whose first versions appeared in the late 1990s as an open standards alternative; early formats were mostly proprietary. Early versions held up to 100 GB; newer versions can hold 2.5 TB or more in a cartridge of the same size.

Day-to-day backups are usually done with some form of NAS (Network Attached Storage) or with cloud-based solutions, making new tape-based installations less and less attractive. However, they can still be found, and system administrators may be required to deal with them.

In what follows, we will try not to focus on particular physical forms for the backup media, and will speak more abstractly.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Backup Methods

What are the different types of backs?

You should never have all backups residing in the same physical location as the systems being protected. Otherwise, fire or other physical damage could lead to a total loss. In the past, this usually meant physically transporting magnetic tapes to a secure location. Today, this is more likely to mean transferring backup files over the Internet to alternative physical locations. Obviously, this has to be done in a secure way, using encryption and other security precautions as is appropriate.

A
  • Full
    • Backup for all files on the system.
  • Incremental
    • Backup for all files that have changed since the last incremental or full backup.
  • Differental
    • Backup for all files that have changed since the last full backup.
  • Multiple level incremental
    • Backup for all files that have changed since the previous backup at the same or a previous level.
  • User
    • Backup only for files in a specific user’s directory.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Backup Methods

Explain the full backup method

A

Backup for all files on the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Backup Methods

Explain the Incremental backup method

A

Backup for all files that have changed since the last incremental or full backup.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Backup Methods

Explain the Differential backup method

A

Backup for all files that have changed since the last full backup.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Backup Methods

Explain the Multiple level incremental backup method

A

Backup for all files that have changed since the previous backup at the same or a previous level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Backup Methods

Explain the User backup method

A

Backup only for files in a specific user’s directory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Backup Strategies

We should note that backup methods are useless without associated restore methods. You have to take into account the robustness, clarity and ease of both directions when selecting strategies.

The simplest backup scheme is to do a full backup of everything once, and then perform incremental backups of everything that subsequently changes. While full backups can take a lot of time, restoring from incremental backups can be more difficult and time consuming. Thus, you can use a mix of both to optimize time and effort.

An example of one useful strategy involving tapes (you can easily substitute other media in the description):

  • Use tape 1 for a full backup on Friday.
  • Use tapes 2-5 for incremental backups on Monday-Thursday.
  • Use tape 6 for full backup on second Friday.
  • Use tapes 2-5 for incremental backups on second Monday-Thursday.
  • Do not overwrite tape 1 until completion of full backup on tape 6.
  • After full backup to tape 6, move tape 1 to external location for disaster recovery.
  • For next full backup (next Friday) get tape 1 and exchange for tape 6.

A good rule of thumb is to have at least two weeks of backups available.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Some Backup Related Utilities

Name a few of the backup utilties

A
  • cpio and tar
    • cpio and tar create and extract archives of files.
  • gzip, bzip2, and xz
    • The archives are often compressed with gzip, bzip2, or xz. The archive file may be written to disk, magnetic tape, or any other device which can hold files. Archives are very useful for transferring files from one filesystem or machine to another.
  • dd
    • This powerful utility is often used to transfer raw data between media. It can copy entire partitions or entire disks.
  • rsync
    • This powerful utility can synchronize directory subtrees or entire filesystems across a network, or between different filesystem locations on a local machine.
  • dump and restore
    • These ancient utilities were designed specifically for backups. They read from the filesystem directly (which is more efficient). However, they must be restored only on the same filesystem type that they came from. There are newer alternatives.
  • mt
    • This utility is useful for querying and positioning tapes before performing backups and restores.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Some Backup Related Utilities

What is the cpio and tar back up utilities used for?

A

cpio and tar create and extract archives of files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Some Backup Related Utilities

What is the gzip, bzip2, and xz back up utilities used for?

A

The archives are often compressed with gzip, bzip2, or xz. The archive file may be written to disk, magnetic tape, or any other device which can hold files. Archives are very useful for transferring files from one filesystem or machine to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Some Backup Related Utilities

What is the dd back up utilities used for?

A

This powerful utility is often used to transfer raw data between media. It can copy entire partitions or entire disks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Some Backup Related Utilities

What is the rsync back up utilities used for?

A

This powerful utility can synchronize directory subtrees or entire filesystems across a network, or between different filesystem locations on a local machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Some Backup Related Utilities

What is the dump and restore back up utilities used for?

A

These ancient utilities were designed specifically for backups. They read from the filesystem directly (which is more efficient). However, they must be restored only on the same filesystem type that they came from. There are newer alternatives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Some Backup Related Utilities

What is the mt back up utilities used for?

A

This utility is useful for querying and positioning tapes before performing backups and restores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Using tar for Backups

tar is easy to use:

When creating a tar archive, for each directory given as an argument, ___.

A

all files and subdirectories will be included in the archive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Using tar for Backups

tar is easy to use:

When restoring, __.

A

it reconstitutes directories as necessary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Using tar for Backups

tar is easy to use:

It even has a –newer option that lets you do ___.

A
  • incremental backups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Using tar for Backups

tar is easy to use:

The version of tar used in Linux can also handle backups that do not fit on one ___ or ___.

A
  • tape
  • whatever device you use
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Using tar for Backups

What command option with tar is used to create a archive?

A
  • -c or just c

Examples:

  • $ tar cvf /dev/st0 /root
  • $ tar -cvf /dev/st0 /root
25
Q

Using tar for Backups

What is the command option with tar to create with multi-volume option?

You will be prompted to put the next tape when needed.

A
  • -M

Examples:

  • $ tar -cMf /dev/st0 /root
26
Q

Using tar for Backups

What command option with tar verifies files with the compare option?

After you make a backup, you can make sure that it is complete and correct using the above verification option.

A
  • -d or –compare

Examples:

  • $ tar –compare –verbose –file /dev/st0
    $ tar -dvf /dev/st0
27
Q

Using tar for Backups

What command option with tar allows you to specify a device or file?

A
  • -f or –file option
28
Q

Using tar for Backups

By default, tar will ___ include all subdirectories in the archive.

A
  • recursively
29
Q

Using tar for Backups

When you create an archive, tar prints a message about removing ___ from the absolute path name. While this allows you to restore the files anywhere, the default behavior can be modified.

A
  • leading slashes
30
Q

Using tar for Backups

Most tar options can be given in ___ with one dash, or ___ with two: -c is completely equivalent to –create. Also note that you can ___ (when using the short notation), so that you don’t have to type every dash.

A
  • short form
  • long form
  • combine options
31
Q

Using tar for Backups

Furthermore, single-dashed tar options can be used with or ___ dashes.

A
  • without

Examples:

$ tar cvf file.tar dir1

has the same result as

$ tar -cvf file.tar dir1

32
Q

Using tar for Restoring Files

The ___ or ___ option extracts files from an archive, all by default. You can narrow the file extraction list by specifying ___. If a directory is specified, all included files and subdirectories are also extracted.

A
  • -x
  • –extract
  • only particular files

Examples:

Extract from an archive:

$ tar -xpvf /dev/st0
$ tar xpvf /dev/st0

Specify only specific files to restore:

$ tar xvf /dev/st0 somefile

33
Q

Using tar for Restoring Files

The ___ or ___ options ensures files are restores with their original permissions.

A
  • -p
  • –same-permissions

Examples:

Extract from an archive:

$ tar –extract –same-permissions –verbose –file /dev/st0

34
Q

Using tar for Restoring Files

The ___ or ___ option lists, but does not extract, the files in the archive.

A
  • -t
  • –list

Examples:

List the contents of a tar backup:

  • $ tar –list –file /dev/st0
  • $ tar -tf /dev/st0
35
Q

Incremental Backups with tar

You can do an incremental backup with tar using the ___ (or the equivalent ___), or the –after-date options. Either option requires specifying either ___ or a ___.

A
  • -N
  • –newer
  • a date
  • qualified (reference) file name

Example:

$ tar –create –newer ‘2011-12-1’ -vzf backup1.tgz /var/tmp
$ tar –create –after-date‘2011-12-1’ -vzf backup1.tgz /var/tmp

Either form creates a backup archive of all files in /var/tmp which were modified after December 1, 2011.

!!When followed by an option line –newer you must use the dash in options like -vzf, or tar will get confused. This kind of option specification confusion sometimes occurs with old UNIX utilities like ps and tar with complicated histories involving different families of UNIX

36
Q

Incremental Backups with tar

Because tar only looks at a ___, it does not consider any other changes to the file, such as ___ or ___. To include files with these changes in the incremental backup, use ___ and create a list of files to be backed up.

A
  • file’s date
  • permissions
  • file name
  • find
37
Q

Archive Compression Methods

It is often desired to compress files to save disk space and/or network transmission time, especially since modern machines will often find the ___ cycle faster than just transmitting (or copying) an uncompressed file.

A
  • compress -> transmit -> decompress
38
Q

Archive Compression Methods

What are the 3 common archive compression tools?

A

In order to increase compression efficiency (which comes at the cost of longer compression times):

  • gzip
    • Uses Lempel-Ziv Coding (LZ77) and produces .gz files.
  • bzip2
    • Uses Burrows-Wheeler block sorting text compression algorithm and Huffman coding, and produces .bz2 files.
  • xz
    • Produces .xz files and also supports legacy .lzma format.
39
Q

Archive Compression Methods

What is the command option with tar to archive and compress using gzip compression tool?

A
  • -z

Example:

$ tar zcvf source.tar.gz source

40
Q

Archive Compression Methods

What is the command option with tar to archive and compress using bzip2 compression tool?

A
  • -j

Example:

$ tar jcvf source.tar.bz2 source

41
Q

Archive Compression Methods

What is the command option with tar to archive and compress using xz compression tool?

A
  • -J

Example:

$ tar Jcvf source.tar.xz source

42
Q

Archive Compression Methods

When archiving a compressing at the same time with tar.

Example using xz compression:

$ tar Jcvf source.tar.xz source

What is the commands that do the same but do archive and compression in separate steps?

What do one over the other?

A

$ tar cvf source.tar source ; xz -v source.tar

It is after to do compression and archive at the same time because:

  • There is no intermediate file storage.
  • Archiving and compression happen simultaneously in the pipeline.
43
Q

Archive Compression Methods

What is the command option while using tar to decompression a compressed archive (.tar package)?

Example:

  1. source.tar.gz source
  2. source.tar.bz2 source
  3. source.tar.xz source
A
  1. $ tar xzvf source.tar.gz
  2. $ tar xjvf source.tar.bz2
  3. $ tar xJvf source.tar.xz

or even simpler:

  1. $ tar xvf source.tar.gz
  2. $ tar xvf source.tar.bz2
  3. $ tar xvf source.tar.xz

as modern versions of tar can sense the method of compression and take care of it automatically.

Obviously, it is not worth using these methods on archives whose component files are already compressed, such as .jpg images, or .pdf files, etc.

44
Q

dd

dd is a common UNIX-based program whose primary purpose is? (2)

dd is used to copy a specified number of ___ or ___, performing on-the-fly byte order conversions, as well as being able to ___ data from one form to another.

dd can also be used to copy regions of raw device files, for example backing up the boot sector of a hard disk, or to read fixed amounts of data from special files like ___ or ___. The basic syntax is?

A
  • the low-level copying
  • conversion of raw data
  • bytes
  • blocks
  • convert
  • /dev/zero
  • /dev/random
  • $ dd if=input-file of=output-file options
45
Q

dd

The basic syntax of dd is?

A

$ dd if=input-file of=output-file options

46
Q

dd

What is the dd comand to create a file with 10 megabytes of null character data into a output file named outfile?

A

$ dd if=/dev/zero of=outfile bs=1M count=10

47
Q

dd

What is the dd command to back up an entire hard drive (with device node name sda) to another hard drive (with device node name sdb).

A

$ dd if=/dev/sda of=/dev/sdb

48
Q

dd

What is the dd command to create an image of a device node hard disk sda called sdadisk.img?

A

$ dd if=/dev/sda of=sdadisk.img

49
Q

dd

What is the dd command to create an backup partition of a device node hard disk sda1 called partition1.img?

A

$ dd if=/dev/sda1 of=partition1.img

50
Q

dd

What is the dd command to create an backup CD ROM named tgsservice.iso?

A

$ dd if=/dev/cdrom of=tgsservice.iso bs=2048

51
Q

Using rsync for Backups

rsync (remote synchronize) is used to transfer files across ___ or ___?

The basic syntax of resync command usage is?

A
  • a network
  • or between different locations on the same machine
  • $ rsync [options] sourcefile destinationfile
52
Q

Using rsync for Backups

$ rsync [options] sourcefile destinationfile

The source and destination can take the form of target:path, where target can be in the form of ___. The ___ part is optional and used if the remote user is different from the local user. Thus, these are all possible rsync commands.

  • $ rsync file.tar someone@backup.mydomain:/usr/local
  • $ rsync -r –dry-run /usr/local /BACKUP/usr
A
  • [user@]host
  • user@
53
Q

Using rsync for Backups

You have to be very careful with rsync about exact location specifications (especially if you use the ___ option), so it is highly recommended to use the ___ option first, and then repeat if the projected action looks correct.

A
  • –delete
  • –dry-run
54
Q

Using rsync for Backups

rsync is very clever; it checks local files against remote files in small chunks, and it is very efficient in that when copying one directory to a similar directory, only the ___ are copied over the network. This synchronizes the second directory with the first directory. You may often use the ___ option, which causes rsync to recursively walk down the directory tree copying all files and directories below the one listed as the sourcefile. Thus, a very useful way to back up a project directory might be similar to:

$ rsync -r project-X archive-machine:archives/project-X

A
  • differences
  • -r
55
Q

Using rsync for Backups

A simple (and very effective and very fast) backup strategy is to simply ___ across a network with rsync commands and to do so ___.

A
  • duplicate directories or partitions
  • frequently
56
Q

Using cpio for Backups

cpio (copy in and out) is a general file archiver utility that has been around since the earliest days of UNIX and was originally designed for tape backups. Even though newer archiving programs (like tar, which is not exactly young) have been deployed to do many of the tasks that were once in the domain of cpio, it still survives.

For example, we have already seen the use of rpm2cpio to convert RPM packages into cpio archives and then extract them. Also, the Linux kernel uses a version of cpio internally to deal with initramfs and initrd initial ram filesystems and disks during boot.

What is 1 reason why cpio is still used today?

A

One reason cpio lives on is that it is lighter than tar and other successors, even if it is somewhat less robust.

57
Q

Using cpio for Backups

How do you use cpio?

A

You can specify the input (-I device) or output (-O) or use redirection on the command line.

The -o or –create option tells cpio to copy files out to an archive. cpio reads a list of file names (one per line) from standard input and writes the archive to standard output.

The -i or –extract option tells cpio to copy files from an archive, reading the archive from standard input. If you list file names as patterns (such as *.c) on the command line, only files in the archive that match the patterns are copied from the archive. If no patterns are given, all files are extracted.

The -t or –list option tells cpio to list the archive contents.

Adding the -v or –verbose option generates a long listing.

You can see some examples of using cpio below.

Create an archive, use -o or –create:

$ ls | cpio –create -O /dev/st0

Extract from an archive, use -i or –extract:

$ cpio -i somefile -I /dev/st0

List contents of an archive, use -t or –list:

$ cpio -t -I /dev/st0

58
Q

Backup Programs

There is no shortage of available backup program suites available for Linux, including proprietary applications or those supplied by storage vendors, as well as open-source applications.

Explain and describe 3 backup programs in linux.

A

Backup Programs

  • Amanda
    • Amanda (Advanced Maryland Automatic Network Disk Archiver) uses native utilities (including tar and dump) but is far more robust and controllable. Amanda is generally available on Enterprise Linux systems through the usual repositories.
  • Bacula
    • Bacula is designed for automatic backup on heterogenous networks. It can be rather complicated to use and is recommended (by its authors) only to experienced administrators. Bacula is generally available on Enterprise Linux systems through the usual repositories.
  • Clonezilla
    • Clonezilla is a very robust disk cloning program, which can make images of disks and deploy them, either to restore a backup, or to be used for ghosting, to provide an image that can be used to install many machines.
    • ​The program comes in two versions: Clonezilla Live, which is good for single machine backup and recovery, and Clonezilla SE, server edition, which can clone to many computers at the same time. Clonezilla is not very hard to use and is extremely flexible, supporting many operating systems (not just Linux), filesystem types, and boot loaders.