Chapter 4 Flashcards
Searching and Analyzing Text
Summarize the various utilities used in processing text files.
Filtering text file data can be made much easier with utilities such as grep, egrep, and cut. Once that data is filtered, you may want to format it for viewing using sort, pr, printf, or even the cat utility. If you need some statistical information on your text file, such as the number of lines it contains, the wc command is handy.
Explain both the structures and commands for redirection.
Employing STDOUT, STDERR, and STDIN redirection allows rather complex filtering and processing of text. The echo command can assist in this process as well as here documents. You can also use pipelines of commands to perform redirection and produce excellent data for review.
Describe the various methods used for editing text files.
Editing text files is part of a system administrator’s life. You can use full-screen editors such as the rather complicated vim text editor or the simple and easy-to-use nano editor. For fast and powerful text stream editing, employ sed and its scripts or the gawk programming language.
The cat -E MyFile.txt command is entered, and at the end of every line displayed is a $. What does this indicate?
The text file has been corrupted somehow. The text file records end in the ASCII character NUL. The text file records end in the ASCII character LF. The text file records end in the ASCII character $. The text file records contain a $ at their end.
C. A text file record is considered to be a single file line that ends in a newline linefeed that is the ASCII character LF. You can see if your text file uses this end-of-line character by issuing the cat -E command. Therefore, option C is the correct answer. The text file may have been corrupted, but this command does not indicate it, so option A is an incorrect choice. The text file records end in the ASCII character LF and not NUL or $. Therefore, options B and D are incorrect. The text file records may very well contain a $ at their end, but you cannot tell by the situation description, so option E is a wrong answer.
The cut utility often needs delimiters to process text records. Which of the following best describes a delimiter?
One or more characters that designate the beginning of a line in a record One or more characters that designate the end of a line in a record One or more characters that designate the end of a text file to a command-line text processing utility A single space or a colon (:) that creates a boundary between different data items in a record One or more characters that create a boundary between different data items in a record
E. To properly use some of the cut command options, fields must exist within each text file record. These fields are data that is separated by a delimiter, which is one or more characters that create a boundary between different data items within a record. Therefore, option E best describes a delimiter and is the correct answer. Option A is made up and is a wrong answer. Option B describes an end-of-line character, such as the ASCII LF. Option C is made up and is a wrong answer. While a single space and a colon can be used as a delimiter, option D is not the best answer and is therefore a wrong choice.
Which of the following utilities change text within a file? (Choose all that apply.)
cut sort vim nano sed
C, D. Recall that many utilities that process text do not change the text within a file unless redirection is employed to do so. The only utilities in this list that will allow you to modify text are the text editors vim and nano. Therefore, options C and D are the correct answers. The cut, sort, and sed utilities gather the data from a designated text file(s), modify it according to the options used, and display the modified text to standard output. The text in the file is not modified. Therefore, options A, B, and E are incorrect choices.
You have a text file, monitor.txt, which contains information concerning the monitors used within the data center. Each record ends with the ASCII LF character and fields are delimitated by a comma (,). A text record has the monitor ID, manufacture, serial number, and location. To display each data center monitor’s monitor ID, serial number, and location, you’d use which cut command?
cut -d "," -f 1,3,4 monitor.txt cut -z -d "," -f 1,3,4 monitor.txt cut -f "," -d 1,3,4 monitor.txt cut monitor.txt -d "," -f 1,3,4 cut monitor.txt -f "," -d 1,3,4
A. The cut command gathers data from the text file, listed as its last argument, and displays it according to the options used. To define field delimiters as a comma and display each data center monitor’s monitor ID, serial number, and location, the options to use are -d “,” -f 1,3,4. Also, since the text file’s records end with an ASCII LF character, no special options, such as the -z option, are needed to process these records. Therefore, option A is the correct choice. Option B uses the unneeded -z option and is therefore a wrong answer. Option C is an incorrect choice because it reverses the -f and -d options. Options D and E are wrong answers because they put the filename before the command switches.
The grep utility can employ regular expressions in its PATTERN. Which of the following best describes a regular expression?
A series of characters you define for a utility, which uses the characters to match the same characters in text files ASCII characters, such as LF and NUL, that a utility uses to filter text Wildcard characters, such as * and ?, that a utility uses to filter text A pattern template you define for a utility, which uses the pattern to filter text Quotation marks (single or double) used around characters to prevent unexpected results
D. Option D is the best answer because a regular expression is a pattern template you define for a utility, such as grep, which uses the pattern to filter text. While you may use a series of characters in a grep PATTERN, they are not called regular expressions, so option A is a wrong answer. Option B is describing end-of-line characters, and not regular expression characters, so it also is an incorrect answer. While the ? is used in basic regular expressions, the * is not (however, .* is used). Therefore, option C is a wrong choice. Quotation marks may be employed around a PATTERN, but they are not considered regular expression characters, and therefore option E is an incorrect answer.
You are a system administrator on a Red Hat Linux server. You need to view records in the /var/log/messages file that start with the date May 30 and end with the IPv4 address 192.168.10.42. Which of the following is the best grep command to use?
grep "May 30?192.168.10.42" /var/log/messages grep "May 30.*192.168.10.42" /var/log/messages grep -i "May 30.*192.168.10.42" /var/log/messages grep -i "May 30?192.168.10.42" /var/log/messages grep -v "May 30.*192.168.10.42" /var/log/messages
B. Option B is the best command because this grep command employs the correct syntax. It uses the quotation marks around the pattern to avoid unexpected results and uses the .* regular expression characters to indicate that anything can be between May 30 and the IPv4 address. No additional switches are necessary. Option A is not the best grep command because it uses the wrong regular expression of ?, which only allows one character to exist between May 30 and the IPv4 address. Options C and D are not the best grep commands because they employ the -i switch to ignore case, which is not needed in this case. The grep command in option E is an incorrect choice, because it uses the -v switch, which will display text records that do not match the PATTERN.
Which of the following is a BRE pattern that could be used with the grep command? (Choose all that apply.)
Sp?ce "Space, the .*frontier" ^Space (lasting | final) frontier$
A, B, C, E. A BRE is a basic regular expression that describes certain patterns you can use with the grep command. An ERE is an extended regular expression and it requires the use of grep -e or the egrep command. Options A, B, C, and E are all BRE patterns that can be used with the grep command, so they are correct choices. The only ERE is in option D, and therefore, it is an incorrect choice.
You need to search through a large text file and find any record that contains either Luke or Laura at the record’s beginning. Also, the phrase Father is must be located somewhere in the record’s middle. Which of the following is an ERE pattern that could be used with the egrep command to find this record?
"Luke$|Laura$.*Father is" "^Luke|^Laura.Father is" "(^Luke|^Laura).Father is" "(Luke$|Laura$).* Father is$" "(^Luke|^Laura).*Father is.*"
E. To meet the search requirements, option E is the ERE to use with the egrep command. Therefore, option E is the correct answer. Option A will return either a record that ends with Luke or a record that ends with Laura. Thus, option A is the wrong answer. Option B is an incorrect choice because it will return either a record that begins with Luke or a record that begins with Laura and has one character between Laura and the Father is phrase. Option C has the Luke and Laura portion of the ERE correct, but it only allows one character between the names and the Father is phrase, which will not meet the search requirements. Thus, option C is a wrong choice. Option D will try to return either a record that ends with Luke or a record that ends with Laura and contains the Father is phrase, so the egrep command will display nothing. Thus, option D is an incorrect choice.
A file data.txt needs to be sorted numerically and its output saved to a new file newdata.txt. Which of the following commands can accomplish this task? (Choose all that apply.)
sort -n -o newdata.txt data.txt sort -n data.txt > newdata.txt sort -n -o data.txt newdata.txt sort -o newdata.txt data.txt sort data.txt > newdata.txt
A, B. To sort the data.txt file numerically and save its output to the new file, newdata.txt, you can either use the -o switch to save the file or employ standard output redirection with the > symbol. In both cases, however, you need to use the -n switch to properly enact a numerical sort. Therefore, both options A and B are correct. Option C is a wrong answer because the command has the newdata.txt and data.txt flipped in the command’s syntax. Options D and E do not employ the -n switch, so they are incorrect answers as well.
Which of the following commands can display the data.txt and datatoo.txt files’ content one after the other to STDOUT? (Choose all that apply.)
ls data.txt datatoo.txt sort -n data.txt > datatoo.txt cat -n data.txt datatoo.txt ls -l data.txt datatoo.txt sort data.txt datatoo.txt
C, E. The commands in both options C and E will display the data.txt and datatoo.txt files’ content one after the other to STDOUT. The cat -n command will also append line numbers to it, but it will still concatenate the files’ content to standard output. Therefore, options C and E are correct. Option A will just display the files’ names to STDOUT, so it is a wrong answer. Option B will numerically sort the data.txt, wipe out the datatoo.txt file’s contents, and replace it with the numerically sorted contents from the data.txt file. Therefore, option B is an incorrect answer. Option D will show the two files’ metadata to STDOUT instead of their contents, so it also is a wrong choice.
A text file, StarGateAttacks.txt, needs to be specially formatted for review. Which of the following commands is the best command to accomplish this task quickly?
printf wc pr paste nano
C. The pr command’s primary purpose in life is to specially format a text file for printing, and it can accomplish the required task fairly quickly. Therefore, option C is the best choice. While the pr utility can handle formatting entire text files, the printf command is geared toward formatting the output of a single text line. While you could write a shell script to read and format each text file’s line via the printf command, it would not be the quickest method to employ. Therefore, option A is a wrong answer. Option B’s wc command will perform counts on a text file and does not format text file contents, so it is also an incorrect answer. The paste command will “sloppily” put together two or more text files side by side. Thus, option D is a wrong answer. Option E is an incorrect choice because the nano text editor would force you to manually format the text file, which is not the desired action.
You need to format the string 42.777 into the correct two-digit floating number. Which of the following printf command FORMAT settings is the correct one to use?
"%s\n" "%.2s\n" "%d\n" "%.2c\n" "%.2f\n"
E. The printf FORMAT “%.2f\n” will produce the desired result of 42.78, and therefore option E is the correct answer. The FORMAT in option A will simply output 42.777, so it is an incorrect choice. The FORMAT in option B will output 42 and therefore is a wrong answer. The printf FORMAT setting in option C will produce an error, and therefore, it is an incorrect choice. Option D’s printf FORMAT “%.2c\n” will display 42 and thus is also an incorrect answer.