103.2 Process text streams using filters Flashcards

1
Q

cat

A

he cat (short for “concatenate“) command is one of the most frequently used command in Linux/Unix like operating systems. cat command allows us to create single or multiple files, view contain of file, concatenate files and redirect output in terminal or files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

cut

A

The cut command in UNIX is a command for cutting out the sections from each line of files and writing the result to standard output. It can be used to cut parts of a line by byte position, character and field. Basically the cut command slices a line and extracts the text. It is necessary to specify option with command otherwise it gives error. If more than one file name is provided then data from each file is not precedes by its file name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

expand

A

expand which allows you to convert tabs into spaces in a file and when no file is specified it reads from standard input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

fmt

A

fmt command in LINUX actually works as a formatter for simplifying and optimizing text files. Formatting of text files can also be done manually, but it can be really time consuming when it comes to large text files, this is where fmt comes to rescue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

head

A

It is complementary to Tail command. The head command, as the name implies, print the top N number of data of the given input. By default, it prints the first 10 lines of the specified files. If more than one file name is provided then data from each file is preceded by its file name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

od

A

od command in Linux is used to convert the content of input in different formats with octal format as the default format.This command is especially useful when debugging Linux scripts for unwanted changes or characters. If more than one file is specified, od command concatenates them in the listed order to form the input.It can display output in a variety of other formats, including hexadecimal, decimal, and ASCII. It is useful for visualizing data that is not in a human-readable format, like the executable code of a program.

https://www.geeksforgeeks.org/od-command-linux-example/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

join

A

The join command in UNIX is a command line utility for joining lines of two files on a common field.

Suppose you have two files and there is a need to combine these two files in a way that the output makes even more sense.For example, there could be a file containing names and the other containing ID’s and the requirement is to combine both files in such a way that the names and corresponding ID’s appear in the same line. join command is the tool for it. join command is used to join the two files based on a key field present in both the files. The input file can be separated by white space or any delimiter.
Syntax:

// displaying the contents of first file //
$cat file1.txt
1 AAYUSH
2 APAAR
3 HEMANT
4 KARTIK
// displaying contents of second file //
$cat file2.txt
1 101
2 102
3 103
4 104
//..using join command...//
$join file1.txt file2.txt
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
// by default join command takes the 
first column as the key to join as 
in the above case //
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

nl

A

nl is a linux command to number lines of the files, it copies its files to standard output, prepending line numbers. It’s more flexible than cat with its -n and -b options, providing an almost bizarre amount of control over the numbering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

paste

A

It is used to join files horizontally (parallel merging) by outputting lines consisting of lines from each file specified, separated by tab as delimiter, to the standard output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

pr

A

The pr command writes the specified file or files to standard output. If you specify the - (minus sign) parameter instead of the File parameter, or if you specify neither, the pr command reads standard input. A heading that contains the page number, date, time, and name of the file separates the output into pages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

sed

A

SED command in UNIX is stands for stream editor and it can perform lot’s of function on file like, searching, find and replace, insertion or deletion. Though most common use of SED command in UNIX is for substitution or for find and replace. By using SED you can edit files even without opening it, which is much quicker way to find and replace something in file, than first opening that file in VI Editor and then changing it.

SED is a powerful text stream editor. Can do insertion, deletion, search and replace(substitution).
SED command in unix supports regular expression which allows it perform complex pattern matching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sort

A

SORT command is used to sort a file, arranging the records in a particular order. By default, the sort command sorts file assuming the contents are ASCII. Using options in sort command, it can also be used to sort numerically.

SORT command sorts the contents of a text file, line by line.
sort is a standard command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order.
The sort command is a command line utility for sorting lines of text files. It supports sorting alphabetically, in reverse order, by number, by month and can also remove duplicates.
The sort command can also sort by items not at the beginning of the line, ignore case sensitivity and return whether a file is sorted or not. Sorting is done based on one or more sort keys extracted from each line of input.
By default, the entire input is taken as sort key. Blank space is the default field separator.
The sort command follows these features as stated below:

Lines starting with a number will appear before lines starting with a letter.
Lines starting with a letter that appears earlier in the alphabet will appear before lines starting with a letter that appears later in the alphabet.
Lines starting with a lowercase letter will appear before lines starting with the same letter in uppercase.

Command :
$ cat > mix.txt
abc
apple
BALL
Abc
bat
Now use the sort command
Command :
$ sort mix.txt
Output :
abc
Abc
apple
bat
BALL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

split

A

it’ command is used to split or break a file into the pieces in Linux and UNIX systems. Whenever we split a large file with split command then split output file’s default size is 1000 lines and its default prefix would be ‘x’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

tail

A

It is the complementary of head command.The tail command, as the name implies, print the last N number of data of the given input. By default it prints the last 10 lines of the specified files. If more than one file name is provided then data from each file is precedes by its file name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

tr

A

The tr command in UNIX is a command line utility for translating or deleting characters. It supports a range of transformations including uppercase to lowercase, squeezing repeating characters, deleting specific characters and basic find and replace. It can be used with UNIX pipes to support more complex translation. tr stands for translate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

unexpand

A

To convert the leading spaces and tabs into tabs, there exists a command line utility called unexpand command.

The unexpand command by default convert each spaces into tabs writing the produced output to the standard output. Here’s the syntax of unexpand command :

17
Q

uniq

A

The uniq command in Linux is a command line utility that reports or filters out the repeated lines in a file.
In simple words, uniq is the tool that helps to detect the adjacent duplicate lines and also deletes the duplicate lines. uniq filters out the adjacent matching lines from the input file(that is required as an argument) and writes the filtered data to the output file .

18
Q

wc

A

wc stands for word count. As the name implies, it is mainly used for counting purpose.

It is used to find out number of lines, word count, byte and characters count in the files specified in the file arguments.
By default it displays four-columnar output.
First column shows number of lines present in a file specified, second column shows number of words present in the file, third column shows number of characters present in file and fourth column itself is the file name which are given as argument.