Text processing Flashcards
What is the ‘cut’ command used for?
Used for extracting text by splitting lines on delimiters/byte positions/character patterns
How can you get the nth character from each line with ‘cut’?
cut -cn filename
How can you get the multiple characters from each line with ‘cut’?
cut -c1,3,4 filename (returns 1st 3rd and 4th characters)
How do you get a range of characters from each line using ‘cut’?
cut -c1-3,6-10 filename
How do you get a range of bytes from each line using ‘cut’?
cut -b1-8 filename (first 8 bytes)
How can you split lines into columns on a colon (:) delimiter and select the 5th column using ‘cut’?
cut -d: -f 5 filename
How can you use ‘cut’ on the output of another command?
command [OPTIONS] | cut [OPTIONS] filename
What is the ‘awk’ command used for?
Text processing utility/language used to extract text from files or command output
How do you out space separated columns lines of text using ‘awk’?
awk ‘{print $1}’ filename ($n is the column number)
How can you get multiple columns using ‘awk’?
awk ‘{print $1,$2,$3}’ filename
How can you get the last column using ‘awk’?
awk ‘{print $NF}’ filename
How can you search for text using ‘awk’?
awk ‘/<text_search>/ {print}' filename</text_search>
How can you split text on a delimiter using ‘awk’?
awk -F<delimiter> '{print $1}' filename</delimiter>
How can you replace text using ‘awk’?
echo “One” | awk ‘{$1=”Two”; print $0}’ (replaces One with Two)
How can you get all file lines longer than n bytes with ‘awk’?
cat filename | awk ‘length($0) > n’
Detail the syntax of an ‘if’ statement in ‘awk’
ls -l | awk ‘{if($9 == “username”) print $0;}’ - ‘{if(something) something $n;}’
How do you print the number of fields per line using ‘awk’?
awk ‘{print NF}’
What is the command ‘grep’ used for?
Advanced pattern matching tool for finding text within files and output
What does ‘grep’ stand for?
Global Regular Expression Print
What is the basic syntax of ‘grep’?
grep search_term filename
How can count occurrences of a search term in ‘grep’?
grep -c search_term filename
How can you ignore case in the search term in ‘grep’?
grep -i search_term filename
How can you get matched line numbers in ‘grep’?
grep -n search_term filename
How can you get all non-matched lines in ‘grep’?
grep -v search_term filename
How can use ‘grep’ on the output of another command?
command [OPTIONS] | grep search_term
How can you search for multiple terms with ‘egrep’?
egrep -i “keyword1|keyword2” <filename></filename>
What is the ‘sort’ command used for?
Sorting text in alphabetical order
What is the ‘uniq’ command used for?
Filtering out lines with repeated text
What is the basic syntax of ‘sort’?
sort filename or command [OPTIONS] | sort
How do list in reverse order using ‘sort’?
sort -r filename
How can you order by space-separated column number using ‘sort’?
sort -k2 filename
Detail a limitation of ‘uniq’ that means it will still sometimes display duplicate lines.
Input to ‘uniq’ must be sorted. ‘uniq’ does not guarantee that duplicates are removed unless they are adjacent
How are ‘sort’ and ‘uniq’ used together?
sort filename | uniq
How can you display duplicate counts with ‘uniq’?
sort filename> | uniq -c
How can you display only duplicated text with ‘uniq’?
sort filename | uniq -d
What is the ‘wc’ command used for?
Counting lines, words, or bytes of an input stream
What is the standard output of ‘wc’ when a file is passed?
lines words bytes filename
How can you print line count using ‘wc’?
wc -l filename
How can you print word count using ‘wc’?
wc -w filename
How can you print byte count using ‘wc’?
wc -c filename