awk Flashcards
AWK
Print first field in every line
awk ‘{print $1}’ file
AWK
Print first field in every line where fields are
separated by a “:”
awk -F: ‘{print $1}’ file
or
awk -F “:” ‘{print $1}’ file
-F is an awk variable specifying the field separator
AWK
Change the second field in every line to ‘Hi’
awk ‘{$2=”Hi” ; print $0}’ file
Multiple commands separated by “;”
AWK
Apply the awk script in ‘awkscript’ to the file named ‘testfile’ using delimiter :
awk -F: -f awkscript testfile
-f argument specifies file containing awk script
AWK
Print a line “Header” followed by all of the lines in the file
awk ‘BEGIN {print “Header”} {print $0}’ file
BEGIN is a filter which only returns true before the first line. BEGIN blocks can be used for initialization of variables etc.
Note that the following would also work:
awk ‘{print $0} BEGIN {print “Header”}’ file
AWK
How do you specify a command that should run after all lines have been processed?
awk ‘END { command }’ file
Note that you can have other pattern/action pairs.
An awk program contains multiple pattern/action pairs. The pattern can be a regex etc or BEGIN/END. If no pattern is supplied for an action then it applies to all rows.
AWK
There are two ways to set variables. What are they?
As params to the awk command itself.
awk -F “:” ‘{print %0}’ file
Or in the BEGIN command.
awk ‘BEGIN {FS = “:”} {print %0}’ file
Note that the names of the command line argument and the variable do not necessarily match. In the above, the -F command line argument and the FS variable both represent the Field Separator variable.
Also, not all variables have a command line equivalent.
AWK
Each line of the input is by default a ‘record’. How do you change this to use a different delimiter for records?
The RS variable sets the Record Separator, which is the Newline character by default. Here is a script which uses records separated by ‘;’ and fields separated by “,”. Note that NR represents the record number currently being processed.
awk ‘BEGIN {RS = “;” ; FS = “,”} {print NR “) “ $1 “=” $2}’ file
This script will process a single line, where records are separated by “;”. Each record consists of two fields, separated by “,”. It will print, for example “1) a = b” for each record.
Example:
if file contains “a,b;c,d;e,f”
Then the output will be:
1) a=b
2) c=d
3) e=f
AWK
Which variable controls what is ouput between arguments passed to’print’?
OFS
Output Field Separator
AWK
Which variable controls what is output between records?
ORS
Output Record Separator
AWK
Which variable indicates how many fields are in the current record?
NF
Num Field
Which variable indicates the number of the current record?
NR
Num Record
AWK
How would you process data where each record consists of a number of fields, each on its own line, and the records are separated by a blank line?
eg
Joe Blow
554-456-3422
Bob Skulk
643-644-6789
awk ‘BEGIN {FS=”\n”; RS=””} {print $1 “ph=” $1}’ file
RS=”” says records are separated by empty line.
FS=”\n” says fields separated by newline.
AWK
How do you create your own variables?
Initialize them in the BEGIN block. Can update and use in other blocks.
e.g.
awk ‘BEGIN {myvar = “Hi there”} {print myvar, $0}’ file
AWK
Print the second field of each line if the first field is > 30
awk ‘{if ($1 > 30) print $2}’ testfile