Week 4 Flashcards
What is POSIX Standard?
standard which defines regular expression into two types 1. Basic Regular Expression (BRE) 2. Extended Regular Expression (ERE)
What is regex?
Regex is a pattern template to filter text
How is grep command used?
- grep ‘pattern’ filename
- command| grep ‘pattern’
- egrep ‘pattern’ filename
grep -E ‘pattern’ filename
what are special characters used in grep pattern?
. - any single char
* - zero or more of preceding char/expr
[] - Any of the enclosed character; hypen (-) indicates char range
^ Anchor for beginning of line or negation of enclosed characters
$ - Anchor for end of line
\ Escape special characters.
{n,m} - range of occurances of preceding pattern at least n and utmost m times
() - grouping of regular expressions
What are some of the special character (ERE)?
{n,m} - Range of occurances of precing pattern at least n and utmost m times.
() grouping of regular expressions
+ one or more of preceding character/ expression
? Zero or one of preceding character / expression
logical OR over the patterns
What are character classes?
[[:print:}} - Printable
[[:alnum:}} - alpha numeric
[[:alpha:}} - Alphabetic
[[:lower:}}- Lower case
[[:upper:}} - Upper case
[[:digit:}} - Decimal digits
[[:blank:}} - Space/ Tab
[[:space:}} - Whitespace
[[:punct:}} - Punctuation
[[:xdigit:}} - Hexadecimal
[[:graph:}} - Non - Space
[[:cntrl:}} - Control Characters
What are Backreferences?
There are 9 backreferences
\1 through \9
\n matches whatever was matched by nth earlier paranthesized sub expression
A line with two occurances of hello will be matched using
(hello). *\1
What is the BRE / ERE operator precedence?
how to search for a string
grep <pattern> <filename></filename></pattern>
cat <filename> | grep <pattern></pattern></filename>
how to use ‘.’ in grep command?
cat names.txt | grep ‘S.n’ - implies pattern ‘S*n’ here ‘.’ means any character available.
Suppose we use cat names.txt | grep ‘.am$’ - this pattern will look for pattern “*am” at the end of the line.
How to check for a string with ‘.’
cat ‘names.txt’| grep ‘.’
How to use anchors in grep?
cat ‘names.txt’ | grep ‘^M’ - Lines which begin with M.
cat names.txt |grep ‘^e’ does not return string which start with capital E. we need to use
cat names.txt | grep -i ‘^e’
how to use word boundaries ?
cat names.txt | grep ‘am\b’ - end of the word boundary
cat names.txt | grep ‘am$’ - end of the line boundary
How to use [] in grep?
[] are used to give options.
cat names.txt | grep ‘M[ME]’ - matches MM or ME
cat names.txt | grep ‘[aeiou][aeiou]’ - matches names which has 2 vowels side by side
cat names.txt| grep ‘B90[1-4]’ - matches B901 to B904
cat names.txt | grep ‘B90[^5-7] - matches everything except B905 till B907. Anchor inside square brackets act as negation.
how to check frequency of occurance?
cat names.txt | grep ‘M{2} - matched MM
‘M{1,2} - either one or twice
How to group patterns ?
cat names.txt | grep ‘{ma)’
’(ma).*\1’ - matches pattern ‘ma’ followed by any no of characters and then ends with ‘ma’
’{a.}{3}’ - matches presense of 3 a’s in a name followed by some other char.
how to use egrep?
cat names.txt | egrep ‘M+’ - patterns containing M’
‘^M+’ - patterns which start with letter M
‘[ED][ME]’ - either ED or ME should occur.
how to filter through files using egrep?
dpkg-query -w -f’${Section} ${Binary:Package}\n | egrep ‘^math’
lists all files which start with math
How to use the character classes in practice?
cat chartypes.txt | grep ‘[[:alnum:]]’
How to exclude a particular output using grep
cat chartypes.txt | grep -v ‘[[:cntrl:]]’
selects all lines which do not have a control character
How to select all the non empty lines using egrep?
cat chartypes.txt | egrep -v ‘^$’
How to select a pincode (which has exactly 6 digits using egrep command?
egrep ‘\b[[:digit:]]{6}\n patterns.txt
how to select email addresses using egrep?
egrep ‘\b[[:alnum:]]+.[[:alnum:]]+\b’ patterns.txt
What does cut command do?
It does horizontal trimming of sections.
cut -c 1-4 fields.txt
displays first 4 characters of each line of the file fields.txt
cat fields.txt | cut -d “ “ -f 1
displays text delimited by space and displays first column or field
how is composition of cut operation used?
cat fields.txt | cut -d “;” -f 1 |cut -d “,” -f 1