TextProcessing Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

How can you pass code as an argument to the ruby interpreter?

A

Use the -e flag when invoking Ruby:

ruby -e 'puts "Hello world"'

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Implement cat

A

ruby -ne ‘puts $_’ file.txt

The -n switch acts as though the code you pass to Ruby was wrapped in the following:

while gets

` # code here `

end

In short, this means that the code you pass in the -e argument is executed once for each line in your input. So, imagining that you had a file called foo.txt, with the following content:

foo

bar

baz

Then invoking Ruby like so:

$ ruby -ne 'puts $_' file.txt

Will output:

foo

bar

baz

Congratulations! You’ve just implemented cat in Ruby.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The -n switch

A

The -n switch acts as though the code you pass to Ruby was wrapped in the following:

while gets

` # code here`

end

In short, this means that the code you pass in the -e argument is executed once for each line in your input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

$_: what is it?

A

Throughout these examples, you’ll perhaps have noticed the use of the special global variable $_. When you invoke Ruby this way, it sets $_ to the current line that’s being processed; so if you wanted to do something like only print lines that start with “f”, that would be very easy:

ruby -ne 'puts $_ if $_ =~ /^f/' file.txt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Print all the lines in a file which begin with the letter f?

A

ruby -ne 'puts $_ if $_ =~ /^f/' file.txt Explain this one liner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The -p switch

A

The -p switch acts similarly to -n, in that it loops over each of the lines in the input. However, it goes a bit further: after your code has finished, it always prints the value of $_. So, you can imagine it as:

while gets

` # code here `

` puts $_ `

end

It’s really useful, then, for doing transformations on the input. If you wanted to take every line you were given, but replace every instance of the letter e you found with the letter a, you could do:

echo "eats, shoots, and leaves" | ruby -pe '$_.gsub!("e", "a")' aats, shoots, and laavas

Here, we modify the value of $_, and this modified value is what’s printed to the scree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

BEGIN block

A

Of course, our code here runs in a loop; what if we wanted to run something just once, before our loop starts? We might want to initialise a variable, for example.

In Ruby, we can use BEGIN blocks to do this. They allow us to execute code just once, at the start of the program.

So, to output line numbers from your input, you could do:

echo "foo\nbar\nbaz" | ruby -ne 'BEGIN { i = 1 }; puts "#{i} #{$_}"; i += 1'

Here, we initialise i to 0 at the start of the script. The ` BEGIN ` block executes only once, so is ignored on subsequent loops; we can then increment i, producing the following output:

1 foo

2 bar

3 baz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Double-space only non-blank lines.

A

ruby -ne 'print; puts unless ~/^$/'

Explain your code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Precede each line by its file-specific line number (left-aligned)

A

ruby -pe 'print $<.file.lineno, "\t"'

How is the -p switch helping?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Count the number of lines in a file

A

ruby -ne 'END{printf "%8d %s\n", $., $FILENAME}' How can we solve this problem using a builtin Unix utility?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Print the sums of the fields of every line (expects fields to be integers).

A

ruby -ane 'puts $F.reduce(0){|sum,x| sum+x.to_i}'

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Print the number of fields on each line, followed by the line.

A

ruby -ane 'BEGIN{$,="\t"}; print $F.size, $_'

ruby -ane 'printf "%3d %s", $F.size, $_'

Explain what is going on in the code above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Print the last field of the last line.

A

ruby -ane 'END{puts $F.last}'

What will happen if your run this code on a CSV file?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Print every line with more than 4 fields.

A

ruby -ane 'print if $F.size > 4'

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Print every line for which the value of the last field is > 4.

A

ruby -ane 'print if $F.last.to_i > 4'

Explain every word of your code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Insert 49 spaces after column 6 of each input line.

A

ruby -pe 'sub! /^.{6}/, "\\&" + " "*49'

Explain your code.

17
Q

Delete BOTH leading and trailing whitespace from each line

A

ruby -pe 'gsub(/^\s+/, "").gsub(/\s+$/, $/)' < file.txt

Explain how this works.

18
Q

Substitute “foo” with “bar” ONLY for lines which contain “baz”

A

ruby -pe 'gsub(/foo/, "bar") if $_ =~ /baz/'

How does this work?

19
Q

join pairs of lines side-by-side (like ‘paste’)

A

` ruby -pe ‘$_ = $_.chomp + “ “ + gets if $. % 2’ < file.txt` How does this work? What is the mysterious reference to ‘paste’?

20
Q

Add a blank line every 5 lines (after lines 5, 10, 15, etc)

A

ruby -pe 'puts if $. % 6 == 0' < file.txt How does this work?

21
Q

Blank line manipulation

  1. Print file except for blank lines
  2. Delete all consecutive blank lines from a file except the first
  3. Delete all consecutive blank lines from a file except for the first 2
  4. Delete all leading blank lines at top of file
A
  1. Print file except for blank lines

ruby -pe 'next if $_ =~ /^\s*$/' < file.txt

  1. Delete all consecutive blank lines from a file except the first

ruby -e 'BEGIN{$/=nil}; puts STDIN.readlines.to_s.gsub(/\n(\n)+/, "\n\n")' < file.txt

  1. Delete all consecutive blank lines from a file except for the first 2

ruby -e 'BEGIN{$/=nil}; puts STDIN.readlines.to_s.gsub(/\n(\n)+/, "\n\n")' < file.txt

  1. Delete all leading blank lines at top of file

ruby -pe '@lineFound = true if $_ !~ /^\s*$/; next if !@lineFound' < file.txt