Lecture Eleven
Regular Expressions (Part 2) - vi; sed; awk
vi
- vi searching and substitution use regular expressions, not just character strings
- examples, using the file cars
/000$ -
find the next line that has "000" at the end of the line
:s/chevy/GM / -
substitute "GM "
for the first occurrence of "chevy"
in the current line
:3,8 s/chevy/GM / -
substitute "GM "
for the first occurrence of "chevy"
in lines 3 to 8
:% s/chevy/GM / -
substitute "GM "
for the first occurrence of "chevy"
in every line
:% s/chevy/GM /i -
substitute "GM "
for the first occurrence of "chevy"
in every line, ignoring case
:3,8 s/[0-9]/*/ -
substitute first occurrence of a digit on lines 3 to 8 with an asterisk
:3,8 s/[0-9]/*/g -
substitute EVERY occurrence of a digit on lines 3 to 8 with an asterisk (g means global)
sed
- stream editor
sed 'address instruction' filename
- checks for address match, one line at a time, and performs instruction if address matched
- prints lines to standard output by default (supressed by -n option)
- addresses
- can use a line number, to select a specific line (for example: 5)
- can specify a range of line numbers (for example: 5,7)
- can specify a regular expression to select all lines that match
- default address (if none is specified) will match every line
- instructions
p -
print line(s) that match the address (usually used with -n option)
d -
delete line(s) that match the address
q -
quit processing at the first line that matches the address
s -
substitute text to replace a matched regular expressions, similar to vi substitution
- examples, using the file cars
sed '3,6 p' cars -
display lines 3 through 6 (these lines will be doubled, since all lines printed by default)
sed -n '3,6 p' cars -
display only lines 3 through 6
sed '5 d' cars -
display all lines except the 5th
sed '5,8 d' cars -
display all lines except the 5th through 8th
sed '5 q' cars -
display first 5 lines then quit, same as head -5 cars
sed -n '/chevy/ p' cars -
display only lines matching regular expression, same as grep 'chevy' cars
sed '/chevy/ d' cars -
delete all matching lines, same as grep -v 'chevy' cars
sed '/chevy/ q' cars -
display to first line matching regular expression
sed 's/[0-9]/*/' cars -
substitute first occurrence of a digit on each line with an asterisk
sed 's/[0-9]/*/g' cars -
substitute every occurrence of a digit on each line with an asterisk
sed '5,8 s/[0-9]/*/' cars -
substitute only on lines 5 to 8
awk
- pattern matching and processing
awk 'pattern {action}' filename
- checks for pattern match,one line at a time, and performs action if pattern matched
- pattern
NR
is a special awk variable meaning the line number of the current record
- can use a line number, to select a specific line, by comparing it to NR (for example: NR == 2)
- can specify a range of line numbers (for example: NR == 2, NR == 4)
- can specify a regular expression, to select all lines that match
$n
are special awk variables, meaning the value of the nth field (field delimiter is space or tab)
- can use field values, by comparing to $n (for example: $3 == 65)
- every line is selected if no pattern is specified
- instructions
- print - print line(s) that match the pattern, or print fields within matching lines
- print is default if no action is specified
- there are many, many instruction, including just about all C statements with similar syntax
- examples, using the file cars
awk 'NR == 2, NR == 4' cars -
display the 2nd through 4th lines (default action is to print entire line)
awk '/chevy/' cars -
display only lines matching regular expression, same as grep 'chevy' cars
awk '{print $3, $1}' cars -
includes an output field separator (variable OFS, default is space)
awk -F':' '{print $6}' /etc/passwd -
specifies that : is input field separator, default is space or tab
awk '/chevy/ {print $3, $1}' cars -
display third and first fiield of lines matching regular expression
awk '$3 == 65' cars -
display only lines with a third field value of 65
awk '$5 <= 3000' cars -
display only lines with a fifth field value that is less than or equal to 3000
awk '$2 ~ /[0-9]/' cars -
searches for reg-exp (a digit) only in the second field