Lecture Eleven

Regular Expressions (Part 2) - vi; sed; awk

vi

  • vi searching and substitution use regular expressions, not just character strings
  • examples, using the file cars
    • /000$ - find the next line that has "000" at the end of the line
    • :s/chevy/GM / - substitute "GM " for the first occurrence of "chevy" in the current line
    • :3,8 s/chevy/GM / - substitute "GM " for the first occurrence of "chevy" in lines 3 to 8
    • :% s/chevy/GM / - substitute "GM " for the first occurrence of "chevy" in every line
    • :% s/chevy/GM /i - substitute "GM " for the first occurrence of "chevy" in every line, ignoring case
    • :3,8 s/[0-9]/*/ - substitute first occurrence of a digit on lines 3 to 8 with an asterisk
    • :3,8 s/[0-9]/*/g - substitute EVERY occurrence of a digit on lines 3 to 8 with an asterisk (g means global)

sed

  • stream editor
    • sed 'address instruction' filename
    • checks for address match, one line at a time, and performs instruction if address matched
    • prints lines to standard output by default (supressed by -n option)
  • addresses
    • can use a line number, to select a specific line (for example: 5)
    • can specify a range of line numbers (for example: 5,7)
    • can specify a regular expression to select all lines that match
    • default address (if none is specified) will match every line
  • instructions
    • p - print line(s) that match the address (usually used with -n option)
    • d - delete line(s) that match the address
    • q - quit processing at the first line that matches the address
    • s - substitute text to replace a matched regular expressions, similar to vi substitution
  • examples, using the file cars
    • sed '3,6 p' cars - display lines 3 through 6 (these lines will be doubled, since all lines printed by default)
    • sed -n '3,6 p' cars - display only lines 3 through 6
    • sed '5 d' cars - display all lines except the 5th
    • sed '5,8 d' cars - display all lines except the 5th through 8th
    • sed '5 q' cars - display first 5 lines then quit, same as head -5 cars
    • sed -n '/chevy/ p' cars - display only lines matching regular expression, same as grep 'chevy' cars
    • sed '/chevy/ d' cars - delete all matching lines, same as grep -v 'chevy' cars
    • sed '/chevy/ q' cars - display to first line matching regular expression
    • sed 's/[0-9]/*/' cars - substitute first occurrence of a digit on each line with an asterisk
    • sed 's/[0-9]/*/g' cars - substitute every occurrence of a digit on each line with an asterisk
    • sed '5,8 s/[0-9]/*/' cars - substitute only on lines 5 to 8

awk

  • pattern matching and processing
    • awk 'pattern {action}' filename
    • checks for pattern match,one line at a time, and performs action if pattern matched
  • pattern
    • NR is a special awk variable meaning the line number of the current record
    • can use a line number, to select a specific line, by comparing it to NR (for example: NR == 2)
    • can specify a range of line numbers (for example: NR == 2, NR == 4)
    • can specify a regular expression, to select all lines that match
    • $n are special awk variables, meaning the value of the nth field (field delimiter is space or tab)
    • can use field values, by comparing to $n (for example: $3 == 65)
    • every line is selected if no pattern is specified
  • instructions
    • print - print line(s) that match the pattern, or print fields within matching lines
    • print is default if no action is specified
    • there are many, many instruction, including just about all C statements with similar syntax
  • examples, using the file cars
    • awk 'NR == 2, NR == 4' cars - display the 2nd through 4th lines (default action is to print entire line)
    • awk '/chevy/' cars - display only lines matching regular expression, same as grep 'chevy' cars
    • awk '{print $3, $1}' cars - includes an output field separator (variable OFS, default is space)
    • awk -F':' '{print $6}' /etc/passwd - specifies that : is input field separator, default is space or tab
    • awk '/chevy/ {print $3, $1}' cars - display third and first fiield of lines matching regular expression
    • awk '$3 == 65' cars - display only lines with a third field value of 65
    • awk '$5 <= 3000' cars - display only lines with a fifth field value that is less than or equal to 3000
    • awk '$2 ~ /[0-9]/' cars - searches for reg-exp (a digit) only in the second field