Lecture Nine
Regular Expressions
Regular Expressions
- many Unix utilities use regular expressions: grep, sed, awk, vi, perl, Tcl
- shell filename matches are not regular expressions (eg. *.c)
- examples in this lecture will use the
grep utility and the file cars
- regular expressions are used to search for or match text:
- literal text can be used to search for that text
- . matches any character (similar to ? wildcard)
grep ".c" cars
grep "5..." cars
[ ] matches any character within the square brackets (similar to [ ] wildcard)
grep "[cC]hevy" cars
grep "[0-9][0-9][0-9][0-9][0-9]" cars
matches any character not within the square brackets (similar to [! ] wildcard)
^ matches beginning of line
$ matches end of line
grep " [0-9][0-9][0-9]$" cars
* following any character denotes zero or more occurrences of that character
grep "ford.*83" cars
grep "^ 65" cars
\ inhibits meaning of special characters
grep ' [0-9][0-9][0-9]\/tt> cars
- extended regular expressions are not recognized by
grep, can use egrep or grep -E:
( reg-exp ) parentheses used for grouping
egrep "^( + +){2}65" cars
egrep "ford" cars
| means OR, matches reg-exp on either side of the vertical bar
- regular expression characters may or may not need to be escaped - varies from program to program
- eg.
egrep and awk use ( and ) for grouping, sed uses ( and ) unless -r option is used
- regular expressions may or may not need delimiters - varies from program to program
- eg.
grep and egrep don't use delimiters,sed and awk use delimiters, eg. /string/
- other examples of regular expressions
(Mr|Mrs) Smith - match either "Mr Smith" or "Mrs Smith"
[a-zA-Z]+ - match one or more letters
^[a-zA-Z]*$ - match lines with only letters
0-9+ - match string not containing digits
[+-]?([0-9]+[.]?[0-9]*|[.][0-9]+)([eE][+-]?[0-9]+)? - match valid "C" programming numbers
grep
- uses regular expression for pattern, eg. grep 'reg-exp' filename, then prints matched lines
- gives 0 exit status if pattern matched
- options:
-c - counts matched lines instead of printing them
-i - ignores case
-n - precedes each line with a line number
-v - reverses sense of test, eg. finds lines not matching pattern
- examples, using the file cars
grep 'chevy' cars - display only lines containing the string "chevy"
grep -c 'chevy' cars - display count of lines containing the string "chevy"
grep -i 'chevy' cars - display only lines containing the string "chevy", ignoring case
grep -ic 'chevy' cars - display count of lines containing the string "chevy", ignoring case
grep -v 'chevy' cars - display only lines not containing the string "chevy"
grep -ivc 'chevy' cars - display count of lines not containing the string "chevy", ignoring case
grep -n 'chevy' cars - display only lines containing the string "chevy", with line numbers