Lecture Nine
Regular Expressions
Regular Expressions
- many Unix utilities use regular expressions: grep, sed, awk, vi, perl, Tcl
- shell filename matches are not regular expressions (eg. *.c)
- examples in this lecture will use the
grep
utility and the file cars
- regular expressions are used to search for or match text:
- literal text can be used to search for that text
- . matches any character (similar to ? wildcard)
grep ".c" cars
grep "5..." cars
[ ]
matches any character within the square brackets (similar to [ ]
wildcard)
grep "[cC]hevy" cars
grep "[0-9][0-9][0-9][0-9][0-9]" cars
matches any character not within the square brackets (similar to [! ]
wildcard)
^
matches beginning of line
$
matches end of line
grep " [0-9][0-9][0-9]$" cars
*
following any character denotes zero or more occurrences of that character
grep "ford.*83" cars
grep "^ 65" cars
\
inhibits meaning of special characters
grep ' [0-9][0-9][0-9]\/tt> cars
- extended regular expressions are not recognized by
grep
, can use egrep
or grep -E
:
( reg-exp )
parentheses used for grouping
egrep "^( + +){2}65" cars
egrep "ford" cars
|
means OR, matches reg-exp on either side of the vertical bar
- regular expression characters may or may not need to be escaped - varies from program to program
- eg.
egrep
and awk
use (
and )
for grouping, sed
uses (
and )
unless -r
option is used
- regular expressions may or may not need delimiters - varies from program to program
- eg.
grep
and egrep
don't use delimiters,sed
and awk
use delimiters, eg. /string/
- other examples of regular expressions
(Mr|Mrs) Smith -
match either "Mr Smith" or "Mrs Smith"
[a-zA-Z]+ -
match one or more letters
^[a-zA-Z]*$ -
match lines with only letters
0-9+ -
match string not containing digits
[+-]?([0-9]+[.]?[0-9]*|[.][0-9]+)([eE][+-]?[0-9]+)? -
match valid "C" programming numbers
grep
- uses regular expression for pattern, eg. grep 'reg-exp' filename, then prints matched lines
- gives 0 exit status if pattern matched
- options:
-c
- counts matched lines instead of printing them
-i
- ignores case
-n
- precedes each line with a line number
-v
- reverses sense of test, eg. finds lines not matching pattern
- examples, using the file cars
grep 'chevy' cars -
display only lines containing the string "chevy"
grep -c 'chevy' cars -
display count of lines containing the string "chevy"
grep -i 'chevy' cars -
display only lines containing the string "chevy", ignoring case
grep -ic 'chevy' cars -
display count of lines containing the string "chevy", ignoring case
grep -v 'chevy' cars -
display only lines not containing the string "chevy"
grep -ivc 'chevy' cars -
display count of lines not containing the string "chevy", ignoring case
grep -n 'chevy' cars -
display only lines containing the string "chevy", with line numbers