Here are Christophe's notes from the session:
We went over a few basic things on regular expressions.
Special "character classes":
\d to detect a digit 0-9
\w to detect alphanumeric 0-9, a-z, A-Z \s to detect a whitespace (space, tab etc) . to detect any character
And usual quantifiers
? : match zero or one
+ : match one or more
* : match zero or more
{n} ({12}) match n times
{n,m} match n to m times (inclusive)
Also one can defined new 'character class', with [ ] Eg. [aeiouy] matches ONE character but it has to be a lowercase vowel
[0-9a-z] matches ONE character that is any digit or lowercase letter (but cannot be an uppercase letter.
[0-9][a-z][A-Z] matches 1aX or 0xA but not 2ax etc.
A special thing with this character class [ ] thing is the caret (^) to introduce the negative:
[^aeiouy] means match one character as long as it is not a lowercase vowel
Exercice
To match
20-DEC-2008 5:59pm
We proposed:
\d\d-\w{3}-\d{4}\s\d:\d\d\w\w
Because \w means any alphanumeric, the expression above would also match
99-DEC-0001 5:59pm
Or
12-ABX-4512 5:59pm
(and also 20-DEC-2008 5:59ab)
We refined the time 5:59pm to match also 12:44pm or 10:51am \d{1,2}:\d\d\w\w
And then even
\d{1,2}:\d\d[am]p
(to match only am or pm...)
That was about it this time...
Tof'
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment