Friday, December 5, 2008

More notes from the December 2nd Session

Here are Christophe's notes from the session:

We went over a few basic things on regular expressions.

Special "character classes":
\d to detect a digit 0-9
\w to detect alphanumeric 0-9, a-z, A-Z \s to detect a whitespace (space, tab etc) . to detect any character

And usual quantifiers
? : match zero or one
+ : match one or more
* : match zero or more
{n} ({12}) match n times
{n,m} match n to m times (inclusive)

Also one can defined new 'character class', with [ ] Eg. [aeiouy] matches ONE character but it has to be a lowercase vowel
[0-9a-z] matches ONE character that is any digit or lowercase letter (but cannot be an uppercase letter.
[0-9][a-z][A-Z] matches 1aX or 0xA but not 2ax etc.

A special thing with this character class [ ] thing is the caret (^) to introduce the negative:
[^aeiouy] means match one character as long as it is not a lowercase vowel

Exercice

To match

20-DEC-2008 5:59pm

We proposed:

\d\d-\w{3}-\d{4}\s\d:\d\d\w\w

Because \w means any alphanumeric, the expression above would also match
99-DEC-0001 5:59pm
Or
12-ABX-4512 5:59pm
(and also 20-DEC-2008 5:59ab)

We refined the time 5:59pm to match also 12:44pm or 10:51am \d{1,2}:\d\d\w\w

And then even
\d{1,2}:\d\d[am]p
(to match only am or pm...)

That was about it this time...

Tof'

No comments: