Monday, 10 December 2007

Regular expressions again

Right. Now I understand regular expressions a little better here's an update to the previous post. Rather than use the rather clunky .NET string manipulation of the previous post this is a little neater:

Dim dateRegEx As New RegularExpressions.Regex("\d{4}[-]\d{2}[-]\d{2}[-]\d{2}.\d{2}.\d{2}.\d{6}")

If dateRegEx.IsMatch(StringToFix) Then

outString = dateRegEx.Replace(StringToFix, _
"(?\d{4})-(?\d{2})-(?\d{2})-(?\d{2}).(?\d{2}).(?\d{2}).(?\d{3})(?\d{3})", _
"${year}-${month}-${day} ${hour}:${min}:${sec}.${milli}")

End If

This does everything the previous code fragment does but with less than half the code. The regular expressions work as follows:

\d - in this example has to be used with a repetition count (the number in the {}-brackets). Essentially this matches the number of digits in the curly brackets. This was probably where I was going wrong in the previous post. So that tidies up the match.

The Replace function has quite a large number of overloads but the one being used is here.

the ?<year> notation in the second replace parameter identifies a regular expression group (in this case 'year'), essentially splitting up the string being evaluated, for manipluation in the third parameter.

The ${} notation is explained here, and in this example I'm just using it to concatenate the various bits of string groups created by using ?<group name> notation together.

No comments: