Solid Fluid System Solutions  
Home Software About Hardware Firmware
Folder Icon Worked Example
 
A simple example that shows how to import data into MS Excel. The demonstraton shows how regex removes the repetitive graft.

Document Icon Regex
Current Document Icon Characters
 
How to specify which characters should match at an arbitary position within a search text.
Document Icon Positioning
 
How to make matches position themselves in relation to non text entities in the search text.
Document Icon Repeat
 
How to repeat the preceeding token match.
Document Icon Groups
 
How to group tokens together for use in repetitions, references or alternatives.
Document Icon Special Groups
 
How to create metatoken, conditional and comment groups.
Document Icon Modes
 
How to control the overal mode of a regular expression.

Individual Characters

This page describes how to specify which characters should match at an arbitary position within a search text.

Character Description Example

The wildcard

. "dot" The dot token matches any character except return and newline. Return and newline can be matched by dot, using the (?s) and (?-s) options described on the Modes page. . matches x

Ways to specify a particular test character

Any character except [\^$.|?*+(){ Aside from those "special characters" listed, any character may be used to represent itself as a token. a matches a
\ "backslash" followed by any of [^$.|?*+(){ A backslash may be used before any "special character" in order that special characters may represent themselves as tokens. This scheme is refered to as an "escape". \+ matches +
\xFF where FF are 2 hexadecimal digits Backslash x, may be used with two hexdecimal digits, as a token capable of defining any 8 bit character. \x21 matches !
\n , \r and \t Backslash r, n and t, may be used as tokens to match ASCII return, newline and tab characters. \r\n matches a DOS/Windows CRLF line break.
\\ Backslash backslash, may be used as a token to match backslash. \\ matches \

Ways to specify a specific range of test characters, valid in a specific position

\d , \w and \s Backslash d, w and s, may be used as a token to match any digit, word or space respectively. [\d\s] matches a character that is a digit or whitespace
\D , \W and \S Backslash D, W and S, are tokens which match characters that are not digits, words and spaces respectively. \D matches a character that is not a digit

Ways to specify a custom range of test characters, valid in a specific position

[ "opening square bracket" Begins description of a range of characters, as a single token. It matches at a single location in the search text. Since the range can only be composed of characters a different syntax is used inside the range delimiters.  
Any character except ^-]\ When inside the range delimiters, additional characters are added as alternative tokens at a single location in the search text. [xyz] matches x , y or z
\ "backslash" followed by any of ^-]\ The escape mechanism may be used inside the range delimiters, in any case where the escape is a token. [\^\]] matches ^ or ]
- "minus" except immediately after the opening [ Minus allows the specification of an implicit subrange. Such a range begins at the ASCII value of the character that preceeds the minus sign and ends at the ASCII value of the character that follows the minus sign. Where the minus sign immediately follows the opening range delimiter, it represents a minus sign. [a-zA-Z0-9] matches any letter or digit
^ "caret" immediately after the opening [ The caret allows the range to be inverted such that the range as a whole represents any character not specified within. Where the caret is not placed immediately after the opening range delimiter, it represents itself as the caret. [^x-z] matches m (any character except x, y or z)
Copyright © Solid Fluid 2007-2022
Last modified: SolFlu  Sat, 20 Jun 2009 21:02:29 GMT