WelcomeProducts & ServicesSecurity ResponseSupportSolutions & IndustriesLicensingTrainingStore
Enterprise
Symantec.com > Enterprise > Support > Knowledge Base


Introduction to using Regular Expressions

Question/Issue:
When using of some of the more advanced features of the Raptor Firewall or Symantec Enterprise Firewall, you should have, at least, a passing knowledge of regular expressions.


Solution:
Regular expressions (often called "regex" or "regexp") are a means of defining a pattern of text against which other text is matched using special metacharacters to define character groups, sets, or alternations.

These expressions become increasingly useful in shell scripting, PERL programming, or in any other application that supports this particular type of pattern matching. They can be used to pull out strings from text files, for substitution, for reporting, and for thousands of other uses.

This document covers the use of regular expressions as they relate to certain advanced functions of Raptor Firewall or Symantec Enterprise Firewall.

Regular expressions are used simply for pattern matching in Symantec Enterprise Firewall products. There is no substitution functionality available in these applications. Regexs are most often used to match patterns for denial of email senders and URL patterns (see specific documentation regarding the HTTP.URLPATTERN setting or the advanced SMTP "Enable bad sender check" setting).

Character sets and grouping
A character set is a group of characters to match against and is defined in brackets, "[ ]". The functionality of a character set is to provide the ability to match one (or, when quantified, more) character out of a choice or series. An example would be "d[eiu]sk", which matches against "desk", "disk", and "dusk". The character set allows a limited sort of "wildcard" functionality.

Another important "wildcard" is the period (".") also known as the "dot." When used in a regular expression, a period matches against any character except "special" characters (metacharacters, quantifiers, and so on). Thus, one could also use "d.sk" to match the previous examples; however, this would also match "dask", "dbsk", "dcsk", and so on.

The parentheses, "()", are used as grouping characters. This means that something that is within parentheses is evaluated as a single unit. This becomes more important when using functions such as alternation or grouping. In the following example, we use the "or" alternation metacharacter, the pipe ("|"), to demonstrate grouping. The example "My (computer|desk|firewall)" would match either "My computer", "My desk", or "My firewall", but not "My computer desk".

Repetition, alternation, and other modifiers
Some of the special characters available operate on groups and sets to perform the functions of alternation or repetition.

The alternation operator, known as the pipe symbol (|) or bar, allows (as shown above in the demonstration of the parenthesis) for specifying multiple characters or character groups to match. It matches against either of the expressions that it separates.

The question mark (?), addition symbol (+), and asterisk (*) denote quantities to match.

The question mark indicates that one or none of the specified characters may be present. For example, the expression "b[oa]?b" matches with "bab", "bob", or "bb".

The addition symbol indicates that one or more of the characters may exist. Using our previous expression "b[oa]+b," we return a match for "bab," "bob," "baaaaab," and so forth, but do not return a match for "bb," as no member of the character set exists in the string.

The asterisk specifies none or more of the preceding character or set. "b[oa]*b" matches against "bab", "bob", "baaaaab", and "bb".

While the differences are subtle, they can have great effect when used properly or misused. For instance, if trying to match against an email address where a particular character is randomly generated (bob@fakedomain.com, b1b@fakedomain.com, bZb@fakedomain.com) but cannot contain a null string, the use of the + operator would be more appropriate than the use of the ? operator, as the ? operator would also match against "bb@fakedomain.com," which may be an address you want to receive mail from, though you might want to block the others.

More information on regular expressions and how they are used may be found on the Internet or regexp man pages.



Document ID: 2001102607572054
Last Modified: 01/06/2005
Date Created: 10/26/2001
Product(s): Symantec Enterprise Firewall 6.5, Symantec Enterprise Firewall 7.x, Symantec Enterprise VPN (Server) 6.5, Symantec Enterprise VPN (Server) 7.x, Symantec Gateway Security Appliance 1.0, Symantec VelociRaptor 1.1
Release(s): Symantec Enterprise Firewall 6.5.2, Symantec Enterprise Firewall 7.0, Symantec Enterprise VPN (Server) 6.5, Symantec Enterprise VPN (Server) 6.5.2, Symantec Enterprise VPN (Server) 7.0, Symantec Gateway Security Appliance 1.0, Symantec VelociRaptor 1.1


Site Index · Legal Notices · Privacy Policy · · Contact Us · Global Sites · License Agreements
©1995 - 2009 Symantec Corporation