From dattier@mcs.net Tue Mar 21 19:18:11 2000
Date: Thu, 9 Mar 2000 12:36:23 -0600 (CST)
From: David W. Tamkin <dattier@mcs.net>
To: Martin Mokrejs <mmokrejs@natur.cuni.cz>
Subject: Re: Need solution with procmail

Martin,

My personal life is swamped right now.  May I suggest that you post your
questions to the Procmail Mailing List, and surely someone will be able to
answer some or all of them before I can, and you won't have to wait for me.

|   [^-a-d]   Any character which is not either a dash, a, b, c, d or newline.

That seems clear enough for me.  That expression matches any single character
that isn't a, b, c, d, hyphen, or newline.

|   ^ or $    Match a newline (for multiline matches).
| 
| So matches empty lines or any lines, looking for the begin or end either?

No, it matches the newline character that comes between each two lines of
text.

|   ^^        Anchor the expression at the very start of the search area, or if
|             encountered  at  the end of the expression, anchor it at the very
|             end of the search area.
| 
| Probably what the word "anchor" means, sorry. ;)

To anchor means to tie down, to fasten, the way a ship is kept in one place
by its anchor.  For this, it means that if the regexp begins with ^^, a match
counts only if the matching text is at the very beginning of the search area,
or if the regexp ends with ^^ the match counts only if the matching text is
at the very end of the search area; if the regexp has ^^ at both ends, the
entire search area must be a match to the regexp.

|   \< or \>  Match the character before or after a word.  They  are  merely  a
|             shorthand  for  `[^a-zA-Z0-9_]',  but  can  also  match newlines.
|             Since they match actual characters, they  are  only  suitable  to
|             delimit words, not to delimit inter-word space.
| 
| I'm totally lost. I `[^a-zA-Z0-9_]' means .... doesn't look for chars a-z
| or A-Z or 0-9 or underscore, then it just looks for non-alphanumeric chars.
| Like space and nonprintable characters, right?

Right (and also a punctuation mark other than an underscore).  The difference
is that, unlike an expression in brackets, \< and \> can also match a newline
character.  So really they are shorthand for ([^a-zA-Z0-9_]|$) because they
can match a newline.

| > To test for contents of a return value,
| >
| >  some_variable=`command`
| >
| >  :0 flags
| >  * some_variable ?? regexp
| >  action

| Why do we have to dig out the result from $some_variable by regexp. Isn't
| there a value (return value already)?

The person whose question I was answering wasn't testing the exit code of
the command but rather trying to see if the standard output of the command
matched a certain regexp. 

| How should the regexp look like?

That depends on what text he was hoping to find (or hoping not to find).