trinity-users@lists.pearsoncomputing.net

Message: previous - next
Month: March 2016

Re: [trinity-users] Hopeing I can find a regex expert here

From: Gene Heskett <gheskett@...>
Date: Wed, 23 Mar 2016 01:19:20 -0400
On Wednesday 23 March 2016 00:32:17 Michele Calgaro wrote:

> On 2016/03/23 12:44 PM, Gene Heskett wrote:
> > Greetings;
> >
> > I use mailfilter as a prefilter in front of fetchmail to nuke some
> > spam while its still on the server.
> >
> > But its missing hits on what I suspect is the From: or Return-Path:
> > strings that have quotation marks in the string because the string
> > is being spec'd by being surrounded by "show this name" bs.
> >
> > I've added the character < as part of the string its to search for,
> > so the search string now looks like "From:.*<*\.unwanted-tld".  Does
> > this stand that famous snow balls chance in hell of working well
> > with or without a quoted "some funkity name" in front of the real
> > url with the <> around it?
> >
> > I just love the lack of documentation on how this string comparison
> > stuff works as shown by the man pages for grep and regex.  All sorts
> > of control options are well covered, but figureing out how to write
> > a search expression must be one of the worlds better guarded
> > secrets.
> >
> > So if someone could show me, or give a url that actually has the
> > full docs, I'd be greatfull.
> >
> > Thanks.
> >
> > Cheers, Gene Heskett
>
> Hi Gene,
> "From:.*<*\.unwanted-tld" will match a string like this (I have put
> one section per line to be cleaer): From:
> whatever character
> 0 or more <
> .unwanted-tld
>
I thought I wanted 1 only, but the way these lowlifes change addresses 
and names hourly, they may remove the <> surrounding the real source 
address and screw me up.  But the fact that they often put dbl-qoutes 
around the throwaway part of the url, is I think screwing me regardless.

What we need is the ability to specify the quote character by the first 
non-space character after the DENY =, which is currently a "^ or a <> 
which apparently inverts the logic.  So a typical line would be

DENY = "^From:.*<*\.bid"

Substitute any of the new tld's for bid that gets obnoxious.  Like xyz, 
or .pro, heck that new list is several dozen tld's.

But AFAIK, we're stuck with the dblquote wrapper around the string to 
match.  Grrrr.

> It is greedy, so it will scan until the last < if there are more than
> one. Not sure if this is what you need or not. If you can post an
> example of what you need to match, I can workout another regex if
> required.
>
Try this:
 
"-Bed Bugs-" <-BedBugs-@...>

with Return-Path.* or From.* in front of it.  Or does that - sign, 4 of 
them, need escaping with a \ ? IDK.

Thanks Michelle.

> Cheers
>   Michele

I converted about 3 lines of the filterdata file that way, and I'm now 
waiting for the next blast of spam to serve as test data.  mailfilter is 
a picky twit, but that hasn't given it a tummy ache either, so I am 
hopefull.

> PS: by the way, the internet is full of excellent documentation about
> regex ;-) For example "http://www.regular-expressions.info/"


Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>