BTW, have you tried "kregexpeditor" ? Nik Am Mittwoch, 23. März 2016 schrieb E. Liddell: > On Wed, 23 Mar 2016 15:58:39 +0900 > Michele Calgaro <michele.calgaro@...> wrote: > > > On 2016/03/23 02:19 PM, Gene Heskett wrote: > > > On Wednesday 23 March 2016 00:32:17 Michele Calgaro wrote: > > > > > >> On 2016/03/23 12:44 PM, Gene Heskett wrote: > > >>> Greetings; > > >>> > > >>> I use mailfilter as a prefilter in front of fetchmail to nuke some > > >>> spam while its still on the server. > > >>> > > >>> But its missing hits on what I suspect is the From: or Return-Path: > > >>> strings that have quotation marks in the string because the string > > >>> is being spec'd by being surrounded by "show this name" bs. > > >>> > > >>> I've added the character < as part of the string its to search for, > > >>> so the search string now looks like "From:.*<*\.unwanted-tld". Does > > >>> this stand that famous snow balls chance in hell of working well > > >>> with or without a quoted "some funkity name" in front of the real > > >>> url with the <> around it? > > >>> > > >>> I just love the lack of documentation on how this string comparison > > >>> stuff works as shown by the man pages for grep and regex. All sorts > > >>> of control options are well covered, but figureing out how to write > > >>> a search expression must be one of the worlds better guarded > > >>> secrets. > > >>> > > >>> So if someone could show me, or give a url that actually has the > > >>> full docs, I'd be greatfull. > > >>> > > >>> Thanks. > > >>> > > >>> Cheers, Gene Heskett > > >> > > >> Hi Gene, > > >> "From:.*<*\.unwanted-tld" will match a string like this (I have put > > >> one section per line to be cleaer): From: > > >> whatever character > > >> 0 or more < > > >> .unwanted-tld > > >> > > > I thought I wanted 1 only, but the way these lowlifes change addresses > > > and names hourly, they may remove the <> surrounding the real source > > > address and screw me up. But the fact that they often put dbl-qoutes > > > around the throwaway part of the url, is I think screwing me regardless. > > > > > > What we need is the ability to specify the quote character by the first > > > non-space character after the DENY =, which is currently a "^ or a <> > > > which apparently inverts the logic. So a typical line would be > > > > > > DENY = "^From:.*<*\.bid" > > > > > > Substitute any of the new tld's for bid that gets obnoxious. Like xyz, > > > or .pro, heck that new list is several dozen tld's. > > > > > > But AFAIK, we're stuck with the dblquote wrapper around the string to > > > match. Grrrr. > > > > > >> It is greedy, so it will scan until the last < if there are more than > > >> one. Not sure if this is what you need or not. If you can post an > > >> example of what you need to match, I can workout another regex if > > >> required. > > >> > > > Try this: > > > > > > "-Bed Bugs-" <-BedBugs-@...> > > > > > > with Return-Path.* or From.* in front of it. Or does that - sign, 4 of > > > them, need escaping with a \ ? IDK. > > Hyphens should only need an escape if within a character class, denoted by > square brackets. > > > > I converted about 3 lines of the filterdata file that way, and I'm now > > > waiting for the next blast of spam to serve as test data. mailfilter is > > > a picky twit, but that hasn't given it a tummy ache either, so I am > > > hopefull. > > > > > >> PS: by the way, the internet is full of excellent documentation about > > >> regex ;-) For example "http://www.regular-expressions.info/" > > > > > > > > > Cheers, Gene Heskett > > > > > Hi Gene, > > so if I understand correctly, you already had a set of rules like > > DENY = "^From:.*\.bid" (bid stands for any tld of yuor choice) > > but it was missing some entries because of the "..." entry before the domain. > > So you put the < in the string as well. > > Right? > > > > Assuming so, it surprises me that the original version missed some entries, since the additional "..." field would have > > already been matched by the .* part of the pattern. > > I think there is a different reason for missing entries. Perhaps a black character before "From:"? Could it be? > > You could try this other version: > > DENY = "^\s*From:.*\.bid" which ignores any separator before From: > > That would also sweep up, say, fred@..., or > "I.bid" <ibid@...> > > > or > > DENY = "^\s*From:.*\.bid>" which also makes explicit that the tld is followed by a >. > > I'd cover the example as > > ^\W*((From:)|(Return-Path:)).*\.bid\W*$ > > which works out to zero or more non-word characters at the beginning of the string, > followed by "From:" or "Return-Path:" followed by zero or more unknowns, followed > by ".bid", followed by zero or more non-word characters, followed by the end of the > string. "Word" characters are alphanumerics, some connectors like _-, and possibly > some non-ASCII depending on the implementation, so "non-word" covers stuff like > punctuation and whitespace. Marking the end of the string makes it more likely you're > getting the TLD and not some random bit in the middle that was designed as a parser > torture-test. > > If you want to get really silly, > > ^\W*((From:)|(Return-Path:)).*\.[^cCoOnN][a-zA-Z][a-zA-Z]+\W*$ > > ought to catch the majority of TLDs with a 3+ ASCII character extension > that isn't .com, .org, or .net, but without a larger sample of "good" and > "bad" addresses, I can't guarantee no false positives. > > I write a lot of regexes in my day job (which is not to say that I get them right the > first time, every time!) Assuming a Perl-compatible implementation (which most > of them are, more or less), "man perlre" is a decent reference for the complicated > bits. Just scroll past the section on modifiers. > > E. Liddell > > --------------------------------------------------------------------- > To unsubscribe, e-mail: trinity-users-unsubscribe@... > For additional commands, e-mail: trinity-users-help@... > Read list messages on the web archive: http://trinity-users.pearsoncomputing.net/ > Please remember not to top-post: http://trinity.pearsoncomputing.net/mailing_lists/#top-posting > > -- Please do not email me anything that you are not comfortable also sharing with the NSA.