On Wednesday 23 March 2016 02:58:39 Michele Calgaro wrote: > On 2016/03/23 02:19 PM, Gene Heskett wrote: > > On Wednesday 23 March 2016 00:32:17 Michele Calgaro wrote: > >> On 2016/03/23 12:44 PM, Gene Heskett wrote: > >>> Greetings; > >>> > >>> I use mailfilter as a prefilter in front of fetchmail to nuke some > >>> spam while its still on the server. > >>> > >>> But its missing hits on what I suspect is the From: or > >>> Return-Path: strings that have quotation marks in the string > >>> because the string is being spec'd by being surrounded by "show > >>> this name" bs. > >>> > >>> I've added the character < as part of the string its to search > >>> for, so the search string now looks like > >>> "From:.*<*\.unwanted-tld". Does this stand that famous snow balls > >>> chance in hell of working well with or without a quoted "some > >>> funkity name" in front of the real url with the <> around it? > >>> > >>> I just love the lack of documentation on how this string > >>> comparison stuff works as shown by the man pages for grep and > >>> regex. All sorts of control options are well covered, but > >>> figureing out how to write a search expression must be one of the > >>> worlds better guarded secrets. > >>> > >>> So if someone could show me, or give a url that actually has the > >>> full docs, I'd be greatfull. > >>> > >>> Thanks. > >>> > >>> Cheers, Gene Heskett > >> > >> Hi Gene, > >> "From:.*<*\.unwanted-tld" will match a string like this (I have put > >> one section per line to be cleaer): From: > >> whatever character > >> 0 or more < > >> .unwanted-tld > > > > I thought I wanted 1 only, but the way these lowlifes change > > addresses and names hourly, they may remove the <> surrounding the > > real source address and screw me up. But the fact that they often > > put dbl-qoutes around the throwaway part of the url, is I think > > screwing me regardless. > > > > What we need is the ability to specify the quote character by the > > first non-space character after the DENY =, which is currently a "^ > > or a <> which apparently inverts the logic. So a typical line would > > be > > > > DENY = "^From:.*<*\.bid" > > > > Substitute any of the new tld's for bid that gets obnoxious. Like > > xyz, or .pro, heck that new list is several dozen tld's. > > > > But AFAIK, we're stuck with the dblquote wrapper around the string > > to match. Grrrr. > > > >> It is greedy, so it will scan until the last < if there are more > >> than one. Not sure if this is what you need or not. If you can post > >> an example of what you need to match, I can workout another regex > >> if required. > > > > Try this: > > > > "-Bed Bugs-" <-BedBugs-@...> > > > > with Return-Path.* or From.* in front of it. Or does that - sign, 4 > > of them, need escaping with a \ ? IDK. > > > > Thanks Michelle. > > > >> Cheers > >> Michele > > > > I converted about 3 lines of the filterdata file that way, and I'm > > now waiting for the next blast of spam to serve as test data. > > mailfilter is a picky twit, but that hasn't given it a tummy ache > > either, so I am hopefull. > > > >> PS: by the way, the internet is full of excellent documentation > >> about regex ;-) For example "http://www.regular-expressions.info/" > > > > Cheers, Gene Heskett > > Hi Gene, > so if I understand correctly, you already had a set of rules like > DENY = "^From:.*\.bid" (bid stands for any tld of yuor choice) > but it was missing some entries because of the "..." entry before the > domain. So you put the < in the string as well. > Right? > > Assuming so, it surprises me that the original version missed some > entries, since the additional "..." field would have already been > matched by the .* part of the pattern. > I think there is a different reason for missing entries. Perhaps a > black character before "From:"? Could it be? You could try this other > version: > DENY = "^\s*From:.*\.bid" which ignores any separator before From: > or > DENY = "^\s*From:.*\.bid>" which also makes explicit that the tld is > followed by a >. > I'll do that for the top 4 or 5 entries to see what effect it has. > By the way, by "missing some entries" you mean that it is not > filtering all the spam or that it is filtering some good emails as > well? > Two consequitive spams ending in the desired hit, it nukes one and passes the other, so I was looking for what the diff was. Kmail shows the raw message with a tap on the v key, and I can't see any trash characters in from of the From etc lines. But I see in the logs, that I an nuking posts from a valuable contributor, Seems Dr. Klepp is coming in from a .biz address, so I'll have to remove that filter line, and my apologies Nick if I have seemed to have ignored you. I'd druther have the spam than miss your helpfull msgs. > Final note, your current modified version if no different from the > original, since <* (0 or more <) is preceded by .* (any sequence of > character). Perhaps you wanted to make <.*, but it would make no > difference either, except for being morerestrictive (i.e. there must > be a < somewhere before the forbidden tld). > > Cheers > Michele Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Genes Web page <http://geneslinuxbox.net:6309/gene>