[leafnode-list] Re: Filtering articals

clemens fischer ino-news at spotteswoode.dnsalias.org
Sun Mar 27 23:24:51 CEST 2011


Paul Brooks wrote:

> I see that I can remove articles using applyfilter by filtering on the
> contents of the article headers before and/or after fetching the
> article. But is there a way of filtering the articles with reference
> to the contents of the article body itself. Even if it means using
> grep and then deleting, if possible, the offending article in some
> way. 

The Lua version has this feature amongst others, but it is (still)
incompatible with any old-school filtering in leafnode.

If you don't want Lua, you could issue a "find" command to get a list of
recently pulled articles, "grep" them and remove according to "grep"
return status.

you could make a small shell script like (untested):

#+v

#!/bin/sh
#
# filter-and-delete.sh
#
# mind your step and all pathnames and patterns here!
#
patterns="${1:?need patterns to check against!}"
bad_articles="${2:?need directory to put bad articles into!}"
check_this="${3:?need a file to check!}"
grep="/bin/egrep"
copy="/bin/cp"
sed="/bin/sed -r -n"
sed_mid="s/^Message-ID:[[:space:]]+(.*)$/\1/p"
delete="/usr/local/sbin/texpire -C"
bad_mid=""
${grep} "${patterns}" "${check_this}" &&
bad_mid=$(${sed} -e "${sed_mid}" < "${check_this}")
[ -n "${bad_mid}" ] && {
    ${copy} "${check_this}" "${bad_articles}/${bad_mid}" &&
    ${delete} "${bad_mid}"
    ex=$?
}
exit ${ex}

#-v

use like (equally well tested :)

- "true > /tmp/bad-articles/some-file" before the start of "fetchnews",
- "fetchnews ..."
- "find -type f -newer /tmp/bad-articles/some-file \
    /var/spool/news/<top-hierarchy> -print0 |
    xargs -0 -n1 filter-and-delete.sh /tmp/bad-articles 'patterns'"

Afterwards and with luck you'll find bad articles in the directory
mentioned.  They will also have been expired from any overview files.

You'd clean out the directory with bad articles from time to time, of
course.


clemens




More information about the leafnode-list mailing list