[leafnode-list] duplicate articles

Matthias Andree matthias.andree at gmx.de
Mon May 17 11:05:39 CEST 2004


Rick Pasotto schrieb am 2004-05-16:

> When I brought this up before I really didn't try to fix it but today it
> happened again. I got 15000+ new (actually old) articles in a newsgroup
> than might have 100 on a heavy day. The excess are all old and are
> duplicates of articles already in the spool.

Well, the machine is, as you are showing below, suffering spool
corruption. You wrote on April 18th you were using leafnode 1.9.52,
I presume this is still true?

If it is, please check http://www.projectcolo.org.uk/~broonie/leafnode/
it has 1.9.54.rc2 deb's for Debian Linux on i386 and powerpc, by Mark
Brown - note they ship without guarantee, as usual, but bug reports will
be processed by Mark and by me, of course -- feedback is solicited!

Texpire as of leafnode 1.9.52 and before would not repair the spool of a
group you had set to "never expire" (groupexpire this.group = -1 -- note
the leafnode-2 syntax differs slightly).

> So... I greped for all 'Message-ID:' lines in
> /var/spool/news/triangle/general and then sorted the list to find
> duplicates. As an example, for one Message-ID: there were four article
> numbers (22541, 48144, 64954, and 90861). Running diff on those four
> showed that the only difference was the X-Ref: line which had the
> appropriate article id. However, the link in
> /var/spool/news/message-id/xxx was to the most recent article only. So,
> it looks like instead of rejecting a duplicate it is simply being
> overridden.

texpire in the 1st pass iterates over all groups by their names, for
example, triangle, triangle/general, ... and figures if any article of
these needs to be re-linked. It figures this by looking at the link
count only, if it's 1, it will re-link the article. In either case, it
marks the article as seen and stores the Message-ID in a
message.id/*/mids file (since 1.9.52). In the 2nd pass, it will remove
all links from message.id/* that haven't been marked in the mids files
or has a link count of 1. (Any write error on the mids files will switch
off mids file based expiry and revert to link count based expiry only,
as was the only available functionality with 1.9.51 and older.)

I do not know why your message.id/* links are missing, but that is the
cause for downloading duplicates and your slrn problems (fetch parent,
fetch thread and the like).

> Now I could go through triangle/general and delete all files
> with a link count of one but that would leave the one with the most
> recent time stamp and therefore the highest article number and that
> would not be correct and would leave my .newsrc file screwed.

Indeed. texpire 1.9.53 and newer will keep the first file to have that
Message-ID, without lowering the "high article" count.

-- 
Matthias Andree

Encrypted mail welcome: my GnuPG key ID is 0x052E7D95



More information about the leafnode-list mailing list