I think pretty much every OpenGuide has been hit by the casino spammers -
basically they edit Wiki Etiquette and Wiki Info to include their URL
as many times as possible, trying to capitalise on our googlejuice.
We implemented admin delete to deal with this, but it does seem to be
the same spam over and over again, so guide admins are having to
delete it over and over again. Luckily we have a tool to deal with
automating repetitive tasks.
I would like this to be optional, extensible and configurable. Here
is a suggested spam filtering config file (see Config::Tiny for syntax).
----------------------------------------------------------------------
[discard_edit]
locale = www.casino-online-online.com
category = www.casino-online-online.com
[discard_edit_regex]
locale = www\..*
category = www\..*
[notify_edit]
notify_to_email = kake(a)earth.li
username = Anonymous
----------------------------------------------------------------------
(I don't plan to implement notify_edit yet, just stuck it in there to
show that this form of config file is quite extensible. Dare I say it,
this might be a good task to be implemented as a plugin.)
When someone/thing tries to commit an edit that matches a discard
rule, they'll get an error message explaining what they did that was
naughty. So you could use this to block anonymous edits too, by
adding the line "username = Anonymous" to the discard_edit section.
Comments? Is there a better way to structure the file without having
to add any more prereqs? (I was thinking YAML, but that's yet another
prereq and people already complain about how many there are, plus it's
another syntax for people to learn.)
In particular, this syntax doesn't let us say things like 'Disallow
edits if EITHER the locale is "foo" OR (the locale is "Hammersmith"
AND the username is "Anonymous")'. I *don't* plan to implement
anything that complex right now, because it's not *needed*, but I
don't want to tie myself into a format that precludes doing it later.
Kake
Hiya.
The Nottingham OpenGuide[1] takes a considerable time to load the front
page; if the machine it's on is using its CPU for anything else, at the
time, it takes even longer. While it's loading, mysqld is taking up ~90%
of the CPU time according to 'top', running (as far as I can tell) the
query that generates the list of recent changes.
I had just put it down a One Of Those Things, but I noticed that the
other, larger guides such as Oxford and London *don't* have this
problem. So, I'm left thinking it's something I've not configured
properly -- has anyone got any clues what that might be? People have
mentioned MySQL indexes to me, but my understanding of what's going on
with them is roughly nil.
Any suggestions are very welcome indeed...
Cheers,
James.
[1] http://nottingham.openguides.org/
--
PGP fingerprint 3E85 0C7A FE11 42E9 A599 094D AE16 90F0 81AE 16FF, ID 81AE16FF
Go not to the IRC for counsel, for they will say both yes and no.
-- (with apologies to JRR Tolkien)
Haven't looked at the module yet, don't know where he's getting the
data from.
Kake
----- Forwarded message from william ross <will(a)spanner.org> -----
From: william ross <will(a)spanner.org>
Date: Tue, 31 Aug 2004 19:39:06 +0100
To: CDBI <cdbi-talk(a)groups.kasei.com>
Subject: Geo::Postcode
Hello,
I've just finished (for now) and uploaded the module that I use to work
with UK postcodes. It handles validation, parsing, location and the
common distance-to-here calculation in a fairly economical way, and it
can deal transparently with postcodes at different resolutions (ie it
will tell you how far it is from 'SW1' to 'EC1Y 8PQ'), to allow for the
various data sets that are available.
At the moment it's up as Geo::Postcode, but that may well change in
order to sit nicely next to similar modules from other countries.
The accuracy of its measurements depends on the data set that it's
working with. It comes with a basic set of nearly 3000 points that
pretty much covers all the postcode areas (like 'E1' or 'LA23'), but no
more. That's enough to get started and rough out an application, but
for serious public use you will need to license the use of a proper
postcode set. The module is designed to accept all sorts of data
sources without too much trouble.
I mention it here because I use it mostly with CDBI apps, so it is
designed to inflate and deflate cleanly as a has_a. If a class already
has a se[arate postcode field, it can just be dropped in:
My::Person->has_a( postcode => 'Geo::Postcode' );
My::Cinema->has_a( postcode => 'Geo::Postcode' );
my $distance = $person->postcode->distance_from( $cinema->postcode );
...and much more.
I hope it turns out to be useful, and I'd be very glad of feedback and
improvements.
best
will
ps. if it hasn't reached your mirror, try www.spanner.org/postcodes/
----- End forwarded message -----