On 25/07/07, Paul Makepeace <paulm(a)paulm.com> wrote:
I shut off Pg and the load stayed the same. Taking
apache-perl down it
went to 0%us. Bringing apache-perl & Pg back up it's ok again. So it
suggests something's getting wedged. Of course, it's possible it's not
actually OpenGuides, but OG gets by far the most number of hits, and
has the most index.cgi processes hanging around, and it is the only Pg
user.
Okay, a quick perusal of the latest logfile shows a couple of notable things.
1) Somebody at 80.68.93.162 has been slurping the whole site, indexes,
pages, format varieties (rdf, raw) and all with WWW::Mechanize at the
rate of one request per second.
2) Spam. Lots and lots of spam.
So what I'm going to do is this:
a) Shut down the site immediately and put it in "maintenance mode"
(i.e. the "out of order sign")
b) Bring it back up as read-only
c) Install a totally clean new version from the latest release, with
spam prevention features
d) Try and find as many different kinds of spam as I can from the
database and get them into the filters
e) Clean the database of spam and vacuum it
f) Bring the site back up - warning Paul first - and monitor for load levels.
a) and b) are going to happen this afternoon. c) d) and e) hopefully
within a day or two. f) as soon as e) is complete. If load *still*
goes too high after the spamwall has been raised, it may be indicative
of a deeper problem meriting heavy investigation, and I'll put the
site back into maintenance mode again.
Incidental thought: since robots.txt doesn't allow wildcards, perhaps
we should put 'rel="nofollow"' on all our links to resources useless
to search engines. I believe the better-behaved robots should respect
this.
Thanks,
Earle.
--
Earle Martin
http://downlode.org/
http://purl.org/net/earlemartin/