On 25/07/07, Paul Makepeace paulm@paulm.com wrote:
I shut off Pg and the load stayed the same. Taking apache-perl down it went to 0%us. Bringing apache-perl & Pg back up it's ok again. So it suggests something's getting wedged. Of course, it's possible it's not actually OpenGuides, but OG gets by far the most number of hits, and has the most index.cgi processes hanging around, and it is the only Pg user.
Okay, a quick perusal of the latest logfile shows a couple of notable things.
1) Somebody at 80.68.93.162 has been slurping the whole site, indexes, pages, format varieties (rdf, raw) and all with WWW::Mechanize at the rate of one request per second. 2) Spam. Lots and lots of spam.
So what I'm going to do is this:
a) Shut down the site immediately and put it in "maintenance mode" (i.e. the "out of order sign") b) Bring it back up as read-only c) Install a totally clean new version from the latest release, with spam prevention features d) Try and find as many different kinds of spam as I can from the database and get them into the filters e) Clean the database of spam and vacuum it f) Bring the site back up - warning Paul first - and monitor for load levels.
a) and b) are going to happen this afternoon. c) d) and e) hopefully within a day or two. f) as soon as e) is complete. If load *still* goes too high after the spamwall has been raised, it may be indicative of a deeper problem meriting heavy investigation, and I'll put the site back into maintenance mode again.
Incidental thought: since robots.txt doesn't allow wildcards, perhaps we should put 'rel="nofollow"' on all our links to resources useless to search engines. I believe the better-behaved robots should respect this.
Thanks,
Earle.