On Wed, Apr 26, 2006 at 06:55:32PM +0100, Tom Heath wrote:
Is one solution to this to strip inline styles from user input?
I think a more general solution would be to implement a banned content list, which would be checked after an edit is submitted and before it is written. If a match is found, a message is displayed saying the edit was rejected, and optionally that the IP address has been banned, for those people who thrash our servers - although these days spam attacks come from large numbers of IPs (probably compromised machines) rather than just one.
The banned content list should probably be kept in a local file and be read in by Wiki::Toolkit. The advantage of keeping it in a file is being able to update it with a cron job that retrieves it from some central location.
Blacklists of spam URLs already exist, e.g. http://blacklist.chongqed.org/ , but are gigantic as you can see. Rather than comparing against these every time an edit is made and using scads of memory, I can see two options:
1) Maintain our own list of known OG-targeting spammers somewhere on openguides.org. The disadvantage of this is that it goes out of date every time a new spammer turns up and has to be added to manually, and then the spam in question has to be removed. The advantage is that the list will stay quite small.
2) Use a blacklist such as the Chongqed one, but only check against it if the body text or metadata contains a URL. This approach will mean it happens fairly rarely.
There's also Ivor's Wiki::Toolkit plugin for Simon Cozens' SpamMonkey, which I haven't meant to ignore in this post, I just don't know anything about how it works under the hood.