Hi,
The Oxford guide is having problems with some clients mangling the é (in this particular case) in Category links:
http://www.ox.compsoc.net/oxfordguide/
This is a particular problem because of course a GET request on Categories will auto-create the category. The culprit appears, in the most recent case, to be this:
msnbot64124.search.msn.com - - [24/May/2004:10:00:57 +0100] "GET /oxfordguide/?id=Category%20Caf%C3%83%C6%92%C3%86%E2%80%99f%C3%83%C6%92%C3%A2%E2%82%AC%C5%A1%C3%83%E2%80%9A%C3%82%C2%A9s;format=rdf HTTP/1.0" 200 1506 "-" "msnbot/0.11 (+http://search.msn.com/msnbot.htm)"
I haven't looked closely into which of the bot and us are at fault yet, but has anyone else seen this problem? This needs fixing in two ways:
- the site should not create categories on the basis of a GET request (RFC 2616 sec 9.1.1), therefore categories need to be created at the time that they are entered into a node, rather than on first access.
- it would be nice to locate the source of this charset confusion, but if the above is fixed this is a secondary consideration (it may turn out that the problem is entirely brokenness on the client side in which case it can be safely ignored)
Cheers,
Dominic.