hello,
i've been crawling the OG pages' rdf to use in our wireless portal application. i have a longish list of pages that gave me back toxic RDF/XML; it follows here.
most of this i think is either URIs included as regualr string data that have '&' characters in them; the odd windows or unicode special character in the URI name.
better detox on the RDF pages would be hella cool; i would love to have summaries or full page content attached too, and detoxofying those is maybe a bit harder...
zx
--
http://london.openguides.org/index.cgi?id=Boathouse,_SW15_2JX;format=rdf#obj http://london.openguides.org/index.cgi?id=Britannia,_W8_6UX;format=rdf#obj http://london.openguides.org/index.cgi?id=British_Museum;format=rdf#obj http://london.openguides.org/index.cgi?id=Chinese_Dinner,_SW16_5JF;format=rd... http://london.openguides.org/index.cgi?id=Coach_And_Horses,_SW13_9LW;format=... http://london.openguides.org/index.cgi?id=Common_Cafe,_SW16 5NP;format=rdf#obj http://london.openguides.org/index.cgi?id=Dirty_Dick%27s,_EC2M_4NR;format=rd... http://london.openguides.org/index.cgi?id=Eel_Pie,_TW1_3NJ;format=rdf#obj http://london.openguides.org/index.cgi?id=Food_Bazaar,_WC1X_8TL;format=rdf#o... http://london.openguides.org/index.cgi?id=Gili_Gulu,_WC2H_9EP;format=rdf#obj http://london.openguides.org/index.cgi?id=Honglee,_CR0_1NG;format=rdf#obj http://london.openguides.org/index.cgi?id=Jerusalem_Tavern,_EC1M_5NA;format=... http://london.openguides.org/index.cgi?id=John_Lewis,_Oxford_Street;format=r... http://london.openguides.org/index.cgi?id=Kelong,_CR0_4RF;format=rdf#obj http://london.openguides.org/index.cgi?id=Market_Porter,_SE1_9AA;format=rdf#... http://london.openguides.org/index.cgi?id=Morgan_M;format=rdf#obj http://london.openguides.org/index.cgi?id=National_Gallery;format=rdf#obj http://london.openguides.org/index.cgi?id=Okawari,_W5_5AP;format=rdf#obj http://london.openguides.org/index.cgi?id=Osushi,_CRO_1BF;format=rdf#obj http://london.openguides.org/index.cgi?id=Peter_Jones;format=rdf#obj http://london.openguides.org/index.cgi?id=Pied_Bull,_SW16_3QB;format=rdf#obj http://london.openguides.org/index.cgi?id=Rudi%27s_Sandwich_Bar,_WC1X_8TP;fo... http://london.openguides.org/index.cgi?id=Sesame,_WC1N_3NG;format=rdf#obj http://london.openguides.org/index.cgi?id=St_John,_EC1M_4AY;format=rdf#obj http://london.openguides.org/index.cgi?id=Surrey_Street_Market,_Croydon;form... http://london.openguides.org/index.cgi?id=Tagine,_SW12_9RT;format=rdf#obj http://london.openguides.org/index.cgi?id=Tandoori_Garden,_SW6_7SR;format=rd... http://london.openguides.org/index.cgi?id=Tesco,_E11_1HT;format=rdf#obj http://london.openguides.org/index.cgi?id=Tesco,_E9_6ND;format=rdf#obj http://london.openguides.org/index.cgi?id=The_Place_Below,_EC2V_6AU;format=r... http://london.openguides.org/index.cgi?id=Victoria_Station;format=rdf#obj http://london.openguides.org/index.cgi?id=Wonderful,_CR0_1AE;format=rdf#obj