From jkg@earth.li Sun Aug 29 23:49:38 2004 From: James Green To: openguides-dev@lists.openguides.org Subject: [OpenGuides-Dev] Search engines and edit/delete pages Date: Sun, 29 Aug 2004 23:50:05 +0100 Message-ID: <20040829225005.GJ14747@jimbo.org.uk> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3970576088439407304==" --===============3970576088439407304== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Hiya. It occurs to me that there's very little point in search engines crawling certain pages of an OpenGuide - anything with "action=edit" or "action=delete" in the URL, at the very least, has no real value to someone searching for information. Unfortunately, unless I'm missing something, robots.txt syntax doesn't allow for matching on anything other than the start of the path component of a URL, which doesn't help us here. It's been suggested to me that ``'' tags in the of a page are effective here, but I'm not sure of the best way to implement this for edit pages without implementing it for *all* pages, which would be careless. I'm thinking of something in header.tt conditional upon the requested URI containing "action=edit" or "action=delete", but before I wander off learning how to talk template::toolkit, I'd be interested to hear better suggestions -- and other values of action we might care about, I guess. The other, more complex but possibly "better" alternative, would be for someone to run with the idea mooted in this thread: Then edit pages could be /edit/Node_Name, deletes /delete/Node_Name, and so on. This makes setting up a suitable robots.txt very simple indeed, though it makes setting up the Apache rewrite rules a) a requirement, instead of just a nice thing, and b) more complex than at present. Thoughts, comments, suggestions and the like all invited. Cheers, James. -- PGP fingerprint 3E85 0C7A FE11 42E9 A599 094D AE16 90F0 81AE 16FF, ID 81AE16FF Fremen add life to spice! --===============3970576088439407304==-- From simon@rumble.net Thu Sep 9 16:02:24 2004 From: Rev Simon Rumble To: openguides-dev@lists.openguides.org Subject: Re: [OpenGuides-Dev] Search engines and edit/delete pages Date: Thu, 09 Sep 2004 16:02:44 +0100 Message-ID: <20040909150244.GC11480@rumble.net> In-Reply-To: <20040829225005.GJ14747@jimbo.org.uk> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7929231380291622527==" --===============7929231380291622527== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit This one time, at band camp, James Green wrote: > It occurs to me that there's very little point in search engines > crawling certain pages of an OpenGuide - anything with "action=edit" or > "action=delete" in the URL, at the very least, has no real value to > someone searching for information. Unfortunately, unless I'm missing > something, robots.txt syntax doesn't allow for matching on anything > other than the start of the path component of a URL, which doesn't help > us here. Solution here would be to change the syntax of the action= URLs to be something like: /action=edit/MyWikiPage Then you could put /action=edit/ in your robots.txt file. This would probably, if you're not already doing it, involve Apache rewriting which, if you're not already doing it, be a bit more effort than is really warranted. Using the meta tag doesn't remove the problem of the bandwidth used. -- Rev Simon Rumble www.rumble.net "Those who know nothing of foreign languages know nothing of their own." - Johann Wolfgang von Goethe --===============7929231380291622527==-- From kake@earth.li Sun Sep 19 01:22:21 2004 From: Kake L Pugh To: openguides-dev@lists.openguides.org Subject: Re: [OpenGuides-Dev] Search engines and edit/delete pages Date: Sun, 19 Sep 2004 01:22:22 +0100 Message-ID: <20040919002222.GA15143@the.earth.li> In-Reply-To: <20040829225005.GJ14747@jimbo.org.uk> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============9006551470261963987==" --===============9006551470261963987== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Sun 29 Aug 2004, James Green wrote: > It occurs to me that there's very little point in search engines > crawling certain pages of an OpenGuide - anything with "action=3Dedit" or > "action=3Ddelete" in the URL, at the very least, has no real value to > someone searching for information. [...] >=20 > It's been suggested to me that ``'' tags in the > of a page are effective here, [...] I made a start on this; see for example http://london.openguides.org/kakemirror/index.cgi?id=3DLocale_Fulham;action= =3Dedit Questions: (a) where else on http://london.openguides.org/kakemirror/index.cgi do we need these tags (please do check they're not already there before spitting out suggestions), (b) am I doing the tags right. Kake --===============9006551470261963987==-- From jkg@earth.li Sun Sep 19 16:26:47 2004 From: James Green To: openguides-dev@lists.openguides.org Subject: Re: [OpenGuides-Dev] Search engines and edit/delete pages Date: Sun, 19 Sep 2004 16:26:48 +0100 Message-ID: <20040919152648.GE3940@jimbo.org.uk> In-Reply-To: <20040919002222.GA15143@the.earth.li> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2507418274404747856==" --===============2507418274404747856== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Sun, Sep 19, 2004 at 01:22:22AM +0100, Kake L Pugh wrote: > On Sun 29 Aug 2004, James Green wrote: > > It's been suggested to me that ``'' tags in the > > of a page are effective here, [...] >=20 > I made a start on this [...] Cool! > Questions: (a) where else on http://london.openguides.org/kakemirror/index.= cgi > do we need these tags (please do check they're not already there before > spitting out suggestions), My main concerns were the edit and delete pages, which you've covered. The only other thing that springs to mind is newpage.cgi. > (b) am I doing the tags right. No, apparently I goofed on that one. The correct form is: `', at least according to http://www.robotstxt.org/wc/meta-user.html Mea culpa. James. --=20 In India, "cold weather" is merely a conventional phrase and has come into use through the necessity of having some way to distinguish between weather which will melt a brass door-knob and weather which will only make it mushy. -- Mark Twain --===============2507418274404747856==-- From kake@earth.li Mon Sep 20 13:31:58 2004 From: Kake L Pugh To: openguides-dev@lists.openguides.org Subject: Re: [OpenGuides-Dev] Search engines and edit/delete pages Date: Mon, 20 Sep 2004 13:32:02 +0100 Message-ID: <20040920123202.GC10114@the.earth.li> In-Reply-To: <20040919152648.GE3940@jimbo.org.uk> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1408307672414965751==" --===============1408307672414965751== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit On Sun 19 Sep 2004, James Green wrote: > My main concerns were the edit and delete pages, which you've covered. > The only other thing that springs to mind is newpage.cgi. Done. > No, apparently I goofed on that one. The correct form is: > `', at least according to > http://www.robotstxt.org/wc/meta-user.html Fixed. These changes will be in the next release. Kake --===============1408307672414965751==--