Re: [OGDev] Robot deterrence

25 Jun 2008


      Christopher Schmidt wrote:
...
On Wed, Jun 25, 2008 at 04:06:28PM +0100, David Cantrell wrote:
...
Yes, what you're missing is that google don't pay attention to
robots.txt or the meta thingy.  I expect that they cache it and then
ignore changes for some time.
Yahoo do the same.
Er, you seem to be misunderstanding how meta tags work: they have to
*Crawl* the page to see the tags... and there is no tag that says "never
crawl this page again."
You seem to be misunderstanding the concept of a cache.  If they read 
the meta tag once, they should remember what it said for a while, AND 
OBEY IT without asking for that page again.  Likewise robots.txt.
...
I've never had Google violate robots.txt.
Lucky you.  I had them start crawling one of my sites, so I added a 
robots.txt, but they kept coming.  I can understand them keeping going 
for a day or so cos they cached the fact that I didn't have a robots.txt 
file, but they were still requesting files other than robots.txt well 
over a month later.  I hope all their programmers' children die in a fire.
...
                            the key thing to point to would be an

instance of Google search results containing a piece of HTML that is
blocked by noindex.  If you can find one of those, I bet that Google
would be interested in seeing it. (Cheap tricks like modifying the HTML
after Google crawls by don't count.)
I have no interest in helping google.  My time is better spent by taking 
a few seconds to block their abusive bot than it is in figuring out how 
to contact the right person and convincing him that he's fucked up and 
then waiting months while he gets manglement approval to deploy a 
bugfix, while all the time the bot continues to prevent real users from 
having access to the service.
-- 
header   FROM_DAVID_CANTRELL    From =~ /david.cantrell/i
describe FROM_DAVID_CANTRELL    Message is from David Cantrell
score    FROM_DAVID_CANTRELL    15.72 # This figure from experimentation

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Re: [OGDev] Robot deterrence