From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: robots.txt on git.postgresql.org |
Date: | 2013-07-09 19:19:03 |
Message-ID: | CABUevExS74oUkc3+RcwVqzaUqvSLqsfXGhPG2GZNsYo6o5-JtQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 9, 2013 at 5:30 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-07-09 16:24:42 +0100, Greg Stark wrote:
>> I note that git.postgresql.org's robot.txt refuses permission to crawl
>> the git repository:
>>
>> http://git.postgresql.org/robots.txt
>>
>> User-agent: *
>> Disallow: /
>>
>>
>> I'm curious what motivates this. It's certainly useful to be able to
>> search for commits.
>
> Gitweb is horribly slow. I don't think anybody with a bigger git repo
> using gitweb can afford to let all the crawlers go through it.
Yes, this is the reason it's been blocked. That machine basically died
every time google or bing or baidu or those hit it. Giving horrible
response times and timeouts for actual users.
We might be able to do something better aobut that now taht we can do
better rate limiting, but it's like playing whack-a-mole. The basic
software is just fantastically slow.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2013-07-09 19:21:42 | Re: robots.txt on git.postgresql.org |
Previous Message | Markus Wanner | 2013-07-09 19:12:59 | Re: Review: extension template |