Re: B-Tree support function number 3 (strxfrm() optimization)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Date: 2014-12-10 02:03:33
Message-ID: CAM3SWZTijoBPpqFF7mN3021Vvtu+5Fd1ymABQ8tLoV4zhfAqxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There is an interesting thread about strcoll() overhead over on -general:

http://www.postgresql.org/message-id/CAB25XEXNONdRmC1_cy3jvmB0TMyDm38eF9q2D7xLa0rbnCJ5pQ@mail.gmail.com

My guess was that this person experienced a rather unexpected downside
of spilling to disk when sorting on a text attribute: System
throughput becomes very CPU bound, because tapesort tends to result in
more comparisons [1]. With abbreviated keys, tapesort can actually
compete with quicksort in certain cases [2]. Tapesorts of text
attributes are especially bad on released versions of Postgres, and
will perform very little actual I/O.

In all seriousness, I wonder if we should add a release note item
stating that when using Postgres 9.5, due to the abbreviated key
optimization, external sorts can be much more I/O bound than in
previous releases...

[1] http://www.postgresql.org/message-id/20140806035512.GA91137@tornado.leadboat.com
[2] http://www.postgresql.org/message-id/CAM3SWZQiGvGhMB4TMbEWoNjO17=ySB5b5Y5MGqJsaNq4uWTryA@mail.gmail.com
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-12-10 02:09:01 Re: Yet another abort-early plan disaster on 9.3
Previous Message Josh Berkus 2014-12-10 01:46:34 Re: Yet another abort-early plan disaster on 9.3