Re: Processing long AND/OR lists

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Processing long AND/OR lists
Date: 2013-06-16 13:36:14
Message-ID: CACMqXCLmQUKuT0oRP_WOzKWHgAQ2jYNvOA5PMc30gT6RZGMaYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 27, 2013 at 5:59 AM, Christopher Browne <cbbrowne(at)gmail(dot)com> wrote:
> This situation falls from a problem that we noticed a mighty long time ago
> in Slony, where the set of XIDs outstanding gets very large, and, attendant
> to that, the set of "action id" values by which tuples are being filtered,
> gets correspondingly large.
>
> It happens when there is a long pause in application of replication data,
> and is commonly the consequence of setting up replication on a very large
> data table that takes a long time for the initial data copy.
>
> At the time, Neil Conway observed this query breakage with a query that was
> roughly 640K in size, from whence fell jokes to the effect, "who would ever
> need a query larger than 640K?"
>
> The resolution that I introduced at the time was to write a little parser
> that would recognize sequences of adjacent values and merge them into
> "BETWEEN A and B" clauses, which would bring the query size back to a
> reasonable size.

PgQ uses simpler optimization to keep IN list size down -
it aggressively enlarges the main xid range and later processes
rows with txid_is_visible_in_snapshot():

https://github.com/markokr/skytools/blob/master/sql/pgq/functions/pgq.batch_event_sql.sql

IOW - it assumes the open-xid distribution is not uniformly random.

This additional optimization was ignored when pgq long-tx
approach was first imported to slony:

http://lists.slony.info/pipermail/slony1-general/2007-July/006238.html

I guess the reason was to have minimal patch.
You might want to play with that now.

--
marko

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-06-16 14:27:56 Re: Improvement of checkpoint IO scheduler for stable transaction responses
Previous Message Joshua D. Drake 2013-06-16 13:10:42 Making submitting a patch FAQ (was Re: [HACKERS] request a new feature in fuzzystrmatch)