Re: Fixing GIN for empty/null/full-scan cases

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Abe Ingersoll <abe(at)abe(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Fixing GIN for empty/null/full-scan cases
Date: 2011-01-18 21:39:17
Message-ID: 2883.1295386757@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> One of the reasons our client wants GIN for the integer[] column so bad is because recreating the GiST integer[] index is quite painful. Before I duped the table, I was just dropping and recreating the index on the original table. It was great to create the GIN index; for 400K rows, it took 1300 ms. After my initial round of testing, I dropped it and created the GiST index. That ran forwell, *hours*. Four at least, maybe six or seven (I forgot \timing and was letting it run on screen while I did some iOS hacking). I think something might be really wrong with GiST index creation for integer arrays, because the difference was just appalling.

I poked into this a bit, although it's really off-topic for what I was
doing in GIN, and I agree that there is something fundamentally rotten
in the gist__int_ops logic. oprofile shows that the code is spending
all its time in g_int_compress and g_int_union during index build:

samples % app name symbol name
52747 40.2486 _int.so g_int_compress
27092 20.6726 libc-2.12.2.so _wordcopy_fwd_aligned
18938 14.4506 _int.so compASC
18256 13.9302 postgres pg_qsort
5741 4.3807 no-vmlinux /no-vmlinux
3297 2.5158 postgres swapfunc
864 0.6593 _int.so g_int_decompress
826 0.6303 _int.so _int_unique
700 0.5341 _int.so inner_int_union
617 0.4708 postgres med3
588 0.4487 libc-2.12.2.so memset
371 0.2831 _int.so g_int_same
143 0.1091 libc-2.12.2.so memcpy
123 0.0939 postgres ArrayGetNItems
67 0.0511 _int.so inner_int_inter
48 0.0366 postgres AllocSetAlloc
47 0.0359 libc-2.12.2.so memmove
40 0.0305 postgres MemoryContextAllocZero

The time in g_int_compress is all on this loop:

while (len > MAXNUMRANGE * 2)
{
min = 0x7fffffff;
for (i = 2; i < len; i += 2)
if (min > (dr[i] - dr[i - 1]))
{
min = (dr[i] - dr[i - 1]);
cand = i;
}
memmove((void *) &dr[cand - 1], (void *) &dr[cand + 1], (len - cand - 1) * sizeof(int32));
len -= 2;
}

which is not too surprising since that's obviously O(N^2). The other
high-percentage functions are qsort and subsidiaries, and those calls
are coming from g_int_union. That part could probably be sped up, since
most of the calls are unioning two inputs, which could be done a lot
cheaper than this (the inputs are always presorted no?). But there is
something really fundamentally wrong in the logic, I think, because
while poking at it with gdb I saw it union-ing fifty-thousand-element
arrays. Surely it should get lossy before it gets to that. I'm also
wondering how it tells the difference between lossy and non-lossy keys
--- although it's hard to tell what such spectacularly uncommented code
is supposed to be doing, it looks like the lossy case is supposed to be
pairs of values representing ranges, in which case g_int_compress is
surely doing the Wrong Thing with already-lossy inputs, and union'ing
lossy inputs is an entirely insane operation as well.

At the moment my opinion is that gist__int_ops is too broken to be
usable, and it's also too uncommented to be fixable by anyone other
than the original author.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2011-01-18 21:40:40 Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql
Previous Message Bruce Momjian 2011-01-18 21:36:25 Re: ToDo List Item - System Table Index Clustering