Re: Incorrect FTS result with GIN index

Lists: pgsql-generalpgsql-hackers
From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Incorrect FTS result with GIN index
Date: 2010-07-15 13:09:30
Message-ID: 29172750.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Hello,

I was trying to use GIN index, but the results seem be incorrect.

1. QUERY WITHOUT INDEX
select count(*) from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));

count
-------
123
(1 row)

2. CREATING INDEX
create index idx_keywords_ger on search_tab
using gin(to_tsvector('german', keywords));

3. QUERY WITH INDEX
select count(*) from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));

count
-------
116
(1 row)

The number of rows is different. To make things more funny and ensure
problem is not caused by dictionary normalisation:

4. EQUIVALENT QUERY WITH INDEX
select count(*) from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*'));

count
-------
123
(1 row)

I tried the same with simple-based dictionary. The problem is always
reproducible.

Total count of records in my database is 1 006 300 if it matters.

One of missing results is the following: "lSWN eeIf hInEI IN
SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce". If the query is more
specifically targeted to find this row then it founds it:

5. MORE DETAILED QUERY WITH INDEX
select keywords from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'eeI:* & dd:*'));

keywords

--------------------------------------------------------------------------------

lSWN eeIf hInEI IN SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce
tSALWIEEIn-3WNecGAINfLuLAV DDLIWNG E Lt h c8 BiIfgGl1 EeIhulSLenS6LDe5O
hGn DDlhIgGEAcS1O eeiEEI WnILWELS68VBLL AGNIAfINt6 lLuWuNeDc ItLfe SL
hGe WIiI EeItnLLuA1efOh3ALWc uGINEltcIBE LnegLDNA3 DD SVNG LSSIlWfE
eeIW ItueS W39LnELg-GuDLEhAn8BeFG IVi DDNEfLG1SI 1tNIOA lAhNLLccfWISE l
6em on.0nsRH nehSA2l1HAsauncu0I65l7 ddnsn1SAS i u0eLAnlr t70gaains w gzsH
eeiog
rfiwgso0g364l1 1wU eei1n 5lL dDA 0
DDInNcEfSWAEAtcL1IeSuAG5LE Lilh8tEGeDg f3B eEIOL7h uWV-L1IGN LINWeIn l S
ils eeiru00ewH.6sgAeHoSlLhglso0 asn0u2a atisA0 ddcngAnzRA Se Au2 nm8ns0
uS8snH
DDD EWlE1GShhLe8L NENI tuL cgGGInfcBAlLfIO L1S eeIWeAEnILStu AViWNI
n IOLLt 0Alih tuWNE L nAGlVSNSDI DDeW BIegfG EeIhL9ELeScELWGAIfN1uIc
DnSE eeIWLu9tLNhNEuAt I1BelhGGfLWLS nSWINI eiELgAIG DDLEclV7 IO c Af
EeIElfN L4I lE2G cSOLniAWgSVItc ILDN L57BuDfALtSIe-WnGhGIW DDA NE1Lhuee
hNILN DD L6flSEeW1gthfI L1WAlENE eEIGIAt VGBDO uGLeLccAeSuLWIn Ii nS
(14 rows)

Did I misunderstood something or is it a bug?

Best regards
Artur
--
View this message in context: http://old.nabble.com/Incorrect-FTS-result-with-GIN-index-tp29172750p29172750.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS results with GIN index
Date: 2010-07-15 13:29:51
Message-ID: 29172950.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


I pasted incorrect query in point 5. It should be:

5. MORE DETAILED QUERY WITH INDEX
select keywords from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'eeI:*')) and
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));

keywords

--------------------------------------------------------------------------------

lSWN eeIf hInEI IN SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce
tSALWIEEIn-3WNecGAINfLuLAV DDLIWNG E Lt h c8 BiIfgGl1 EeIhulSLenS6LDe5O
hGn DDlhIgGEAcS1O eeiEEI WnILWELS68VBLL AGNIAfINt6 lLuWuNeDc ItLfe SL
hGe WIiI EeItnLLuA1efOh3ALWc uGINEltcIBE LnegLDNA3 DD SVNG LSSIlWfE
eeIW ItueS W39LnELg-GuDLEhAn8BeFG IVi DDNEfLG1SI 1tNIOA lAhNLLccfWISE l
6em on.0nsRH nehSA2l1HAsauncu0I65l7 ddnsn1SAS i u0eLAnlr t70gaains w gzsH
eeiog
rfiwgso0g364l1 1wU eei1n 5lL dDA 0
DDInNcEfSWAEAtcL1IeSuAG5LE Lilh8tEGeDg f3B eEIOL7h uWV-L1IGN LINWeIn l S
ils eeiru00ewH.6sgAeHoSlLhglso0 asn0u2a atisA0 ddcngAnzRA Se Au2 nm8ns0
uS8snH
DDD EWlE1GShhLe8L NENI tuL cgGGInfcBAlLfIO L1S eeIWeAEnILStu AViWNI
n IOLLt 0Alih tuWNE L nAGlVSNSDI DDeW BIegfG EeIhL9ELeScELWGAIfN1uIc
DnSE eeIWLu9tLNhNEuAt I1BelhGGfLWLS nSWINI eiELgAIG DDLEclV7 IO c Af
EeIElfN L4I lE2G cSOLniAWgSVItc ILDN L57BuDfALtSIe-WnGhGIW DDA NE1Lhuee
hNILN DD L6flSEeW1gthfI L1WAlENE eEIGIAt VGBDO uGLeLccAeSuLWIn Ii nS
(14 rows)
--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29172950.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS results with GIN index
Date: 2010-07-15 14:28:56
Message-ID: 29173652.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


My version of PostgreSQL is 8.4.3.
--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29173652.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-17 10:53:09
Message-ID: Pine.LNX.4.64.1007171450210.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Artur,

I downloaded your dump and tried your queries with index, I see no problem
so far.

Table "public.search_tab"
Column | Type | Modifiers
----------------+---------+----------------------------------------------------------
id | integer | not null default nextval('search_tab_id_seq1'::regclass)
keywords | text |
collection_urn | text |
bbox | text |
object_urn | text | not null
description | text |
category | text |
summary | text |
priority | integer |
Indexes:
"search_tab_pkey1" PRIMARY KEY, btree (id)
"idx_keywords_ger" gin (to_tsvector('german'::regconfig, keywords))

test=# explain analyze select count(*) from search_tab
where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*'))
and (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=103.87..103.88 rows=1 width=0) (actual time=24.784..24.784 rows=1 loops=1)
-> Bitmap Heap Scan on search_tab (cost=5.21..103.80 rows=25 width=0) (actual time=24.642..24.769 rows=123 loops=1)
Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@ '''dd'':*'::tsquery))
-> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.21 rows=25 width=0) (actual time=24.620..24.620 rows=123 loops=1)
Index Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@ '''dd'':*'::tsquery))
Total runtime: 24.830 ms
(6 rows)

see rows=123

On Thu, 15 Jul 2010, Artur Dabrowski wrote:

>
> Hello,
>
> I was trying to use GIN index, but the results seem be incorrect.
>
>
> 1. QUERY WITHOUT INDEX
> select count(*) from search_tab where
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));
>
> count
> -------
> 123
> (1 row)
>
>
> 2. CREATING INDEX
> create index idx_keywords_ger on search_tab
> using gin(to_tsvector('german', keywords));
>
>
> 3. QUERY WITH INDEX
> select count(*) from search_tab where
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));
>
> count
> -------
> 116
> (1 row)
>
>
> The number of rows is different. To make things more funny and ensure
> problem is not caused by dictionary normalisation:
>
> 4. EQUIVALENT QUERY WITH INDEX
> select count(*) from search_tab where
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*'));
>
> count
> -------
> 123
> (1 row)
>
> I tried the same with simple-based dictionary. The problem is always
> reproducible.
>
> Total count of records in my database is 1 006 300 if it matters.
>
> One of missing results is the following: "lSWN eeIf hInEI IN
> SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce". If the query is more
> specifically targeted to find this row then it founds it:
>
> 5. MORE DETAILED QUERY WITH INDEX
> select keywords from search_tab where
> (to_tsvector('german', keywords ) @@ to_tsquery('german', 'eeI:* & dd:*'));
>
> keywords
>
> --------------------------------------------------------------------------------
>
> lSWN eeIf hInEI IN SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce
> tSALWIEEIn-3WNecGAINfLuLAV DDLIWNG E Lt h c8 BiIfgGl1 EeIhulSLenS6LDe5O
> hGn DDlhIgGEAcS1O eeiEEI WnILWELS68VBLL AGNIAfINt6 lLuWuNeDc ItLfe SL
> hGe WIiI EeItnLLuA1efOh3ALWc uGINEltcIBE LnegLDNA3 DD SVNG LSSIlWfE
> eeIW ItueS W39LnELg-GuDLEhAn8BeFG IVi DDNEfLG1SI 1tNIOA lAhNLLccfWISE l
> 6em on.0nsRH nehSA2l1HAsauncu0I65l7 ddnsn1SAS i u0eLAnlr t70gaains w gzsH
> eeiog
> rfiwgso0g364l1 1wU eei1n 5lL dDA 0
> DDInNcEfSWAEAtcL1IeSuAG5LE Lilh8tEGeDg f3B eEIOL7h uWV-L1IGN LINWeIn l S
> ils eeiru00ewH.6sgAeHoSlLhglso0 asn0u2a atisA0 ddcngAnzRA Se Au2 nm8ns0
> uS8snH
> DDD EWlE1GShhLe8L NENI tuL cgGGInfcBAlLfIO L1S eeIWeAEnILStu AViWNI
> n IOLLt 0Alih tuWNE L nAGlVSNSDI DDeW BIegfG EeIhL9ELeScELWGAIfN1uIc
> DnSE eeIWLu9tLNhNEuAt I1BelhGGfLWLS nSWINI eiELgAIG DDLEclV7 IO c Af
> EeIElfN L4I lE2G cSOLniAWgSVItc ILDN L57BuDfALtSIe-WnGhGIW DDA NE1Lhuee
> hNILN DD L6flSEeW1gthfI L1WAlENE eEIGIAt VGBDO uGLeLccAeSuLWIn Ii nS
> (14 rows)
>
>
> Did I misunderstood something or is it a bug?
>
> Best regards
> Artur
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-19 09:50:50
Message-ID: 29203020.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Hello Oleg,

my results are different. The analysis looks like this (please note the
different numbers of rows):

Aggregate (cost=104.05..104.06 rows=1 width=0) (actual
time=152.133..152.135 rows=1 loops=1)
-> Bitmap Heap Scan on search_tab (cost=5.39..103.98 rows=25 width=0)
(actual time=76.546..151.834 rows=116 loops=1)
Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
'''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
'''dd'':*'::tsquery))
-> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.38 rows=25
width=0) (actual time=76.292..76.292 rows=506 loops=1)
Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
'''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
'''dd'':*'::tsquery))
Total runtime: 152.389 ms

I have no idea, what could be the reason for different behaviour on your and
my machine (windows xp, postgreSQL 8.4.3)?
I reproduced the same wrong behaviour on a machine of my co-worker (windows
xp, postgreSQL 8.4.4).

--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29203020.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-19 13:26:47
Message-ID: Pine.LNX.4.64.1007191726180.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Artur,

I don't know, but could you try linux machine ?

Oleg
On Mon, 19 Jul 2010, Artur Dabrowski wrote:

>
> Hello Oleg,
>
> my results are different. The analysis looks like this (please note the
> different numbers of rows):
>
> Aggregate (cost=104.05..104.06 rows=1 width=0) (actual
> time=152.133..152.135 rows=1 loops=1)
> -> Bitmap Heap Scan on search_tab (cost=5.39..103.98 rows=25 width=0)
> (actual time=76.546..151.834 rows=116 loops=1)
> Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
> '''dd'':*'::tsquery))
> -> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.38 rows=25
> width=0) (actual time=76.292..76.292 rows=506 loops=1)
> Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
> '''dd'':*'::tsquery))
> Total runtime: 152.389 ms
>
>
> I have no idea, what could be the reason for different behaviour on your and
> my machine (windows xp, postgreSQL 8.4.3)?
> I reproduced the same wrong behaviour on a machine of my co-worker (windows
> xp, postgreSQL 8.4.4).
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-20 07:29:50
Message-ID: 29212116.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


I tested the same backup on our CentOS 5.4 virtual machine (running on xen
server) and the results are really weird (118 rows, comparing to 116 on win
xp and 123 expected):

Aggregate (cost=104.00..104.01 rows=1 width=0) (actual
time=120.373..120.374 rows=1 loops=1)
-> Bitmap Heap Scan on search_tab (cost=5.35..103.93 rows=25 width=0)
(actual time=59.418..120.137 rows=118 loops=1)
Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
'''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
'''dd'':*'::tsquery))
-> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.34 rows=25
width=0) (actual time=59.229..59.229 rows=495 loops=1)
Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
'''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
'''dd'':*'::tsquery))
Total runtime: 120.670 ms

And here are the configuration details:

PostgreSQL:
postgresql84-server-8.4.4-1.el5_5.1

# uname -r
2.6.18-164.15.1.el5xen

# cat /etc/redhat-release
CentOS release 5.4 (Final)

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5140 @ 2.33GHz
stepping : 6
cpu MHz : 2333.416
cache size : 4096 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx
fxsr sse sse2 ss ht syscall lm constant_tsc pni cx16 lahf_lm
bogomips : 5835.83
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

Oleg Bartunov wrote:
>
> Artur,
>
> I don't know, but could you try linux machine ?
>
> Oleg
>

--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29212116.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-20 07:37:16
Message-ID: 29212162.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


The CentOS used for testing is a 64-bits version.

Artur Dabrowski wrote:
>
> I tested the same backup on our CentOS 5.4 virtual machine (running on xen
> server) and the results are really weird (118 rows, comparing to 116 on
> win xp and 123 expected):
>
>
>

--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29212162.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-20 09:09:06
Message-ID: Pine.LNX.4.64.1007201303160.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Artur,

I recommend post your problem to -hackers mailing list. I have no idea,
what could be a problem.

My machine is:
uname -a
Linux mira 2.6.33-020633-generic #020633 SMP Thu Feb 25 10:10:03 UTC 2010 x86_64 GNU/Linux

PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1, 64-bit

As a last resort I recommend you to compile pg yourself and see if the
problem exists.

Oleg

On Tue, 20 Jul 2010, Artur Dabrowski wrote:

>
> I tested the same backup on our CentOS 5.4 virtual machine (running on xen
> server) and the results are really weird (118 rows, comparing to 116 on win
> xp and 123 expected):
>
> Aggregate (cost=104.00..104.01 rows=1 width=0) (actual
> time=120.373..120.374 rows=1 loops=1)
> -> Bitmap Heap Scan on search_tab (cost=5.35..103.93 rows=25 width=0)
> (actual time=59.418..120.137 rows=118 loops=1)
> Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
> '''dd'':*'::tsquery))
> -> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.34 rows=25
> width=0) (actual time=59.229..59.229 rows=495 loops=1)
> Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
> '''dd'':*'::tsquery))
> Total runtime: 120.670 ms
>
> And here are the configuration details:
>
> PostgreSQL:
> postgresql84-server-8.4.4-1.el5_5.1
>
> # uname -r
> 2.6.18-164.15.1.el5xen
>
> # cat /etc/redhat-release
> CentOS release 5.4 (Final)
>
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 15
> model name : Intel(R) Xeon(R) CPU 5140 @ 2.33GHz
> stepping : 6
> cpu MHz : 2333.416
> cache size : 4096 KB
> physical id : 0
> siblings : 1
> core id : 0
> cpu cores : 1
> fpu : yes
> fpu_exception : yes
> cpuid level : 10
> wp : yes
> flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx
> fxsr sse sse2 ss ht syscall lm constant_tsc pni cx16 lahf_lm
> bogomips : 5835.83
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
>
>
>
>
>
>
> Oleg Bartunov wrote:
>>
>> Artur,
>>
>> I don't know, but could you try linux machine ?
>>
>> Oleg
>>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-20 15:46:20
Message-ID: 29215929.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Oleg,

thanks for your help.

I sent a post to pg-hackers list:
http://old.nabble.com/Query-results-differ-depending-on-operating-system-%28using-GIN%29-ts29213082.html

As to compiling pg... I will no do this since I do not really feel
comfortable doing it and cannot dedicate too much time to this problem.

Artur

Oleg Bartunov wrote:
>
> Artur,
>
> I recommend post your problem to -hackers mailing list. I have no idea,
> what could be a problem.
>
> My machine is:
> uname -a
> Linux mira 2.6.33-020633-generic #020633 SMP Thu Feb 25 10:10:03 UTC 2010
> x86_64 GNU/Linux
>
> PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu
> 4.4.1-4ubuntu9) 4.4.1, 64-bit
>
> As a last resort I recommend you to compile pg yourself and see if the
> problem exists.
>
> Oleg
>
>
> On Tue, 20 Jul 2010, Artur Dabrowski wrote:
>
>>
>> I tested the same backup on our CentOS 5.4 virtual machine (running on
>> xen
>> server) and the results are really weird (118 rows, comparing to 116 on
>> win
>> xp and 123 expected):
>>
>> Aggregate (cost=104.00..104.01 rows=1 width=0) (actual
>> time=120.373..120.374 rows=1 loops=1)
>> -> Bitmap Heap Scan on search_tab (cost=5.35..103.93 rows=25 width=0)
>> (actual time=59.418..120.137 rows=118 loops=1)
>> Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
>> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
>> '''dd'':*'::tsquery))
>> -> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.34
>> rows=25
>> width=0) (actual time=59.229..59.229 rows=495 loops=1)
>> Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
>> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
>> '''dd'':*'::tsquery))
>> Total runtime: 120.670 ms
>>
>> And here are the configuration details:
>>
>> PostgreSQL:
>> postgresql84-server-8.4.4-1.el5_5.1
>>
>> # uname -r
>> 2.6.18-164.15.1.el5xen
>>
>> # cat /etc/redhat-release
>> CentOS release 5.4 (Final)
>>
>> # cat /proc/cpuinfo
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 15
>> model name : Intel(R) Xeon(R) CPU 5140 @ 2.33GHz
>> stepping : 6
>> cpu MHz : 2333.416
>> cache size : 4096 KB
>> physical id : 0
>> siblings : 1
>> core id : 0
>> cpu cores : 1
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 10
>> wp : yes
>> flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi
>> mmx
>> fxsr sse sse2 ss ht syscall lm constant_tsc pni cx16 lahf_lm
>> bogomips : 5835.83
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>>
>>
>>
>>
>>
>>
>> Oleg Bartunov wrote:
>>>
>>> Artur,
>>>
>>> I don't know, but could you try linux machine ?
>>>
>>> Oleg
>>>
>>
>>
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tp29172750p29215929.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-24 16:22:02
Message-ID: Pine.LNX.4.64.1007242020350.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Artur,

you could get much more problems in future. Full text search problem may be
signature of more general problem with your postgres setup. So, I'd recommend
to find a source of the problem

Oleg
On Tue, 20 Jul 2010, Artur Dabrowski wrote:

>
> Oleg,
>
> thanks for your help.
>
> I sent a post to pg-hackers list:
> http://old.nabble.com/Query-results-differ-depending-on-operating-system-%28using-GIN%29-ts29213082.html
>
> As to compiling pg... I will no do this since I do not really feel
> comfortable doing it and cannot dedicate too much time to this problem.
>
> Artur
>
>
>
> Oleg Bartunov wrote:
>>
>> Artur,
>>
>> I recommend post your problem to -hackers mailing list. I have no idea,
>> what could be a problem.
>>
>> My machine is:
>> uname -a
>> Linux mira 2.6.33-020633-generic #020633 SMP Thu Feb 25 10:10:03 UTC 2010
>> x86_64 GNU/Linux
>>
>> PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu
>> 4.4.1-4ubuntu9) 4.4.1, 64-bit
>>
>> As a last resort I recommend you to compile pg yourself and see if the
>> problem exists.
>>
>> Oleg
>>
>>
>> On Tue, 20 Jul 2010, Artur Dabrowski wrote:
>>
>>>
>>> I tested the same backup on our CentOS 5.4 virtual machine (running on
>>> xen
>>> server) and the results are really weird (118 rows, comparing to 116 on
>>> win
>>> xp and 123 expected):
>>>
>>> Aggregate (cost=104.00..104.01 rows=1 width=0) (actual
>>> time=120.373..120.374 rows=1 loops=1)
>>> -> Bitmap Heap Scan on search_tab (cost=5.35..103.93 rows=25 width=0)
>>> (actual time=59.418..120.137 rows=118 loops=1)
>>> Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@
>>> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
>>> '''dd'':*'::tsquery))
>>> -> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.34
>>> rows=25
>>> width=0) (actual time=59.229..59.229 rows=495 loops=1)
>>> Index Cond: ((to_tsvector('german'::regconfig, keywords) @@
>>> '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@
>>> '''dd'':*'::tsquery))
>>> Total runtime: 120.670 ms
>>>
>>> And here are the configuration details:
>>>
>>> PostgreSQL:
>>> postgresql84-server-8.4.4-1.el5_5.1
>>>
>>> # uname -r
>>> 2.6.18-164.15.1.el5xen
>>>
>>> # cat /etc/redhat-release
>>> CentOS release 5.4 (Final)
>>>
>>> # cat /proc/cpuinfo
>>> processor : 0
>>> vendor_id : GenuineIntel
>>> cpu family : 6
>>> model : 15
>>> model name : Intel(R) Xeon(R) CPU 5140 @ 2.33GHz
>>> stepping : 6
>>> cpu MHz : 2333.416
>>> cache size : 4096 KB
>>> physical id : 0
>>> siblings : 1
>>> core id : 0
>>> cpu cores : 1
>>> fpu : yes
>>> fpu_exception : yes
>>> cpuid level : 10
>>> wp : yes
>>> flags : fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi
>>> mmx
>>> fxsr sse sse2 ss ht syscall lm constant_tsc pni cx16 lahf_lm
>>> bogomips : 5835.83
>>> clflush size : 64
>>> cache_alignment : 64
>>> address sizes : 36 bits physical, 48 bits virtual
>>> power management:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Oleg Bartunov wrote:
>>>>
>>>> Artur,
>>>>
>>>> I don't know, but could you try linux machine ?
>>>>
>>>> Oleg
>>>>
>>>
>>>
>>
>> Regards,
>> Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-26 10:20:16
Message-ID: 1280139616679-2227845.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Hello Oleg,

I totally agree, that the problem should be fixed. Saying this, I need to
add that:
- I have no knowledge of postgres development,
- I cannot dedicate any significant time to this problem,
- I am no longer working for the project where the problem occurred,
- In the mentioned project the problem is not considered business-critical
at the moment (although it may be in the future).

Nevertheless I think it should be still interesting for postgres developers
community to fix it. The point is I have no needed knowledge nor time to fix
it.

As to my postgres setup - it's nothing special, it's just a regular version
from postgres' webpage.

Best regards
Artur

Oleg Bartunov wrote:
>
> Artur,
>
> you could get much more problems in future. Full text search problem may
> be
> signature of more general problem with your postgres setup. So, I'd
> recommend
> to find a source of the problem
>
>
> Oleg
>

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Incorrect-FTS-results-with-GIN-index-tp1928607p2227845.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>, pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-28 00:27:48
Message-ID: 19427.1280276868@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> I recommend post your problem to -hackers mailing list. I have no idea,
> what could be a problem.

I wonder whether the problem is not windows versus non windows but
original database versus copies. If it is a GIN bug it seems quite
possible that it would depend on the order of insertion of the index
entries, which a simple dump-and-reload probably wouldn't duplicate.

If you were working from a dump it'd be easy to try creating the index
before populating the table to see if the bug can be reproduced then,
but there's no certainty that would provoke the bug.

The rest of us have not seen the dump data, so we have no hope of
doing anything with this report anyway.

regards, tom lane


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Artur Dabrowski <ad(at)astec(dot)com(dot)pl>, pgsql-general(at)postgresql(dot)org
Subject: Re: Incorrect FTS result with GIN index
Date: 2010-07-28 18:00:35
Message-ID: Pine.LNX.4.64.1007282200070.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Tom,

you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump

Oleg
On Tue, 27 Jul 2010, Tom Lane wrote:

> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> I recommend post your problem to -hackers mailing list. I have no idea,
>> what could be a problem.
>
> I wonder whether the problem is not windows versus non windows but
> original database versus copies. If it is a GIN bug it seems quite
> possible that it would depend on the order of insertion of the index
> entries, which a simple dump-and-reload probably wouldn't duplicate.
>
> If you were working from a dump it'd be easy to try creating the index
> before populating the table to see if the bug can be reproduced then,
> but there's no certainty that would provoke the bug.
>
> The rest of us have not seen the dump data, so we have no hope of
> doing anything with this report anyway.
>
> regards, tom lane
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-28 23:33:01
Message-ID: 26432.1280359981@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump

Hmm ... I'm not sure why you're failing to reproduce it, because it's
falling over pretty easily for me. After poking at it for awhile,
I am of the opinion that scanGetItem's handling of multiple keys is
fundamentally broken and needs to be rewritten completely. The
particular case I'm seeing here is that one key returns this sequence of
TIDs/lossy flags:

...
1085/4 0
1086/65535 1
1087/4 0
...

while the other one returns this:

...
1083/11 0
1086/6 0
1086/10 0
1087/10 0
...

and what comes out of scanGetItem is just

...
1086/6 1
...

because after returning that, on the next call it advances both input
keystreams. So 1086/10 should be visited and is not.

I think that depending on the previous entryRes state to determine what
to do is basically unworkable, and what should probably be done instead
is to remember the last-returned TID and advance keystreams with TIDs <=
that. I haven't quite thought through how that should interact with
lossy-page TIDs but it seems more robust than what we've got.

I'm also noticing that the ANDing behavior for the "ee:* & dd:*" query
style seems very much stupider than it needs to be --- it's returning
lossy pages that very obviously don't need to be examined because the
other keystream has no match at all on that page. But I haven't had
time to probe into the reason why.

I'm out of time for today, do you want to work on it?

regards, tom lane


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-29 11:03:32
Message-ID: Pine.LNX.4.64.1007291459270.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Tom,

we're not able to work on this right now, so go ahead if you have time.
I also wonder why did I get "right" result :) Just repeated the query:

test=# select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*'));
count
-------
123
(1 row)

Time: 26.185 ms

Oleg
On Wed, 28 Jul 2010, Tom Lane wrote:

> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump
>
> Hmm ... I'm not sure why you're failing to reproduce it, because it's
> falling over pretty easily for me. After poking at it for awhile,
> I am of the opinion that scanGetItem's handling of multiple keys is
> fundamentally broken and needs to be rewritten completely. The
> particular case I'm seeing here is that one key returns this sequence of
> TIDs/lossy flags:
>
> ...
> 1085/4 0
> 1086/65535 1
> 1087/4 0
> ...
>
> while the other one returns this:
>
> ...
> 1083/11 0
> 1086/6 0
> 1086/10 0
> 1087/10 0
> ...
>
> and what comes out of scanGetItem is just
>
> ...
> 1086/6 1
> ...
>
> because after returning that, on the next call it advances both input
> keystreams. So 1086/10 should be visited and is not.
>
> I think that depending on the previous entryRes state to determine what
> to do is basically unworkable, and what should probably be done instead
> is to remember the last-returned TID and advance keystreams with TIDs <=
> that. I haven't quite thought through how that should interact with
> lossy-page TIDs but it seems more robust than what we've got.
>
> I'm also noticing that the ANDing behavior for the "ee:* & dd:*" query
> style seems very much stupider than it needs to be --- it's returning
> lossy pages that very obviously don't need to be examined because the
> other keystream has no match at all on that page. But I haven't had
> time to probe into the reason why.
>
> I'm out of time for today, do you want to work on it?
>
> regards, tom lane
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-29 14:03:07
Message-ID: 11087.1280412187@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> I also wonder why did I get "right" result :) Just repeated the query:

> test=# select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*'));
> count
> -------
> 123
> (1 row)

Yeah, that case works (though I think it's unnecessarily slow). The one
that gives the wrong answer is the equivalent form with two AND'ed @@
operators.

regards, tom lane


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-29 16:13:32
Message-ID: Pine.LNX.4.64.1007292012350.32129@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thu, 29 Jul 2010, Tom Lane wrote:

> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> I also wonder why did I get "right" result :) Just repeated the query:
>
>> test=# select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*'));
>> count
>> -------
>> 123
>> (1 row)
>
> Yeah, that case works (though I think it's unnecessarily slow). The one
> that gives the wrong answer is the equivalent form with two AND'ed @@
> operators.

hmm, that query works too :)

test=# select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));
count
-------
123
(1 row)

Time: 26.155 ms

test=# explain analyze select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*'));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=103.87..103.88 rows=1 width=0) (actual time=22.819..22.820 rows=1 loops=1)
-> Bitmap Heap Scan on search_tab (cost=5.21..103.80 rows=25 width=0) (actual time=22.677..22.799 rows=123 loops=1)
Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@ '''dd'':*'::tsquery))
-> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.21 rows=25 width=0) (actual time=22.655..22.655 rows=123 loops=1)
Index Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig, keywords) @@ '''dd'':*'::tsquery))
Total runtime: 22.865 ms

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-29 16:28:38
Message-ID: 14531.1280420918@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> On Thu, 29 Jul 2010, Tom Lane wrote:
>> Yeah, that case works (though I think it's unnecessarily slow). The one
>> that gives the wrong answer is the equivalent form with two AND'ed @@
>> operators.

> hmm, that query works too :)

There may be some platform dependency involved --- in particular, you
wouldn't see the issue unless one keystream has two nonlossy TIDs on the
same page as the other one has a lossy TID, so it's going to depend on
the placement of heap rows. Anyway, I can reproduce it just by loading
the given dump, on both 8.4 and HEAD. Will work on a fix.

regards, tom lane