From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Kenneth Marshall <ktm(at)rice(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: unique index violation after pg_upgrade to PG10 |
Date: | 2017-10-25 05:39:26 |
Message-ID: | CAH2-Wz=0s6zxoj4JEL+2dvXo4QyE4gif3OHYVQ-TBPhyR0BMCQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 24, 2017 at 10:20 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> I think you must have compared these:
Yes, I did. My mistake.
> On Tue, Oct 24, 2017 at 03:11:44PM -0500, Justin Pryzby wrote:
>> ts=# SELECT * FROM bt_page_items(get_raw_page('sites_idx', 1));
>>
>> itemoffset | 48
>> ctid | (1,37)
>> itemlen | 32
>> nulls | f
>> vars | t
>> data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 0b 31 31 31 31 00 00 00 00 00 00
> ...
>> itemoffset | 37
>> ctid | (0,97)
>> itemlen | 24
>> nulls | f
>> vars | t
>> data | 1b 43 52 43 4c 4d 54 2d 43 45 4d 53 30 03 00 00
>
> ..but note those are both items in sites_idx (48 and 37, which I seem to have
> pasted out of order).. I included both because I'm not confident I know what
> the "index tid=(1,37)" referred to, but I gather it means item at offset=37
> (and not item with ctid=(1,37).)
>
> | [pryzbyj(at)database amcheck]$ psql --port 5678 ts -c "SELECT bt_index_check('sites_idx'::regclass::oid, heapallindexed=>True)"
> | ERROR: high key invariant violated for index "sites_idx"
> | DETAIL: Index tid=(1,37) points to heap tid=(0,97) page lsn=0/0.
This means that the item at (1,37) in the index is not <= the high
key, which is located at (1,1). (The high key comes first, despite
being an upper bound rather than a lower bound, per pageinspect docs.)
I find it suspicious that the page lsn is 0 here, btw.
> ts=# SELECT * FROM page_header(get_raw_page('sites_idx', 1));
> lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
> -----+----------+-------+-------+-------+---------+----------+---------+-----------
> 0/0 | 0 | 0 | 872 | 1696 | 8176 | 8192 | 4 | 0
>
> Here is its heap page:
>
> ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 0)) WHERE lp=97;
> lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
> ----+--------+----------+--------+--------+----------+----------+--------+-------------+------------+--------+------------------+-------+--------------------------------------------
> 97 | 968 | 1 | 52 | 21269 | 33567444 | 0 | (3,27) | 8204 | 2307 | 32 | 1110100000010000 | | \x700000001b4352434c4d542d43454d5330030303
>
> Which I see ends with 0303 vs 0000..
Looks like I was accidentally right, then -- the heap and index do differ.
You might try this tool I published recently, to get a better sense of
details like this:
https://github.com/petergeoghegan/pg_hexedit
(Don't do so with a data directory that you cannot afford to corrupt
again, though!)
> Maybe this is relevant ?
> ts=# SELECT * FROM heap_page_items(get_raw_page('sites', 3)) WHERE lp=27;
> lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
> ----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------
> 27 | 0 | 0 | 0 | | | | | | | | | |
This looks like an LP_DEAD item.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-10-25 05:53:49 | Re: Current int & float overflow checking is slow. |
Previous Message | Robert Haas | 2017-10-25 05:33:46 | Re: Current int & float overflow checking is slow. |