From: | Kadri Raudsepp <kadri(dot)raudsepp(at)nordicgaming(dot)com> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | Column missing from pg_statistics |
Date: | 2014-01-10 14:31:48 |
Message-ID: | CAJ_Sy+yzsbj6QNCQMyuh+PsnnHYUMJ+1_9La4=U9XXryeZsmbg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin pgsql-hackers |
Hello,
I'm DBA for several quite large postgres databases on debian servers.
Current version:
PostgreSQL 9.2.6 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
but the problem I'm about to describe was discovered before upgrade, while
running v9.2.4.
I have set up a cron-job that sends me daily reports on bloat amount in
tables and indices, which I calculate using pg_stats, not pgstattuple, for
performance and I/O reasons. If the bloat amount or percentage are big
enough, I use pg_repack to get rid of it. At some point I noticed, that
some tables keep showing up in the reports with the same amount of bloat,
which pg_repack was seemingly unable to remove. Investigation showed that
pgstattuple gave very different results than my bloat-finding query.
Reason - for some tables there are some columns that never show up in
pg_statistics.
I'm going to give an example of a table that has this issue. I have
vacuumed it, analyzed it, used "alter column set statistic" specifically on
the missing column - no results. I hope someone has encountered this issue
before or at least could give me some new ideas, why this is or how to fix
it.
Here's the example:
# \d+ d3s_log
Table
"public.d3s_log"
Column | Type |
Modifiers | Storage | Stats target | Description
---------------+-----------------------------+------------------------------------------------------+----------+--------------+-------------
id | bigint | not null default
nextval('d3s_log_id_seq'::regclass) | plain | |
users_id | integer | not
null | plain |
|
bet_id | integer | not
null | plain |
|
bet_state | integer
| | plain
| |
d3s_trans_id | character varying(255) | not
null | extended |
|
operation | d3s_operation_enum | not
null | plain |
|
created | timestamp without time zone | not null default
now() | plain | |
success | boolean | not
null | plain |
|
response_time | timestamp without time zone
| | plain
| |
path | character varying(255) | default NULL::character
varying | extended | |
params | text
| | extended
| |
request | text
| | extended |
1000 |
response | text
| | extended
| |
Indexes:
"d3s_log_pkey" PRIMARY KEY, btree (id)
"d3s_log_bet_id_operation_success" btree (bet_id, operation, success)
"d3s_log_bets_id_index" btree (bet_id)
"d3s_log_created" btree (created)
"d3s_log_users_id_index" btree (users_id)
Has OIDs: no
# SELECT s.staattnum, a.attname FROM pg_statistic s INNER JOIN pg_attribute
a ON (a.attrelid,a.attnum)=(s.starelid,s.staattnum) WHERE starelid =
'd3s_log'::regclass;
staattnum | attname
-----------+---------------
1 | id
2 | users_id
3 | bet_id
4 | bet_state
5 | d3s_trans_id
6 | operation
7 | created
8 | success
9 | response_time
10 | path
11 | params
13 | response
(12 rows)
## As can be seen, column 12, "request", is missing.
# select tablename, attname, null_frac, avg_width from pg_stats where
tablename = 'd3s_log';
tablename | attname | null_frac | avg_width
-----------+---------------+------------+-----------
d3s_log | id | 0 | 8
d3s_log | users_id | 0 | 4
d3s_log | bet_id | 0 | 4
d3s_log | bet_state | 1 | 4
d3s_log | d3s_trans_id | 0 | 23
d3s_log | operation | 0 | 4
d3s_log | created | 0 | 8
d3s_log | success | 0 | 1
d3s_log | response_time | 0 | 8
d3s_log | path | 0.00303333 | 48
d3s_log | params | 0 | 180
d3s_log | response | 0.00123333 | 456
(12 rows)
## Again, "request" is missing, but there were small changes in avg_width
for other columns after analyze, so some statistics calculation seems to be
working.
The difference between pgstattuple and calculated bloat is 6.5 GB out of 15
GB total table size, which is approximately the amount of data stored in
the extended "request" column.
I also did pg_dump of this table and restored it on another server, running
v9.2.4 - no errors, analyze calculates statistics, but "request" is still
missing there, too.
I have also checked other tables with the same problem - the common thing
seems to be that all the missing columns are extended, type text or
varchar, but I'm not 100% sure, haven't checked them all yet. The size of
the missing column is not a common thing - on some tables the missing
column is almost empty.
Looking forward to your answers and ideas,
Katie
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-01-10 16:15:14 | Re: [ADMIN] Column missing from pg_statistics |
Previous Message | Renato Forti | 2014-01-09 16:30:38 | RES: RES: Remote Backup (pg_dump_all) Windows/Linux. |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-01-10 14:49:35 | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |
Previous Message | Merlin Moncure | 2014-01-10 14:10:00 | Re: array_length(anyarray) |