Re: One source of constant annoyance identified

Lists: pgsql-general
From: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: One source of constant annoyance identified
Date: 2002-07-03 08:34:53
Message-ID: 2266D0630E43BB4290742247C8910575014CE2C6@dozer.computec.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello!

Thank you very much for your efforts - we appreciate that very much :)

> -----Ursprüngliche Nachricht-----
> Von: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Gesendet: Mittwoch, 3. Juli 2002 00:28
> An: Markus Wollny
> Cc: pgsql-general(at)postgresql(dot)org
> Betreff: Re: [GENERAL] One source of constant annoyance identified
>
>
> "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de> writes:
> > Sorry I took so long - I attached the schema as asked.
>
> Thanks. But I'm still unable to reproduce the memory bloat you see on
> SELECTs. This seems very peculiar. You said you were
> running SuSE 7.3
> --- how recent is that? Which glibc version is it running?
> (I've been
> reduced to speculating about memory leakage inside libc, which is a
> pretty long shot but...)

I agree - it is a long shot: SuSE 7.3 has got Kernel: 2.4.10 and glibc:
2.2.4; it was released in October 2001. I tried using SuSE 8.0, released
in late April this year, but I was more than unhappy with some of the
new "features" - and it seems that hardly anybody has switched to 8.0
for server-usage as yet. Generally SuSE 7.3 (which is probably by far
the most popular distro in Germany) is considered quite stable and
current enough for server-usage. Would it really be worth the hassle
updating glibc to 2.2.5?

> > I took a quick look at top: Even this humble query causes
> memory- and
> > processor-load like a giant: 266M RAM, 38.3% processor time, 26.4%
> > memory usage. Okay, it's calling the trigger for each row
> which in turn
> > inserts some new tuples into ct_com_board_fti, but is it expected to
> > cause so much load?
>
> Wouldn't surprise me. Since you're using an AFTER trigger,
> the pending
> trigger events have to be saved up for commit time, so the list of
> pending events is going to grow quite large.

Okay, so there are indeed situations when this kind of backend size can
be considered normal. That's some sort of relief :)

> (How many rows do you have in ct_com_board_message, anyway?
> How many did that query try to
> update?) This however does not explain your problem with
> SELECT, since
> selects don't fire triggers.

Currently there are 362,154 rows in ct_com_board_message and 85,101 rows
in ct_com_user. I don't know if this can be considered a lot; we will
expect that to grow at an accelerating rate during the next months, so
more than a million can be expected within the next 12 months or so. We
scarcely use any database-specific features like triggers as yet, it's
99.99% pure SQL, tables, indexes, data, selects, inserts, updates,
deletes, no "fancy stuff" as yet. We'll have to get to know PostgreSQL
better before using anything specific; we never needed to get too deep
into Oracle either, which was an advantage when porting the whole thing
over to PostgreSQL, but it seems that the latter calls for a bit more
attention and know-how...

> Could I see the output of EXPLAIN for that problem SELECT on your
> machine?

We are currently working on a new version for the search I gave you
before, but here's another one that's causing similar symptoms - extra
long running time and most probably huge backends (it's 10:20 am, far
from peak time, and we already have ~280MB swap and six backends
>100MB):

This one generates an overview over all the forum-threads in one board
which has 41,624 messages, 2971 of them are FATHER_ID=0, so
thread-starters:

select MESSAGE.BOARD_ID
, MESSAGE.THREAD_ID
, MESSAGE.FATHER_ID
, MESSAGE.MESSAGE_ID
, MESSAGE.USER_ID
, MESSAGE.USER_LOGIN
as LOGIN
, MESSAGE.USER_STATUS
as STATUS
, MESSAGE.USER_RIGHTS
as RIGHTS
, MESSAGE.TITLE
, MESSAGE.COUNT_REPLY
as COUNT_REPLY
,
to_char(MESSAGE.LAST_REPLY,'DD.MM.YY hh24:mi') as LAST_REPLY
,
round((date_part('epoch',CURRENT_TIMESTAMP)-date_part('epoch',MESSAGE.LA
ST_REPLY))/60) as diff_posting
,
to_char(MESSAGE.CREATED,'DD.MM.YY hh24:mi') as DATUM

, (select count(*) from
CT_COM_USER_THREAD_FOLLOW where USER_ID= '295798' and
thread_id=MESSAGE.THREAD_ID) as TFUID

from CT_COM_BOARD_MESSAGE MESSAGE
where (0=0)

and MESSAGE.BOARD_ID = 10
and MESSAGE.FATHER_ID = 0
and MESSAGE.STATE_ID = 0
order by MESSAGE.LAST_REPLY desc

Sort (cost=30695.27..30695.27 rows=7693 width=154) (actual
time=9745.94..9751.58 rows=4663 loops=1)
-> Index Scan using idx_bm_show_topics on ct_com_board_message
message (cost=0.00..30198.72 rows=7693 width=154) (actual
time=111.56..9549.99 rows=4663 loops=1)
SubPlan
-> Aggregate (cost=5.83..5.83 rows=1 width=0) (actual
time=0.91..0.91 rows=1 loops=4663)
-> Index Scan using idx_user_thread_follow on
ct_com_user_thread_follow (cost=0.00..5.83 rows=1 width=0) (actual
time=0.88..0.88 rows=0 loops=4663)
Total runtime: 9835.57 msec

I do hope you can make anything of this...

Regards,

Markus


From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: One source of constant annoyance identified
Date: 2002-07-03 19:09:24
Message-ID: 4gh6iugpqp26o91pbrd0kjks4n7415ui5a@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 3 Jul 2002 10:34:53 +0200, "Markus Wollny"
<Markus(dot)Wollny(at)computec(dot)de> wrote:
>This one generates an overview over all the forum-threads in one board
>which has 41,624 messages, 2971 of them are FATHER_ID=0, so
^^^^
>thread-starters: [long SQL statement follows]

Markus, that's strange. Your explain says:
>Sort (cost=30695.27..30695.27 rows=7693 width=154) (actual
>time=9745.94..9751.58 rows=4663 loops=1)
^^^^
> -> Index Scan using idx_bm_show_topics on ct_com_board_message
>message (cost=0.00..30198.72 rows=7693 width=154) (actual
>time=111.56..9549.99 rows=4663 loops=1)
> SubPlan
> -> Aggregate (cost=5.83..5.83 rows=1 width=0) (actual
>time=0.91..0.91 rows=1 loops=4663)
> -> Index Scan using idx_user_thread_follow on
>ct_com_user_thread_follow (cost=0.00..5.83 rows=1 width=0) (actual
>time=0.88..0.88 rows=0 loops=4663)
>Total runtime: 9835.57 msec

Anyway, that's not my point here. I'd want you to do an EXPLAIN
ANALYZE of another SQL statement which does the same IMHO.

From the schema you posted I see that (user_id, thread_id) is the
primary key, or at least a unique index, in CT_COM_USER_THREAD_FOLLOW,
so the sub-select
>(select count(*)
> from CT_COM_USER_THREAD_FOLLOW
> where USER_ID= '295798' and thread_id=MESSAGE.THREAD_ID) as TFUID
can only give 0 or 1. So following my first rule of thumb "Avoid
subselects; use joins wherever possible" I'd write:

select MESSAGE.BOARD_ID
, MESSAGE.THREAD_ID
, MESSAGE.FATHER_ID
, MESSAGE.MESSAGE_ID
, MESSAGE.USER_ID
, MESSAGE.USER_LOGIN as LOGIN
, MESSAGE.USER_STATUS as STATUS
, MESSAGE.USER_RIGHTS as RIGHTS
, MESSAGE.TITLE
, MESSAGE.COUNT_REPLY as COUNT_REPLY
, to_char(MESSAGE.LAST_REPLY,'DD.MM.YY hh24:mi') as
LAST_REPLY
, round((date_part('epoch',CURRENT_TIMESTAMP) -
date_part('epoch',MESSAGE.LAST_REPLY))/60) as
diff_posting
, to_char(MESSAGE.CREATED,'DD.MM.YY hh24:mi') as DATUM
, count(TH.THREAD_ID) as TFUID
from CT_COM_BOARD_MESSAGE MESSAGE
left join CT_COM_USER_THREAD_FOLLOW TH
ON (TH.USER_ID='295798' and
TH.thread_id=MESSAGE.THREAD_ID)
where (0=0)
and MESSAGE.BOARD_ID = 10
and MESSAGE.FATHER_ID = 0
and MESSAGE.STATE_ID = 0
order by MESSAGE.LAST_REPLY desc;

Could you try to EXPLAIN ANALYZE this and post the result?
And please cross-check whether it returns the same result set as your
original query.

Servus
Manfred


From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: One source of constant annoyance identified
Date: 2002-07-04 06:56:04
Message-ID: m5s7iu8o63q84jrd5pm3r6qsaddnca12am@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 03 Jul 2002 21:09:24 +0200, I wrote:
>select MESSAGE.BOARD_ID
> , [...]
> , count(TH.THREAD_ID) as TFUID

Oops! COUNT won't work unless you add GRAOUP BY <all other fields> at
the end of the query. I had
CASE WHEN th.thread_id IS NULL THEN 0 ELSE 1 END
here and thoughtlessly replaced it by COUNT(...) for brevity. How
foolish!

Servus
Manfred