Re: Excessive memory used for INSERT

Lists: pgsql-performance
From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Subject: Excessive memory used for INSERT
Date: 2014-12-17 15:14:09
Message-ID: 2163931.4eVnXNjHa3@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

Software and hardware running postgresql are:
- postgresql92-9.2.3-1.1.1.x86_64
- openSuSE 12.3 x64_86
- 16 GB of RAM
- 2 GB of swap
- 8-core Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz
- ext4 filesystem hold on a hardware Dell PERC H710 RAID10 with 4x4TB SATA HDs.
- 2 GB of RAM are reserved for a virtual machine.

The single database used was created by
CREATE FUNCTION msg_function() RETURNS trigger
LANGUAGE plpgsql
AS $_$ DECLARE _tablename text; _date text; _slot timestamp; BEGIN _slot :=
NEW.slot; _date := to_char(_slot, 'YYYY-MM-DD'); _tablename := 'MSG_'||_date;
PERFORM 1 FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n
ON n.oid = c.relnamespace WHERE c.relkind = 'r' AND c.relname = _tablename
AND n.nspname = 'public'; IF NOT FOUND THEN EXECUTE 'CREATE TABLE
public.' || quote_ident(_tablename) || ' ( ) INHERITS (public.MSG)'; EXECUTE 'ALTER
TABLE public.' || quote_ident(_tablename) || ' OWNER TO seviri'; EXECUTE 'GRANT
ALL ON TABLE public.' || quote_ident(_tablename) || ' TO seviri'; EXECUTE 'ALTER
TABLE ONLY public.' || quote_ident(_tablename) || ' ADD CONSTRAINT ' ||
quote_ident(_tablename||'_pkey') || ' PRIMARY KEY (slot,msg)'; END IF; EXECUTE
'INSERT INTO public.' || quote_ident(_tablename) || ' VALUES ($1.*)' USING NEW;
RETURN NULL; END; $_$;

CREATE TABLE msg (
slot timestamp(0) without time zone NOT NULL,
msg integer NOT NULL,
hrv bytea,
vis006 bytea,
vis008 bytea,
ir_016 bytea,
ir_039 bytea,
wv_062 bytea,
wv_073 bytea,
ir_087 bytea,
ir_097 bytea,
ir_108 bytea,
ir_120 bytea,
ir_134 bytea,
pro bytea,
epi bytea,
clm bytea,
tape character varying(10)
);

Basically, this database consists of daily tables with the date stamp appended in their
names, i.e.
MSG_YYYY-MM-DD and a global table MSG linked to these tables allowing to list all
the records.

A cron script performing a single insert (upsert, see log excerpt below) runs every 15
minutes and
never had any issue.

However, I also need to submit historical records. This is achieved by a bash script
parsing a text file
and building insert commands which are submitted 10 at a time to the database
using psql through a
temp file in a BEGIN; ...; COMMIT block. When running this script, I noticed that the
INSERT
subprocess can reached around 4GB of memory using htop (see attached
screenshot). After a while,
the script inevitably crashes with the following messages
psql:/tmp/tmp.a0ZrivBZhD:10: connection to server was lost
Could not submit SQL request file /tmp/tmp.a0ZrivBZhD to database

and the associated entries in the log:
2014-12-15 17:54:07 GMT LOG: server process (PID 21897) was terminated by
signal 9: Killed
2014-12-15 17:54:07 GMT DETAIL: Failed process was running: WITH upsert AS
(update MSG set
(slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_1
08,IR_120,IR_134,PRO,EPI,CLM,TAPE) = (to_timestamp('201212032145',
'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\x01
','\x01','\x7f','LTO5_020') where slot=to_timestamp('201212032145',
'YYYYMMDDHH24MI') and MSG=2 RETURNING *) insert into MSG
(slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_1
08,IR_120,IR_134,PRO,EPI,CLM,TAPE) select to_timestamp('201212032145',
'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\x01
','\x01','\x7f','LTO5_020' WHERE NOT EXISTS (SELECT * FROM upsert);
2014-12-15 17:54:07 GMT LOG: terminating any other active server processes
2014-12-15 17:54:07 GMT WARNING: terminating connection because of crash of
another server process
2014-12-15 17:54:07 GMT DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
2014-12-15 17:54:07 GMT HINT: In a moment you should be able to reconnect to
the database and repeat your command.
2014-12-15 17:54:07 GMT seviri seviri WARNING: terminating connection because
of crash of another server process
2014-12-15 17:54:07 GMT seviri seviri DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2014-12-15 17:54:07 GMT seviri seviri HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2014-12-15 17:54:07 GMT LOG: all server processes terminated; reinitializing
2014-12-15 17:54:08 GMT LOG: database system was interrupted; last known up
at 2014-12-15 17:49:38 GMT
2014-12-15 17:54:08 GMT LOG: database system was not properly shut down;
automatic recovery in progress
2014-12-15 17:54:08 GMT LOG: redo starts at 0/58C1C060
2014-12-15 17:54:08 GMT LOG: record with zero length at 0/58C27950
2014-12-15 17:54:08 GMT LOG: redo done at 0/58C27920
2014-12-15 17:54:08 GMT LOG: last completed transaction was at log time
2014-12-15 17:53:33.898086+00
2014-12-15 17:54:08 GMT LOG: autovacuum launcher started
2014-12-15 17:54:08 GMT LOG: database system is ready to accept connections

My postgresql.conf contains the following modified parameters:
listen_addresses = '*'
max_connections = 100
shared_buffers = 96MB # increased from the default value of 24MB, because script
was failing in the beginning

Attachment Content-Type Size
image/png 26.9 KB

From: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Cc: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-17 15:26:32
Message-ID: 5491A0A8.7010506@toco-domains.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hello Alessandro,

> 2014-12-15 17:54:07 GMT DETAIL: Failed process was running: WITH upsert
> AS (update MSG set
> (slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_108,IR_120,IR_134,PRO,EPI,CLM,TAPE)
> = (to_timestamp('201212032145',
> 'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\x01','\x01','\x7f','LTO5_020')
> where slot=to_timestamp('201212032145', 'YYYYMMDDHH24MI') and MSG=2
> RETURNING *) insert into MSG
> (slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_108,IR_120,IR_134,PRO,EPI,CLM,TAPE)
> select to_timestamp('201212032145',
> 'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\xff','\x01','\x01','\x7f','LTO5_020'
> WHERE NOT EXISTS (SELECT * FROM upsert);

How many rows is "(SELECT * FROM upsert)" returning? Without knowing
more i would guess, that the result-set is very big and that could be
the reason for the memory usage.

I would add an WHERE clause to reduce the result-set (an correct index
can fasten this method even more).

Greetings,
Torsten


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org, Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-17 15:41:31
Message-ID: 7359.1418830891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de> writes:
> How many rows is "(SELECT * FROM upsert)" returning? Without knowing
> more i would guess, that the result-set is very big and that could be
> the reason for the memory usage.

Result sets are not ordinarily accumulated on the server side.

Alessandro didn't show the trigger definition, but my guess is that it's
an AFTER trigger, which means that a trigger event record is accumulated
in server memory for each inserted/updated row. If you're trying to
update a huge number of rows in one command (or one transaction, if it's
a DEFERRED trigger) you'll eventually run out of memory for the event
queue.

An easy workaround is to make it a BEFORE trigger instead. This isn't
really nice from a theoretical standpoint; but as long as you make sure
there are no other BEFORE triggers that might fire after it, it'll work
well enough.

Alternatively, you might want to reconsider the concept of updating
hundreds of millions of rows in a single operation ...

regards, tom lane


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-17 16:57:51
Message-ID: 35486133.28l5a59Zd7@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi Torsten,

Thanks for your answer.

I have modified
(SELECT * FROM upsert)
to
(SELECT * FROM upsert WHERE slot=to_timestamp('201212032145', 'YYYYMMDDHH24MI') and MSG=2)
according to your suggestion to reduce the result-set to a single row. However, the INSERT process is still consuming the same amount of RAM.

Regards,

Alessandro.

On Wednesday 17 December 2014 16:26:32 Torsten Zuehlsdorff wrote:
> Hello Alessandro,
>
> > 2014-12-15 17:54:07 GMT DETAIL: Failed process was running: WITH upsert
> > AS (update MSG set
> > (slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_1
> > 08,IR_120,IR_134,PRO,EPI,CLM,TAPE) = (to_timestamp('201212032145',
> > 'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','
> > \xff','\xff','\xff','\xff','\xff','\x01','\x01','\x7f','LTO5_020') where
> > slot=to_timestamp('201212032145', 'YYYYMMDDHH24MI') and MSG=2 RETURNING
> > *) insert into MSG
> > (slot,MSG,HRV,VIS006,VIS008,IR_016,IR_039,WV_062,WV_073,IR_087,IR_097,IR_1
> > 08,IR_120,IR_134,PRO,EPI,CLM,TAPE) select to_timestamp('201212032145',
> > 'YYYYMMDDHH24MI'),2,'\xffffff','\xff','\xff','\xff','\xff','\xff','\xff','
> > \xff','\xff','\xff','\xff','\xff','\x01','\x01','\x7f','LTO5_020' WHERE
> > NOT EXISTS (SELECT * FROM upsert);
>
> How many rows is "(SELECT * FROM upsert)" returning? Without knowing
> more i would guess, that the result-set is very big and that could be
> the reason for the memory usage.
>
> I would add an WHERE clause to reduce the result-set (an correct index
> can fasten this method even more).
>
> Greetings,
> Torsten


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-17 17:04:42
Message-ID: 3672040.FVo2HxYOvt@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

My dtrigger definition is
CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH ROW EXECUTE PROCEDURE msg_function();
so it seems that it is a BEFORE trigger.

To be totally honest, I have "really" limited knownledge in SQL and postgresql and all these were gathered from recipes found on the web...

Regards,

Alessandro.

On Wednesday 17 December 2014 10:41:31 Tom Lane wrote:
> Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de> writes:
> > How many rows is "(SELECT * FROM upsert)" returning? Without knowing
> > more i would guess, that the result-set is very big and that could be
> > the reason for the memory usage.
>
> Result sets are not ordinarily accumulated on the server side.
>
> Alessandro didn't show the trigger definition, but my guess is that it's
> an AFTER trigger, which means that a trigger event record is accumulated
> in server memory for each inserted/updated row. If you're trying to
> update a huge number of rows in one command (or one transaction, if it's
> a DEFERRED trigger) you'll eventually run out of memory for the event
> queue.
>
> An easy workaround is to make it a BEFORE trigger instead. This isn't
> really nice from a theoretical standpoint; but as long as you make sure
> there are no other BEFORE triggers that might fire after it, it'll work
> well enough.
>
> Alternatively, you might want to reconsider the concept of updating
> hundreds of millions of rows in a single operation ...
>
> regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
Cc: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-17 17:49:03
Message-ID: 4171.1418838543@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> My dtrigger definition is
> CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH ROW EXECUTE PROCEDURE msg_function();
> so it seems that it is a BEFORE trigger.

Hm, no AFTER triggers anywhere? Are there foreign keys, perhaps?

regards, tom lane


From: Torsten Förtsch <torsten(dot)foertsch(at)gmx(dot)net>
To: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 07:51:47
Message-ID: 54928793.5000907@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 17/12/14 16:14, Alessandro Ipe wrote:
> 2014-12-15 17:54:07 GMT LOG: server process (PID 21897) was terminated
> by signal 9: Killed

since it was killed by SIGKILL, maybe it's the kernel's OOM killer?

Torsten


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 11:16:49
Message-ID: 6645627.5dTte5MPLk@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

A grep in a nightly dump of this database did not return any AFTER trigger.
The only keys are primary on each daily table, through
ADD CONSTRAINT "MSG_YYYY-MM-DD_pkey" PRIMARY KEY (slot, msg);
and on the global table
ADD CONSTRAINT msg_pkey PRIMARY KEY (slot, msg);

Regards,

A.

On Wednesday 17 December 2014 12:49:03 Tom Lane wrote:
> Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> > My dtrigger definition is
> > CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH ROW EXECUTE
> > PROCEDURE msg_function(); so it seems that it is a BEFORE trigger.
>
> Hm, no AFTER triggers anywhere? Are there foreign keys, perhaps?
>
> regards, tom lane


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Torsten Förtsch <torsten(dot)foertsch(at)gmx(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 15:42:23
Message-ID: 6002240.FouBPN158i@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thursday 18 December 2014 08:51:47 Torsten Förtsch wrote:
> On 17/12/14 16:14, Alessandro Ipe wrote:
> > 2014-12-15 17:54:07 GMT LOG: server process (PID 21897) was
terminated
> > by signal 9: Killed
>
> since it was killed by SIGKILL, maybe it's the kernel's OOM killer?

Indeed and this hopefully prevented postgresql to crash my whole system due to
RAM exhaustion. But the problem remains : why an INSERT requires that huge
amount of memory ?

Regards,

A.


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 16:31:46
Message-ID: 6053205.R7VxckQEVY@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

I tried also with an upsert function
CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS void
LANGUAGE plpgsql
AS $$
BEGIN
EXECUTE sql_update;
IF FOUND THEN
RETURN;
END IF;
BEGIN
EXECUTE sql_insert;
EXCEPTION WHEN OTHERS THEN
EXECUTE sql_update;
END;
RETURN;
END;
$$;
with the same result on the memory used...

The tables hold 355000 rows in total.

Regards,

A.

On Thursday 18 December 2014 12:16:49 Alessandro Ipe wrote:
> Hi,
>
>
> A grep in a nightly dump of this database did not return any AFTER trigger.
> The only keys are primary on each daily table, through
> ADD CONSTRAINT "MSG_YYYY-MM-DD_pkey" PRIMARY KEY (slot, msg);
> and on the global table
> ADD CONSTRAINT msg_pkey PRIMARY KEY (slot, msg);
>
>
> Regards,
>
>
> A.
>
> On Wednesday 17 December 2014 12:49:03 Tom Lane wrote:
> > Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> > > My dtrigger definition is
> > > CREATE TRIGGER msg_trigger BEFORE INSERT ON msg FOR EACH
ROW EXECUTE
> > > PROCEDURE msg_function(); so it seems that it is a BEFORE trigger.
> >
> > Hm, no AFTER triggers anywhere? Are there foreign keys, perhaps?
> >
> > regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
Cc: pgsql-performance(at)postgresql(dot)org, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 17:05:45
Message-ID: 17790.1418922345@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> Hi,
> I tried also with an upsert function
> CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS void
> LANGUAGE plpgsql
> AS $$
> BEGIN
> EXECUTE sql_update;
> IF FOUND THEN
> RETURN;
> END IF;
> BEGIN
> EXECUTE sql_insert;
> EXCEPTION WHEN OTHERS THEN
> EXECUTE sql_update;
> END;
> RETURN;
> END;
> $$;
> with the same result on the memory used...

If you want to provide a self-contained test case, possibly we could look
into it, but these fragmentary bits of what you're doing don't really
constitute an investigatable problem statement.

I will note that EXCEPTION blocks aren't terribly cheap, so if you're
reaching the "EXECUTE sql_insert" a lot of times that might have something
to do with it.

regards, tom lane


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-18 17:31:42
Message-ID: 2870224.f0CdzmR1Qa@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

I can send a full dump of my database (< 2MB) if it is OK for you.

Thanks,

A.

On Thursday 18 December 2014 12:05:45 Tom Lane wrote:
> Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> > Hi,
> > I tried also with an upsert function
> > CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS
void
> >
> > LANGUAGE plpgsql
> > AS $$
> >
> > BEGIN
> > EXECUTE sql_update;
> > IF FOUND THEN
> >
> > RETURN;
> >
> > END IF;
> > BEGIN
> >
> > EXECUTE sql_insert;
> > EXCEPTION WHEN OTHERS THEN
> > EXECUTE sql_update;
> > END;
> >
> > RETURN;
> >
> > END;
> > $$;
> > with the same result on the memory used...
>
> If you want to provide a self-contained test case, possibly we could look
> into it, but these fragmentary bits of what you're doing don't really
> constitute an investigatable problem statement.
>
> I will note that EXCEPTION blocks aren't terribly cheap, so if you're
> reaching the "EXECUTE sql_insert" a lot of times that might have something
> to do with it.
>
> regards, tom lane


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-23 11:56:03
Message-ID: 6006964.iXUoRzG9pV@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

I guess the memory consumption is depending on the size of my database, so
only giving a reduced version of it won't allow to hit the issue.

The pg_dumpall file of my database can be found at the address
https://gerb.oma.be/owncloud/public.php?service=files&t=5e0e9e1bb06dce1d12c95662a9ee1c03

The queries causing the issue are given in files
- tmp.OqOavPYbHa (with the new upsert_func function)
- tmp.f60wlgEDWB (with WITH .. AS statement)

I hope it will help. Thanks.

Regards,

A.

On Thursday 18 December 2014 12:05:45 Tom Lane wrote:
> Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> > Hi,
> > I tried also with an upsert function
> > CREATE FUNCTION upsert_func(sql_insert text, sql_update text) RETURNS void
> >
> > LANGUAGE plpgsql
> > AS $$
> >
> > BEGIN
> > EXECUTE sql_update;
> > IF FOUND THEN
> >
> > RETURN;
> >
> > END IF;
> > BEGIN
> >
> > EXECUTE sql_insert;
> > EXCEPTION WHEN OTHERS THEN
> > EXECUTE sql_update;
> > END;
> >
> > RETURN;
> >
> > END;
> > $$;
> > with the same result on the memory used...
>
> If you want to provide a self-contained test case, possibly we could look
> into it, but these fragmentary bits of what you're doing don't really
> constitute an investigatable problem statement.
>
> I will note that EXCEPTION blocks aren't terribly cheap, so if you're
> reaching the "EXECUTE sql_insert" a lot of times that might have something
> to do with it.
>
> regards, tom lane

Attachment Content-Type Size
tmp.OqOavPYbHa text/plain 6.9 KB
tmp.f60wlgEDWB text/plain 7.9 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
Cc: pgsql-performance(at)postgresql(dot)org, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2014-12-23 20:27:41
Message-ID: 5141.1419366461@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> I guess the memory consumption is depending on the size of my database, so
> only giving a reduced version of it won't allow to hit the issue.

> The pg_dumpall file of my database can be found at the address
> https://gerb.oma.be/owncloud/public.php?service=files&t=5e0e9e1bb06dce1d12c95662a9ee1c03

> The queries causing the issue are given in files
> - tmp.OqOavPYbHa (with the new upsert_func function)
> - tmp.f60wlgEDWB (with WITH .. AS statement)

Well, the core of the problem here is that you've chosen to partition the
MSG table at an unreasonably small grain: it's got 3711 child tables and
it looks like you plan to add another one every day. For forty-some
megabytes worth of data, I'd have said you shouldn't be partitioning at
all; for sure you shouldn't be partitioning like this. PG's inheritance
mechanisms are only meant to cope with order-of-a-hundred child tables at
most. Moreover, the only good reason to partition is if you want to do
bulk data management by, say, dropping the oldest partition every so
often. It doesn't look like you're planning to do that at all, and I'm
sure if you do, you don't need 1-day granularity of the drop.

I'd recommend you either dispense with partitioning entirely (which would
simplify your life a great deal, since you'd not need all this hacky
partition management code), or scale it back to something like one
partition per year.

Having said that, it looks like the reason for the memory bloat is O(N^2)
space consumption in inheritance_planner() while trying to plan the
"UPDATE msg SET" commands. We got rid of a leading term in that
function's space consumption for many children awhile ago, but it looks
like you've found the next largest term :-(. I might be able to do
something about that. In the meantime, if you want to stick with this
partitioning design, couldn't you improve that code so the UPDATE is
only applied to the one child table it's needed for?

regards, tom lane


From: Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>
Subject: Re: Excessive memory used for INSERT
Date: 2015-01-06 17:40:05
Message-ID: 2913727.pH4N6BDu2g@snow.oma.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

Doing the UPDATE on the child table (provided that the table does exist) as
you recommended solved all my memory consumption issue.

Thanks a lot,

Alessandro.

On Tuesday 23 December 2014 15:27:41 Tom Lane wrote:
> Alessandro Ipe <Alessandro(dot)Ipe(at)meteo(dot)be> writes:
> > I guess the memory consumption is depending on the size of my database, so
> > only giving a reduced version of it won't allow to hit the issue.
> >
> > The pg_dumpall file of my database can be found at the address
> > https://gerb.oma.be/owncloud/public.php?service=files&t=5e0e9e1bb06dce1d12
> > c95662a9ee1c03
> >
> > The queries causing the issue are given in files
> > - tmp.OqOavPYbHa (with the new upsert_func function)
> > - tmp.f60wlgEDWB (with WITH .. AS statement)
>
> Well, the core of the problem here is that you've chosen to partition the
> MSG table at an unreasonably small grain: it's got 3711 child tables and
> it looks like you plan to add another one every day. For forty-some
> megabytes worth of data, I'd have said you shouldn't be partitioning at
> all; for sure you shouldn't be partitioning like this. PG's inheritance
> mechanisms are only meant to cope with order-of-a-hundred child tables at
> most. Moreover, the only good reason to partition is if you want to do
> bulk data management by, say, dropping the oldest partition every so
> often. It doesn't look like you're planning to do that at all, and I'm
> sure if you do, you don't need 1-day granularity of the drop.
>
> I'd recommend you either dispense with partitioning entirely (which would
> simplify your life a great deal, since you'd not need all this hacky
> partition management code), or scale it back to something like one
> partition per year.
>
> Having said that, it looks like the reason for the memory bloat is O(N^2)
> space consumption in inheritance_planner() while trying to plan the
> "UPDATE msg SET" commands. We got rid of a leading term in that
> function's space consumption for many children awhile ago, but it looks
> like you've found the next largest term :-(. I might be able to do
> something about that. In the meantime, if you want to stick with this
> partitioning design, couldn't you improve that code so the UPDATE is
> only applied to the one child table it's needed for?
>
> regards, tom lane