Re: Deleted WAL files held open by backends in Linux

Lists: pgsql-hackers
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Deleted WAL files held open by backends in Linux
Date: 2009-11-25 22:54:29
Message-ID: 4B0D6145020000250002CD53@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I ran across the subject issue and spent some time puzzling over it.
It seemed strange that the only backends which were holding open
deleted WAL files were ones where the connection was established
with a login which has no write permissions. Eventually, I vaguely
recalled there was such an issue discussed in recent months and
found it in the archives. It was this:

http://archives.postgresql.org/pgsql-bugs/2009-09/msg00144.php

As a DBA I feel that it's irritating and a potential time waster.
I guess it is a stretch to imagine that a database would have
enough read-only connections to exhaust resources by holding open
one deleted WAL file each; unless they have, say, 200 such
connections and they're cutting things so close that a wasted 3.2GB
of disk space at the WAL file location will run them out.

I'm not sure whether Tom's comment that "There is zero hope of
making that work." referred to the idea that we could close deleted
WAL files or to something else. Is a fix feasible?

This was observed on:

PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 20070115 (SUSE Linux)

SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2

Linux version 2.6.16.60-0.39.3-smp (geeko(at)buildhost)
(gcc version 4.1.2 20070115 (SUSE Linux))
#1 SMP Mon May 11 11:46:34 UTC 2009

-Kevin


From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-26 03:20:11
Message-ID: 20091126122011.5BA3.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> I guess it is a stretch to imagine that a database would have
> enough read-only connections to exhaust resources by holding open
> one deleted WAL file each; unless they have, say, 200 such
> connections and they're cutting things so close that a wasted 3.2GB
> of disk space at the WAL file location will run them out.

AFAIK, we rename and reuse old WAL segment files.
So, we don't waste disk space unless checkpoint_segments are too small.

Also, if you are worried about disk space,
how about adding ftruncate(to 0 byte) before unlink() ?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-26 04:16:51
Message-ID: 21317.1259209011@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I ran across the subject issue and spent some time puzzling over it.
> ...
> I'm not sure whether Tom's comment that "There is zero hope of
> making that work." referred to the idea that we could close deleted
> WAL files or to something else. Is a fix feasible?

The reason I was dismissive of that is this scenario: a backend is told
to execute a data-modifying transaction (so it has to write and commit
some WAL records) and then the client promptly goes to sleep and stays
that way for awhile. The backend has nothing to do either, so it's just
sitting waiting on the client socket. There is noplace reasonable here
to make it decide to close its open WAL file. We could possibly do a
pre-emptive close() after transaction commit, but that would result in a
useless distributed overhead of opening and closing the current WAL file
when a backend is engaged in a series of transactions.

On Linux this simply isn't an issue, anyway --- if you can't afford one
open file descriptor per backend for WAL, you're going to have a far
bigger problem with the backend's habit of caching descriptors for data
files. On Windows it might be a problem, but I think we have got
workarounds for all the important cases. The important point is that
we want any overhead for this to be pushed into non-performance-critical
code paths.

> It seemed strange that the only backends which were holding open
> deleted WAL files were ones where the connection was established
> with a login which has no write permissions.

A backend would never open a WAL file unless it had to write a WAL
record, so I'm having a hard time believing that these were totally
read-only transactions. Can you give specifics?

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 18:20:12
Message-ID: 4B13B87C020000250002CDC2@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> It seemed strange that the only backends which were holding open
>> deleted WAL files were ones where the connection was established
>> with a login which has no write permissions.
>
> A backend would never open a WAL file unless it had to write a WAL
> record, so I'm having a hard time believing that these were
> totally read-only transactions. Can you give specifics?

Here's some information captured before the original post:

MILWAUKEE-PG:~ # lsof | grep deleted
COMMAND PID USER FD TYPE DEVICE SIZE
NODE NAME
postgres 13105 ccsa 10u REG 8,4 16777216
268442394 /var/pgsql/data/cc/pg_xlog/00000001000000E0000000BB (deleted)
postgres 30500 ccsa 4u REG 8,4 16777216
268442626 /var/pgsql/data/cc/pg_xlog/00000001000000E20000003F (deleted)
postgres 30501 ccsa 5u REG 8,4 16777216
283509014 /var/pgsql/data/cc/pg_xlog/00000001000000E200000016 (deleted)
MILWAUKEE-PG:~ # ps auxf|grep ^ccsa
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME
COMMAND
ccsa 30490 0.0 0.0 290988 8608 ? S Oct24 0:13
/usr/local/pgsql-8.3.7/bin/postgres -D /var/pgsql/data/cc
ccsa 30491 0.0 0.0 14524 1056 ? Ss Oct24 0:10 \_
postgres: logger process
ccsa 30493 0.0 1.0 291876 268760 ? Ss Oct24 1:36 \_
postgres: writer process
ccsa 30494 0.0 0.0 291120 1452 ? Ss Oct24 0:01 \_
postgres: wal writer process
ccsa 30495 0.0 0.0 291552 1564 ? Ss Oct24 0:00 \_
postgres: autovacuum launcher process
ccsa 30496 0.0 0.0 14612 952 ? Ss Oct24 0:01 \_
postgres: archiver process last was 00000001000000E20000005B
ccsa 30497 0.0 0.0 15280 1576 ? Ss Oct24 6:37 \_
postgres: stats collector process
ccsa 30500 0.0 0.9 292448 231844 ? Ss Oct24 10:42 \_
postgres: viewer cc 127.0.0.1(36846) idle
ccsa 30501 0.0 0.8 292496 209428 ? Ss Oct24 10:28 \_
postgres: viewer cc 165.219.78.11(53940) idle
ccsa 3107 0.0 0.0 292508 5836 ? Ss Nov09 0:32 \_
postgres: cc cc 127.0.0.1(40442) idle
ccsa 3113 0.0 1.1 305304 286280 ? Ss Nov09 6:51 \_
postgres: cc cc 127.0.0.1(49969) idle
ccsa 3119 0.0 0.0 292508 5836 ? Ss Nov09 0:29 \_
postgres: cc cc 127.0.0.1(54127) idle
ccsa 13105 0.0 0.0 293396 23852 ? Ss Nov23 0:08 \_
postgres: viewer cc 165.219.90.178(51481) idle
ccsa 1485 0.8 1.1 312400 291508 ? Ss Nov24 14:18 \_
postgres: cc cc 127.0.0.1(42692) idle
ccsa 10752 0.3 1.1 312712 290856 ? Ss Nov24 4:54 \_
postgres: cc cc 127.0.0.1(45119) idle
ccsa 10908 1.3 1.1 313120 292836 ? Ss Nov24 20:03 \_
postgres: cc cc 127.0.0.1(42065) idle
ccsa 25099 0.7 1.1 312864 285048 ? Ss 14:02 1:03 \_
postgres: cc cc 127.0.0.1(50361) idle
ccsa 26571 0.0 0.3 298912 89528 ? Ss 14:29 0:06 \_
postgres: cc cc 127.0.0.1(52213) idle
ccsa 26809 0.1 0.3 298940 88816 ? Ss 14:33 0:06 \_
postgres: cc cc 127.0.0.1(59145) idle
ccsa 26812 0.6 1.1 310060 274192 ? Ss 14:33 0:44 \_
postgres: cc cc 127.0.0.1(59962) idle
MILWAUKEE-PG:~ # ll /var/pgsql/data/cc/pg_xlog/
total 1671172
-rw------- 1 ccsa users 253 2009-11-20 21:29
00000001000000E00000000A.00001810.backup
-rw------- 1 ccsa users 16777216 2009-11-25 16:10
00000001000000E20000005B
-rw------- 1 ccsa users 16777216 2009-11-25 16:22
00000001000000E20000005C
-rw------- 1 ccsa users 16777216 2009-11-25 04:15
00000001000000E20000005D
-rw------- 1 ccsa users 16777216 2009-11-25 05:15
00000001000000E20000005E
-rw------- 1 ccsa users 16777216 2009-11-25 06:15
00000001000000E20000005F
-rw------- 1 ccsa users 16777216 2009-11-25 07:15
00000001000000E200000060
-rw------- 1 ccsa users 16777216 2009-11-25 07:42
00000001000000E200000061
[...]
-rw------- 1 ccsa users 16777216 2009-11-25 15:34
00000001000000E2000000BB
-rw------- 1 ccsa users 16777216 2009-11-25 15:37
00000001000000E2000000BC
-rw------- 1 ccsa users 16777216 2009-11-25 15:44
00000001000000E2000000BD
-rw------- 1 ccsa users 16777216 2009-11-25 15:50
00000001000000E2000000BE
-rw------- 1 ccsa users 16777216 2009-11-25 15:57
00000001000000E2000000BF
-rw------- 1 ccsa users 16777216 2009-11-25 16:04
00000001000000E2000000C0
drwx------ 2 ccsa users 94 2009-11-25 16:12 archive_status

You will note that the connections logged in as "viewer" (and only
those) are holding open a deleted WAL file. This user has not been
granted anything except SELECT permissions to any tables. In
addition, immediately after creating the database, we ran:

REVOKE CREATE ON DATABASE cc FROM public;
REVOKE CREATE ON SCHEMA public FROM public;

Two of these connections are from software publishers, which only
issue SELECT statements against three tables (to which they only
have SELECT permission). The other is from monitoring software,
which only runs SELECT against C functions from our fsutil software
(which can be seen on pgfoundry). All are JDBC connections. I
suspect that autoCommit is on; if you think it matters, I can go
review the code to confirm.

On one of our development machines, with a number of PostgreSQL
databases, I found about 60 such WAL files deleted and held open
by backends logged in under the read-only user. It wasted my time
primarily because I was looking for situations where we might have
deleted a directory tree without first stopping the process using
it; while 60 file handles and 1GB of disk space is no big deal for
us, it caused me to investigate the issue to confirm there weren't
zombie PostgreSQL processes out there.

At a minimum, we should add the extra 16MB per connection that might
be taken on the WAL file system to the calculations people should do
when sizing that, just in case someone is trying to cut that thin
while planning on using a lot of connections.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 18:48:10
Message-ID: 7910.1259606890@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A backend would never open a WAL file unless it had to write a WAL
>> record, so I'm having a hard time believing that these were
>> totally read-only transactions. Can you give specifics?

> You will note that the connections logged in as "viewer" (and only
> those) are holding open a deleted WAL file. This user has not been
> granted anything except SELECT permissions to any tables.

You sure it's not creating any temp tables? You didn't mention
revoking TEMP privilege.

I can think of one code path that could result in a genuinely read-only
session having to write WAL: if it's forced to flush dirty buffers in
order to read in other pages, and such a buffer was dirtied by
as-yet-uncommitted transactions, it might have to flush WAL to be
allowed to write the dirty buffer. But I think you'd have had to dial
back the bgwriter to the point of uselessness before this would be a
common occurrence.

> At a minimum, we should add the extra 16MB per connection that might
> be taken on the WAL file system to the calculations people should do
> when sizing that, just in case someone is trying to cut that thin
> while planning on using a lot of connections.

In the first place, this is a complete non-issue except on Windows ---
on other platforms we can rename and recycle the files even if they're
being held open. I rather doubt anyone would think they could predict a
Windows machine's disk usage that accurately anyway. In the second
place, for each backend to be holding open a different dead WAL file
strains the limits of credulity. Telling people to assume 16MB *
max_connections would be a gross overestimate.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 19:37:40
Message-ID: 4B13CAA4020000250002CDDF@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> You sure it's not creating any temp tables? You didn't mention
> revoking TEMP privilege.

They have not been revoked, but I am sure the software publisher
doesn't explicitly create any, and I'd be very surprised if the
monitoring software did. The tables are small enough that it's hard
to believe that the 50MB work_mem would spill to disk, either (if
that matters).

> I can think of one code path that could result in a genuinely
> read-only session having to write WAL: if it's forced to flush
> dirty buffers in order to read in other pages, and such a buffer
> was dirtied by as-yet-uncommitted transactions, it might have to
> flush WAL to be allowed to write the dirty buffer. But I think
> you'd have had to dial back the bgwriter to the point of
> uselessness before this would be a common occurrence.

#bgwriter_delay = 200ms
bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0

>> At a minimum, we should add the extra 16MB per connection that
>> might be taken on the WAL file system to the calculations people
>> should do when sizing that, just in case someone is trying to
>> cut that thin while planning on using a lot of connections.
>
> In the first place, this is a complete non-issue except on Windows
> --- on other platforms we can rename and recycle the files even
> if they're being held open. I rather doubt anyone would think
> they could predict a Windows machine's disk usage that accurately
> anyway. In the second place, for each backend to be holding open
> a different dead WAL file strains the limits of credulity.
> Telling people to assume 16MB * max_connections would be a gross
> overestimate.

Pretty much every read only JDBC connection seems to be holding open
a deleted WAL file on my Linux box, but it would take pretty
pessimal timing for each connection to be holding open a different
one -- I see that many connections share a deleted WAL file.

project-db:~ # cat /proc/version
Linux version 2.6.16.60-0.39.3-smp (geeko(at)buildhost) (gcc version 4.1.2
20070115 (SUSE Linux)) #1 SMP Mon May 11 11:46:34 UTC 2009
project-db:~ # lsof | grep deleted
postgres 2189 ccefcirsa 20u REG 8,3
16777216 1610613340
/var/pgsql/data/ccefcir/pg_xlog/000000010000000A00000010 (deleted)
postgres 2195 ccefcirsa 43u REG 8,3
16777216 1610613340
/var/pgsql/data/ccefcir/pg_xlog/000000010000000A00000010 (deleted)
postgres 2511 jdashcirsa 1u CHR 136,5
7 /dev/pts/5 (deleted)

postgres 2511 jdashcirsa 2u CHR 136,5
7 /dev/pts/5 (deleted)

postgres 2514 jdashcirsa 3u REG 8,3
16777216 1610631538
/var/pgsql/data/jdashcir/pg_xlog/000000010000001100000076 (deleted)
postgres 2812 ccsa 3u REG 8,3
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000094 (deleted)
postgres 3647 ccefsa 1u CHR 136,5
7 /dev/pts/5 (deleted)

postgres 3647 ccefsa 2u CHR 136,5
7 /dev/pts/5 (deleted)

postgres 3650 ccefsa 3u REG 8,3
16777216 669441097
/var/pgsql/data/ccef/pg_xlog/000000020000004B000000C5 (deleted)
postgres 4266 milwaukeeifsa 3u REG 8,3
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres 4270 juryscansa 3u REG 8,3
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)
postgres 6100 juryscansa 41u REG 8,3
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)
postgres 6105 juryscansa 35u REG 8,3
16777216 1073742167
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000007E (deleted)
postgres 6113 juryscansa 43u REG 8,3
16777216 1073742166
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000001D (deleted)
postgres 6119 juryscansa 59u REG 8,3
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)
postgres 6121 juryscansa 62u REG 8,3
16777216 1073742165
/var/pgsql/data/juryscan/pg_xlog/000000020000001A000000DB (deleted)
postgres 6126 juryscansa 33u REG 8,3
16777216 1073742167
/var/pgsql/data/juryscan/pg_xlog/000000020000001A0000007E (deleted)
postgres 7997 washrptcirsa 13u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 8170 washrptcirsa 23u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 8687 washrptcirsa 11u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 8720 washrptcirsa 3u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 8769 washrptcirsa 6u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 10093 washrptcirsa 18u REG 8,3
16777216 2688967689
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B000000E4 (deleted)
postgres 10313 ccsagsa 3u REG 8,3
16777216 3758753714
/var/pgsql/data/ccsag/pg_xlog/0000000300000035000000EE (deleted)
postgres 10660 datasourcesa 50u REG 8,3
16777216 3250939657
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000C9 (deleted)
postgres 10661 datasourcesa 23u REG 8,3
16777216 3250939657
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000C9 (deleted)
postgres 10666 datasourcesa 20u REG 8,3
16777216 3250939679
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000B3 (deleted)
postgres 10667 datasourcesa 20u REG 8,3
16777216 3250939679
/var/pgsql/data/datasource/pg_xlog/0000000100000004000000B3 (deleted)
postgres 11049 formssa 3u REG 8,3
16777216 1879057748
/var/pgsql/data/forms/pg_xlog/000000010000000B000000C1 (deleted)
postgres 11209 milwaukeeifsa 19u REG 8,3
16777216 2690371279
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006900000018 (deleted)
postgres 11217 milwaukeeifsa 131u REG 8,3
16777216 2690371279
/var/pgsql/data/milwaukeeif/pg_xlog/0000000200000069000000DE (deleted)
postgres 11221 milwaukeeifsa 34u REG 8,3
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
postgres 11223 milwaukeeifsa 16u REG 8,3
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
postgres 11659 jdashccsa 6u REG 8,3
16777216 401111458
/var/pgsql/data/jdashcc/pg_xlog/000000020000006400000080 (deleted)
postgres 12949 jdashccsa 22u REG 8,3
16777216 268437280
/var/pgsql/data/jdashcc/pg_xlog/0000000200000065000000BE (deleted)
postgres 12976 formssa 31u REG 8,3
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres 12978 formssa 23u REG 8,3
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres 12980 formssa 34u REG 8,3
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres 12992 formssa 35u REG 8,3
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres 12993 formssa 25u REG 8,3
16777216 1879057748
/var/pgsql/data/forms/pg_xlog/000000010000000E000000F4 (deleted)
postgres 12994 formssa 24u REG 8,3
16777216 1879057738
/var/pgsql/data/forms/pg_xlog/000000010000000F00000082 (deleted)
postgres 14359 jdashccsa 22u REG 8,3
16777216 401111459
/var/pgsql/data/jdashcc/pg_xlog/0000000200000065000000BD (deleted)
postgres 15397 washrptcirsa 26u REG 8,3
16777216 2688967664
/var/pgsql/data/washrptcir/pg_xlog/000000010000002900000079 (deleted)
postgres 16422 ccsagsa 11u REG 8,3
16777216 3758531497
/var/pgsql/data/ccsag/pg_xlog/0000000300000030000000ED (deleted)
postgres 16490 q409sa 3u REG 8,3
16777216 3490994896
/var/pgsql/data/codesq409/pg_xlog/000000010000000B00000056 (deleted)
postgres 16533 q309sa 3u REG 8,3
16777216 1073749053
/var/pgsql/data/codesq309/pg_xlog/0000000100000007000000A4 (deleted)
postgres 16721 ccsa 27u REG 8,3
16777216 3763808825
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000068 (deleted)
postgres 16725 ccsa 25u REG 8,3
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000094 (deleted)
postgres 16729 ccsa 17u REG 8,3
16777216 3763808824
/var/pgsql/data/epayment/pg_xlog/000000020000009A000000F0 (deleted)
postgres 16734 ccsa 34u REG 8,3
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B000000FA (deleted)
postgres 16741 ccsa 18u REG 8,3
16777216 3763808825
/var/pgsql/data/epayment/pg_xlog/000000020000009B00000068 (deleted)
postgres 16747 ccsa 38u REG 8,3
16777216 3763808807
/var/pgsql/data/epayment/pg_xlog/000000020000009B000000FA (deleted)
postgres 18316 ccefsa 13u REG 8,3
16777216 669441097
/var/pgsql/data/ccef/pg_xlog/000000020000004B000000C5 (deleted)
postgres 20568 washrptsa 3u REG 8,3
16777216 1463
/var/pgsql/data/washrpt/pg_xlog/000000020000004700000011 (deleted)
postgres 23040 milwaukeeifsa 47u REG 8,3
16777216 2690371282
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006A00000051 (deleted)
nscd 23041 root 7u REG 8,2
217016 1459569 /var/run/nscd/dbnP5QN9 (deleted)
nscd 23041 root 8r REG 8,2
217016 1459569 /var/run/nscd/dbnP5QN9 (deleted)
postgres 23322 ccefcirsa 1u CHR 136,0
2 /dev/pts/0 (deleted)
postgres 23322 ccefcirsa 2u CHR 136,0
2 /dev/pts/0 (deleted)
postgres 23339 milwaukeeifsa 136u REG 8,3
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres 23603 jawebsa 1u CHR 136,5
7 /dev/pts/5 (deleted)
postgres 23603 jawebsa 2u CHR 136,5
7 /dev/pts/5 (deleted)
postgres 24367 mssa 34u REG 8,3
16777216 805314300 /var/pgsql/data/ms/pg_xlog/000000010000000A00000058
(deleted)
postgres 24368 mssa 37u REG 8,3
16777216 805314303 /var/pgsql/data/ms/pg_xlog/000000010000000A00000059
(deleted)
postgres 24923 encryptsa 335u REG 8,3
16777216 805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres 24925 encryptsa 27u REG 8,3
16777216 805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres 24931 encryptsa 16u REG 8,3
16777216 805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres 24933 encryptsa 26u REG 8,3
16777216 805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres 24942 encryptsa 16u REG 8,3
16777216 805310179
/var/pgsql/data/encrypt/pg_xlog/000000020000003E00000023 (deleted)
postgres 26267 washrptcirsa 1u CHR 136,3
5 /dev/pts/3 (deleted)
postgres 26267 washrptcirsa 2u CHR 136,3
5 /dev/pts/3 (deleted)
postgres 26270 washrptcirsa 3u REG 8,3
16777216 2688967703
/var/pgsql/data/washrptcir/pg_xlog/000000010000002B00000070 (deleted)
postgres 27403 milwaukeeifsa 30u REG 8,3
16777216 2708846433
/var/pgsql/data/milwaukeeif/pg_xlog/000000020000006B00000014 (deleted)
postgres 28916 encryptsa 3u REG 8,3
16777216 805314462
/var/pgsql/data/encrypt/pg_xlog/000000020000003C000000E8 (deleted)
postgres 29217 mssa 3u REG 8,3
16777216 805314300 /var/pgsql/data/ms/pg_xlog/000000010000000A00000010
(deleted)
postgres 30151 datasourcesa 3u REG 8,3
16777216 3250939659
/var/pgsql/data/datasource/pg_xlog/000000010000000400000094 (deleted)

I'll test a bit to see what it takes to get a JDBC connection to
open a WAL file for a connection.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 19:46:36
Message-ID: 8934.1259610396@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> You sure it's not creating any temp tables? You didn't mention
>> revoking TEMP privilege.

> They have not been revoked, but I am sure the software publisher
> doesn't explicitly create any, and I'd be very surprised if the
> monitoring software did. The tables are small enough that it's hard
> to believe that the 50MB work_mem would spill to disk, either (if
> that matters).

It's not about the size of a temp table, because writes to the temp
table itself aren't WAL-logged. However, the system catalog entries for
a temp table *are* WAL-logged.

> Pretty much every read only JDBC connection seems to be holding open
> a deleted WAL file on my Linux box, but it would take pretty
> pessimal timing for each connection to be holding open a different
> one -- I see that many connections share a deleted WAL file.

This still seems a bit improbable to me. There has to be something
causing those sessions to touch WAL, and the dirty-buffer scenario
doesn't seem reliable enough.

[ thinks... ] How about SELECT FOR UPDATE or SELECT FOR SHARE?
Those cause WAL writes.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 19:53:28
Message-ID: 4B13CE58020000250002CDEB@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> It's not about the size of a temp table, because writes to the
> temp table itself aren't WAL-logged. However, the system catalog
> entries for a temp table *are* WAL-logged.

Definitely not issuing any CREATE TEMP statements of any kind,
unless the JDBC driver is doing that under the covers.

>> Pretty much every read only JDBC connection seems to be holding
>> open a deleted WAL file on my Linux box, but it would take pretty
>> pessimal timing for each connection to be holding open a
>> different one -- I see that many connections share a deleted WAL
>> file.
>
> This still seems a bit improbable to me. There has to be
> something causing those sessions to touch WAL, and the
> dirty-buffer scenario doesn't seem reliable enough.
>
> [ thinks... ] How about SELECT FOR UPDATE or SELECT FOR SHARE?
> Those cause WAL writes.

Definitely not.

Probably best not to worry about it until I can play around with
some Java test code to see what it takes to cause the connection to
open the WAL. I'll post when I've had a chance to try that.

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-11-30 21:24:53
Message-ID: 4B13E3C5020000250002CDFD@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> There has to be something causing those sessions to touch WAL, and
> the dirty-buffer scenario doesn't seem reliable enough.

This is seeming fairly likely to be the cause, though. It may be a
combination of the nightly VACUUM FREEZE ANALYZE we typically do on
every database and the monitoring software, which would directly run
a couple queries each five seconds or so, and would also exercise
the software publishers on a similar schedule. My "smoking gun" is
that the monitoring software was restarted last night after the time
that the nightly vacuums would have completed, and its connections
are not showing as having a deleted WAL file open today.

I was unable to directly cause the a backend to open a WAL file by
connecting through JDBC and running the same kind of queries used by
the software publisher, which does tend to point to some indirect
mechanism, like the dirty buffer flushing.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 01:18:24
Message-ID: 15412.1259630304@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> There has to be something causing those sessions to touch WAL, and
>> the dirty-buffer scenario doesn't seem reliable enough.

> This is seeming fairly likely to be the cause, though. It may be a
> combination of the nightly VACUUM FREEZE ANALYZE we typically do on
> every database and the monitoring software, which would directly run
> a couple queries each five seconds or so, and would also exercise
> the software publishers on a similar schedule. My "smoking gun" is
> that the monitoring software was restarted last night after the time
> that the nightly vacuums would have completed, and its connections
> are not showing as having a deleted WAL file open today.

Hmm. If the read-only sessions are long-lived then you could expect
that the probability of having flushed a dirty block (and hence had to
write some WAL) increases over time and eventually approaches 1.
How old were the sessions you were looking at?

If we think this is worth doing something about (I'm not convinced yet)
then the answer would be to forcibly close a backend's open WAL file
in some reasonably seldom-used code path. One possibility that comes
to mind is to do it in ProcessCatchupEvent(), which will be invoked in
approximately the right circumstances: a backend that is sitting idle
for a long time within an otherwise busy system. That wouldn't be a
100% solution, because if the backend is handling a steady stream of
queries it will likely never run ProcessCatchupEvent(). But the places
that would be 100% (like transaction commit) would probably entail too
much of a performance hit from repeatedly opening and closing WAL files.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 15:03:32
Message-ID: 4B14DBE4020000250002CE6E@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> How old were the sessions you were looking at?

Days to months old.

> If we think this is worth doing something about
> (I'm not convinced yet)

Once one knows about the issue, it's only a minor annoyance, and
that infrequently, so it's not worth anything that would take
significant effort or would cause any measurable performance hit.

> then the answer would be to forcibly close a backend's open WAL
> file in some reasonably seldom-used code path. One possibility
> that comes to mind is to do it in ProcessCatchupEvent(), which
> will be invoked in approximately the right circumstances: a
> backend that is sitting idle for a long time within an otherwise
> busy system. That wouldn't be a 100% solution, because if the
> backend is handling a steady stream of queries it will likely
> never run ProcessCatchupEvent(). But the places that would be
> 100% (like transaction commit) would probably entail too much of a
> performance hit from repeatedly opening and closing WAL files.

In our case there would tend to be bursts of activity on some of
these connections, and they would usually have a dribble of activity
from the monitoring system. This dribble would consist of a couple
sub-millisecond SELECT statements once every five or ten seconds.
(I have no clue whether that's frequent enough to suppress the
ProcessCatchupEvent() call.)

Is there a reasonably cheap way to check whether the backend has a
WAL file open and whether that one is the current append target? As
you point out, the need to actually close such a file would be
infrequent, and it seems as though when the need exists, it wouldn't
be a matter of *whether* it would need to be closed, just *when*.
Currently it sounds like we would often wind up doing it the next
time we try to run a query which has to flush dirty buffers, or when
the connection is being closed. It seems like those might not be
the ideal time to add the overhead of the close, so we would just
need to find someplace which isn't worse.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 15:12:46
Message-ID: 964.1259680366@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Is there a reasonably cheap way to check whether the backend has a
> WAL file open and whether that one is the current append target?

Detecting whether we have a WAL file open is trivial (just look at
the static variable holding the file descriptor). Determining whether
it's still the current append target is not so cheap though; it would
require examining shared-memory status which means taking a lock on
that status (and it's a high-traffic lock already).

We could have the open WAL file dropped if stale as a side-effect
anytime we have occasion to examine that shared state anyway. But
in a nearly-read-only session such as your example I'm not sure that
would happen often enough to fix the problem.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 18:10:03
Message-ID: 4B15079B020000250002CE8A@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Determining whether it's still the current append target is not so
> cheap though; it would require examining shared-memory status
> which means taking a lock on that status (and it's a high-traffic
> lock already).

I haven't reviewed the internal locking techniques, so this may well
be a dumb question, but... Since we only care whether the value is
equal, and an occasional false report of equality wouldn't hurt
anything, couldn't we bypass the lock in this particular case?

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 19:05:19
Message-ID: 5278.1259694319@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Determining whether it's still the current append target is not so
>> cheap though; it would require examining shared-memory status
>> which means taking a lock on that status (and it's a high-traffic
>> lock already).

> I haven't reviewed the internal locking techniques, so this may well
> be a dumb question, but... Since we only care whether the value is
> equal, and an occasional false report of equality wouldn't hurt
> anything, couldn't we bypass the lock in this particular case?

Perhaps, if you didn't mind sometimes getting a wrong answer.
I guess the cost of that would be pretty small in this particular
usage.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-01 19:14:54
Message-ID: 4B1516CE020000250002CE95@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Perhaps, if you didn't mind sometimes getting a wrong answer.

Well, it would be a heuristic which would close the deleted file
*almost* all the time. When it didn't, the next check would
probably catch it. Assuming that you would never get an indication
that it was different when it hadn't changed, the worst that could
happen is that you infrequently get the current behavior.

That all *seems* sane to me, if low priroity. Any objections to my
putting it on the TODO list?

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deleted WAL files held open by backends in Linux
Date: 2009-12-03 17:37:57
Message-ID: 4B17A315020000250002D055@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> Any objections to my putting it on the TODO list?

Hearing none, added. (Apologies for missing the box where I should
have commented on the what the change did.)

-Kevin