Re: What's left?

Lists: pgsql-hackerspgsql-hackers-win32
From: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>
To: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>, ''Merlin Moncure' ' <merlin(dot)moncure(at)rcsonline(dot)com>, 'pgsql-hackers-win32 ' <pgsql-hackers-win32(at)postgresql(dot)org>
Cc: 'PostgreSQL-development ' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What's left?
Date: 2004-01-23 01:30:23
Message-ID: A02DEC4D1073D611BAE8525405FCCE2B55F291@harris.memetrics.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32


Some fool wrote:
> It will then be a matter of fixing things like:
> * installation directory issues (/usr/local/pgsql/bin won't work too
> well outside of the MingW environment :-)
> * general directory handling (ie. whitespaces in directory names;
> forward/backslash path canonicalization)
> * sync issues
> * any missing structs/items in shared memory
> * generally, running the test suite, and fixing whatever is busted (I'm
> at 41 tests passing now :-)

One important thing I forgot, that someone could start looking at now:
* backends keeping files open when other backends are trying to
delete/rename them

The port I wrote for here at work simply modified the functions in dirmod.c,
to attempt the delete (or rename), and, on a failure identifiable as being
presumably due to another process holding the file open, simply schedules
the file for deletion at system start time using the Win32 API for doing so
(hey, it is Windows, it is going to reboot sooner or later :-). In the case
of rename, just copies the existing file and schedules the original for
deletion.

Ugly, and sometimes slow where we'd rather not be, but it gets us by.

We must do better for the official port, and whilst better solutions are
obviously conceivable, AFAICS they will require some amount of backend
changes and therefore consent from main list. Someone might want to start
looking at a nice, clean solution to this.

Cheers,
Claudio

---
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see
<a
href="http://www.memetrics.com/emailpolicy.html">http://www.memetrics.com/em
ailpolicy.html</a>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>
Cc: "''Merlin Moncure' '" <merlin(dot)moncure(at)rcsonline(dot)com>, "'pgsql-hackers-win32 '" <pgsql-hackers-win32(at)postgresql(dot)org>, "'PostgreSQL-development '" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What's left?
Date: 2004-01-23 03:47:14
Message-ID: 23641.1074829634@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com> writes:
> One important thing I forgot, that someone could start looking at now:
> * backends keeping files open when other backends are trying to
> delete/rename them

> We must do better for the official port,

Why? The procedure you mentioned seems perfectly adequate to me,
seeing that it's a bit of a corner case to start with.

I cannot think of any way of "doing better" that wouldn't be far too
invasive to be acceptable.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>, "''Merlin Moncure' '" <merlin(dot)moncure(at)rcsonline(dot)com>, "'pgsql-hackers-win32 '" <pgsql-hackers-win32(at)postgresql(dot)org>, "'PostgreSQL-development '" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What's left?
Date: 2004-01-26 05:13:53
Message-ID: 200401260513.i0Q5Dra23724@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

Tom Lane wrote:
> Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com> writes:
> > One important thing I forgot, that someone could start looking at now:
> > * backends keeping files open when other backends are trying to
> > delete/rename them
>
> > We must do better for the official port,
>
> Why? The procedure you mentioned seems perfectly adequate to me,
> seeing that it's a bit of a corner case to start with.

I don't see this as a corner case, except it being a corner case
operating system. :-)

I think it will very likely rename/unlink will fail because of the file
descriptor cache kept by each backend.

I am attaching dir.c from the PeerDirect port. It handles unlink
failures by appending the failed file name to a file that is later read
and another unlink attempted. Perhaps this is something we can do, and
have try unlinks after each checkpoint.

PeerDirect handles rename by just looping. We really can't delay a
rename. There is discussion in the Win32 TODO detail that goes over
some options, I think.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 7.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>, "''Merlin Moncure' '" <merlin(dot)moncure(at)rcsonline(dot)com>, "'pgsql-hackers-win32 '" <pgsql-hackers-win32(at)postgresql(dot)org>, "'PostgreSQL-development '" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What's left?
Date: 2004-01-26 05:26:24
Message-ID: 1038.1075094784@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> I think it will very likely rename/unlink will fail because of the file
> descriptor cache kept by each backend.

Hmm ... you're probably right. Okay, it's a more significant issue than
I thought.

> I am attaching dir.c from the PeerDirect port. It handles unlink
> failures by appending the failed file name to a file that is later read
> and another unlink attempted. Perhaps this is something we can do, and
> have try unlinks after each checkpoint.

That seems like a possibility. The open files will get closed very soon
after the delete occurs (as a byproduct of relcache flush), so we don't
need very much of a delay. Next checkpoint sounds reasonable.

> PeerDirect handles rename by just looping. We really can't delay a
> rename. There is discussion in the Win32 TODO detail that goes over
> some options, I think.

Do we really have any problem with rename? We don't rename table files.
The renames I can think of are renaming temp files into place as
permanent ones, and there would be no open references to such a file.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>, "''Merlin Moncure' '" <merlin(dot)moncure(at)rcsonline(dot)com>, "'pgsql-hackers-win32 '" <pgsql-hackers-win32(at)postgresql(dot)org>, "'PostgreSQL-development '" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What's left?
Date: 2004-01-26 05:49:07
Message-ID: 200401260549.i0Q5n8M10662@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-hackers-win32

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > I think it will very likely rename/unlink will fail because of the file
> > descriptor cache kept by each backend.
>
> Hmm ... you're probably right. Okay, it's a more significant issue than
> I thought.
>
> > I am attaching dir.c from the PeerDirect port. It handles unlink
> > failures by appending the failed file name to a file that is later read
> > and another unlink attempted. Perhaps this is something we can do, and
> > have try unlinks after each checkpoint.
>
> That seems like a possibility. The open files will get closed very soon
> after the delete occurs (as a byproduct of relcache flush), so we don't
> need very much of a delay. Next checkpoint sounds reasonable.

Good. I am glad for the recache closing because we were going to need
something like that.

> > PeerDirect handles rename by just looping. We really can't delay a
> > rename. There is discussion in the Win32 TODO detail that goes over
> > some options, I think.
>
> Do we really have any problem with rename? We don't rename table files.
> The renames I can think of are renaming temp files into place as
> permanent ones, and there would be no open references to such a file.

We do have a problem. It is with cache files read on startup, like
pg_pwd. We can generate the file as temp, but we have to slide it in
while a backend is not reading it. On a busy system, I am not sure how
large a window we will get for the rename. The rename is all
centralized in port/dirmod.c, so we can deal with it there, whatever the
solution.

We also have to do the rename during xact close because we need to hold
locks so we are sure the files are written in the same order that they
modify pg_shadow, waiting a long time for the rename is a serious
problem.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073