Re: In pg_upgrade, copy fsm, vm, and extent files by checking for fi

Lists: pgsql-committerspgsql-hackers
From: Bruce Momjian <bruce(at)momjian(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 22:32:12
Message-ID: E1TYlVE-0005b7-TW@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

In pg_upgrade, copy fsm, vm, and extent files by checking for file
existence via open(), rather than collecting a directory listing and
looking up matching relfilenode files with sequential scans of the
array. This speeds up pg_upgrade by 2x for a large number of tables,
e.g. 16k.

Per observation by Ants Aasma.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/29add0de4920e4f448a30bfc35798b939c211d97

Modified Files
--------------
contrib/pg_upgrade/file.c | 55 ----------
contrib/pg_upgrade/pg_upgrade.h | 2 -
contrib/pg_upgrade/relfilenode.c | 208 +++++++++++++++-----------------------
3 files changed, 82 insertions(+), 183 deletions(-)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 22:39:29
Message-ID: 27789.1352932769@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> In pg_upgrade, copy fsm, vm, and extent files by checking for file
> existence via open(), rather than collecting a directory listing and
> looking up matching relfilenode files with sequential scans of the
> array. This speeds up pg_upgrade by 2x for a large number of tables,
> e.g. 16k.

Uh ... you replaced a strcmp() with an open()?

I'm prepared to believe that's a win for sufficiently large N, if you
assume that the filesystem is smart enough to have O(1) lookup time
regardless of the directory size ... but that doesn't seem like a very
good assumption, and in any case surely this loses badly for a smaller
number of files.

You would have been better off keeping the array and sorting it so you
could use binary search, instead of passing the problem off to the
filesystem.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 22:58:14
Message-ID: 20121114225814.GB6753@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > In pg_upgrade, copy fsm, vm, and extent files by checking for file
> > existence via open(), rather than collecting a directory listing and
> > looking up matching relfilenode files with sequential scans of the
> > array. This speeds up pg_upgrade by 2x for a large number of tables,
> > e.g. 16k.
>
> Uh ... you replaced a strcmp() with an open()?

Yes, strcmp() on all elements of an array.

> I'm prepared to believe that's a win for sufficiently large N, if you
> assume that the filesystem is smart enough to have O(1) lookup time
> regardless of the directory size ... but that doesn't seem like a very
> good assumption, and in any case surely this loses badly for a smaller
> number of files.
>
> You would have been better off keeping the array and sorting it so you
> could use binary search, instead of passing the problem off to the
> filesystem.

Well, testing showed using open() was a big win. To do this with the
directory listing, as I mentioned, you need to pull listings from all
tablespaces, sort them, then do a binary search. I thought the removal
of the directory array code itself was a win, and I figured the
directory code was already doing a binary search in the kernel.

Do you want me to code up a test?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 23:15:30
Message-ID: 28536.1352934930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
>> You would have been better off keeping the array and sorting it so you
>> could use binary search, instead of passing the problem off to the
>> filesystem.

> Well, testing showed using open() was a big win.

... on the filesystem you tested on. I'm concerned that it might not
look so good on other platforms.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 23:28:41
Message-ID: 20121114232841.GC6753@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 14, 2012 at 06:15:30PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
> >> You would have been better off keeping the array and sorting it so you
> >> could use binary search, instead of passing the problem off to the
> >> filesystem.
>
> > Well, testing showed using open() was a big win.
>
> ... on the filesystem you tested on. I'm concerned that it might not
> look so good on other platforms.

True. I am on ext3. So I need to generate a proof-of-concept patch and
have others test it?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-14 23:57:13
Message-ID: 29287.1352937433@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

BTW, this patch isn't looking so good on Windows. Buildfarm member
chough says

"C:\prog\bf\root\HEAD\pgsql.11252\pgsql.sln" (default target) (1) ->
(contrib\pg_upgrade target) ->
.\contrib\pg_upgrade\relfilenode.c(202): warning C4003: not enough actual parameters for macro 'open'

"C:\prog\bf\root\HEAD\pgsql.11252\pgsql.sln" (default target) (1) ->
(contrib\pg_upgrade target) ->
.\contrib\pg_upgrade\relfilenode.c(202): error C2059: syntax error : ')'
.\contrib\pg_upgrade\relfilenode.c(207): error C2181: illegal else without matching if
.\contrib\pg_upgrade\relfilenode.c(242): error C2059: syntax error : 'return'
.\contrib\pg_upgrade\relfilenode.c(243): error C2059: syntax error : '}'

1 Warning(s)
4 Error(s)

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-15 00:03:50
Message-ID: 20121115000350.GE6753@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 14, 2012 at 06:57:13PM -0500, Tom Lane wrote:
> BTW, this patch isn't looking so good on Windows. Buildfarm member
> chough says
>
> "C:\prog\bf\root\HEAD\pgsql.11252\pgsql.sln" (default target) (1) ->
> (contrib\pg_upgrade target) ->
> .\contrib\pg_upgrade\relfilenode.c(202): warning C4003: not enough actual parameters for macro 'open'
>
>
> "C:\prog\bf\root\HEAD\pgsql.11252\pgsql.sln" (default target) (1) ->
> (contrib\pg_upgrade target) ->
> .\contrib\pg_upgrade\relfilenode.c(202): error C2059: syntax error : ')'
> .\contrib\pg_upgrade\relfilenode.c(207): error C2181: illegal else without matching if
> .\contrib\pg_upgrade\relfilenode.c(242): error C2059: syntax error : 'return'
> .\contrib\pg_upgrade\relfilenode.c(243): error C2059: syntax error : '}'
>
> 1 Warning(s)
> 4 Error(s)

OK, fixed by adding a third open() parameter of 0. I see our other code
does this too. I am not sure what those syntax errors are but I am
guessing the failed macro messed them up.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: In pg_upgrade, copy fsm, vm, and extent files by checking for fi
Date: 2012-11-24 03:31:13
Message-ID: 20121124033113.GC9382@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 14, 2012 at 06:28:41PM -0500, Bruce Momjian wrote:
> On Wed, Nov 14, 2012 at 06:15:30PM -0500, Tom Lane wrote:
> > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
> > >> You would have been better off keeping the array and sorting it so you
> > >> could use binary search, instead of passing the problem off to the
> > >> filesystem.
> >
> > > Well, testing showed using open() was a big win.
> >
> > ... on the filesystem you tested on. I'm concerned that it might not
> > look so good on other platforms.
>
> True. I am on ext3. So I need to generate a proof-of-concept patch and
> have others test it?

OK, test patch attached. The patch will only work if your database uses
a single tablespace. Doing multiple tablespaces seemed too complex for
the test patch.

Here are my results:

# tables git bsearch patch
1 11.16 10.99
1000 19.13 19.26
2000 26.78 27.11
4000 43.81 42.15
8000 79.96 77.38
16000 165.26 162.24
32000 378.18 368.49
64000 1083.35 1086.77

As you can see, the bsearch code doesn't see to make much difference.
Sometimes it is faster, other times, slower --- seem to be just
measurement noise. This code uses the same method of file lookup as git
head, meaning it looks for specific files rather than patterns --- this
simplified the bsearch code.

Can anyone see a consistent improvement with the bsearch patch?
Attached is my test shell script. There is no reason we can't use
bsearch(), except not using it allows us to remove the pg_upgrade
directory listing function.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
pg_upgrade.diff text/x-diff 8.3 KB
test.sh application/x-sh 2.3 KB