Quick Links

Re: Faster StrNCpy

Lists:	pgsql-hackerspgsql-patches

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Faster StrNCpy
Date:	2006-09-26 20:24:51
Message-ID:	29452.1159302291@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

David Strong points out here
http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
that some popular implementations of strncpy(dst,src,n) are quite
inefficient when strlen(src) is much less than n, because they don't
optimize the zero-pad step that is required by the standard.

It looks to me like we have a good number of places that are using
either StrNCpy or strncpy directly to copy into large buffers that
we do not need full zero-padding in, only a single guaranteed null
byte. While not all of these places are in performance-critical
paths, some are. David identified set_ps_display, and the other
thing that's probably significant is unnecessary use of strncpy
for keys of string-keyed hash tables. (We used to actually need
zero padding for string-keyed hash keys, but that was a long time ago.)

I propose adding an additional macro in c.h, along the lines of

#define StrNCopy(dst,src,len) \
do \
{ \
char * _dst = (dst); \
Size _len = (len); \
\
if (_len > 0) \
{ \
const char * _src = (src); \
Size _src_len = strlen(_src); \
\
if (_src_len < _len) \
memcpy(_dst, _src, _src_len + 1); \
else \
{ \
memcpy(_dst, _src, _len - 1); \
_dst[_len-1] = '\0'; \
} \
} \
} while (0)

Unlike StrNCpy, this requires that the source string be null-terminated,
so it would not be a drop-in replacement everywhere. Also, it could be
a performance loss if strlen(src) is much larger than len ... but that
is not usually the case for the places we'd want to apply it.

Thoughts, objections? In particular, is the name OK, or do we need
something a bit further away from StrNCpy?

regards, tom lane

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 20:40:18
Message-ID:	20060926204018.GF19913@svana.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Tue, Sep 26, 2006 at 04:24:51PM -0400, Tom Lane wrote:
> David Strong points out here
> http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
> that some popular implementations of strncpy(dst,src,n) are quite
> inefficient when strlen(src) is much less than n, because they don't
> optimize the zero-pad step that is required by the standard.

I think that's why strlcpy was invented, to deal with the issues with
strncpy.

http://www.gratisoft.us/todd/papers/strlcpy.html

There's an implementation here (used in glib), though you could
probably find more.

http://mail.gnome.org/archives/gtk-devel-list/2000-May/msg00029.html

Do you really think it's worth making a macro rather than just a normal
function?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 20:49:37
Message-ID:	20060926204937.GA22101@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Martijn van Oosterhout wrote:
> On Tue, Sep 26, 2006 at 04:24:51PM -0400, Tom Lane wrote:
> > David Strong points out here
> > http://archives.postgresql.org/pgsql-hackers/2006-09/msg02071.php
> > that some popular implementations of strncpy(dst,src,n) are quite
> > inefficient when strlen(src) is much less than n, because they don't
> > optimize the zero-pad step that is required by the standard.
>
> I think that's why strlcpy was invented, to deal with the issues with
> strncpy.
>
> http://www.gratisoft.us/todd/papers/strlcpy.html
>
> There's an implementation here (used in glib), though you could
> probably find more.
>
> http://mail.gnome.org/archives/gtk-devel-list/2000-May/msg00029.html

That one would be LGPL (glib's license). Here is OpenBSD's version,
linked from that one:

ftp://ftp.openbsd.org/pub/OpenBSD/src/lib/libc/string/strlcpy.c

You'll notice that it iterates once per char. Between that and the
strlen() call in Tom's version, not sure which is the lesser evil.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 20:53:59
Message-ID:	29714.1159304039@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> I think that's why strlcpy was invented, to deal with the issues with
> strncpy.
> http://www.gratisoft.us/todd/papers/strlcpy.html

strlcpy does more than we need (note that none of the existing uses care
about counting the overflowed bytes). Not sure if it's worth adopting
those semantics when they're not really standard, but if you think a lot
of people would be familiar with strlcpy, maybe we should.

> Do you really think it's worth making a macro rather than just a normal
> function?

Only in that a macro in c.h is less work than a configure test plus a
replacement file in src/port. But if we want to consider this a
standard function that just doesn't happen to exist everywhere, I
suppose we should use configure.

regards, tom lane

From:	Neil Conway <neilc(at)samurai(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 21:03:50
Message-ID:	1159304630.1462.15.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Tue, 2006-09-26 at 16:53 -0400, Tom Lane wrote:
> strlcpy does more than we need (note that none of the existing uses care
> about counting the overflowed bytes). Not sure if it's worth adopting
> those semantics when they're not really standard, but if you think a lot
> of people would be familiar with strlcpy, maybe we should.

I think we should -- while strlcpy() is not standardized, it is widely
used (in libc on all the BSDs, Solaris and OS X, as well as private
copies in Linux, glib, etc.).

A wholesale replacement of strncpy() calls is probably worth doing --
replacing them with strlcpy() if the source string is NUL-terminated,
and I suppose memcpy() otherwise.

-Neil

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 21:04:14
Message-ID:	29846.1159304654@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> You'll notice that it iterates once per char. Between that and the
> strlen() call in Tom's version, not sure which is the lesser evil.

Yeah, I was wondering that too. My code would require two scans of the
source string (one inside strlen and one in memcpy), but in much of our
usage the source and dest should be reasonably well aligned and one
could expect memcpy to be using word rather than byte operations, so you
might possibly make it back on the strength of fewer write cycles. And
on the third hand, for short source strings none of this matters and
the extra function call involved for strlen/memcpy probably dominates.

I'm happy to just use the OpenBSD version as a src/port module.
Any objections?

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Neil Conway <neilc(at)samurai(dot)com>
Cc:	Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-26 21:12:25
Message-ID:	29984.1159305145@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> A wholesale replacement of strncpy() calls is probably worth doing --
> replacing them with strlcpy() if the source string is NUL-terminated,
> and I suppose memcpy() otherwise.

What I'd like to do immediately is put in strlcpy() and hit the two or
three places I think are performance-relevant. I agree with trying to
get rid of StrNCpy/strncpy calls over the long run, but it's just code
beautification and probably not appropriate for beta.

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Neil Conway <neilc(at)samurai(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 00:42:20
Message-ID:	200609261742.21745.josh@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Tom,

> What I'd like to do immediately is put in strlcpy() and hit the two or
> three places I think are performance-relevant. I agree with trying to
> get rid of StrNCpy/strncpy calls over the long run, but it's just code
> beautification and probably not appropriate for beta.

Immediately? Presumably you mean for 8.3?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	josh(at)agliodbs(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 02:25:48
Message-ID:	2810.1159323948@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> What I'd like to do immediately is put in strlcpy() and hit the two or
>> three places I think are performance-relevant.

> Immediately? Presumably you mean for 8.3?

No, I mean now. This is a performance bug and it's still open season on
bugs. If we were close to having a release-candidate version, I'd hold
off, but the above proposal seems sufficiently low-risk for the current
stage of the cycle.

regards, tom lane

From:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 13:02:46
Message-ID:	B6419AF36AC8524082E1BC17DA2506E802579E0C@USMV-EXCH2.na.uis.unisys.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Tom,

Let us know when you've added strlcpy () and we'll be happy to run some tests on the new code.

David

________________________________

From: pgsql-hackers-owner(at)postgresql(dot)org on behalf of Tom Lane
Sent: Tue 9/26/2006 7:25 PM
To: josh(at)agliodbs(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org; Neil Conway; Martijn van Oosterhout
Subject: Re: [HACKERS] Faster StrNCpy

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> What I'd like to do immediately is put in strlcpy() and hit the two or
>> three places I think are performance-relevant.

> Immediately? Presumably you mean for 8.3?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 13:22:30
Message-ID:	451A7B16.5040702@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>
>>> What I'd like to do immediately is put in strlcpy() and hit the two or
>>> three places I think are performance-relevant.
>>>
>
>
>> Immediately? Presumably you mean for 8.3?
>>
>
> No, I mean now. This is a performance bug and it's still open season on
> bugs. If we were close to having a release-candidate version, I'd hold
> off, but the above proposal seems sufficiently low-risk for the current
> stage of the cycle.
>
>

What are the other hotspots?

cheers

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, Neil Conway <neilc(at)samurai(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 13:49:26
Message-ID:	8425.1159364966@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> What I'd like to do immediately is put in strlcpy() and hit the two or
>> three places I think are performance-relevant.

> What are the other hotspots?

The ones I can think of offhand are set_ps_display and use of strncpy as
a HashCopyFunc.

regards, tom lane

From:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 14:08:05
Message-ID:	B6419AF36AC8524082E1BC17DA2506E802579E0F@USMV-EXCH2.na.uis.unisys.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

We sometimes see TupleDescInitEntry () taking high CPU times via OProfile. This does include, amongst a lot of other code, a call to namestrcpy () which in turn calls StrNCpy (). Perhaps this is not a good candidate right now as a name string is only 64 bytes.

David

________________________________

From: pgsql-hackers-owner(at)postgresql(dot)org on behalf of Tom Lane
Sent: Wed 9/27/2006 6:49 AM
To: Andrew Dunstan
Cc: josh(at)agliodbs(dot)com; pgsql-hackers(at)postgresql(dot)org; Neil Conway; Martijn van Oosterhout
Subject: Re: [HACKERS] Faster StrNCpy

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> What I'd like to do immediately is put in strlcpy() and hit the two or
>> three places I think are performance-relevant.

> What are the other hotspots?

The ones I can think of offhand are set_ps_display and use of strncpy as
a HashCopyFunc.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

From:	mark(at)mark(dot)mielke(dot)cc
To:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-27 23:26:40
Message-ID:	20060927232639.GA15401@mark.mielke.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Wed, Sep 27, 2006 at 07:08:05AM -0700, Strong, David wrote:
> We sometimes see TupleDescInitEntry () taking high CPU times via
> OProfile. This does include, amongst a lot of other code, a call to
> namestrcpy () which in turn calls StrNCpy (). Perhaps this is not a
> good candidate right now as a name string is only 64 bytes.

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

memcpy(&target, &source, sizeof(target));
target[sizeof(target)-1] = '\0';

I imagine any extra checking causes processor stalls, or at least for
the branch prediction to fill up? Straight copies might allow for
maximum parallelism? If it's only 64 bytes, on processors such as
Pentium or Athlon, that's 2 or 4 cache lines, and writes are always
performed as cache lines.

I haven't seen the code that you and Tom are looking at to tell
whether it is safe to do this or not.

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

From:	"Adnan DURSUN" <a_dursun(at)hotmail(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Can i see server SQL commands ?
Date:	2006-09-28 01:27:36
Message-ID:	BAY106-DAV2345898C65DDEFF9066DC2FA1B0@phx.gbl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Hi all

I wanna know what is going on while a DML command works. For example
;
Which commands are executed by the core when we send an "UPDATE tab
SET col = val1..."
in case there is a foreing key or an unique constraint on table
"tab".

How can i see that ?

Best regards

Adnan DURSUN
ASRIN Bilişim Ltd.

From:	tomas(at)tuxteam(dot)de
To:	Adnan DURSUN <a_dursun(at)hotmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Can i see server SQL commands ?
Date:	2006-09-28 07:05:22
Message-ID:	20060928070522.GA31054@www.trapp.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Sep 28, 2006 at 04:27:36AM +0300, Adnan DURSUN wrote:
>
> Hi all
>
> I wanna know what is going on while a DML command works. For example
> ;
> Which commands are executed by the core when we send an "UPDATE tab
> SET col = val1..."

Adnan,

this mailing list is not the right one for such questions. More
appropriate would be <pgsql-novice(at)postgresql(dot)org> or maybe
<pgsql-general(at)postgresql(dot)org>.

Having said that, you may set the log level of the server in the
configuration file (whose location depends on your OS and PostgreSQL
version. Look there for a line log_statements = XXX and set XXX to
'all'. Don't forget to restart your server afterwards.

HTH
- -- tomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFG3QyBcgs9XrR2kYRAgdqAJ0VnUw5+Q79HiIwHocHIw4TWHePaQCffBBK
ASn3Z6XpKG91NTrmEaBtz08=
=Ibh3
-----END PGP SIGNATURE-----

From:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-09-28 14:51:36
Message-ID:	B6419AF36AC8524082E1BC17DA2506E80310217B@USMV-EXCH2.na.uis.unisys.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Mark,

In the specific case of the namestrcpy () function, it will copy a
maximum of 64 bytes, but the length of the source string is unknown. I
would have to think that memcpy () would certainly win if you knew the
source and destination sizes etc. Perhaps there are some places like
that in the code that don't use memcpy () currently?

David

-----Original Message-----
From: mark(at)mark(dot)mielke(dot)cc [mailto:mark(at)mark(dot)mielke(dot)cc]
Sent: Wednesday, September 27, 2006 4:27 PM
To: Strong, David
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Faster StrNCpy

Just wondering - are any of these cases where a memcpy() would work
just as well? Or are you not sure that the source string is at least
64 bytes in length?

memcpy(&target, &source, sizeof(target));
target[sizeof(target)-1] = '\0';

I haven't seen the code that you and Tom are looking at to tell
whether it is safe to do this or not.

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com
__________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood
Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario,
Canada

One ring to rule them all, one ring to find them, one ring to bring
them all
and in the darkness bind them...

http://mark.mielke.cc/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-28 15:56:04
Message-ID:	26197.1159458964@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

"Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
> Just wondering - are any of these cases where a memcpy() would work
> just as well? Or are you not sure that the source string is at least
> 64 bytes in length?

In most cases, we're pretty sure that it's *not* --- it'll just be a
palloc'd C string.

I'm disinclined to fool with the restriction that namestrcpy zero-pad
Name values, because they might end up on disk, and allowing random
memory contents to get written out is ungood from a security point of
view. However, it's entirely possible that it'd be a bit faster to do
a MemSet followed by strlcpy than to use strncpy for zero-padding.

regards, tom lane

From:	Markus Schaber <schabi(at)logix-tt(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 09:21:21
Message-ID:	451CE591.9030108@logix-tt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Hi, Tom,

Tom Lane wrote:
> "Strong, David" <david(dot)strong(at)unisys(dot)com> writes:
>> Just wondering - are any of these cases where a memcpy() would work
>> just as well? Or are you not sure that the source string is at least
>> 64 bytes in length?
>
> In most cases, we're pretty sure that it's *not* --- it'll just be a
> palloc'd C string.
>
> I'm disinclined to fool with the restriction that namestrcpy zero-pad
> Name values, because they might end up on disk, and allowing random
> memory contents to get written out is ungood from a security point of
> view.

There's another disadvantage of always copying 64 bytes:

It may be that the 64-byte range crosses a page boundary. Now guess what
happens when this next page is not mapped -> segfault.

Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Markus Schaber <schabi(at)logix-tt(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 14:59:22
Message-ID:	16166.1159541962@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Markus Schaber <schabi(at)logix-tt(dot)com> writes:
> There's another disadvantage of always copying 64 bytes:
> It may be that the 64-byte range crosses a page boundary. Now guess what
> happens when this next page is not mapped -> segfault.

Irrelevant, because in all interesting cases the Name field is part of a
larger record that would stretch into that other page anyway.

regards, tom lane

From:	mark(at)mark(dot)mielke(dot)cc
To:	Markus Schaber <schabi(at)logix-tt(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 20:42:39
Message-ID:	20060929204239.GA30048@mark.mielke.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Fri, Sep 29, 2006 at 11:21:21AM +0200, Markus Schaber wrote:
> Tom Lane wrote:
> >> Just wondering - are any of these cases where a memcpy() would work
> >> just as well? Or are you not sure that the source string is at least
> >> 64 bytes in length?
> >
> > In most cases, we're pretty sure that it's *not* --- it'll just be a
> > palloc'd C string.
> >
> > I'm disinclined to fool with the restriction that namestrcpy zero-pad
> > Name values, because they might end up on disk, and allowing random
> > memory contents to get written out is ungood from a security point of
> > view.
>
> There's another disadvantage of always copying 64 bytes:
>
> It may be that the 64-byte range crosses a page boundary. Now guess what
> happens when this next page is not mapped -> segfault.

With strncpy(), this possibility already exists. If it is a real problem,
that stand-alone 64-byte allocations are crossing page boundaries, the
fault is with the memory allocator, not with the user of the memory.

For strlcpy(), my suggestion that Tom quotes was that modern processes
do best when instructions can be fully parallelized. It is a lot
easier to parallelize a 64-byte copy, than a tight loop looking for
'\0' or n >= 64. 64 bytes easily fits into cache memory, and modern
processors write cache memory in blocks of 16, 32, or 64 bytes anyways,
meaning that any savings in terms of not writing are minimal.

But it's only safe if you know that the source string allocation is
>= 64 bytes. Often you don't, therefore it isn't safe, and the suggestion
is unworkable.

Cheers,
mark

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

From:	mark(at)mark(dot)mielke(dot)cc
To:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 21:23:31
Message-ID:	20060929212331.GB30048@mark.mielke.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

If anybody is curious, here are my numbers for an AMD X2 3800+:

$ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be slow."' -o x x.c y.c strlcpy.c ; ./x
NONE: 620268 us
MEMCPY: 683135 us
STRNCPY: 7952930 us
STRLCPY: 10042364 us

$ gcc -O3 -std=c99 -DSTRING='"Short sentence."' -o x x.c y.c strlcpy.c ; ./x
NONE: 554694 us
MEMCPY: 691390 us
STRNCPY: 7759933 us
STRLCPY: 3710627 us

$ gcc -O3 -std=c99 -DSTRING='""' -o x x.c y.c strlcpy.c ; ./x
NONE: 631266 us
MEMCPY: 775340 us
STRNCPY: 7789267 us
STRLCPY: 550430 us

Each invocation represents 100 million calls to each of the functions.
Each function accepts a 'dst' and 'src' argument, and assumes that it
is copying 64 bytes from 'src' to 'dst'. The none function does
nothing. The memcpy calls memcpy(), the strncpy calls strncpy(), and
the strlcpy calls the strlcpy() that was posted from the BSD sources.
(GLIBC doesn't have strlcpy() on my machine).

This makes it clear what the overhead of the additional logic involves.
memcpy() is approximately equal to nothing at all. strncpy() is always
expensive. strlcpy() is often more expensive than memcpy(), except in
the empty string case.

These tests do not properly model the effects of real memory, however,
they do model the effects of cache memory. I would suggest that the
results are exaggerated, but not invalid.

For anybody doubting the none vs memcpy, I've included the generated
assembly code. I chalk it entirely up to fully utilizing the
parallelization capability of the CPU. Although 16 movq instructions
are executed, they can be executed fully in parallel.

It almost makes it clear to me that all of these instructions are
pretty fast. Are we sure this is a real bottleneck? Even the slowest
operation above, strlcpy() on a very long string, appears to execute
10 per microsecond? Perhaps my tests are too easy for my CPU and I
need to make it access many different 64-byte blocks? :-)

Cheers,
mark

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

Attachment	Content-Type	Size
x.s	text/plain	1.7 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	mark(at)mark(dot)mielke(dot)cc
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 21:34:30
Message-ID:	3776.1159565670@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

mark(at)mark(dot)mielke(dot)cc writes:
> If anybody is curious, here are my numbers for an AMD X2 3800+:

You did not show your C code, so no one else can reproduce the test on
other hardware. However, it looks like your compiler has unrolled the
memcpy into straight-line 8-byte moves, which makes it pretty hard for
anything operating byte-wise to compete, and is a bit dubious for the
general case anyway (since it requires assuming that the size and
alignment are known at compile time).

This does make me wonder about whether we shouldn't try the
strlen+memcpy implementation I proposed earlier ...

regards, tom lane

From:	mark(at)mark(dot)mielke(dot)cc
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Faster StrNCpy
Date:	2006-09-29 21:59:17
Message-ID:	20060929215917.GC30048@mark.mielke.cc
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

On Fri, Sep 29, 2006 at 05:34:30PM -0400, Tom Lane wrote:
> mark(at)mark(dot)mielke(dot)cc writes:
> > If anybody is curious, here are my numbers for an AMD X2 3800+:
> You did not show your C code, so no one else can reproduce the test on
> other hardware. However, it looks like your compiler has unrolled the
> memcpy into straight-line 8-byte moves, which makes it pretty hard for
> anything operating byte-wise to compete, and is a bit dubious for the
> general case anyway (since it requires assuming that the size and
> alignment are known at compile time).

I did show the .s code. I call into x_memcpy(a, b), meaning that the
compiler can't assume anything. It may happen to be aligned.

Here are results over 64 Mbytes of memory, to ensure that every call is
a cache miss:

$ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN="(1024*1024)" -o x x.c y.c strlcpy.c ; ./x
NONE: 767243 us
MEMCPY: 6044137 us
STRNCPY: 10741759 us
STRLCPY: 12061630 us
LENCPY: 9459099 us

$ gcc -O3 -std=c99 -DSTRING='"Short sentence."' -DN="(1024*1024)" -o x x.c y.c strlcpy.c ; ./x
NONE: 712193 us
MEMCPY: 6072312 us
STRNCPY: 9982983 us
STRLCPY: 6605052 us
LENCPY: 7128258 us

$ gcc -O3 -std=c99 -DSTRING='""' -DN="(1024*1024)" -o x x.c y.c strlcpy.c ; ./x NONE: 708164 us
MEMCPY: 6042817 us
STRNCPY: 8885791 us
STRLCPY: 5592477 us
LENCPY: 6135550 us

At least on my machine, memcpy() still comes out on top. Yes, assuming that
it is aligned correctly for the machine. Here is unaliagned (all arrays are
stored +1 offset in memory):

$ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN="(1024*1024)" -DALIGN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 790932 us
MEMCPY: 6591559 us
STRNCPY: 10622291 us
STRLCPY: 12070007 us
LENCPY: 10322541 us

$ gcc -O3 -std=c99 -DSTRING='"Short sentence."' -DN="(1024*1024)" -DALIGN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 764577 us
MEMCPY: 6631731 us
STRNCPY: 9513540 us
STRLCPY: 6615345 us
LENCPY: 7263392 us

$ gcc -O3 -std=c99 -DSTRING='""' -DN="(1024*1024)" -DALIGN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 825689 us
MEMCPY: 6607777 us
STRNCPY: 8976487 us
STRLCPY: 5878088 us
LENCPY: 6180358 us

Alignment looks like it does impact the results for memcpy(). memcpy()
changes from around 6.0 seconds to 6.6 seconds. Overall, though, it is
still the winner in all cases accept for strlcpy(), which beats it on
very short strings ("").

Here is the cache hit case including your strlen+memcpy as 'LENCPY':

$ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be very slow."' -DN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 696157 us
MEMCPY: 825118 us
STRNCPY: 7983159 us
STRLCPY: 10787462 us
LENCPY: 6048339 us

$ gcc -O3 -std=c99 -DSTRING='"Short sentence."' -DN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 700201 us
MEMCPY: 593701 us
STRNCPY: 7577380 us
STRLCPY: 3727801 us
LENCPY: 3169783 us

$ gcc -O3 -std=c99 -DSTRING='""' -DN=1 -o x x.c y.c strlcpy.c ; ./x
NONE: 706283 us
MEMCPY: 792719 us
STRNCPY: 7870425 us
STRLCPY: 681334 us
LENCPY: 2062983 us

First call was every call being a cache hit. With this one, every one is
a cache miss, and the 64-byte blocks are spread equally over 64 Mbytes of
memory. I've attached the code for your consideration. x.c is the routines
I used to perform the tests. y.c is the main program. strlcpy.c is copied
from the online reference as is without change. The compilation steps
are described above. STRING is the string to try out. N is the number
of 64-byte blocks to allocate. ALIGN is the number of bytes to offset
the array by when storing / reading / writing. ALIGN should be >= 0.

At N=1, it's all in cache. At N=1024*1024 it is taking up 64 Mbytes of
RAM.

Cheers,
mark

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

Attachment	Content-Type	Size
x.c	text/plain	580 bytes
y.c	text/plain	2.1 KB
strlcpy.c	text/plain	1.8 KB

From:	"Strong, David" <david(dot)strong(at)unisys(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Faster StrNCpy
Date:	2006-10-02 16:06:35
Message-ID:	B6419AF36AC8524082E1BC17DA2506E802579E2C@USMV-EXCH2.na.uis.unisys.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers pgsql-patches

Mark,

Thanks for attaching the C code for your test. I ran a few tests on a 3Ghz Intel Xeon Paxville (dual core) system. I hope the formatting of this table survives:

Method Size N=1024*1024 N=1

MEMCPY 63 6964927 us 582494 us
MEMCPY 32 7102497 us 582467 us
MEMCPY 16 7116358 us 582538 us
MEMCPY 8 6965239 us 582796 us
MEMCPY 4 6964722 us 583183 us

STRNCPY 63 10131174 us 8843010 us
STRNCPY 32 10648202 us 9563868 us
STRNCPY 16 9187398 us 7969947 us
STRNCPY 8 9275353 us 8042777 us
STRNCPY 4 9067570 us 8058532 us

STRLCPY 63 15045507 us 14379702 us
STRLCPY 32 8960303 us 8120471 us
STRLCPY 16 7393607 us 4915457 us
STRLCPY 8 7222983 us 3211931 us
STRLCPY 4 7181267 us 1725546 us

LENCPY 63 7608932 us 4416602 us
LENCPY 32 7252849 us 3807535 us
LENCPY 16 11680927 us 10331487 us
LENCPY 8 10409685 us 9660616 us
LENCPY 4 10824632 us 9525082 us

The first column is the copy method, the second column is the source string size (size of -DSTRING), the 3rd and 4th columns are the different -DN settings.

The memcpy () call is the clear winner, at all source string sizes. The strncpy () call is better than strlcpy (), until the size of the string decreases. This is probably due to the zero padding effect of strncpy. The lencpy () call starts out strong, but degrades as the size of the string decreases. This was a little surprising and I don't have an explanation for this behavior at this time.

The AMD optimization manuals have some interesting examples for optimizations for memcpy, along the lines of cache line copies and prefetching:

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF#search=%22amd%20optimization%20manual%22

h <http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf#search=%22amd%20optimization%20manual%22> ttp://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf#search=%22amd%20optimization%20manual%22

There also used to be an interesting article on the SGI web site called "Optimizing CPU to Memory Accesses on the SGI Visual Workstations 320 and 540", but this seems to have been pulled. I did find a copy of the article here:

http://eunchul.com/database/board/cat.php?data=Win32_API&board_group=D42a8ff5c3a9b9

Obviously, different copy mechanisms suit different data sizes. So, I added a little debug to the strlcpy () function that was added to Postgres the other day. I ran a test against Postgres for ~15 minutes that used 2 client backends and the BG writer - 8330804 calls to strlcpy () were generated by the test.

Out of the 8330804 calls, 6226616 calls used a maximum copy size of 2213 bytes e.g. strlcpy (dest, src, 2213) and 2104074 calls used a maximum copy size of 64 bytes.

I know the 2213 size calls come from the set_ps_display () function. I don't know where the 64 size calls come from, yet.

In the 64 size case, with the exception of 35 calls, calls for size 64 are only copying 1 byte - I would assume this is a NULL.

In the 2213 size case, 1933027 calls copy 20 bytes; 2189415 calls copy 5 bytes; 85550 calls copy 6 bytes and 2018482 calls copy 7 bytes.

Based on this data, it would seem that either memcpy () or strlcpy () calls would be better due to the source string size.

Call originating from the set_ps_display () function might be able to use the memcpy () call as the size of the source string should be known. The other calls probably need something like strlcpy () as the source string might not be known, although using memcpy () to copy in XX byte blocks might be interesting.

David

________________________________

From: pgsql-hackers-owner(at)postgresql(dot)org on behalf of mark(at)mark(dot)mielke(dot)cc
Sent: Fri 9/29/2006 2:59 PM
To: Tom Lane
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Faster StrNCpy