Re: strncmp->memcmp when we know the shorter length

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: strncmp->memcmp when we know the shorter length
Date: 2010-12-22 02:30:14
Message-ID: AANLkTim2cBy16PJdyCEkUACE0gypgrjSx3VJd28V+OiE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 21, 2010 at 9:01 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Dec 21, 2010 at 8:29 PM, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
> wrote:
> > On Tue, Dec 21, 2010 at 6:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >>
> >> On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> >> > When the caller knows the smaller string length, memcmp and strncmp
> are
> >> > functionally equivalent. Since memcmp need not watch each byte for a
> >> > NULL
> >> > terminator, it often compares a CPU word at a time for better
> >> > performance. The
> >> > attached patch changes use of strncmp to memcmp where we have the
> length
> >> > of the
> >> > shorter string. I was most interested in the varlena.c instances, but
> I
> >> > tried
> >> > to find all applicable call sites. To benchmark it, I used the
> attached
> >> > "bench-texteq.sql". This patch improved my 5-run average timing of
> the
> >> > SELECT
> >> > from 65.8s to 56.9s, a 13% improvement. I can't think of a case where
> >> > the
> >> > change should be pessimal.
> >>
> >> This is a good idea. I will check this over and commit it.
> >
> > Doesn't this risk accessing bytes beyond the shorter string?
>
> If it's done properly, I don't see how this would be a risk.
>
> > Look at the
> > warning above the StrNCpy(), for example.
>
> If you're talking about this comment:
>
> * BTW: when you need to copy a non-null-terminated string (like a
> text
> * datum) and add a null, do not do it with StrNCpy(..., len+1). That
> * might seem to work, but it fetches one byte more than there is in
> the
> * text object.
>
> ...then that's not applicable here. It's perfectly safe to compare to
> strings of length n using an n-byte memcmp(). The bytes being
> compared are 0 through n - 1; the terminating null is in byte n, or
> else it isn't, but memcmp() certainly isn't going to look at it.
>
>
I missed the part where Noah said "... where we have the length of the *
_shorter_* string". I agree we are safe here.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2010-12-22 03:03:30 Re: How much do the hint bits help?
Previous Message Robert Haas 2010-12-22 02:01:23 Re: strncmp->memcmp when we know the shorter length