Re: [PATCHES] Eliminate more detoast copies for packed varlenas

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Eliminate more detoast copies for packed varlenas
Date: 2007-09-23 12:44:04
Message-ID: 87ejgpv9ez.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> Ok, this removes what should be most if not all of the call sites where we're
>> detoasting text or byteas. In particular it gets all the regexp/like functions
>> and all the trim/pad functions. It also gets hashtext and hash_any.
>
> Applied with some fixes --- you'd missed like_match.c, which doubtless
> explains Guillame's complaint that it didn't work ...

Strange. It passed all regression tests for me and it seems like this is
something that would have been caught even in single-byte mode by a simple
test. It seems to me that like_match.c only used for SIMILAR is that right?
That would explain it as there don't appear to be any tests of SIMILAR.

Separately:

> Although I'd misdiagnosed the reason for the recent failures on buildfarm
> member grebe, I see no reason to revert the 1-byte-header-friendly changes I
> made in varlena.c. Instead, tweak the code a little bit to get more
> advantage out of that.

This may have been a misdiagnosis of the buildfarm failures but it looks like
a correct bug fix to me. That is, it looks like regexp_split_to_array() was
capable of passing a packed varlena to setup_regexp_matches which wasn't
expecting it. But this I don't understand why it wouldn't cause regression
failures and indeed when I tested it with my build it didn't cause any
problems.

This all brings up the question of what other files should be considered for
fixing. There is the possibility that some of the other sites could turn out
to be performance critical for a given application. In particular I'm
primarily concerned with calls which could be invoked during a data load or
index build.

Of the following list it seems to me the calls in adt/acl.c are probably
interesting to fix. Especially since it seems we could just replace all those
text* parameters with Datum parameters since they're only going to be passed
to textin anyways.

After that the tsearch and xml data types are probably interesting, and the
various data type input functions and contrib gist/gin operator classes.

I'm fine doing the drudge work but wanted to check before I do it that I'm not
doing something we don't think is worth doing at this point in time.

src/backend/access/transam/xlog.c:3
src/backend/catalog/pg_conversion.c:2
src/backend/commands/sequence.c:1
src/backend/libpq/be-fsstubs.c:2
src/backend/tsearch/dict.c:3
src/backend/tsearch/to_tsany.c:6
src/backend/tsearch/wparser.c:6
src/backend/utils/adt/acl.c:61
src/backend/utils/adt/char.c:1
src/backend/utils/adt/date.c:3
src/backend/utils/adt/dbsize.c:2
src/backend/utils/adt/encode.c:1
src/backend/utils/adt/formatting.c:14
src/backend/utils/adt/genfile.c:3
src/backend/utils/adt/not_in.c:2
src/backend/utils/adt/quote.c:2
src/backend/utils/adt/regproc.c:1
src/backend/utils/adt/ruleutils.c:6
src/backend/utils/adt/tid.c:1
src/backend/utils/adt/timestamp.c:8
src/backend/utils/adt/tsquery_rewrite.c:1
src/backend/utils/adt/tsvector_op.c:3
src/backend/utils/adt/xml.c:24
src/backend/utils/mb/mbutils.c:1
src/tutorial/funcs_new.c:3

contrib/adminpack/adminpack.c:6
contrib/chkpass/chkpass.c:2
contrib/dblink/dblink.c:41
contrib/fuzzystrmatch/dmetaphone.c:2
contrib/fuzzystrmatch/fuzzystrmatch.c:6
contrib/hstore/hstore_gin.c:1
contrib/hstore/hstore_gist.c:1
contrib/hstore/hstore_op.c:6
contrib/intarray/_int_op.c:1
contrib/ltree/ltree_op.c:3
contrib/pageinspect/btreefuncs.c:3
contrib/pageinspect/rawpage.c:1
contrib/pg_trgm/trgm_gin.c:2
contrib/pg_trgm/trgm_gist.c:1
contrib/pg_trgm/trgm_op.c:3
contrib/pgcrypto/pgcrypto.c:10
contrib/pgcrypto/pgp-pgsql.c:1
contrib/pgrowlocks/pgrowlocks.c:1
contrib/pgstattuple/pgstatindex.c:2
contrib/pgstattuple/pgstattuple.c:1
contrib/sslinfo/sslinfo.c:2
contrib/tablefunc/tablefunc.c:14
contrib/tsearch2/dict.c:3
contrib/tsearch2/dict_ex.c:1
contrib/tsearch2/dict_ispell.c:1
contrib/tsearch2/dict_snowball.c:3
contrib/tsearch2/dict_syn.c:1
contrib/tsearch2/dict_thesaurus.c:1
contrib/tsearch2/query.c:4
contrib/tsearch2/query_rewrite.c:1
contrib/tsearch2/ts_cfg.c:1
contrib/tsearch2/ts_stat.c:2
contrib/tsearch2/tsvector.c:2
contrib/tsearch2/wparser.c:9
contrib/uuid-ossp/uuid-ossp.c:2
contrib/xml2/xpath.c:25
contrib/xml2/xslt_proc.c:3

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-23 13:09:50 Re: [PATCHES] Eliminate more detoast copies for packed varlenas
Previous Message Tom Lane 2007-09-22 23:45:24 Re: Text <-> C string

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-23 13:09:50 Re: [PATCHES] Eliminate more detoast copies for packed varlenas
Previous Message Jaime Casanova 2007-09-23 04:49:28 Re: [HACKERS] 'Waiting on lock'