Re: string function - "format" function proposal

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: string function - "format" function proposal
Date: 2010-09-29 07:59:51
Message-ID: AANLkTik=WV-Zn1-KXBF6yoDsJbGS4roJY4jE3QSaHqo7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

2010/9/29 Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>:
> On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> I am sending a updated version.
>>
>> changes:
>> * tag %v removed from format function,
>> * proprietary tags %lq a iq removed from sprintf
>> * code cleaned
>>
>> patch divided to two parts - format function and stringfunc (contains
>> sprintf function and substitute function)
>
> === Discussions about the spec ===
> Two patches add format() into the core, and substitute() and sprintf() into
> stringfunc contrib module. But will we have 3 versions of string formatters?
>
> IMHO, substitute() is the best choice that we will have in the core because
> functionalities in format() and sprintf() can be achieved by combination of
> substitute() and quote_nullable(), quote_ident(), or to_char(). I think the
> core will provide only simple and non-overlapped features. Users can write
> wrapper functions by themselves if they think the description is redundant.

I think we need a three variants of formating functions - "format" in
core, fo simply creating and building a messages, a SQL strings,
"sprintf" for traditionalist in contrib - this functions isn't well
joined to SQL environment and it's too heavy - more it overwrite a
some functionality of "to_char" function. "substitute" function
provide just positional unformatted parameters - that isn't typical
ucase - so must not be in core too.

>
> === format.diff ===
> * It has a reject in doc, but the hunk can be fixed easily.
>    1 out of 2 hunks FAILED -- saving rejects to file doc/src/sgml/func.sgml.rej
>  COMMENT: We have the function list in alphabetical order,

fixed

>  so format() should be inserted after encode().
> * It can be built without compile warnings.
> * Enough documentation and regression tests are included.
>
> === stringfunc.diff ===
> * It can be applied cleanly and built without compile warnings.
> * Documentation is included, but not enough.
>  COMMENT: According to existing docs, function list are described with
>  <variablelist> or <table>.

fixed

> * Enough regression tests are included.
> * COMMENT: stringfunc directory should be added to contrib/Makefile.
>
> * BUG: stringfunc_substitute_nv() calls text_format().
>  I think we don't need stringfunc_substitute_nv at all.
>  It can be replaced by stringfunc_substitute(). _nv version is only
>  required if it is in the core because of sanity regression test.

you have a true - but I am not sure about coding patters for contribs,
so I designed it with respect to core sanity check.

>
> * BUG?: The doc says sprintf() doesn't support length modifiers,
>  but it is actually supported in broken state:

I was wrong in documentation - length modifiers are supported -
positional modifiers are not supported. fixed.

> postgres=# SELECT sprintf('%*s', 2, 'ABC');
>  sprintf
> ---------
>  ABC      <= should be ERROR if unsupported, or AB if supported.
> (1 row)

it works well - "with" modifier doesn't reduce string. String is
stripped by "precision" modifiers.

SELECT sprintf('%*.s', 2, ABC) --> AB

checked via gcc

please, try
printf(">>%s<<\n", "12345678");
printf(">>%3s<<\n", "12345678");
printf(">>%.3s<<\n", "12345678");
printf(">>%10.3s<<\n", "12345678");

do you understand me, why I "dislike" "printf"? How much people knows
well these formatting rules?

>
> * BUG?: ereport(ERROR,
>         (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>          errmsg("unsupported tag \"%%%c\"", tag)));
>  Is the code ok if the tag (a char) is a partial byte of multi-byte character?

it's bug - the supported tags are only single byte, but unsupported
tag can be multibyte character and must by showed correctly - fixed.

>  My machine prints ? in the case, but it might be platform-dependent.
>
> === Both patches ===
> * Performance: I don't think those functions are not performance-critical,
>  but we could cache typoutput functions in fn_extra if needed.
>  record_out would be a reference.

I though about it too and I checked it now - there is 0.4% performance
on 10000000 rows on my PC (format function) - so I don't do any
changes - caching of oids means a few lines more - but here isn't
expected effect.

>
> * Coding: Whitespace and tabs are mixed in some places. They are not so
>  important because we will run pgindent, but careful choice will be
>  preferred even of a patch.
>

checked, fixed

Thank you very much for review

regards

Pavel Stehule

> --
> Itagaki Takahiro
>

Attachment Content-Type Size
format.diff text/x-patch 9.7 KB
stringfunc.diff text/x-patch 26.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-09-29 08:08:35 Re: Stalled post to pgsql-committers
Previous Message Fujii Masao 2010-09-29 07:56:57 Re: is sync rep stalled?