Re: Unicode UTF-8 table formatting for psql text output

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unicode UTF-8 table formatting for psql text output
Date: 2009-11-14 17:40:24
Message-ID: 20091114174023.GB23762@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 09, 2009 at 05:40:54PM -0500, Bruce Momjian wrote:
> Tom Lane wrote:
> > Greg Stark <gsstark(at)mit(dot)edu> writes:
> > > While i agree this looks nicer I wonder what it does to things like
> > > excel/gnumeric/ooffice auto-recognizing table layouts and importing
> > > files. I'm not sure our old format was so great for this so maybe this
> > > is actually an improvement I'm asking for.
> >
> > Yeah. We can do what we like with the UTF8 format but I'm considerably
> > more worried about the aspect of making random changes to the
> > plain-ASCII output. On the other hand, we changed that just a release
> > or so ago (to put in the multiline output in the first place) and
> > I didn't hear complaints about it that time.
>
> Sorry for the delayed reply:
>
> The line continuation characters were chosen in 8.4 for clarity --- if
> you have found something clearer for 8.5, let's make the improvement. I
> think clarity is more important in this area than consistency with the
> previous psql output format.

The attached patch is proposed for the upcoming commitfest, and
hopefully takes into account the comments made in this thread.
To summarise the changes:

- it makes the handling of newlines and wrapped lines consistent
between column header and data lines.
- it includes additional logic such that both the "old" and "new"
styles are representable using the format structures, so we
can retain backward compatibility if so desired (it's easy to
remove if not).
- an "ascii-old" linestyle is added which is identical to the old
style for those who need guaranteed reproducibility of output,
but this is not the default.
- The Unicode format uses "↵" in the right-hand margin to indicate
newlines. Wrapped lines use "…" in the right-hand margin before,
and left-hand margin after, a break (so you can visually follow
the wrap).
- The ASCII format is the same but uses "+" and "." in place of
carriage return and ellipsis symbols.
- All the above is documented in the SGML documentation, including
the old style, which I always found confusing.

For comparison, I've included a transcript of all three linestyles
with a test script (also attached).

Any changes to the symbols used and/or their placement are trivially
made by just altering the format in print.c.

Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.

Attachment Content-Type Size
psql-wrap-formatting.patch text/x-diff 12.2 KB
typescript text/plain 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-11-14 17:55:01 Re: operator exclusion constraints
Previous Message Tom Lane 2009-11-14 17:39:51 Re: Inspection of row types in pl/pgsql and pl/sql