Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default

Lists: pgsql-committerspgsql-hackers
From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 07:06:52
Message-ID: E1c6uJ2-0006E7-MY@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Build HTML documentation using XSLT stylesheets by default

The old DSSSL build is still available for a while using the make target
"oldhtml".

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/e36ddab11735052841b4eff96642187ec9a8a7bc

Modified Files
--------------
doc/src/sgml/Makefile | 8 ++++----
doc/src/sgml/stylesheet.css | 50 +++++++++++++++++----------------------------
2 files changed, 23 insertions(+), 35 deletions(-)


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 09:38:19
Message-ID: CABUevEz-mw=TXA7Yfv98J1T8AyskpEn7VafQoS+ttchXgAKPwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

This seems to have broken our website build a bit. If you check
https://www.postgresql.org/docs/devel/static/index.html, you'll notice a
bunch of bad characters.

AFAICT this is because the output is now UTF8 and it used to be LATIN1. The
current output actually has it in the html tags that it's utf8,but since
the old one had no tags specifying it's encoding we hardcoded it to LATIN1.

I assume we shall expect it to always be UTF8 from now on, and just find a
way for the docs loader script for the website to properly detect when we
switched over? Probably by just looking for that specific <?xml tag on the
first line.

Is this change something that might break something else, though?

//Magnus

On Wed, Nov 16, 2016 at 8:06 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> Build HTML documentation using XSLT stylesheets by default
>
> The old DSSSL build is still available for a while using the make target
> "oldhtml".
>
> Branch
> ------
> master
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/e36ddab11735052841b4eff9664218
> 7ec9a8a7bc
>
> Modified Files
> --------------
> doc/src/sgml/Makefile | 8 ++++----
> doc/src/sgml/stylesheet.css | 50 +++++++++++++++++-------------
> ---------------
> 2 files changed, 23 insertions(+), 35 deletions(-)
>
>
> --
> Sent via pgsql-committers mailing list (pgsql-committers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-committers
>


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 14:02:11
Message-ID: 77752011-5a08-afbb-f5e7-bb5e43dcd017@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 1:38 AM, Magnus Hagander wrote:
> AFAICT this is because the output is now UTF8 and it used to be LATIN1.
> The current output actually has it in the html tags that it's utf8,but
> since the old one had no tags specifying it's encoding we hardcoded it
> to LATIN1.

The old output has this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

This has always been the case, AFAICT.

Btw., shouldn't the output web site pages have encoding declarations?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 14:09:05
Message-ID: CABUevEx3n0AsjTJW5P0DxqO7c0T_n272rCW+BTV1EdMtXQc7Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 16, 2016 at 3:02 PM, Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:

> On 11/16/16 1:38 AM, Magnus Hagander wrote:
> > AFAICT this is because the output is now UTF8 and it used to be LATIN1.
> > The current output actually has it in the html tags that it's utf8,but
> > since the old one had no tags specifying it's encoding we hardcoded it
> > to LATIN1.
>
> The old output has this:
>
> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
>
> This has always been the case, AFAICT.
>

Oh, it's there. It's just not on one line and not at the beginning, so I
misssed it :)

> Btw., shouldn't the output web site pages have encoding declarations?
>

That gets sent in the http header, doesn't it?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-committers(at)postgresql(dot)org, pgsql-committers-owner(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 14:29:53
Message-ID: e58cea7733a628cc28dacea2bc2c79e2@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2016-11-16 08:06, Peter Eisentraut wrote:
> Build HTML documentation using XSLT stylesheets by default
>
> The old DSSSL build is still available for a while using the make
> target
> "oldhtml".

This xslt build takes 8+ minutes, compared to barely 1 minute for
'oldhtml'.

I'd say that is a strong disadvantage.

I hope 'for a while' will mean 'for a long time to come' or even
'forever.'

Erik Rijkers


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 14:46:16
Message-ID: 2820.1479307576@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Erik Rijkers <er(at)xs4all(dot)nl> writes:
> This xslt build takes 8+ minutes, compared to barely 1 minute for
> 'oldhtml'.

I'm just discovering the same.

> I'd say that is a strong disadvantage.

I'd say that is flat out unacceptable. I won't ever use this toolchain
if it's that much slower than the old way. What was the improvement
we were hoping for, again?

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 15:16:31
Message-ID: 3246f247-c850-75cb-64ca-d57fdbd119f1@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/2016 09:46 AM, Tom Lane wrote:
> Erik Rijkers <er(at)xs4all(dot)nl> writes:
>> This xslt build takes 8+ minutes, compared to barely 1 minute for
>> 'oldhtml'.
> I'm just discovering the same.
>
>> I'd say that is a strong disadvantage.
> I'd say that is flat out unacceptable. I won't ever use this toolchain
> if it's that much slower than the old way. What was the improvement
> we were hoping for, again?
>
>

On the buildfarm crake has gone from about 2 minutes to about 3.5
minutes to run "make doc". That's not good but it's not an eight-fold
increase either.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 15:25:12
Message-ID: CA+TgmoYN+rLQoQzpWfLGYcNw7JCvO4P1RNjYYCux7cY-H3CZKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 16, 2016 at 9:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Erik Rijkers <er(at)xs4all(dot)nl> writes:
>> This xslt build takes 8+ minutes, compared to barely 1 minute for
>> 'oldhtml'.
>
> I'm just discovering the same.
>
>> I'd say that is a strong disadvantage.
>
> I'd say that is flat out unacceptable. I won't ever use this toolchain
> if it's that much slower than the old way. What was the improvement
> we were hoping for, again?

Gosh, and I thought the existing toolchain was already ridiculously
slow. Couldn't somebody write a Perl script that generated the HTML
documentation from the SGML in, like, a second? I mean, we're
basically just mapping one set up markup tags to another set of markup
tags. And splitting up some files for the HTML version. And adding
some boilerplate. But none of that sounds like it should be all that
hard.

I am reminded of the saying that XML is like violence -- if it doesn't
solve your problem, you're not using enough of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Erik Rijkers <er(at)xs4all(dot)nl>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 15:33:56
Message-ID: CA+TgmoZGz5+Yz7MZf1sMwg7iOBa4=9PeWiaWimv+MLxE7GvDqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Nov 16, 2016 at 10:16 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On the buildfarm crake has gone from about 2 minutes to about 3.5 minutes to
> run "make doc". That's not good but it's not an eight-fold increase either.

On my MacBook, "time make docs" as of e36ddab11735052841b4eff96642187ec9a8a7bc:

real 2m17.871s
user 2m15.505s
sys 0m2.238s

And as of 4ecd1974377ffb4d6d72874ba14fcd23965b1792:

real 1m47.696s
user 1m47.085s
sys 0m1.145s

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 20:38:51
Message-ID: 20161116203851.xg7llcb5i4bj7w6s@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut wrote:
> Build HTML documentation using XSLT stylesheets by default

"make check" still uses DSSSL though. Is that intentional? Is it going
to be changed?

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 20:59:28
Message-ID: 66fd8ad3-ed49-1c13-9d09-005145f78699@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 6:29 AM, Erik Rijkers wrote:
> On 2016-11-16 08:06, Peter Eisentraut wrote:
>> Build HTML documentation using XSLT stylesheets by default
>>
>> The old DSSSL build is still available for a while using the make
>> target
>> "oldhtml".
>
> This xslt build takes 8+ minutes, compared to barely 1 minute for
> 'oldhtml'.

I have committed another patch to improve the build performance a bit.
Could you check again?

On my machine and on the build farm, the performance now almost matches
the DSSSL build.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:00:37
Message-ID: 277a3c8d-2865-c57a-cd7b-977d54d1bf06@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 6:09 AM, Magnus Hagander wrote:
> Btw., shouldn't the output web site pages have encoding declarations?
>
> That gets sent in the http header, doesn't it?

That's probably alright, but it would be nicer if the documents were
self-contained.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:10:24
Message-ID: 69288636-d9b7-d74e-3c0f-5a8ea6a8d25d@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 12:38 PM, Alvaro Herrera wrote:
> "make check" still uses DSSSL though. Is that intentional? Is it going
> to be changed?

It doesn't use DSSSL. Is uses nsgmls to parse the SGML, which is a
different thing that will be addressed in a separate step.

So, yes, but later.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:11:27
Message-ID: ea22ecfb-a454-a0e6-35e1-5be9f299616e@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 6:46 AM, Tom Lane wrote:
> What was the improvement we were hoping for, again?

Get off an ancient and unmaintained tool chain.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:14:57
Message-ID: 12c5ebd515ad1164233cdc2eed384465@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2016-11-16 21:59, Peter Eisentraut wrote:
> On 11/16/16 6:29 AM, Erik Rijkers wrote:
>>
>> This xslt build takes 8+ minutes, compared to barely 1 minute for
>> 'oldhtml'.
>
> I have committed another patch to improve the build performance a bit.
> Could you check again?

It is indeed better (three minutes off, nice) but still:
real 5m21.348s -- for 'make -j 8 html'
versus
real 1m8.502s -- for 'make -j 8 oldhtml'

Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
the cause of the discrepancy with other's measurements.

Obviously as long as 'oldhtml' is possible I won't complain.

thanks,

Erik Rijkers


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:23:52
Message-ID: 20161116212352.4tarihdfg56pz3xf@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut wrote:

> > This xslt build takes 8+ minutes, compared to barely 1 minute for
> > 'oldhtml'.
>
> I have committed another patch to improve the build performance a bit.
> Could you check again?

After the optimization, on my laptop it takes 2:31 with the new system
and 1:58 with the old one. If it can be made faster, all the better,
but at this level I'm okay.

Now admittedly this conversion didn't do one bit towards the goal I
wanted to achieve: that each doc source file ended up as a valid XML
file that could be processed separately with tools like xml2po. They
are still SGML only -- in particular no doctype declaration and
incomplete closing tags.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-16 21:48:19
Message-ID: 25761.1479332899@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Erik Rijkers <er(at)xs4all(dot)nl> writes:
> On 2016-11-16 21:59, Peter Eisentraut wrote:
>> I have committed another patch to improve the build performance a bit.
>> Could you check again?

> It is indeed better (three minutes off, nice) but still:
> real 5m21.348s -- for 'make -j 8 html'
> versus
> real 1m8.502s -- for 'make -j 8 oldhtml'

Yeah, I get about the same.

> Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
> the cause of the discrepancy with other's measurements.

... and on the same toolchain. Probably the answer is "install a newer
toolchain", but from what I understand, there's a whole lot of work there
if your platform vendor doesn't supply it already packaged.

regards, tom lane


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-17 01:10:26
Message-ID: 477a2596-294a-438d-ee8f-c494e2f0bd63@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 1:23 PM, Alvaro Herrera wrote:
> Now admittedly this conversion didn't do one bit towards the goal I
> wanted to achieve: that each doc source file ended up as a valid XML
> file that could be processed separately with tools like xml2po. They
> are still SGML only -- in particular no doctype declaration and
> incomplete closing tags.

Yes, that is one of the upcoming steps. But we need to do the current
thing first.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-17 01:15:25
Message-ID: 05500248-54a0-7bfa-3c94-539e79ce8c1f@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 1:14 PM, Erik Rijkers wrote:
> real 5m21.348s -- for 'make -j 8 html'
> versus
> real 1m8.502s -- for 'make -j 8 oldhtml'
>
> Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
> the cause of the discrepancy with other's measurements.

I tested the build on a variety of operating systems, including that
one, with different tool chain versions and I am getting consistent
performance. So the above is unclear to me at the moment.

For the heck of it, run this

xsltproc --nonet --stringparam pg.version '10devel' stylesheet.xsl
postgres.xml

to make sure it's not downloading something from the network.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-17 06:53:33
Message-ID: ccab1cb0ffc1d7efd1dad16cbdd11d47@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 2016-11-17 02:15, Peter Eisentraut wrote:
> On 11/16/16 1:14 PM, Erik Rijkers wrote:
>> real 5m21.348s -- for 'make -j 8 html'
>> versus
>> real 1m8.502s -- for 'make -j 8 oldhtml'
>>
>> Centos 6.6 - I suppose it's getting a bit old, I don't know if that's
>> the cause of the discrepancy with other's measurements.
>
> I tested the build on a variety of operating systems, including that
> one, with different tool chain versions and I am getting consistent
> performance. So the above is unclear to me at the moment.
>
> For the heck of it, run this
>
> xsltproc --nonet --stringparam pg.version '10devel' stylesheet.xsl
> postgres.xml
>
> to make sure it's not downloading something from the network.

$ time xsltproc --nonet --stringparam pg.version '10devel'
stylesheet.xsl postgres.xml
real 5m43.776s

$ ( cd /home/aardvark/pg_stuff/pg_sandbox/pgsql.HEAD/doc/src/sgml; time
make oldhtml )
real 1m14.152s

(I did clean out in between)


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-30 15:44:46
Message-ID: ed1319fe-9be7-c9d0-3881-d9fd70e10830@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/16/16 3:59 PM, Peter Eisentraut wrote:
>>> Build HTML documentation using XSLT stylesheets by default
>>>
>>> The old DSSSL build is still available for a while using the make
>>> target
>>> "oldhtml".
>>
>> This xslt build takes 8+ minutes, compared to barely 1 minute for
>> 'oldhtml'.
>
> I have committed another patch to improve the build performance a bit.
> Could you check again?
>
> On my machine and on the build farm, the performance now almost matches
> the DSSSL build.

Anyone who is still getting terrible performance (>2x slower) from the
html build, please send me the output of

xsltproc --profile --timing --stringparam pg.version '10devel'
stylesheet.xsl postgres.xml 2>profile.txt

so I can look into it.

(It will be big, so feel free to paste it somewhere or send privately.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-30 17:30:09
Message-ID: 2709.1480527009@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> On 11/16/16 3:59 PM, Peter Eisentraut wrote:
>> On my machine and on the build farm, the performance now almost matches
>> the DSSSL build.

Still sucks for me on an up-to-date RHEL6 box: about 1m5s to build oldhtml,
about 4m50s to build html, both starting after "make maintainer-clean" in
the doc/src/sgml/ subdirectory.

BTW, I notice the "make check-tabs" step isn't getting run with the new
target; is that intentional?

> Anyone who is still getting terrible performance (>2x slower) from the
> html build, please send me the output of
> xsltproc --profile --timing --stringparam pg.version '10devel'
> stylesheet.xsl postgres.xml 2>profile.txt
> so I can look into it.

It wasn't that big, so attached.

regards, tom lane

Attachment Content-Type Size
profile.txt text/plain 56.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-11-30 18:52:53
Message-ID: 12888.1480531973@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

I wrote:
> Still sucks for me on an up-to-date RHEL6 box: about 1m5s to build oldhtml,
> about 4m50s to build html, both starting after "make maintainer-clean" in
> the doc/src/sgml/ subdirectory.

However, speed may be the least of its problems. I just noticed that it's
inserting commas at random places in syntax summaries :-(. For instance,
the "overlay" entry in table 9.8 looks like

overlay(string, placing
string, from int [for int])

Neither comma belongs there according to the SGML source, and I don't see
them in guaibausaurus' rendering of the page:
https://www.postgresql.org/docs/devel/static/functions-string.html

So I'm forced to the conclusion that I need a newer version of the
toolchain and/or style sheets. If you've got any idea of just what
needs to be updated, that would be real helpful. xsltproc itself
is from "libxslt-1.1.26-2.el6_3.1.x86_64" but I'm unsure what packages
contain relevant style sheets.

regards, tom lane


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-01 02:32:06
Message-ID: 9874f754-a3c7-71ed-c13f-d09a09f0bfd3@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/30/16 1:52 PM, Tom Lane wrote:
> However, speed may be the least of its problems. I just noticed that it's
> inserting commas at random places in syntax summaries :-(. For instance,
> the "overlay" entry in table 9.8 looks like
>
> overlay(string, placing
> string, from int [for int])
>
> Neither comma belongs there according to the SGML source, and I don't see
> them in guaibausaurus' rendering of the page:
> https://www.postgresql.org/docs/devel/static/functions-string.html
>
> So I'm forced to the conclusion that I need a newer version of the
> toolchain and/or style sheets. If you've got any idea of just what
> needs to be updated, that would be real helpful. xsltproc itself
> is from "libxslt-1.1.26-2.el6_3.1.x86_64" but I'm unsure what packages
> contain relevant style sheets.

OK, I got it. The component of concern is the DocBook XSL stylesheets,
called docbook-style-xsl on RH-like systems (docbook-xsl on Debian). If
it runs too slow, it's probably too old.

Here you can see a list of available versions:
http://docbook.sourceforge.net/release/xsl/

I noticed a significant slow-down with versions older than 1.76.1. And
indeed CentOS/RHEL 6 comes with 1.75.2.

Also, the issue with the extra commas mentioned above goes away with 1.78.0.

Here is the trick why this isn't reproducible for some:

The local stylesheet file stylesheet.xsl references
http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl. If
you have the docbook-style-xsl package installed, then this URL gets
redirected to your local installation through the XML catalog mechanism.
If you don't have the package installed locally, then xsltproc will
download the stylesheet files from that actual URL and cache them
locally. So if you have an old docbook-style-xsl version, you get old
and slow behavior. If you uninstall it or just never installed it, you
get the latest from the internet.

If you don't want to mess with your local packages, you can also prevent
the use of the XML catalog by setting the environment variable
XML_CATALOG_FILES to empty (e.g., XML_CATALOG_FILES='' make html).

There is some work to be done here to document this and make sure we
wrap releases with appropriate versions and so on, but I hope this
information can keep everyone moving for now.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-01 04:37:35
Message-ID: 14994.1480567055@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> OK, I got it. The component of concern is the DocBook XSL stylesheets,
> called docbook-style-xsl on RH-like systems (docbook-xsl on Debian). If
> it runs too slow, it's probably too old.

OK, I updated docbook-style-xsl to 1.79.1 from Fedora rawhide (building
and installing that was quite painless btw, didn't need a pile of build
dependencies like I'd feared it would take). The extraneous commas are
gone, and the speed is better but still not really up to DSSSL speed:
1m44s (vs 1m5s with old toolchain). So there's some additional piece
that needs fixing, but that's certainly the worst of it.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-01 04:49:56
Message-ID: CAFj8pRAmLG=F+FrYZ7pK4QL7cn6um_3dEi8sLUX3sXa6UQ7xnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

2016-12-01 5:37 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> > OK, I got it. The component of concern is the DocBook XSL stylesheets,
> > called docbook-style-xsl on RH-like systems (docbook-xsl on Debian). If
> > it runs too slow, it's probably too old.
>
> OK, I updated docbook-style-xsl to 1.79.1 from Fedora rawhide (building
> and installing that was quite painless btw, didn't need a pile of build
> dependencies like I'd feared it would take). The extraneous commas are
> gone, and the speed is better but still not really up to DSSSL speed:
> 1m44s (vs 1m5s with old toolchain). So there's some additional piece
> that needs fixing, but that's certainly the worst of it.
>

It does much more intensive work with IO - I have feeling like there are
intensive fsync.

Regards

Pavel

>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-01 16:49:25
Message-ID: 20161201164925.poqw4zrnlbj5ex7a@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Pavel Stehule wrote:

> It does much more intensive work with IO - I have feeling like there are
> intensive fsync.

You could prove that, by running "make html" under "strace -f -e
trace=fsync" etc. I just tried that, and I don't see any fsync. I
guess you could try other syscalls, or simply "-e trace=file". Doing
the latter I noticed an absolutely stupid number of attempts to open
file
/usr/lib/libxslt-plugins/nwalsh_com_xslt_ext_com_nwalsh_saxon_UnwrapLinks.so
which deserves a WTF.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Alexander Law <exclusion(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-15 09:30:17
Message-ID: bfce8c4e-e200-9617-791a-4e05a054e698@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Hello Alvaro,

It's caused by the condition
<xsl:when test="function-available('suwl:unwrapLinks')">...
in the simple.xlink template
(docbook/stylesheet/docbook-xsl/xhtml/inline.xsl). (This test executed
for each xlink (~ 90000 times)).
Yes, it's inefficient but it doesn't affect build time (for me).
You can try to apply the attached patch and measure the time with it.
So If the performance is rather acceptable now I'd continue switch to
XML, and get back to the performance issues after the switch.
(epub generation is much more slow, and I have developed a patch to
speed up it too.)

Best regards,
Alexander

01.12.2016 19:49, Alvaro Herrera wrote:
> Pavel Stehule wrote:
>
>> It does much more intensive work with IO - I have feeling like there are
>> intensive fsync.
> You could prove that, by running "make html" under "strace -f -e
> trace=fsync" etc. I just tried that, and I don't see any fsync. I
> guess you could try other syscalls, or simply "-e trace=file". Doing
> the latter I noticed an absolutely stupid number of attempts to open
> file
> /usr/lib/libxslt-plugins/nwalsh_com_xslt_ext_com_nwalsh_saxon_UnwrapLinks.so
> which deserves a WTF.
>

Attachment Content-Type Size
make-html-wo-nwalsh-search.patch text/x-patch 7.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Law <exclusion(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-15 16:51:24
Message-ID: 32425.1481820684@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Alexander Law <exclusion(at)gmail(dot)com> writes:
> Hello Alvaro,
> It's caused by the condition
> <xsl:when test="function-available('suwl:unwrapLinks')">...
> in the simple.xlink template
> (docbook/stylesheet/docbook-xsl/xhtml/inline.xsl). (This test executed
> for each xlink (~ 90000 times)).
> Yes, it's inefficient but it doesn't affect build time (for me).
> You can try to apply the attached patch and measure the time with it.

For me, that reduces the "make html" time from 1m44s to 1m43s.

regards, tom lane


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-22 22:55:34
Message-ID: a3143f0c-e658-dac3-652a-489b6903798d@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 11/30/16 11:37 PM, Tom Lane wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
>> OK, I got it. The component of concern is the DocBook XSL stylesheets,
>> called docbook-style-xsl on RH-like systems (docbook-xsl on Debian). If
>> it runs too slow, it's probably too old.
>
> OK, I updated docbook-style-xsl to 1.79.1 from Fedora rawhide (building
> and installing that was quite painless btw, didn't need a pile of build
> dependencies like I'd feared it would take). The extraneous commas are
> gone, and the speed is better but still not really up to DSSSL speed:
> 1m44s (vs 1m5s with old toolchain). So there's some additional piece
> that needs fixing, but that's certainly the worst of it.

I've done a few more tweaks, and now it actually runs faster for me than
the old build.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [COMMITTERS] pgsql: Build HTML documentation using XSLT stylesheets by default
Date: 2016-12-22 23:35:01
Message-ID: 30098.1482449701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> On 11/30/16 11:37 PM, Tom Lane wrote:
>> OK, I updated docbook-style-xsl to 1.79.1 from Fedora rawhide (building
>> and installing that was quite painless btw, didn't need a pile of build
>> dependencies like I'd feared it would take). The extraneous commas are
>> gone, and the speed is better but still not really up to DSSSL speed:
>> 1m44s (vs 1m5s with old toolchain). So there's some additional piece
>> that needs fixing, but that's certainly the worst of it.

> I've done a few more tweaks, and now it actually runs faster for me than
> the old build.

For me it's now 1m35s, which is better than the last round but not
quite up to the old speed. It's tolerable though. Thanks for hacking
on it.

regards, tom lane