Re: Slowness of extended protocol

From: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Shay Rojansky <roji(at)roji(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Slowness of extended protocol
Date: 2016-08-10 16:18:38
Message-ID: CAB=Je-FGvHNYapBT4wCBd1E1NPTkbnoaqJ38+4TieNZ8_3bJCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert>But that makes it the job of every driver to implement some
sort of cache, which IMHO isn't a very reasonable position

Let's wait what Shay decides on implementing query cache in npgsql ?
Here's the issue: https://github.com/npgsql/npgsql/issues/1237

He could change his mind when it comes to the need of "new
ParseBindExecuteDeallocate message".

Robert>I think you should consider the possibility that those people
know what they are talking about.

I do appreciate all the inputs, however, all the performance-related
complaints in this thread I've seen can trivially be solved by adding a
query cache at the driver level + fixing pgbouncer's issue.
Unfortunately, client-side SQL parsing is inevitable evil (see below on
'?'), so any sane driver would cache queries any way. The ones that do not
have query cache will perform badly anyway.

As far as I can understand, the alternative solution is "to add
ParseBindExecDeallocateInOneGo
message to the protocol". This new message would require changes from all
the drivers, and all the pgbouncers. This new message would be slower than
proper query cache, so why should we all spend time on a half-baked
solution?

Of course, I might miss some use cases, that is why keep asking: please
provide close to production scenario that does require the new protocol
message we are talking about.
Note: examples (by Robert and Shay) like "typical web application that
fetches a single row from DB and prints it to the browser" were already
presented, and they are easily covered by the query cache.

To be fair, implementing a cache is a trivial thing when compared with
hard-coding binary/text formats for all the datatypes in each and every
language.
Remember, each driver has to implement its own set of procedures to
input/output values in text/binary format, and that is a way harder than
implementing the cache we are talking about.

If only there was some ability to have language-independent binary transfer
format (protobuf, avro, kryo, whatever)....

Regarding query cache: each language has its own notion how bind variables
are represented in SQL text.
For instance, in Go language (as well as in Java), bind placeholders are
represented as '?' character.
Of course, PostgreSQL does not support that (it wants $1, $2, etc), so Go
driver has to parse SQL text at the driver side in order to convert it into
PostgreSQL-compatible flavor. This parser has to support comments, string
literals, etc, etc. It is just natural thing to have a cache there so the
driver does not repeat the same parsing logic again and again (typical
applications are known to use the same query text multiple times).

Robert>When
there's a common need that affects users of many different programming
languages

You are right. No questions here.
Ok, what is a need? What problem does "new message" solve?

From what I see there is no performance need to introduce
"ParseBindExecDeallocateInOneGo" message. The thread is
performance-related, so I naturally object spending everybody's time on
implementing a useless feature.

Vladimir>Do you agree that the major part would be some hot queries, the
rest will be
Vladimir> much less frequently used ones (e.g. one time queries)?
Robert>Sure, but I don't want the application to have to know about that

Application does not need to know that. It is like "branch prediction in
the CPU". Application does not need to know there is a branch predictor in
the CPU. The same goes for query cache. Application should just continue to
execute SQLs in a sane manner, and the query cache would pick up the
pattern (server-prepare hot-used queries).

Vladimir

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2016-08-10 16:24:22 Re: Proposal for CSN based snapshots
Previous Message Tom Lane 2016-08-10 16:07:43 Re: Is there a way around function search_path killing SQL function inlining?