Re: WIP: URI connection string support for libpq

Lists: pgsql-hackers
From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP: URI connection string support for libpq
Date: 2011-12-12 22:06:31
Message-ID: 1323723673-sup-8978@moon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Hackers,

Attached is a work-in-progress patch for URI connection string syntax support in libpq. The recent discussion (also pointing to the original one) is here:

http://archives.postgresql.org/message-id/1321899990-sup-1235@moon

The patch adds support for the following syntax in psql, by adding special handling of dbname parameter, when it starts with "postgresql://", e.g:

psql -d postgresql://user(at)pw:host:port/dbname?param1=value1&param2=value2...

Virtually every component of the above syntax is optional, ultimately allowing for, e.g:

psql -d postgresql:///

to specify local connection via Unix socket, with default port, user name, dbname, etc.

URI percent-encoding is handled, in particular, allowing to include special symbols in the embedded password string, or to specify non-standard Unix socket location, like the following:

psql -d postgresql://%2Fvar%2Fpgsql%2Ftmp/mydb

The patch applies cleanly against the master branch and compiles w/o errors or warnings. No tests were broken by this patch on my box, as far as I can tell.

The patch follows design initially proposed and tries to address feedback gathered from the recent discussion. Special provision was made to improve compatibility with JDBC's connection URIs, by treating "ssl=true" parameter as equivalent of "sslmode=require".

The patch intentionally omits documentation changes, to focus on the desired behavior and new code design.

I've put reasonable effort into testing the new code by feeding various parameters to "psql -d". However, if there's a facility for writing formal regression tests against psql, I'd be happy to use that.

I'm also adding this to the next open CommitFest: 2012-01.

--
Regards,
Alex

Attachment Content-Type Size
libpq-uri-v3.patch application/octet-stream 18.3 KB

From: Peter van Hardenberg <pvh(at)pvh(dot)ca>
To: Alexander Shulgin <ash(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-12 23:55:08
Message-ID: CAAcg=kWxsUeQ7Rz=to4nvuwHJ+Vj6ADrNHEcqFrGHnYmMNPznQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 2:06 PM, Alexander Shulgin
<ash(at)commandprompt(dot)com> wrote:
>
>
>  psql -d postgresql://user(at)pw:host:port/dbname?param1=value1&param2=value2...
>

I'd like to make the controversial proposal that the URL prefix should
be "postgres:" instead of "postgresql:". Postgres is a widely accepted
nickname for the project, and is eminently more pronounceable. Once
the url is established it will be essentially impossible to change
later, but right now only a nearly insurmountable mailing list thread
prevents it.

Excluding references to the "postgresql.org" domain, there are already
5x as many references in the source code to "postgres" (2583 lines)
than to "postgresql" (539 lines). Taking into account that the name of
the binary and the usual Unix user are already postgres, having one
less place which would eventually need changing seems like a good plan
overall.

Here is, for those who have understandably blocked this argument from
their memory, a link to the existing wiki document on the pros and
cons of the two names:
http://wiki.postgresql.org/wiki/Postgres

Over at Heroku decided to side with Tom's assessment that "arguably,
the 1996 decision to call it PostgreSQL instead of reverting to plain
Postgres was the single worst mistake this project ever made." (And we
at Heroku have also frustratingly split our references and
occasionally used the "SQL" form.)

Although I do not have the stomach to push for a full renaming blitz,
I felt I must at least make a case for not making the situation any
worse.

My apologies in advance for re-opening this can of worms.

Best regards,
-pvh

PS: It is not in any way shape or form relevant to my argument, nor do
I claim that anyone else should care, but in the spirit of full
disclosure, and depending on how you count, we currently have
somewhere between 250,000 and 500,000 URLs which begin with
postgres:// in our care.

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt." -- Kurt Vonnegut


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: Alexander Shulgin <ash(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-13 01:05:21
Message-ID: BAA80D31-5F8C-4E31-A4F1-3494C2934CC2@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 12, 2011, at 3:55 PM, Peter van Hardenberg wrote:

> I'd like to make the controversial proposal that the URL prefix should
> be "postgres:" instead of "postgresql:". Postgres is a widely accepted
> nickname for the project, and is eminently more pronounceable. Once
> the url is established it will be essentially impossible to change
> later, but right now only a nearly insurmountable mailing list thread
> prevents it.

What happened to SexQL?

David


From: Peter van Hardenberg <pvh(at)pvh(dot)ca>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: Alexander Shulgin <ash(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-13 08:00:43
Message-ID: CAAcg=kVSw1fnAsSrvL4K4EoCQsy3WXAxkRYCFShHokpgDYeU7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 5:05 PM, David E. Wheeler <david(at)justatheory(dot)com> wrote:
> On Dec 12, 2011, at 3:55 PM, Peter van Hardenberg wrote:
>> only a nearly insurmountable mailing list thread
>> prevents it.
>
> What happened to SexQL?
>

Case in point.

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt." -- Kurt Vonnegut


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: Alexander Shulgin <ash(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-13 21:31:32
Message-ID: CA+TgmoYC4dBRJuqrcKCLgsrvCCsVUX9EQH4BAKw6ZmGA3+6-ZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 6:55 PM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:
> I'd like to make the controversial proposal that the URL prefix should
> be "postgres:" instead of "postgresql:". Postgres is a widely accepted
> nickname for the project, and is eminently more pronounceable. Once
> the url is established it will be essentially impossible to change
> later, but right now only a nearly insurmountable mailing list thread
> prevents it.

That, and the fact the JDBC is already doing it the other way. A
reasonable compromise might be to accept either one. AIUI, part of
what Alexander was aiming for here was to "unite the clans", so to
speak, and it would seem a bit unfriendly (and certainly
counter-productive as regards that goal) to pull the rug out from him
by refusing to support that syntax over what is basically a
supermassive bikeshed. However, being generous in what we accept
won't cost anything, so why not?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-13 22:45:08
Message-ID: 1323816239-sup-2005@moon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of Tue Dec 13 23:31:32 +0200 2011:
>
> On Mon, Dec 12, 2011 at 6:55 PM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:
> > I'd like to make the controversial proposal that the URL prefix should
> > be "postgres:" instead of "postgresql:". Postgres is a widely accepted
> > nickname for the project, and is eminently more pronounceable. Once
> > the url is established it will be essentially impossible to change
> > later, but right now only a nearly insurmountable mailing list thread
> > prevents it.
>
> That, and the fact the JDBC is already doing it the other way. A
> reasonable compromise might be to accept either one. AIUI, part of
> what Alexander was aiming for here was to "unite the clans", so to
> speak, and it would seem a bit unfriendly (and certainly
> counter-productive as regards that goal) to pull the rug out from him
> by refusing to support that syntax over what is basically a
> supermassive bikeshed. However, being generous in what we accept
> won't cost anything, so why not?

(oops, misfired... now sending to the list)

I was going to put a remark about "adding to the soup" here, but realized that if this is actually committed, "the soup" is gonna be like this: libpq-supported syntax vs. everything else (think JDBC, or is there any other driver in the wild not using libpq?) This is in the ideal world, where every binding is updated to embrace the new syntax and users have updated all of their systems, etc.

Before that, why don't also accept "psql://", "pgsql://", "postgre://" and anything else? Or wait, aren't we adding to the soup again (or rather putting the soup right into libpq?)


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-14 00:54:14
Message-ID: 4EE7F3B6.7010805@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/13/2011 05:45 PM, Alexander Shulgin wrote:
> Before that, why don't also accept "psql://", "pgsql://", "postgre://"
> and anything else? Or wait, aren't we adding to the soup again (or
> rather putting the soup right into libpq?)

There are multiple URI samples within PostgreSQL drivers in the field,
here are two I know of what I believe to be a larger number of samples
that all match in this regard:

http://sequel.rubyforge.org/rdoc/files/doc/opening_databases_rdoc.html
http://www.rmunn.com/sqlalchemy-tutorial/tutorial.html

These two are using "postgres". One of the hopes in adding URI support
was to make it possible for the libpq spec to look similar to the ones
already floating around, so that they'd all converge. Using a different
prefix than the most popular ones have already adopted isn't a good way
to start that. Now, whenever the URI discussion wanders off into
copying the JDBC driver I wonder again why that's relevant. But making
the implementation look like what people have already deployed surely
is, particularly if there's no downside to doing that.

Initial quick review of your patch: you suggested this as the general form:

psql -d postgresql://user(at)pw:host:port/dbname?param1=value1&param2=value2...

That's presumably supposed to be:

psql -d postgresql://user:pw(at)host:port/dbname?param1=value1&param2=value2...

This variation worked here:

$ psql -d postgresql://gsmith(at)localhost:5432/gsmith

If we had to pick one URI prefix, it should be "postgres". But given
the general name dysfunction around this project, I can't see how anyone
would complain if we squat on "postgresql" too. Attached patch modifies
yours to prove we can trivially support both, in hopes of detonating
this argument before it rages on further. Tested like this:

$ psql -d postgres://gsmith(at)localhost:5432/gsmith

And that works too now. I doubt either of us like what I did to the
handoff between conninfo_uri_parse and conninfo_uri_parse_options to
achieve that, but this feature is still young.

After this bit of tinkering with the code, it feels to me like this
really wants a split() function to break out the two sides of a string
across a delimiter, eating it in the process. Adding the level of
paranoia I'd like around every bit of code I see that does that type of
operation right now would take a while. Refactoring in terms of split
and perhaps a few similarly higher-level string parsing operations,
targeted for this job, might make it easier to focus on fortifying those
library routines instead. For example, instead of the gunk I just added
that moves past either type of protocol prefix, I'd like to just say
"split(buf,"://",&left,&right) and then move on with processing the
right side.

I agree with your comment that we need to add some sort of regression
tests for this. Given how the parsing is done right now, we'd want to
come up with some interesting invalid strings too. Making sure this
fails gracefully (and not in a buffer overflow way) might even use
something like fuzz testing too. Around here we've just been building
some Python scripting to do that sort of thing, tests that aren't
practical to do with pg_regress. Probably be better from the project's
perspective if such things were in Perl instead; so far no one has ever
paid me enough to stomach writing non-trivial things in Perl. Perhaps
you are more diverse.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Attachment Content-Type Size
libpq-uri-v3a.patch text/x-patch 18.7 KB

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-14 01:11:38
Message-ID: 4EE7F7CA.9080803@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 12/13/2011 04:54 PM, Greg Smith wrote:
> On 12/13/2011 05:45 PM, Alexander Shulgin wrote:
>> Before that, why don't also accept "psql://", "pgsql://", "postgre://"
>> and anything else? Or wait, aren't we adding to the soup again (or
>> rather putting the soup right into libpq?)
>
> There are multiple URI samples within PostgreSQL drivers in the field,
> here are two I know of what I believe to be a larger number of samples
> that all match in this regard:
>
> http://sequel.rubyforge.org/rdoc/files/doc/opening_databases_rdoc.html
> http://www.rmunn.com/sqlalchemy-tutorial/tutorial.html
>
> These two are using "postgres". One of the hopes in adding URI support
> was to make it possible for the libpq spec to look similar to the ones
> already floating around, so that they'd all converge. Using a different
> prefix than the most popular ones have already adopted isn't a good way
> to start that. Now, whenever the URI discussion wanders off into copying
> the JDBC driver I wonder again why that's relevant.

Because the use of Java/JDBC dwarfs both of your examples combined.
Don't get me wrong, I love Python (everyone knows this) but in terms of
where the work is being done it is still in Java for the most part, by
far. That said, I am not really arguing against your other points except
to answer your question.

Sincerely,

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-14 04:23:35
Message-ID: 4EE824C7.50306@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/13/2011 08:11 PM, Joshua D. Drake wrote:
> Because the use of Java/JDBC dwarfs both of your examples combined.
> Don't get me wrong, I love Python (everyone knows this) but in terms
> of where the work is being done it is still in Java for the most part,
> by far.

I was talking about better targeting a new userbase, and I think that
one is quite a bit larger than the current PostgreSQL+JDBC one. I just
don't see any value in feeding them any Java inspired cruft. As for
total size, Peter's comment mentioned having >250,000 installations
using URIs already. While they support other platforms now, I suspect
the majority of those are still running Heroku's original Ruby product
offering. The first link I pointed at was one of the Ruby URI examples.

While I do still have more Java-based customers here, there's enough
Rails ones mixed in that I wouldn't say JDBC dwarfs them anymore even
even for me. As for the rest of the world, I direct you toward
https://github.com/erh/mongo-jdbc as a sign of the times.
"experimental" because there's not much demand for JDBC in web app land
anymore.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-14 08:17:54
Message-ID: 20111214081753.GB23644@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 13, 2011 at 07:54:14PM -0500, Greg Smith wrote:
> After this bit of tinkering with the code, it feels to me like this
> really wants a split() function to break out the two sides of a
> string across a delimiter, eating it in the process. Adding the
> level of paranoia I'd like around every bit of code I see that does
> that type of operation right now would take a while. Refactoring in
> terms of split and perhaps a few similarly higher-level string
> parsing operations, targeted for this job, might make it easier to
> focus on fortifying those library routines instead. For example,
> instead of the gunk I just added that moves past either type of
> protocol prefix, I'd like to just say "split(buf,"://",&left,&right)
> and then move on with processing the right side.

FWIW, python calls this operation "partition", as in:

left, delim, right = buf.partition("://")

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
-- Arthur Schopenhauer


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2011-12-14 14:52:29
Message-ID: 1323852045-sup-7596@moon
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Greg Smith's message of Wed Dec 14 02:54:14 +0200 2011:
>
> Initial quick review of your patch: you suggested this as the general form:
>
> psql -d postgresql://user(at)pw:host:port/dbname?param1=value1&param2=value2...
>
> That's presumably supposed to be:
>
> psql -d postgresql://user:pw(at)host:port/dbname?param1=value1&param2=value2...

Yes, that was clearly a typo, so "user:pw(at)host:port".

> If we had to pick one URI prefix, it should be "postgres". But given
> the general name dysfunction around this project, I can't see how anyone
> would complain if we squat on "postgresql" too.

That'd be true if we've started afresh in the absence of any existing URI implementations.

IMO, what makes a connection URI useful is:
a) it keeps all the connection parameters in a single string, so you can easily send it to other people to use, and
b) it works everywhere, so the people who've got the URI can use it and expect to get the same results as you do.

(Well, not without some quirks, like effects of locally-set environment variables or presence of .pgpass or service files, or different nameserver opinions about which hostname resolves to which IP address, but that is pretty much the case with any sort of URIs.)

This is not in objection to what you say, but rather an important thing to keep in mind for the purpose of this discussion.

Whatever decision we make here, the libpq-binding connectors are going to be compatible with each other automatically if they just pass the URI to libpq. However, should we stick to using "postgresql://" URI prefix exclusively, these might need to massage the URI a bit before passing further (like replacing "postgres://" with "postgresql://", also accepting the latter should be reasonable.) With proper recommendations from our side, the new client code will use the longer prefix, thus achieving compatibility with the only(?) driver not based on libpq (that is, JDBC) in the long run.

> Attached patch modifies
> yours to prove we can trivially support both, in hopes of detonating
> this argument before it rages on further. Tested like this:
>
> $ psql -d postgres://gsmith(at)localhost:5432/gsmith
>
> And that works too now. I doubt either of us like what I did to the
> handoff between conninfo_uri_parse and conninfo_uri_parse_options to
> achieve that, but this feature is still young.

Yes, the caller could just do the pointer arithmetics itself, since the exact URI prefix is known at the time, then pass it to conninfo_uri_parse.

> After this bit of tinkering with the code, it feels to me like this
> really wants a split() function to break out the two sides of a string
> across a delimiter, eating it in the process. Adding the level of
> paranoia I'd like around every bit of code I see that does that type of
> operation right now would take a while. Refactoring in terms of split
> and perhaps a few similarly higher-level string parsing operations,
> targeted for this job, might make it easier to focus on fortifying those
> library routines instead. For example, instead of the gunk I just added
> that moves past either type of protocol prefix, I'd like to just say
> "split(buf,"://",&left,&right) and then move on with processing the
> right side.

A search with cscope over my repo clone doesn't give any results for "split", so I assume you're talking about a new function with a signature similar to the following:

split(char *buf, const char *delim, char **left, char **right)

Note, there should be no need for parameter "left", since that will be pointing to the start of "buf". Also, we might just return "right" as a function's value instead of using out-parameter, with NULL meaning delimiter was not found in the buffer.

Now, if you look carefully at the patch's code, there are numerous places where it accepts either of two delimiting characters and needs to examine one before zeroing it out, so it'll need something more like this:

char *need_a_good_name_for_this(char *buf, const char *possible_delims, char *actual_delim)

where it will store a copy of encountered delimiting char in *actual_delim before modifying the buffer.

> I agree with your comment that we need to add some sort of regression
> tests for this. Given how the parsing is done right now, we'd want to
> come up with some interesting invalid strings too. Making sure this
> fails gracefully (and not in a buffer overflow way) might even use
> something like fuzz testing too. Around here we've just been building
> some Python scripting to do that sort of thing, tests that aren't
> practical to do with pg_regress.

I'd appreciate if you could point me to any specific example of such existing tests to take some inspiration from.

--
Regards,
Alex


From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Alexander Shulgin <ash(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-22 17:26:20
Message-ID: 4F45253C.4030409@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This submission has turned into a bit of a mess. I did the closest
thing to a review the day after it was submitted; follow-up review
attempts had issues applying the patch. And it's been stuck there. The
patch is still fine, I just tested it out to pick this back up myself
again. I think this one is a good advocacy feature, and most of the
hard work is done already. Smooth some edge cases and this will be
ready to go.

First thing: the URI prefix. It is possible to connect using a URI in
Python+SQL Alchemy, which was mentioned before as not too relevant due
to their also requiring a driver name. As documented at
http://docs.sqlalchemy.org/en/latest/core/engines.html and demonstrated
at http://packages.python.org/Flask-SQLAlchemy/config.html , it is
possible to leave off the driver part of the connection string. That
assumes the default driver, such that postgresql:// does the same as
postgresql+psycopg2:// , sensibly. That means we absolutely have an
installed base of URI speaking developers split between postgresql://
(Python) and postgres:// (Ruby). Given that, there really isn't a
useful path forward that helps out all those developers without
supporting both prefixes. That's where this left off before, I just
wanted to emphasize how clear that need seems now.

Next thing, also mentioned at that Flask page. SQLite has standardized
the idea that sqlite:////absolute/path/to/foo.db is a URI pointing to a
file. Given that, I wonder if Alex's syntax for specifying a socket
file name might adopt that syntax, rather than requiring the hex
encoding: postgresql://%2Fvar%2Fpgsql%2Ftmp/mydb It's not a big deal,
but it would smooth another rough edge toward making the Postgres URI
implementation look as close as possible to others.

So far I've found only one syntax that I expected this to handle that it
rejects:

psql -d postgresql://gsmith(at)localhost

It's picky about needing that third slash, but that shouldn't be hard to
fix. I started collecting up all the variants that do work as an
initial shell script regression test, so that changes don't break
something that already works. Here are all the variations that already
work, setup so that a series of "1" outputs is passing:

psql -d postgresql://gsmith(at)localhost:5432/gsmith -At -c "SELECT 1"
psql -d postgresql://gsmith(at)localhost/gsmith -At -c "SELECT 1"
psql -d postgresql://localhost:5432/gsmith -At -c "SELECT 1"
psql -d postgresql://localhost/gsmith -At -c "SELECT 1"
psql -d postgresql://gsmith(at)localhost:5432/ -At -c "SELECT 1"
psql -d postgresql://gsmith(at)localhost/ -At -c "SELECT 1"
psql -d postgresql://localhost:5432/ -At -c "SELECT 1"
psql -d postgresql://localhost/gsmith -At -c "SELECT 1"
psql -d postgresql://localhost/ -At -c "SELECT 1"
psql -d postgresql:/// -At -c "SELECT 1"
psql -d postgresql://%6Cocalhost/ -At -c "SELECT 1"
psql -d postgresql://localhost/gsmith?user=gsmith -At -c "SELECT 1"
psql -d postgresql://localhost/gsmith?user=gsmith&port=5432 -At -c
"SELECT 1"
psql -d postgresql://localhost/gsmith?user=gsmith\&port=5432 -At -c
"SELECT 1"

Replace all the "gsmith" with $USER to make this usable for others.

My eyes are starting to cross when I look at URI now, so that's enough
for today. If Alex wants to rev this soon, great; if not I have a good
idea what I'd like to do with this next, regardless of that.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


From: Alex Shulgin <ash(at)commandprompt(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-24 13:01:12
Message-ID: 8762ewz8pz.fsf@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <greg(at)2ndQuadrant(dot)com> writes:

Thank you for the review, Greg!

> Given that, there really isn't a useful path forward that helps out
> all those developers without supporting both prefixes. That's where
> this left off before, I just wanted to emphasize how clear that need
> seems now.

OK, I've used the code from your earlier review to support the short
prefix. I sincerely hope we don't make the situation any worse by being
flexible about the prefix...

> Next thing, also mentioned at that Flask page. SQLite has
> standardized the idea that sqlite:////absolute/path/to/foo.db is a URI
> pointing to a file. Given that, I wonder if Alex's syntax for
> specifying a socket file name might adopt that syntax, rather than
> requiring the hex encoding: postgresql://%2Fvar%2Fpgsql%2Ftmp/mydb
> It's not a big deal, but it would smooth another rough edge toward
> making the Postgres URI implementation look as close as possible to
> others.

Yeah, this is really appealing, however how do you tell if the part
after the last slash is a socket directory name or a dbname? E.g:

psql postgres:///path/to/different/socket/dir (default dbname)
psql postgres:///path/to/different/socket/dir/other (dbname=other ?)

If we treat the whole URI string as the path to the socket dir (which I
find the most intuitive way to do it,) the only way to specify a
non-default dbname is to use query parameter:

psql postgres:///path/to/different/socket/dir?dbname=other

or pass another -d flag to psql *after* the URI:

psql [-d] postgres:///path/to/different/socket/dir -d other

Reasonable?

> So far I've found only one syntax that I expected this to handle that
> it rejects:
>
> psql -d postgresql://gsmith(at)localhost
>
> It's picky about needing that third slash, but that shouldn't be hard
> to fix.

Yeah, good that you've spotted it. If my reading of the URI RFC (2396)
is correct, the question mark and query parameters may follow the
hostname, w/o that slash too, like this:

psql -d postgresql://localhost?user=gsmith

So this made me relax some checks and rewrite the code a bit.

> I started collecting up all the variants that do work as an initial
> shell script regression test, so that changes don't break something
> that already works. Here are all the variations that already work,
> setup so that a series of "1" outputs is passing:
>
[snip]

Yes, the original code was just a bit too picky about URI component
separators. Attached also is a simplified test shell script.

I have also added a warning message for when a query parameter is not
recognized and being ignored. Not sure if plain fprintf to stderr is
accepted practice for libpq, please correct if you have better idea.

--
Regards,
Alex

Attachment Content-Type Size
libpq-uri-v4.patch text/x-diff 18.3 KB
psql-uri-regress.sh text/x-sh 879 bytes

From: Florian Weimer <fweimer(at)bfk(dot)de>
To: Alex Shulgin <ash(at)commandprompt(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-24 13:07:21
Message-ID: 82k43cz8fq.fsf@mid.bfk.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alex Shulgin:

> Yeah, this is really appealing, however how do you tell if the part
> after the last slash is a socket directory name or a dbname? E.g:
>
> psql postgres:///path/to/different/socket/dir (default dbname)
> psql postgres:///path/to/different/socket/dir/other (dbname=other ?)

The HTTP precent is to probe the file system until you find something.
Most HTTP servers have something similar to the PATH_INFO variable which
captures trailing path segments.

It's ugly, but it's standard practice, and seems better than a separate
-d parameter (which sort of defeats the purpose of URIs).

--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99


From: Alex Shulgin <ash(at)commandprompt(dot)com>
To: Florian Weimer <fweimer(at)bfk(dot)de>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-24 13:16:53
Message-ID: 87wr7cxtfe.fsf@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Florian Weimer <fweimer(at)bfk(dot)de> writes:

> * Alex Shulgin:
>
>> Yeah, this is really appealing, however how do you tell if the part
>> after the last slash is a socket directory name or a dbname? E.g:
>>
>> psql postgres:///path/to/different/socket/dir (default dbname)
>> psql postgres:///path/to/different/socket/dir/other (dbname=other ?)
>
> The HTTP precent is to probe the file system until you find something.
> Most HTTP servers have something similar to the PATH_INFO variable which
> captures trailing path segments.
>
> It's ugly, but it's standard practice, and seems better than a separate
> -d parameter (which sort of defeats the purpose of URIs).

Hm, do you see anything what's wrong with "?dbname=other" if you don't
like a separate -d?

--
Alex


From: Florian Weimer <fweimer(at)bfk(dot)de>
To: Alex Shulgin <ash(at)commandprompt(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-24 13:18:44
Message-ID: 82fwe0z7wr.fsf@mid.bfk.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alex Shulgin:

>> It's ugly, but it's standard practice, and seems better than a separate
>> -d parameter (which sort of defeats the purpose of URIs).
>
> Hm, do you see anything what's wrong with "?dbname=other" if you don't
> like a separate -d?

It's not nice URI syntax, but it's better than an out-of-band mechanism.

--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99


From: Cédric Villemain <cedric(at)2ndquadrant(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Florian Weimer <fweimer(at)bfk(dot)de>, Alex Shulgin <ash(at)commandprompt(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-25 19:37:01
Message-ID: 201202252037.02330.cedric@2ndquadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le vendredi 24 février 2012 14:18:44, Florian Weimer a écrit :
> * Alex Shulgin:
> >> It's ugly, but it's standard practice, and seems better than a separate
> >> -d parameter (which sort of defeats the purpose of URIs).
> >
> > Hm, do you see anything what's wrong with "?dbname=other" if you don't
> > like a separate -d?
>
> It's not nice URI syntax, but it's better than an out-of-band mechanism.

I've not followed all the mails about this feature but I don't find it is a
nice syntax too.

"?dbname=other" looks like dbname is an argument, but dbname is a requirement
for postgresql connexion.

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: cedric(at)2ndquadrant(dot)fr
Cc: pgsql-hackers(at)postgresql(dot)org, Florian Weimer <fweimer(at)bfk(dot)de>, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-26 07:33:00
Message-ID: 4F49E02C.8060401@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/25/2012 09:37 PM, Cédric Villemain wrote:
>
> I've not followed all the mails about this feature but I don't find it is a
> nice syntax too.
>
> "?dbname=other" looks like dbname is an argument, but dbname is a requirement
> for postgresql connexion.

Ugh, not really. AFAIK, dbname is a connection option which defaults to
$USER, unless overridden on command line or in the environment (or via a
service file.)

--
Alex


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Florian Weimer <fweimer(at)bfk(dot)de>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-02-28 18:45:23
Message-ID: 4F4D20C3.2060008@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/24/2012 03:18 PM, Florian Weimer wrote:
>
> * Alex Shulgin:
>
>>> It's ugly, but it's standard practice, and seems better than a separate
>>> -d parameter (which sort of defeats the purpose of URIs).
>>
>> Hm, do you see anything what's wrong with "?dbname=other" if you don't
>> like a separate -d?
>
> It's not nice URI syntax, but it's better than an out-of-band mechanism.

Attached is v5 of the patch, adding support for local Unix socket
directory specification w/o the need to percent-encode path separators.
The path to directory must start with forward slash, like so:

postgres:///path/to/socket/dir

To specify non-default dbname use URI query parameters:

postgres:///path/to/socket/dir?dbname=other

Username/password should be also specified on query parameters in this
case, as opposed to "user:pw(at)host" syntax supported by host URIs.

--
Alex

Attachment Content-Type Size
libpq-uri-v5.patch text/x-patch 19.4 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Alexander Shulgin <ash(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-05 23:09:12
Message-ID: 1330988952.14443.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On ons, 2012-02-22 at 12:26 -0500, Greg Smith wrote:
> I started collecting up all the variants that do work as an
> initial shell script regression test, so that changes don't break
> something that already works. Here are all the variations that
> already work, setup so that a series of "1" outputs is passing:

Let's please add something like this to the patch. Otherwise, I foresee
a lot of potential to break corner cases in the future.


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-06 08:11:49
Message-ID: 4F55C6C5.3040800@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/06/2012 01:09 AM, Peter Eisentraut wrote:
>
> On ons, 2012-02-22 at 12:26 -0500, Greg Smith wrote:
>> I started collecting up all the variants that do work as an
>> initial shell script regression test, so that changes don't break
>> something that already works. Here are all the variations that
>> already work, setup so that a series of "1" outputs is passing:
>
> Let's please add something like this to the patch. Otherwise, I foresee
> a lot of potential to break corner cases in the future.

I've included a (separate) test shell script based on Greg's cases in
one of the updates. What would be preferred place to plug it in?
Override installcheck in libpq Makefile?

--
Alex


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alexander Shulgin <ash(at)commandprompt(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-06 17:30:28
Message-ID: 1331055028.19112.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2012-03-06 at 10:11 +0200, Alexander Shulgin wrote:
> On 03/06/2012 01:09 AM, Peter Eisentraut wrote:
> >
> > On ons, 2012-02-22 at 12:26 -0500, Greg Smith wrote:
> >> I started collecting up all the variants that do work as an
> >> initial shell script regression test, so that changes don't break
> >> something that already works. Here are all the variations that
> >> already work, setup so that a series of "1" outputs is passing:
> >
> > Let's please add something like this to the patch. Otherwise, I foresee
> > a lot of potential to break corner cases in the future.
>
> I've included a (separate) test shell script based on Greg's cases in
> one of the updates. What would be preferred place to plug it in?
> Override installcheck in libpq Makefile?

I think that would be the right place.


From: Alex Shulgin <ash(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-07 16:31:17
Message-ID: 87ipigwey2.fsf@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:

>> I've included a (separate) test shell script based on Greg's cases in
>> one of the updates. What would be preferred place to plug it in?
>> Override installcheck in libpq Makefile?
>
> I think that would be the right place.

I figured that adding this right into src/interfaces/libpq is polluting
the source dir, so I've used src/test instead.

Attached v6 adds src/test/uri directory complete with the test script,
expected output file and a Makefile which responds to installcheck.
README file also included.

--
Alex


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alex Shulgin <ash(at)commandprompt(dot)com>
Cc: Greg Smith <greg(at)2ndQuadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-07 18:03:03
Message-ID: 1331143383.12416.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On ons, 2012-03-07 at 18:31 +0200, Alex Shulgin wrote:
> I figured that adding this right into src/interfaces/libpq is
> polluting the source dir, so I've used src/test instead.

I would prefer src/interfaces/libpq/test, to keep it close to the code.


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-07 19:16:28
Message-ID: 4F57B40C.3060406@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/07/2012 08:03 PM, Peter Eisentraut wrote:
>
> On ons, 2012-03-07 at 18:31 +0200, Alex Shulgin wrote:
>> I figured that adding this right into src/interfaces/libpq is
>> polluting the source dir, so I've used src/test instead.
>
> I would prefer src/interfaces/libpq/test, to keep it close to the code.

Hm, actually that makes more sense and is not unprecedented (I see ecpg
has it's own 'test' subdir.) Apparently I was under false impression
that all regression tests are concentrated under $(topdir)/src/test.

I'll post an updated patch shortly (unless someone like to argue to keep
the tests where they are now.)


From: Alexander Shulgin <ash(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: URI connection string support for libpq
Date: 2012-03-07 21:54:06
Message-ID: 4F57D8FE.9070201@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/07/2012 09:16 PM, Alexander Shulgin wrote:
>
>> I would prefer src/interfaces/libpq/test, to keep it close to the code.
>
> Hm, actually that makes more sense and is not unprecedented (I see ecpg
> has it's own 'test' subdir.) Apparently I was under false impression
> that all regression tests are concentrated under $(topdir)/src/test.
>
> I'll post an updated patch shortly (unless someone like to argue to keep
> the tests where they are now.)

And here it is attached (v7.) The test code now lives under libpq/test.

A colleague of mine also pointed out that expanded PGUSER/PGPORT vars
slipped into the expected.out file in the previous version, so that was
not really useful for testing.

The new version addresses the above issue by expanding shell vars in a
separate step. The test lines moved to separate file 'regress.in,'
since we are expanding the variables manually now (no need to use heredoc.)

After moving the test lines to separate file I've noticed that it was
identical to the expected output file. So I've thought it would be nice
to add some failing URIs as well (improves code coverage.) I did that
and one test highlighted a minor bug, which I've also fixed.

For that, I decided to move previously extracted parts of code back to
the main parser routine, as it was getting too ugly to pass all the
required local vars to them. I hope this won't confuse the ones who had
a chance to review previous version too much.

--
Regards,
Alex

Attachment Content-Type Size
libpq-uri-v7.patch text/x-patch 24.8 KB