Re: Feedback about Drupal SQL debugging

Lists: pgsql-hackers
From: Jean-Michel Pouré <jm(at)poure(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:22:41
Message-ID: 1250871761.5834.11.camel@acer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear friends,

I have been using PostgreSQL since 6.3 releases and I am a real fan.
Of course, I never use nor trust MySQL to deliver data.

Now I use Drupal 6.3 with PostgreSQL 8.4.

I loose a lot of time correcting Drupal SQL.
You may be interested in my developer feedback.

I gathered some real examples here: Guidelines for writing MySQL and
PostgreSQL compliant SQL => http://drupal.org/node/555514

This page gathers most frequent problems that Drupal users and
developers encounter when using PostgreSQL.

I would be delighted to have your feedback.
Could some issues reasonably be fixed for a better Drupal support?

Kind regards,
Jean-Michel


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:39:23
Message-ID: 1250872763.4365.1.camel@jd-desktop.unknown.charter.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> This page gathers most frequent problems that Drupal users and
> developers encounter when using PostgreSQL.
>
> I would be delighted to have your feedback.
> Could some issues reasonably be fixed for a better Drupal support?

Well I doubt we would do anything to copy MySQL. However Drupal has
already made strides to solve this in 7.x. The abstraction layer is much
smarter.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering


From: "D'Arcy J(dot)M(dot) Cain" <darcy(at)druid(dot)net>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:48:52
Message-ID: 20090821124852.b927fc55.darcy@druid.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 21 Aug 2009 18:22:41 +0200
Jean-Michel Pouré <jm(at)poure(dot)com> wrote:
> I gathered some real examples here: Guidelines for writing MySQL and
> PostgreSQL compliant SQL => http://drupal.org/node/555514
>
> This page gathers most frequent problems that Drupal users and
> developers encounter when using PostgreSQL.
>
> I would be delighted to have your feedback.

It looks to me like you could just reference SQL99 rather than
mentioning PostgreSQL other than as an example of a standards compliant
database engine.

How would those constructs work in MS-SQL or Oracle?

--
D'Arcy J.M. Cain <darcy(at)druid(dot)net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:50:08
Message-ID: 200908211650.n7LGo8X09335@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jean-Michel Pour wrote:
-- Start of PGP signed section.
> Dear friends,
>
> I have been using PostgreSQL since 6.3 releases and I am a real fan.
> Of course, I never use nor trust MySQL to deliver data.
>
> Now I use Drupal 6.3 with PostgreSQL 8.4.
>
> I loose a lot of time correcting Drupal SQL.
> You may be interested in my developer feedback.
>
> I gathered some real examples here: Guidelines for writing MySQL and
> PostgreSQL compliant SQL => http://drupal.org/node/555514
>
> This page gathers most frequent problems that Drupal users and
> developers encounter when using PostgreSQL.
>
> I would be delighted to have your feedback.
> Could some issues reasonably be fixed for a better Drupal support?

I doubt we are going to change Postgres to improve Drupal support --- it
would be better to fix Drupal.

However, I have a few suggestions:

For this item, http://drupal.org/node/555580, use || for concatentation.

I can't believe MySQL doesn't support multi-column indexes,
http://drupal.org/node/555558.

For this item, I think you want DELETE FROM history USING ...,
http://drupal.org/node/555562.

The SQL standard doesn't support multiple deletes, so odds are we will
not either, http://drupal.org/node/555648.

I show multi-value INSERT was added in PG 8.2, not 8.4, * Add
support for multiple-row VALUES clauses, per SQL standard (Joe, Tom),
http://drupal.org/node/555568.

I am confused because I thought Drupal worked with Postgres, but looking
at your list, it seems it doesn't.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:53:45
Message-ID: 1250873625.4365.3.camel@jd-desktop.unknown.charter.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2009-08-21 at 12:50 -0400, Bruce Momjian wrote:

> I show multi-value INSERT was added in PG 8.2, not 8.4, * Add
> support for multiple-row VALUES clauses, per SQL standard (Joe, Tom),
> http://drupal.org/node/555568.
>
> I am confused because I thought Drupal worked with Postgres, but looking
> at your list, it seems it doesn't.

Drupal itself works perfectly on Postgres. The problem is module authors
that are MySQL developers who never test their module on Postgres.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: jd(at)commandprompt(dot)com
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 16:54:24
Message-ID: 4A8ED140.4070102@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
>> This page gathers most frequent problems that Drupal users and
>> developers encounter when using PostgreSQL.
>>
>> I would be delighted to have your feedback.
>> Could some issues reasonably be fixed for a better Drupal support?
>>
>
> Well I doubt we would do anything to copy MySQL. However Drupal has
> already made strides to solve this in 7.x. The abstraction layer is much
> smarter.
>
>
>
Joshua,

Since you haven't shown us what page this refers to, I at least am
totally in the dark about what is being discussed.

cheers

andrew


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: jd(at)commandprompt(dot)com, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 17:06:43
Message-ID: 407d949e0908211006j178dac34y6a91092282275bde@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/21 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
> Since you haven't shown us what page this refers to, I at least am totally
> in the dark about what is being discussed.

It was in the original post

http://drupal.org/node/555514

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: jd(at)commandprompt(dot)com, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 17:17:11
Message-ID: 4A8ED697.1070006@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark wrote:
> 2009/8/21 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>> Since you haven't shown us what page this refers to, I at least am totally
>> in the dark about what is being discussed.
>>
>
> It was in the original post
>
> http://drupal.org/node/555514
>
>

Darn. Our mail system sucks badly. For some insane reason some
postgresbut not all emails to me get badly delayed.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 17:30:35
Message-ID: 603c8f070908211030k6dc39339r78bbd0623d2647c9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/21 Jean-Michel Pouré <jm(at)poure(dot)com>:
> Dear friends,
>
> I have been using PostgreSQL since 6.3 releases and I am a real fan.
> Of course, I never use nor trust MySQL to deliver data.
>
> Now I use Drupal 6.3 with PostgreSQL 8.4.
>
> I loose a lot of time correcting Drupal SQL.
> You may be interested in my developer feedback.
>
> I gathered some real examples here: Guidelines for writing MySQL and
> PostgreSQL compliant SQL => http://drupal.org/node/555514
>
> This page gathers most frequent problems that Drupal users and
> developers encounter when using PostgreSQL.
>
> I would be delighted to have your feedback.
> Could some issues reasonably be fixed for a better Drupal support?

A lot of these issues seem to have easy workarounds, so I'm not sure
what the big deal is. If you don't write standards-compliant SQL, you
shouldn't be surprised when you find out that it's not portable. Most
of those constructions wouldn't work on Microsoft SQL server either,
and I bet at least some of them would fail under Oracle as well.

For the int/varchar casting issue, you might try putting single quotes
around the values. I would expect that to work in both databases,
though I don't use MySQL.

...Robert


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 19:14:12
Message-ID: 4A8EF204.2060708@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jean-Michel,

Thank you for doing this!

I've registered for the Drupal site so that I can fix and/or expand some
of your items.

People who know Drupal better than me should add to them.

If you want to discuss Drupal & PostgreSQL again, please post on the
pgsql-advocacy list or the pgsql-php mailing lists. pgsql-hackers isn't
the best place to get people to help you.

BTW, why don't we have a multi-argument version of CONCAT()? In 8.4, it
would be possible ... I should add it to mysqlcompat library.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 20:19:43
Message-ID: 9631.1250885983@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> BTW, why don't we have a multi-argument version of CONCAT()?

Why wouldn't people use the SQL-standard || operator instead?

regards, tom lane


From: Jean-Michel Pouré <jm(at)poure(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 21:34:57
Message-ID: 1250890497.5563.6.camel@acer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I've registered for the Drupal site so that I can fix and/or expand
> some
> of your items.

Thanks. I corrected the index on dual fields page.

> If you want to discuss Drupal & PostgreSQL again, please post on the
> pgsql-advocacy list or the pgsql-php mailing lists. pgsql-hackers
> isn't
> the best place to get people to help you.

I would prefer no, please. This post is made to understand what needs to
be done at PostgreSQL level for better Drupal supports.

As written previously, Drupal developers write MySQL code. Some of this
code is not portable, okay.

> BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
> it
> would be possible ... I should add it to mysqlcompat library.

yes. In PostgreSQL core ...

PostgreSQL requires all non-aggregated fields to be present in the GROUP
BY clause (I fixed 10 such issues in Drupal code).
http://drupal.org/node/555530

Why can't PostgreSQL add the required field automatically? Could this be
added to PostgreSQL to-do-list?

Kind regards,
Jean-Michel


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 22:15:16
Message-ID: 20090821221516.GB6792@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> > BTW, why don't we have a multi-argument version of CONCAT()?
>
> Why wouldn't people use the SQL-standard || operator instead?

Because by default, MySQL uses that as, get this, "logical OR."

Cheers,
David (grateful he's with a project that doesn't just gratuitously go
around breaking stuff)
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 22:17:18
Message-ID: 11606.1250893038@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:
>> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>>> BTW, why don't we have a multi-argument version of CONCAT()?
>>
>> Why wouldn't people use the SQL-standard || operator instead?

> Because by default, MySQL uses that as, get this, "logical OR."

Egad. Well, I think that's something for the mysqlcompat project
not core ...

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 22:48:00
Message-ID: 4A8F2420.8020605@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/21/09 3:17 PM, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
>> On Fri, Aug 21, 2009 at 04:19:43PM -0400, Tom Lane wrote:
>>> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>>>> BTW, why don't we have a multi-argument version of CONCAT()?
>>> Why wouldn't people use the SQL-standard || operator instead?
>
>> Because by default, MySQL uses that as, get this, "logical OR."
>
> Egad. Well, I think that's something for the mysqlcompat project
> not core ...

Yeah, I'll write one. I'll also blog it as an example of the new
variable argument functions.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 23:01:36
Message-ID: 4A8F2750.1010504@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jean-Michel Pouré wrote:
>
>> BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
>> it
>> would be possible ... I should add it to mysqlcompat library.
>>
>
> yes. In PostgreSQL core ...
>

No. That is exactly where it shouldn't go. And frankly, Drupal
developers should stop writing non-portable code. Isn't there a Mysql
mode that is supposed to conform to the standard, at least more than
their default mode?

> PostgreSQL requires all non-aggregated fields to be present in the GROUP
> BY clause (I fixed 10 such issues in Drupal code).
> http://drupal.org/node/555530
>
> Why can't PostgreSQL add the required field automatically? Could this be
> added to PostgreSQL to-do-list?
>
>
>

Isn't that contrary to the standard?

cheers

andrew


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 23:03:40
Message-ID: 407d949e0908211603i53553592m5923c7fa18ade891@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/21 Jean-Michel Pouré <jm(at)poure(dot)com>:
> PostgreSQL requires all non-aggregated fields to be present in the GROUP
> BY clause (I fixed 10 such issues in Drupal code).
> http://drupal.org/node/555530
>
> Why can't PostgreSQL add the required field automatically? Could this be
> added to PostgreSQL to-do-list?

This is a more complex (and more interesting) topic than what your
blog discusses.

Firstly understand what MySQL is *actually* doing:

select a,b,c from tab group by a

Only sorts and groups by "a" as instructed. The b columns and c
columns are not included in the grouping. So if you have data like:

a,b,c
1,1,1
1,2,2
2,1,1
2,2,2

You'll get two groups because there are only two values of "a". One
group will have a=1 and one group will have a=2. Which value you get
for b and c will be completely arbitrary and unpredictable.

If Postgres added b,c to the GROUP BY it would produce four groups,
because there four different values of <a,b,c>. You *can* get
something similar to MySQL's behaviour using DISTINCT ON:

select distinct on (a) a,b,c from a ORDER BY a,b,c

But Postgres insists you have an ORDER BY which has to agree with the
DISTINCT ON columns and provide some extra column(s) to determine
which values of b,c are chosen.

If Postgres changed on this front it would be to support the SQL
Standard concept of "functional dependency". In cases where some
columns are guaranteed to be unique you can leave them out of the
GROUP BY but still use them in the select list. This isn't MySQL's
behaviour of just allowing you to leave them out and hope that it
doesn't matter which row's values are used. The database has to
actually determine that it really doesn't matter. Typically that would
be because you've grouped by a set of columns which form the key of a
unique constraint, in which case every other column from that table
would also necessarily be the same since they would all come from the
same row of that table.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Jean-Michel =?iso-8859-1?q?Pour=E9?=" <jm(at)poure(dot)com>
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 23:21:47
Message-ID: 200908211621.47589.jd@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday 21 August 2009 04:01:36 pm Andrew Dunstan wrote:
> Jean-Michel Pouré wrote:
> >> BTW, why don't we have a multi-argument version of CONCAT()? In 8.4,
> >> it
> >> would be possible ... I should add it to mysqlcompat library.
> >
> > yes. In PostgreSQL core ...
>
> No. That is exactly where it shouldn't go. And frankly, Drupal
> developers should stop writing non-portable code. Isn't there a Mysql
> mode that is supposed to conform to the standard, at least more than
> their default mode?

This is all solved with Drupal 7. We really shouldn't be burning time on this.

Joshua D. Drake
--
PostgreSQL.org Major Contributor
Command Prompt, Inc : 503-667-4564 - http://www.commandprompt.com/
Since 1997, Consulting, Development, Support, Training


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-21 23:38:52
Message-ID: 4A8F300C.9090706@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark wrote:
> If Postgres changed on this front it would be to support the SQL
> Standard concept of "functional dependency". In cases where some
> columns are guaranteed to be unique you can leave them out of the
> GROUP BY but still use them in the select list. This isn't MySQL's
> behaviour of just allowing you to leave them out and hope that it
> doesn't matter which row's values are used. The database has to
> actually determine that it really doesn't matter. Typically that would
> be because you've grouped by a set of columns which form the key of a
> unique constraint, in which case every other column from that table
> would also necessarily be the same since they would all come from the
> same row of that table.
>
>

That would make much more sense.

You can also get the effect of picking an arbitrary row now by use
max(column) or min(column) in place of the straight column.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 00:07:07
Message-ID: 13107.1250899627@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Jean-Michel Pour wrote:
>> Why can't PostgreSQL add the required field automatically? Could this be
>> added to PostgreSQL to-do-list?

> Isn't that contrary to the standard?

As of SQL99 it's supposed to be legal if you're grouping by a primary key
(or some other cases where the other columns can be proved functionally
dependent on the grouping columns, but that's the most useful one).
We haven't got round to implementing that, but I'm not sure that it
would make the Drupal code work anyway. Are they actually writing to
spec here, or just doing whatever mysql will let them?

BTW, I was under the impression there already *was* a TODO entry about
improving our standards compliance in this area. I can't find it in
the list right now, though.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 00:51:14
Message-ID: 20090822005114.GV23840@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Greg Stark (gsstark(at)mit(dot)edu) wrote:
> select distinct on (a) a,b,c from a ORDER BY a,b,c
>
> But Postgres insists you have an ORDER BY which has to agree with the
> DISTINCT ON columns and provide some extra column(s) to determine
> which values of b,c are chosen.

Not quite technically correct. You have to have an ORDER BY which
includes the columns inside the DISTINCT ON, but not any more than that.
At that point, the values you get for the other columns are arbitrary.
PG does *allow* you to provide other columns in the ORDER BY, so you can
specify which values from those other columns should be used.

I'm not advocating that we force another column to be used, nor do I
think you are, but I have to admit that I don't think I've ever used it
w/o other columns in the ORDER BY.

> If Postgres changed on this front it would be to support the SQL
> Standard concept of "functional dependency". In cases where some
> columns are guaranteed to be unique you can leave them out of the
> GROUP BY but still use them in the select list. This isn't MySQL's
> behaviour of just allowing you to leave them out and hope that it
> doesn't matter which row's values are used. The database has to
> actually determine that it really doesn't matter. Typically that would
> be because you've grouped by a set of columns which form the key of a
> unique constraint, in which case every other column from that table
> would also necessarily be the same since they would all come from the
> same row of that table.

Hrmm. That sounds kinda neat, but you'd still have to specify one of
the columns in the GROUP BY, I presume? Or could you just say 'GROUP
BY' without any columns, and have it GROUP BY the key of the table
you're using?

Thanks,

Stephen


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:01:35
Message-ID: 407d949e0908211801y2648dedcrd4d5f1494939e24c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/22 Stephen Frost <sfrost(at)snowman(dot)net>:
> Hrmm.  That sounds kinda neat, but you'd still have to specify one of
> the columns in the GROUP BY, I presume?  Or could you just say 'GROUP
> BY' without any columns, and have it GROUP BY the key of the table
> you're using?

You would have to specify the key. I think typically you would have
something like:

SELECT a.*, sum(b.col)
FROM a,b
GROUP BY a.pk

Since you have the primary key of a in your group by column you're
allowed to use any columns from a in your select list even if they're
not listed in the group by clause.

The database knows that it can use those values from any output row of
the group since they'll all come from the same orginal row of a. Or
possibly it could use some plan that doesn't involve multiplying that
data in the first place.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:09:04
Message-ID: 20090822010904.GX23840@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Greg Stark (gsstark(at)mit(dot)edu) wrote:
> You would have to specify the key. I think typically you would have
> something like:
>
> SELECT a.*, sum(b.col)
> FROM a,b
> GROUP BY a.pk

Ahhh, ok, this makes more sense. This is SQL standard? Do we have a
TODO for it?

> The database knows that it can use those values from any output row of
> the group since they'll all come from the same orginal row of a. Or
> possibly it could use some plan that doesn't involve multiplying that
> data in the first place.

Right. It strikes me as a relativly small amount of work to get the
initial "just add the columns to the group by" logic implemented. I'd
start from exactly where that ERROR comes from, to minimize any
performance hit from having to go figure out if the columns in the GROUP
BY comprise a key. Doing something different in the planner based on
that could come later, if necessary.

I havn't looked at any code yet, but those who are familiar with these
areas- any gotchas you can think of off-hand to make this more difficult
than I'm hoping it is?

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:23:09
Message-ID: 14300.1250904189@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Right. It strikes me as a relativly small amount of work to get the
> initial "just add the columns to the group by" logic implemented.

Well, no, you *aren't* adding the columns to the GROUP BY. You're just
not throwing the error. You really don't want to add redundant columns
to GROUP BY because it makes more work for the planner and executor
(unless the planner can figure out they're redundant, which in itself
takes work).

This is a bit trickier than it looks because it makes the validity of a
query dependent on the existence of an appropriate uniqueness
constraint; thus for example DROP CONSTRAINT might invalidate a stored
rule or view. See prior discussions.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:31:44
Message-ID: 20090822013144.GY23840@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > Right. It strikes me as a relativly small amount of work to get the
> > initial "just add the columns to the group by" logic implemented.
>
> Well, no, you *aren't* adding the columns to the GROUP BY. You're just
> not throwing the error. You really don't want to add redundant columns
> to GROUP BY because it makes more work for the planner and executor
> (unless the planner can figure out they're redundant, which in itself
> takes work).

Hmm, right. Possibly also add some bit of info to pass to something
down the line, if necessary.

> This is a bit trickier than it looks because it makes the validity of a
> query dependent on the existence of an appropriate uniqueness
> constraint; thus for example DROP CONSTRAINT might invalidate a stored
> rule or view. See prior discussions.

Ah, yes. Couldn't the dependency system be used to handle this though?
If you try to drop that constraint it'll complain unless you use cascade
which would drop the view? I'll try and find older discussions. Sadly,
I don't see it on the TODO. Perhaps I can add it.

Thanks,

Stephen


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:31:57
Message-ID: 407d949e0908211831r18cb498av5fc172b96210317f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/22 Stephen Frost <sfrost(at)snowman(dot)net>:
> * Greg Stark (gsstark(at)mit(dot)edu) wrote:
>> You would have to specify the key. I think typically you would have
>> something like:
>>
>> SELECT a.*, sum(b.col)
>>    FROM a,b
>>  GROUP BY a.pk
>
> Ahhh, ok, this makes more sense.  This is SQL standard?

Incidentally it makes even more sense that MySQL would do what they do
when you remember that they didn't have subqueries until recently. So
MySQL programmers had all become accustomed to the circumlocutions
like:

SELECT a.*
FROM a left join b USING (a.b_id = b.id)
WHERE b.id IS NULL
GROUP BY a.id

to express the much simpler

select * from a where b_id in (select id from b)

So not many uses of it in MySQL actually *would* be valid if we
implemented the shortcut. But MySQL doesn't enforce that so it serves
that purpose as well as what we get out of DISTINCT ON.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:36:06
Message-ID: 20090822013605.GZ23840@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Greg Stark (gsstark(at)mit(dot)edu) wrote:
> So not many uses of it in MySQL actually *would* be valid if we
> implemented the shortcut. But MySQL doesn't enforce that so it serves
> that purpose as well as what we get out of DISTINCT ON.

That's probably a good thing- if they're valid then we'd probably return
something different which would be a suprise. I'm not really looking at
this from the "help MySQL apps" point of view.. It just strikes me as
something nice to have.

Thanks,

Stephen


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:41:09
Message-ID: 407d949e0908211841p6968d7bbv6969f08b9135876f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/22 Stephen Frost <sfrost(at)snowman(dot)net>:
>> This is a bit trickier than it looks because it makes the validity of a
>> query dependent on the existence of an appropriate uniqueness
>> constraint; thus for example DROP CONSTRAINT might invalidate a stored
>> rule or view.  See prior discussions.
>
> Ah, yes.  Couldn't the dependency system be used to handle this though?
> If you try to drop that constraint it'll complain unless you use cascade
> which would drop the view?  I'll try and find older discussions.  Sadly,
> I don't see it on the TODO.  Perhaps I can add it.

All this wasn't possible before 8.3 so there are a whole slew of
optimizations that have been kind of waiting in the wings until we got
that infrastructure.

The first step is probably to do the opposite of what we're talking
about here: cases where people *have* added extra columns to the GROUP
BY key so they can use those columns in their select list. We can
remove those columns from the sort or hash comparison key if there's a
column (or columns) which is a unique constraint key for the same
source. Similarly we can remove columns from an ORDER BY if the order
key has earlier columns which are already a unique key for the same
source.

That would be fairly simple and it would provide a good test case for
the dependency tracking stuff and plan invalidation triggered by
constraint ddl. It wouldn't be a massive performance change but it
would help some cases where the sort node is comparing a lot of
redundant keys.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-22 01:55:07
Message-ID: 25669.1250906107@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> The first step is probably to do the opposite of what we're talking
> about here: cases where people *have* added extra columns to the GROUP
> BY key so they can use those columns in their select list. We can
> remove those columns from the sort or hash comparison key if there's a
> column (or columns) which is a unique constraint key for the same
> source. Similarly we can remove columns from an ORDER BY if the order
> key has earlier columns which are already a unique key for the same
> source.

This is something we could only do at plan time --- if we do it at parse
time we risk making a robust query into one that will break when
somebody drops a constraint. So it's not really the inverse of the
other case.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Feedback about Drupal SQL debugging
Date: 2009-08-24 20:17:23
Message-ID: 1251145044.10096.20.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On fre, 2009-08-21 at 20:07 -0400, Tom Lane wrote:
> As of SQL99 it's supposed to be legal if you're grouping by a primary key
> (or some other cases where the other columns can be proved functionally
> dependent on the grouping columns, but that's the most useful one).
> We haven't got round to implementing that, but I'm not sure that it
> would make the Drupal code work anyway. Are they actually writing to
> spec here, or just doing whatever mysql will let them?
>
> BTW, I was under the impression there already *was* a TODO entry about
> improving our standards compliance in this area. I can't find it in
> the list right now, though.

I added "Add support for functional dependencies" just now, with a
comment how this relates to GROUP BY.