Replacing plpgsql's lexer

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Replacing plpgsql's lexer
Date: 2009-04-14 20:37:06
Message-ID: 18653.1239741426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Whichever way the current discussion about Unicode literals turns out,
it's clear that plpgsql is not up to speed on matching the core lexer's
behavior --- it's wrong anyway with respect to
standard_conforming_strings.

I had earlier speculated semi-facetiously about ripping out the plpgsql
lexer altogether, but the more I think about it the less silly the idea
looks. Suppose that we change the core lexer so that the keyword lookup
table it's supposed to use is passed to scanner_init() rather than being
hard-wired in. Then make plpgsql call the core lexer using its own
keyword table. Everything else would match core lexical behavior
automatically. The special behavior that we do want, such as being
able to construct a string representing a desired subrange of the input,
could all be handled in plpgsql-specific wrapper code.

I've just spent a few minutes looking for trouble spots in this theory,
and so far the only real ugliness I can see is that plpgsql treats
":=" and ".." as single tokens whereas the core would parse them as two
tokens. We could hack the core lexer to have an additional switch that
controls that. Or maybe just make it always return them as single
tokens --- AFAICS, neither combination is legal in core SQL anyway,
so this would only result in a small change in the exact syntax error
you get if you write such a thing in core SQL.

Another trouble spot is the #option syntax, but that could be handled
by a special-purpose prescan, or just dropped altogether; it's not like
we've ever used that for anything but debugging.

It looks like this might take about a day's worth of work (IOW two
or three days real time) to get done.

Normally I'd only consider doing such a thing during development phase,
but since we're staring at at least one and maybe two bugs that are
going to be hard to fix in any materially-less-intrusive way, I'm
thinking about doing it now. Theoretically this change shouldn't break
any working code, so letting it hit the streets in 8.4beta2 doesn't seem
totally unreasonable.

Comments, objections, better ideas?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 20:56:56
Message-ID: 603c8f070904141356o7522e8fbu7f45e6d10e3dc139@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 14, 2009 at 4:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Whichever way the current discussion about Unicode literals turns out,
> it's clear that plpgsql is not up to speed on matching the core lexer's
> behavior --- it's wrong anyway with respect to
> standard_conforming_strings.
>
> I had earlier speculated semi-facetiously about ripping out the plpgsql
> lexer altogether, but the more I think about it the less silly the idea
> looks.  Suppose that we change the core lexer so that the keyword lookup
> table it's supposed to use is passed to scanner_init() rather than being
> hard-wired in.  Then make plpgsql call the core lexer using its own
> keyword table.  Everything else would match core lexical behavior
> automatically.  The special behavior that we do want, such as being
> able to construct a string representing a desired subrange of the input,
> could all be handled in plpgsql-specific wrapper code.
>
> I've just spent a few minutes looking for trouble spots in this theory,
> and so far the only real ugliness I can see is that plpgsql treats
> ":=" and ".." as single tokens whereas the core would parse them as two
> tokens.  We could hack the core lexer to have an additional switch that
> controls that.  Or maybe just make it always return them as single
> tokens --- AFAICS, neither combination is legal in core SQL anyway,
> so this would only result in a small change in the exact syntax error
> you get if you write such a thing in core SQL.
>
> Another trouble spot is the #option syntax, but that could be handled
> by a special-purpose prescan, or just dropped altogether; it's not like
> we've ever used that for anything but debugging.
>
> It looks like this might take about a day's worth of work (IOW two
> or three days real time) to get done.
>
> Normally I'd only consider doing such a thing during development phase,
> but since we're staring at at least one and maybe two bugs that are
> going to be hard to fix in any materially-less-intrusive way, I'm
> thinking about doing it now.  Theoretically this change shouldn't break
> any working code, so letting it hit the streets in 8.4beta2 doesn't seem
> totally unreasonable.
>
> Comments, objections, better ideas?

All this sounds good. As for how to handle := and .., I think making
them lex the same way in PL/pgsql and core SQL would be a good thing.

...Robert


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 21:06:32
Message-ID: 1239743192.16396.147.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote:
> Comments, objections, better ideas?

Please, if you do this, make it optional.

Potentially changing the behaviour of thousands of functions just to fix
a rare bug will not endear us to our users. The bug may be something
that people are relying on in some subtle way, ugly as that sounds.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 21:07:49
Message-ID: 49E4FB25.9050602@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> All this sounds good. As for how to handle := and .., I think making
> them lex the same way in PL/pgsql and core SQL would be a good thing.
>
>
>

They don't have any significance in core SQL. What would we do with the
lexeme?

ISTR we've used some hacks in the past to split lexemes into pieces, and
presumably we'd have to do something similar with these.

The only thing that makes me nervous about this is that we're very close
to Beta. OTOH, this is one area the regression suite should give a
fairly good workout to.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 21:32:51
Message-ID: 19917.1239744771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Robert Haas wrote:
>> All this sounds good. As for how to handle := and .., I think making
>> them lex the same way in PL/pgsql and core SQL would be a good thing.

> They don't have any significance in core SQL. What would we do with the
> lexeme?

It would just fail --- the core grammar will have no production that can
accept it. Right offhand I think the only difference is that instead of

regression=# select a .. 2;
ERROR: syntax error at or near "."
LINE 1: select a .. 2;
^

you'd see

regression=# select a .. 2;
ERROR: syntax error at or near ".."
LINE 1: select a .. 2;
^

ie it acts like one token not two in the error message.

This solution would become problematic if the core grammar ever had a
meaning for := or .. that required treating them as two tokens (eg,
the grammar allowed this sequence with whitespace between). I don't
think that's very likely though; and if it did happen we could fix it
with the aforementioned control switch.

> The only thing that makes me nervous about this is that we're very close
> to Beta. OTOH, this is one area the regression suite should give a
> fairly good workout to.

Yeah, I'd rather have done it before beta1, but too late. The other
solution still entails massive changes to the plpgsql lexer, so it
doesn't really look like much lower risk. AFAICS the practical
alternatives are a reimplementation in beta2, or no fix until 8.5.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 22:29:38
Message-ID: 22440.1239748178@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote:
>> Comments, objections, better ideas?

> Please, if you do this, make it optional.

I don't think making the plpgsql lexer pluggable is realistic.

> Potentially changing the behaviour of thousands of functions just to fix
> a rare bug will not endear us to our users. The bug may be something
> that people are relying on in some subtle way, ugly as that sounds.

That's why I don't want to change it in a minor release. In a major
release, however, it's fair game.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 23:49:56
Message-ID: 200904142349.n3ENnur25728@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote:
> >> Comments, objections, better ideas?
>
> > Please, if you do this, make it optional.
>
> I don't think making the plpgsql lexer pluggable is realistic.
>
> > Potentially changing the behaviour of thousands of functions just to fix
> > a rare bug will not endear us to our users. The bug may be something
> > that people are relying on in some subtle way, ugly as that sounds.
>
> That's why I don't want to change it in a minor release. In a major
> release, however, it's fair game.

Well, this bug has existed long before 8.4 so we could just leave it for
8.5, and it is not like we have had tons of complaints; the only
complaint I saw was one from March, 2008.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-14 23:54:07
Message-ID: 24524.1239753247@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Well, this bug has existed long before 8.4 so we could just leave it for
> 8.5, and it is not like we have had tons of complaints; the only
> complaint I saw was one from March, 2008.

We had one last week, which is what prompted me to start looking at the
plpgsql lexer situation in the first place. Also, if the unicode
literal situation doesn't change, that's going to be problematic as
well.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 00:02:45
Message-ID: 20090415000245.GO8123@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> Well, this bug has existed long before 8.4 so we could just leave it for
> 8.5, and it is not like we have had tons of complaints; the only
> complaint I saw was one from March, 2008.

I think it's a good thing to do in general. I'm also concerned about
if it will impact the plpgsql functions we have (which are pretty
numerous..) but in the end I'd rather have it fixed in 8.4 than possibly
delayed indefinitely (after all, if it's in 8.4, why fix it for 8.5?).

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 06:57:50
Message-ID: 1239778670.16396.164.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Tue, 2009-04-14 at 18:29 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Tue, 2009-04-14 at 16:37 -0400, Tom Lane wrote:
> >> Comments, objections, better ideas?
>
> > Please, if you do this, make it optional.
>
> I don't think making the plpgsql lexer pluggable is realistic.

Doesn't sound easy, no. (I didn't suggest pluggable, just optional).

> > Potentially changing the behaviour of thousands of functions just to fix
> > a rare bug will not endear us to our users. The bug may be something
> > that people are relying on in some subtle way, ugly as that sounds.
>
> That's why I don't want to change it in a minor release. In a major
> release, however, it's fair game.

If we want to make easy upgrades a reality, this is the type of issue we
must consider. Not much point having perfect binary upgrades if all your
functions start behaving differently after upgrade and then you discover
there isn't a binary downgrade path...

Rather than come up with specific solutions, let me just ask the
question: Is there a workaround for people caught by these changes?
Let's plan that alongside the change itself, so we have a reserve
'chute.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 08:36:43
Message-ID: 49E59C9B.7070809@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Tue, 2009-04-14 at 18:29 -0400, Tom Lane wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> Potentially changing the behaviour of thousands of functions just to fix
>>> a rare bug will not endear us to our users. The bug may be something
>>> that people are relying on in some subtle way, ugly as that sounds.
>> That's why I don't want to change it in a minor release. In a major
>> release, however, it's fair game.
>
> If we want to make easy upgrades a reality, this is the type of issue we
> must consider. Not much point having perfect binary upgrades if all your
> functions start behaving differently after upgrade and then you discover
> there isn't a binary downgrade path...
>
> Rather than come up with specific solutions, let me just ask the
> question: Is there a workaround for people caught by these changes?
> Let's plan that alongside the change itself, so we have a reserve
> 'chute.

Extract the source of the offending plpgsql function using e.g pg_dump,
modify it so that it works again, and restore the function. There's your
workaround.

I haven't been following what the issues we have with the current
plpgsql lexer are, so I'm not sure what I think of the plan as a whole.
Sharing the main lexer seems like a good idea, but it also seems like
it's way too late in the release cycle for such changes. But then again,
if we have issues that need to be fixed anyway, it might well be the
best way to fix them.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 09:12:22
Message-ID: 1239786742.16396.194.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Wed, 2009-04-15 at 11:36 +0300, Heikki Linnakangas wrote:

> Extract the source of the offending plpgsql function using e.g
> pg_dump, modify it so that it works again, and restore the function.
> There's your workaround.

Forcing manual re-editing of an unknown number of lines of code is not a
useful workaround, its just the default.

How do you know which is the offending function? If we force a full
application retest we put in place a significant barrier to upgrade.
That isn't useful for us as developers, nor is it useful for users.

I'm happy for Tom to make changes now; delay has no advantage. If we
have to add some lines of code/complexity to help our users then it
seems a reasonable thing to do rather than keeping our code pure and
demanding everybody else make changes.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 09:56:37
Message-ID: 4136ffa0904150256p5c160c34mb7f6584accef4f5a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 15, 2009 at 10:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> How do you know which is the offending function? If we force a full
> application retest we put in place a significant barrier to upgrade.
> That isn't useful for us as developers, nor is it useful for users.

This is a fundamental conflict, not one that has a single simple answer.

However this seems like a strange place to pick your battle. Something
as low-level as the lexer is very costly to provide multiple
interfaces to. It's basically impossible short of simply providing two
different plpgsql languages -- something which won't scale at all if
we have to do it every time we make a syntax change to the language.

I'm actually concerned that we've become *too* conservative. Pretty
much any change that doesn't have Tom's full support and credibility
standing behind it ends up being criticized on the basis that we don't
know precisely what effects it will have in every possible scenario.

One of free software's big advantages over commercial software is that
it moves so much more quickly. Oracle, AIX, Windows, etc are burdened
by hundreds of layers of backwards-compatibility which take up a huge
portion of their development and Q/A effort. The reason Linux,
Postgres, and others have been able to come up so quickly and overtake
them is partly because we don't worry about such things.

As far as I'm concerned commercial support companies can put effort
into developing backwards-compatibility modules which add no long-term
value for their paying customers who need it today while the free
software developers can keep improving the software for new users.

--
greg


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 09:57:21
Message-ID: 49E5AF81.90002@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> On Wed, 2009-04-15 at 11:36 +0300, Heikki Linnakangas wrote:
>
>> Extract the source of the offending plpgsql function using e.g
>> pg_dump, modify it so that it works again, and restore the function.
>> There's your workaround.
>
> Forcing manual re-editing of an unknown number of lines of code is not a
> useful workaround, its just the default.
>
> How do you know which is the offending function? If we force a full
> application retest we put in place a significant barrier to upgrade.
> That isn't useful for us as developers, nor is it useful for users.

If I understood correctly, the proposed change is not supposed to have
any user-visible effects. It doesn't force a full application retest any
more than any of the other changes that have gone into 8.4.

We're talking about what we'll do or tell the users to do if we missed
something. By definition we don't know what we've missed, so I don't
think we can come up with a more specific solution than that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 10:33:19
Message-ID: 1239791599.23905.10.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Wed, 2009-04-15 at 10:56 +0100, Greg Stark wrote:
> On Wed, Apr 15, 2009 at 10:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > How do you know which is the offending function? If we force a full
> > application retest we put in place a significant barrier to upgrade.
> > That isn't useful for us as developers, nor is it useful for users.
>
> This is a fundamental conflict, not one that has a single simple answer.
>
> However this seems like a strange place to pick your battle.

I think you are right that you perceive a fundamental conflict and most
things I say become battles. That is not my choice and I will withdraw
from further discussion. My point has been made clearly and has not been
made to cause conflict. I've better things to do with my time than that,
though it's a shame you think that of me.

> As far as I'm concerned commercial support companies can put effort
> into developing backwards-compatibility modules which add no long-term
> value for their paying customers who need it today while the free
> software developers can keep improving the software for new users.

We will all doubtless make money from difficult upgrades, though that is
not my choice, nor that of my customers.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 10:50:08
Message-ID: 4136ffa0904150350j222f1e6brfdca27652d3986fd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 15, 2009 at 11:33 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
>> This is a fundamental conflict, not one that has a single simple answer.
>>
>> However this seems like a strange place to pick your battle.
>
> I think you are right that you perceive a fundamental conflict and most
> things I say become battles. That is not my choice and I will withdraw
> from further discussion. My point has been made clearly and has not been
> made to cause conflict. I've better things to do with my time than that,
> though it's a shame you think that of me.

Uhm, I didn't intend this as criticism at all, except inasmuch as the
judgement about whether the plpgsql lexer was a good choice of place
to make this stand. The use of "battle" was only because of the idiom
"pick your battle".

I think we are in general too conservative about making changes and
you are concerned that we're not giving enough thought to the upgrade
pain and should be more conservative. We can talk about general
policies but ultimately we'll have to debate each change on its
merits.

In this case it would help if we described the specific kinds of code
and consequences users. I'm not sure we're all on the same page.

I think changing the lexer to match the SQL lexer will only affect
string constants and only if standards_conforming_strings is enabled,
and only those instances which are handled internally to plpgsql and
not passed to the SQL engine. So the fix will pretty much always be
local to the behaviour change. It's possible for an escaped string to
need an E'' and for the backslash to migrate to other parts of the
code before triggering a bug (or possibly even get stored in the
database and cause a problem in other parts of the application). But
it should still be pretty straightforward to find the original source
of the string and also pretty easy to recognize string constants
throughout the source code.

As it currently stands a programmer sometimes has to use E'\x' and
sometimes has to use '\x' depending on whether the plpgsql is lexing
the string or is passing it to the SQL engine unlexed. It's not
obvious which parts get handled in which way to a user since some
constructs are handled as SQL which don't appear to be SQL and vice
versa -- at least it's not obvious to me even having read the source
in the past.

If I understand things correctly I think the change improves the
language for future users by far more than it imposes maintenance
costs on existing users, especially considering that anyone depending
on '\x' strings with standards_conforming_strings enabled is only
probably getting it wrong in some places without realizing it anyways

.

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 12:27:34
Message-ID: 603c8f070904150527h7690bb39u99364b7601aff545@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 15, 2009 at 5:56 AM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Wed, Apr 15, 2009 at 10:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>
>> How do you know which is the offending function? If we force a full
>> application retest we put in place a significant barrier to upgrade.
>> That isn't useful for us as developers, nor is it useful for users.
>
> This is a fundamental conflict, not one that has a single simple answer.
>
> However this seems like a strange place to pick your battle. Something
> as low-level as the lexer is very costly to provide multiple
> interfaces to. It's basically impossible short of simply providing two
> different plpgsql languages -- something which won't scale at all if
> we have to do it every time we make a syntax change to the language.

Completely agreed.

> I'm actually concerned that we've become *too* conservative. Pretty
> much any change that doesn't have Tom's full support and credibility
> standing behind it ends up being criticized on the basis that we don't
> know precisely what effects it will have in every possible scenario.

I think we've become too conservative in some areas and not
conservative enough in others. In particular, we're not very
conservative AT ALL about changes to the on-disk format - which is
like unto a bullet through the head for in-place upgrade. And we
sometimes make behavior changes that have potentially catastrophic
user consequences (like that one to TRUNCATE... which one, you ask?
ah, well, you'd better not use TRUNCATE in 8.4 until you RTFM then),
but then we'll have an argument about whether it's OK to make some
change where it's difficult to image the user impact being all that
severe, like:

- this one
- removing the special case for %% in the log_filename
- forward-compatible, backward-compatible improvements to CREATE OR REPLACE VIEW
- lots of others

So it seems that there is no consistent heuristic (other than, as you
say, Tom's approval or lack thereof) applied to these changes.

...Robert


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 12:27:38
Message-ID: 49E5D2BA.7030009@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
> How do you know which is the offending function? If we force a full
> application retest we put in place a significant barrier to upgrade.
> That isn't useful for us as developers, nor is it useful for users.
>
>

We support back branches for a long time for a reason. Nobody in their
right mind should upgrade to a new version without without first
extensively testing (and if necessary adjusting) their application on
it. This is true regardless of this issue and will be true for every
release.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 13:31:08
Message-ID: 603c8f070904150631o3818d128wd37266d0ebbce667@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 15, 2009 at 5:56 AM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Wed, Apr 15, 2009 at 10:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>
>> How do you know which is the offending function? If we force a full
>> application retest we put in place a significant barrier to upgrade.
>> That isn't useful for us as developers, nor is it useful for users.
>
> This is a fundamental conflict, not one that has a single simple answer.
>
> However this seems like a strange place to pick your battle. Something
> as low-level as the lexer is very costly to provide multiple
> interfaces to. It's basically impossible short of simply providing two
> different plpgsql languages -- something which won't scale at all if
> we have to do it every time we make a syntax change to the language.

Completely agreed.

> I'm actually concerned that we've become *too* conservative. Pretty
> much any change that doesn't have Tom's full support and credibility
> standing behind it ends up being criticized on the basis that we don't
> know precisely what effects it will have in every possible scenario.

I think we've become too conservative in some areas and not
conservative enough in others. In particular, we're not very
conservative AT ALL about changes to the on-disk format - which is
like unto a bullet through the head for in-place upgrade. And we
sometimes make behavior changes that have potentially catastrophic
user consequences (like that one to TRUNCATE... which one, you ask?
ah, well, you'd better not use TRUNCATE in 8.4 until you RTFM then),
but then we'll have an argument about whether it's OK to make some
change where it's difficult to image the user impact being all that
severe - like this one, for example (or removing the special case for
no %-escapes in log_filename).

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 15:45:06
Message-ID: 11501.1239810306@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> We support back branches for a long time for a reason.

I think that's really the bottom line here. If we insist on new major
releases always being bug-compatible with prior releases, our ability to
improve the software will go to zero. The solution we have opted for
instead is to support back branches for a long time and to avoid making
that type of change in a back branch.

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-15 15:53:37
Message-ID: 1239810817.7840.235.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2009-04-15 at 11:45 -0400, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > We support back branches for a long time for a reason.
>
> I think that's really the bottom line here. If we insist on new major
> releases always being bug-compatible with prior releases, our ability to
> improve the software will go to zero. The solution we have opted for
> instead is to support back branches for a long time and to avoid making
> that type of change in a back branch.
>

+1

Joshua D. Drake

> regards, tom lane
>
--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 16:12:12
Message-ID: 23254.1239984732@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I had earlier speculated semi-facetiously about ripping out the plpgsql
> lexer altogether, but the more I think about it the less silly the idea
> looks.

This little project crashed and burned upon remembering that plpgsql
invokes raw_parser() to syntax-check each chunk of SQL that it extracts.
If plpgsql were using the main lexer, that would mean recursive use of
the lexer --- and it's not re-entrant.

We could think about making the main lexer re-entrant, but that would
involve a bump in the minimum required flex version (I don't know when
%option reentrant got added, but it's not in 2.5.4). And it definitely
doesn't seem like something to be doing during beta.

Getting rid of the requirement for recursion doesn't look palatable
either. We don't want to delay the syntax check for reasons explained
in check_sql_expr()'s comments; and that's not the only source of
recursion anyway --- plpgsql_parse_datatype does it too, and there could
be other places.

So I think we are down to a choice of doing nothing for 8.4, or teaching
the existing plpgsql lexer about standard_conforming_strings. Assuming
the current proposal for U& literals holds up, it should not be
necessary for plpgsql to know about those explicitly as long as it obeys
standard_conforming_strings, so this might not be too horrid a project.
I'll take a look at that next.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 16:58:06
Message-ID: 20090417165806.GB10700@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 17, 2009 at 12:12:12PM -0400, Tom Lane wrote:
> I wrote:
> > I had earlier speculated semi-facetiously about ripping out the
> > plpgsql lexer altogether, but the more I think about it the less
> > silly the idea looks.
>
> This little project crashed and burned upon remembering that plpgsql
> invokes raw_parser() to syntax-check each chunk of SQL that it
> extracts. If plpgsql were using the main lexer, that would mean
> recursive use of the lexer --- and it's not re-entrant.
>
> We could think about making the main lexer re-entrant, but that
> would involve a bump in the minimum required flex version (I don't
> know when %option reentrant got added, but it's not in 2.5.4). And
> it definitely doesn't seem like something to be doing during beta.
>
> Getting rid of the requirement for recursion doesn't look palatable
> either. We don't want to delay the syntax check for reasons
> explained in check_sql_expr()'s comments; and that's not the only
> source of recursion anyway --- plpgsql_parse_datatype does it too,
> and there could be other places.
>
> So I think we are down to a choice of doing nothing for 8.4, or
> teaching the existing plpgsql lexer about
> standard_conforming_strings. Assuming the current proposal for U&
> literals holds up, it should not be necessary for plpgsql to know
> about those explicitly as long as it obeys
> standard_conforming_strings, so this might not be too horrid a
> project. I'll take a look at that next.

Speaking of standard_conforming_strings, I know it's late, but if we
make it a requirement now, a lot of problems just go away. Yes, it's
inconvenient, but we're making lots of big changes, so one more
shouldn't halt adoption.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 17:01:39
Message-ID: 20090417170139.GI7709@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter wrote:

> Speaking of standard_conforming_strings, I know it's late, but if we
> make it a requirement now, a lot of problems just go away. Yes, it's
> inconvenient, but we're making lots of big changes, so one more
> shouldn't halt adoption.

16 days too late ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: David Fetter <david(at)fetter(dot)org>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 17:06:32
Message-ID: 20090417170632.GC10700@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 17, 2009 at 01:01:39PM -0400, Alvaro Herrera wrote:
> David Fetter wrote:
>
> > Speaking of standard_conforming_strings, I know it's late, but if
> > we make it a requirement now, a lot of problems just go away.
> > Yes, it's inconvenient, but we're making lots of big changes, so
> > one more shouldn't halt adoption.
>
> 16 days too late ...

Depends. If we've found show-stopping bugs, as it appears we may have
done, in not requiring standards_conforming_strings, we can't just
pull a MySQL and ship anyhow.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 18:03:45
Message-ID: 49E8C481.7070700@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> David Fetter wrote:
>
>
>> Speaking of standard_conforming_strings, I know it's late, but if we
>> make it a requirement now, a lot of problems just go away. Yes, it's
>> inconvenient, but we're making lots of big changes, so one more
>> shouldn't halt adoption.
>>
>
> 16 days too late ...
>
>

More like several months plus 16 days, I'd say.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 18:07:55
Message-ID: 25507.1239991675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> Depends. If we've found show-stopping bugs, as it appears we may have
> done, in not requiring standards_conforming_strings, we can't just
> pull a MySQL and ship anyhow.

It's hardly a "show stopping bug", considering it's been there since
standard_conforming_strings was invented.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 18:08:11
Message-ID: 20090417180811.GD10700@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 17, 2009 at 02:03:45PM -0400, Andrew Dunstan wrote:
> Alvaro Herrera wrote:
>> David Fetter wrote:
>>
>>> Speaking of standard_conforming_strings, I know it's late, but if
>>> we make it a requirement now, a lot of problems just go away.
>>> Yes, it's inconvenient, but we're making lots of big changes, so
>>> one more shouldn't halt adoption.
>>
>> 16 days too late ...
>
> More like several months plus 16 days, I'd say.

If our string problems turn out to be fixable short of making this
option mandatory, possibly. It would depend on how fragile that
collection of fixes turns out to be.

We haven't shipped with glaring known-broken stuff in userland before.
Are we going to start now?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 18:09:22
Message-ID: 20090417180922.GE10700@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 17, 2009 at 02:07:55PM -0400, Tom Lane wrote:
> David Fetter <david(at)fetter(dot)org> writes:
> > Depends. If we've found show-stopping bugs, as it appears we may
> > have done, in not requiring standards_conforming_strings, we can't
> > just pull a MySQL and ship anyhow.
>
> It's hardly a "show stopping bug", considering it's been there since
> standard_conforming_strings was invented.

A known sploit would be a show-stopper.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-17 18:14:01
Message-ID: 25705.1239992041@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
>> It's hardly a "show stopping bug", considering it's been there since
>> standard_conforming_strings was invented.

> A known sploit would be a show-stopper.

We're not turning on standard_conforming_strings right now. We are
*certainly* not forcing it on without recourse in existing branches,
which would be the logical conclusion if we considered this to be a
security issue fixable in no other way. Would you mind not wasting the
list's time with this?

regards, tom lane


From: Stephen Cook <sclists(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-18 09:23:37
Message-ID: 49E99C19.5080302@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

++

David Fetter wrote:
> On Fri, Apr 17, 2009 at 01:01:39PM -0400, Alvaro Herrera wrote:
>> David Fetter wrote:
>>
>>> Speaking of standard_conforming_strings, I know it's late, but if
>>> we make it a requirement now, a lot of problems just go away.
>>> Yes, it's inconvenient, but we're making lots of big changes, so
>>> one more shouldn't halt adoption.
>> 16 days too late ...
>
> Depends. If we've found show-stopping bugs, as it appears we may have
> done, in not requiring standards_conforming_strings, we can't just
> pull a MySQL and ship anyhow.
>
> Cheers,
> David.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 16:42:44
Message-ID: 15478.1240159364@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> So I think we are down to a choice of doing nothing for 8.4, or teaching
> the existing plpgsql lexer about standard_conforming_strings. Assuming
> the current proposal for U& literals holds up, it should not be
> necessary for plpgsql to know about those explicitly as long as it obeys
> standard_conforming_strings, so this might not be too horrid a project.
> I'll take a look at that next.

The attached proposed patch rips out plpgsql's handling of comments and
string literals, and puts in scanner rules that are extracted from the
core lexer (but simplified in a few places where we don't need all the
complexity). The net user-visible effects should be:

* Both regular and E'' literals should now be parsed exactly the same
as the core does it.

* Nested slash-star comments are now handled properly.

* Warnings and errors associated with string parsing should now match
the core, which means they might vary a bit from previous plpgsql
behavior.

I need to test this a bit more, and it could probably do with adding
a few regression test cases, but I think it's code-complete.

Comments?

regards, tom lane

PS: in passing I got rid of the scanner_functype/scanner_typereported
kluge, which might once have had some purpose but now is just cluttering
both the scanner and the grammar. This is a leftover from my failed
attempt at removing the scanner altogether. Since it simplifies the
code I thought I'd keep it.

Attachment Content-Type Size
unknown_filename text/plain 25.0 KB

From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 17:18:32
Message-ID: 4136ffa0904191018s448bb55v3a15d44a8ac76b75@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 19, 2009 at 5:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> * Nested slash-star comments are now handled properly.

as opposed to?

--
greg


From: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 17:24:31
Message-ID: 1FD7BF1E-AFD4-4374-A0C1-4AA1D6A9349F@pointblue.com.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 19 Apr 2009, at 17:42, Tom Lane wrote:
>
> The attached proposed patch rips out plpgsql's handling of comments
> and
> string literals, and puts in scanner rules that are extracted from the
> core lexer (but simplified in a few places where we don't need all the
> complexity). The net user-visible effects should be:
>
>
> Comments?

Will it also mean, that queries are going to be analyzed deeper ?
Ie, afaik I am able now to create plpgsql function, that tries to run
query accessing non existent table, or columns.
Or, if I rename column/table/relation now, views, etc are getting
updated - but not plpgsql functions. Will that change with your patch ?


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 17:28:49
Message-ID: 4136ffa0904191028n60b02d88k706af6256c816bb3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 19, 2009 at 6:24 PM, Grzegorz Jaskiewicz
<gj(at)pointblue(dot)com(dot)pl> wrote:
> Will it also mean, that queries are going to be analyzed deeper ?
> Ie, afaik I am able now to create plpgsql function, that tries to run query
> accessing non existent table, or columns.
> Or, if I rename column/table/relation now, views, etc are getting updated -
> but not plpgsql functions. Will that change with your patch ?

The scanner isn't responsible for anything like this. It just braeks
the input up into tokens. So its responsible for determining where
strings start and end and where tble names start and end but doesn't
actually look up the name anywhere -- that's up to the parser and
later steps. So no.

--
greg


From: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 17:30:12
Message-ID: C757CC22-75BE-4C00-A59E-DC2897F7F0B4@pointblue.com.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 19 Apr 2009, at 18:28, Greg Stark wrote:

> On Sun, Apr 19, 2009 at 6:24 PM, Grzegorz Jaskiewicz
> <gj(at)pointblue(dot)com(dot)pl> wrote:
>> Will it also mean, that queries are going to be analyzed deeper ?
>> Ie, afaik I am able now to create plpgsql function, that tries to
>> run query
>> accessing non existent table, or columns.
>> Or, if I rename column/table/relation now, views, etc are getting
>> updated -
>> but not plpgsql functions. Will that change with your patch ?
>
>
> The scanner isn't responsible for anything like this. It just braeks
> the input up into tokens. So its responsible for determining where
> strings start and end and where tble names start and end but doesn't
> actually look up the name anywhere -- that's up to the parser and
> later steps. So no.
ok, thanks.
To be honest, That would be the great feature.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replacing plpgsql's lexer
Date: 2009-04-19 17:34:20
Message-ID: 18237.1240162460@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> On Sun, Apr 19, 2009 at 5:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * Nested slash-star comments are now handled properly.

> as opposed to?

They nest, as required by the SQL spec and implemented by our core
lexer. plpgsql didn't use to get this right.

regards, tom lane