Re: enhanced error fields

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: enhanced error fields
Date: 2012-12-28 15:09:19
Message-ID: CAFj8pRCPWzAvyKkL=K5qCdkPG+LA75F9p7OiH6Gf666Yg319WA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

2012/12/28 Stephen Frost <sfrost(at)snowman(dot)net>:
> Pavel, Peter,
>
> To be honest, I haven't been following this thread at all, but I'm kind
> of amazed that this patch seems to be progressing. I spent quite a bit
> of time previously trying to change the CSV log structure to add a
> single column and this patch is adding a whole slew of them. My prior
> patch also allowed customization of the CSV log output, which seems like
> it would be even more valuable if we're going to be adding a bunch more
> fields. I know there are dissenting viewpoints on that, however, as
> application authors would prefer to not have to deal with variable CSV
> output formats, which I can understand.

Some smarter design of decision what will be stored to CSV and what
now can be great. I am not thinking so these enhanced fields has value
pro typical DBA and should be stored to CSV only when somebody need
it. But it is different story - although it can simplify our work
because then we don't need to solve this issue.

>
> As regards this specific patch, I trust that the protocol error response
> changes won't have any impact on existing clients? Do we anticipate
> something like the JDBC driver having any issues? If there's any chance
> that we're going to break a client by sending it things which it won't
> understand, it seems we'd need to bump the protocol (which hasn't been
> done in quite some time and would likely needs its own discussion...).
>

Depends on implementation. Implementations designed similar to libpq
should not have a problems.

> Regarding qualifying what we're returning- I tend to agree with Pavel
> that if we're going to provide this information, we should do it in a
> fully qualified manner. I can certainly see situations where constraint
> names are repeated in environments and it may not be clear what schema
> it's in (think application development with a dev and a test schema in
> the same database), and the same could hold for constraints against
> tables (though having those repeated would certainly be more rare, since
> you can't repeat a named constraint in a given schema if it has an index
> behind it, though you could with simple CHECK constraints). Again, my
> feeling is that if we're going to provide such information, it should be
> fully-qualified.
>
> There are some additional concerns regarding the patch itself that I
> have (do we really want to modify ereport() in this way? How can we
> implement something which scales better than just adding more and more
> parameters?) but I think we need to figure out exactly what we're agreed
> to be doing with this patch and get buy-in from everyone first.

Related fields are fundamental - and I am thinking so is good to
optimize it - it has semantic tag, and tag has one char size. Some set
of fields is given by ANSI - we can select some subset because not all
fields described by ANSI has sense in pg. I don't would to enhance
range of this patch too much and I don't would to redesign current
concept of error handling. I am thinking so it works well and I have
no negative feedback from PostgreSQL users that I know.

But probably only reserved memory for ErrorData is limit for
"serialized custom fields" - I believe so we will have a associative
array - and one field can be of this type for any custom fields.

Regards

Pavel

>
> Thanks,
>
> Stephen
>
> * Pavel Stehule (pavel(dot)stehule(at)gmail(dot)com) wrote:
>> I rechecked your version eelog4 and I am thinking so it is ok. From my
>> view it can be enough for application developer and I am not against
>> to commit this (or little bit reduced version) as first step.
>>
>> As plpgsql developer I really miss a fields "routine_name,
>> routine_schema and trigger_name". These fields can be simply
>> implemented without any impact on performance. Because routine_name
>> and routine_schema is not unique in postgres, I propose third field
>> "routine_oid". My patch eelog5 is based on your eelog4 with enhancing
>> for these mentioned fields - 5KB more - but only PL/pgSQL is supported
>> now. I would to find a acceptable design now.
>>
>> Second - I don't see any value for forwarding these new fields
>> (column_name, table_name, constraint_name, schema_name, routine_name,
>> routine_schema, routine_id, trigger_name) to log or to csvlog and I
>> propose don't push it to log. We can drop logging in next iteration if
>> you agree.
>>
>> What do you thinking about new version?
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>> 2012/12/10 Peter Geoghegan <peter(at)2ndquadrant(dot)com>:
>> > So, I took a look at this, and made some more changes.
>> >
>> > I have a hard time seeing the utility of some fields that were in this
>> > patch, and so they have been removed from this revision.
>> >
>> > Consider, for example:
>> >
>> > + constraint_table text,
>> > + constraint_schema text,
>> >
>> > While constraint_name remains, I find it really hard to believe that
>> > applications need a perfectly unambiguous idea of what constraint
>> > they're dealing with. If applications have a constraint that has a
>> > different domain-specific meaning depending on what schema it is in,
>> > while a table in one schema uses that constraint in another, well,
>> > that's a fairly awful design. Is it something that we really need to
>> > care about? You mentioned the SQL standard, but as far as I know the
>> > SQL standard has nothing to say about these fields within something
>> > like errdata - their names are lifted from information_schema, but
>> > being unambiguous is hardly so critical here. We want to identify an
>> > object for the purposes of displaying an error message only. Caring
>> > deeply about ambiguity seems wrong-headed to me in this context.
>> >
>> > + routine_name text,
>> > + trigger_name text,
>> > + trigger_table text,
>> > + trigger_schema text,
>> >
>> > The whole point of an exception (which ereport() is very similar to)
>> > is to decouple errors from error handling as much as possible - any
>> > application that magically takes a different course of action based on
>> > where that error occurred as reported by the error itself (and not the
>> > location of where handling the error occurs) seems kind of
>> > wrong-headed to me. Sure, a backtrace may be supplied, but that's only
>> > for diagnostic purposes, and a human can just as easily get that
>> > information already. If you can present an example of this information
>> > actually being present in a way that is amenable to that, either in
>> > any other RDBMS or any major programming language or framework, I
>> > might reconsider.
>> >
>> > This just leaves:
>> >
>> > + column_name text,
>> > + table_name text,
>> > + constraint_name text,
>> > + schema_name text,
>> >
>> > This seems like enough information for any reasonable use of this
>> > infrastructure, and I highly doubt anyone would avail of the other
>> > fields if they remained. I guess an application might have done
>> > something with "constraint_table", as with foreign keys for example,
>> > but I just can't see that being used when constraint_name can be used.
>> >
>> > Removing these fields has also allowed me to remove the "avoid setting
>> > field at lower point in the callstack" logic, which was itself kind of
>> > ugly. Since fields like routine_name only set themselves at the top of
>> > the callstack, the potential for astonishing outcomes was pretty high
>> > - what if you expect one routine_name, but then that routine follows a
>> > slightly different code-path one day and you get another function
>> > setting routine_name and undermining that expectation?
>> >
>> > There were some bugs left in the patch eelog3.diff, mostly due to
>> > things like setting table name to what is actually an index name. As I
>> > mentioned, we now assert that:
>> >
>> > + Assert(table->rd_rel->relkind == RELKIND_RELATION);
>> >
>> > in the functions within relerror.c.
>> >
>> > I have tightened up where these fields are available, and
>> > appropriately documented that for the benefit of both application
>> > developers and developers of client libraries. I went so far as to
>> > hack the Perl scripts that generate .sgml and .h files from
>> > errcodes.txt (i.e. the infrastrucutre that produces tables of errcodes
>> > in various places from a single authoritative place) - I have
>> > instituted a coding standard so that these fields are reliably
>> > available and have documented that requirement at both the user and
>> > hacker level.
>> >
>> > It would be nice if a Perl hacker could eyeball those changes - this
>> > is my first time writing Perl, and I suspect it may be worth having
>> > someone else to polish the Perl code a bit.
>> >
>> > I have emphasized the need for consistency and a sane contract for
>> > application developers and third-party client driver authors - they
>> > *need* to know that certain fields will always be available, or at
>> > least won't be unavailable due to a change in the phase of the moon.
>> >
>> > errcodes.txt now says:
>> >
>> > + # Postgres coding standards mandate that certain fields be available in all
>> > + # instances for some of the Class 23 errcodes, documented under
>> > "Requirement: "
>> > + # here. Some other errcode's ereport sites may, at their own discretion, make
>> > + # errcolumn, errtable, errconstraint and errschema fields available too.
>> > + # Furthermore, it is possible to make some fields available beyond those
>> > + # formally required at callsites involving these Class 23 errcodes with
>> > + # "Requirements: ".
>> > Section: Class 23 - Integrity Constraint Violation
>> > ! Requirement: unused
>> > 23000 E ERRCODE_INTEGRITY_CONSTRAINT_VIOLATION
>> > integrity_constraint_violation
>> > + Requirement: unused
>> > 23001 E ERRCODE_RESTRICT_VIOLATION
>> > restrict_violation
>> > + # Note that requirements for ERRCODE_NOT_NULL do not apply to domains:
>> > + Requirement: schema_name, table_name
>> > 23502 E ERRCODE_NOT_NULL_VIOLATION
>> > not_null_violation
>> > + Requirement: schema_name, table_name, constraint_name
>> > 23503 E ERRCODE_FOREIGN_KEY_VIOLATION
>> > foreign_key_violation
>> > + Requirement: schema_name, table_name, constraint_name
>> > 23505 E ERRCODE_UNIQUE_VIOLATION
>> > unique_violation
>> > + Requirement: constraint_name
>> > 23514 E ERRCODE_CHECK_VIOLATION
>> > check_violation
>> > + Requirement: schema_name, table_name, constraint_name
>> > 23P01 E ERRCODE_EXCLUSION_VIOLATION
>> > exclusion_violation
>> >
>> > Now, there are one or two places where these fields are not actually
>> > available even though they're formally required according to a literal
>> > reading of the above. This is only because there is clearly no such
>> > field sensibly available, even in principle - to my mind this cannot
>> > be a problem, because the application developer cannot have any
>> > reasonable expectation of a field being set. I'm really talking about
>> > two cases in particular:
>> >
>> > * For ERRCODE_NOT_NULL_VIOLATION, we don't actually provide
>> > schema_name and table_name in the event of domains. This was
>> > previously identified as an issue. If it is judged better to not have
>> > any requirements there at all, so be it.
>> >
>> > * For the validateDomainConstraint() ERRCODE_CHECK_VIOLATION ereport
>> > call, we may not provide a constraint name iff a Constraint.connname
>> > is NULL. Since there isn't a constraint name to give even in
>> > principle, and this is an isolated case, this seems reasonable.
>> >
>> > To make logging less verbose, TABLE NAME isn't consistently split out
>> > as a separate field - this seems fine to me, since application code
>> > doesn't target logs:
>> >
>> > + if (edata->column_name && edata->table_name)
>> > + {
>> > + log_line_prefix(&buf, edata);
>> > + appendStringInfo(&buf, _("COLUMN NAME: %s:%s\n"),
>> > + edata->table_name, edata->column_name);
>> > + }
>> > + else if (edata->table_name)
>> > + {
>> > + log_line_prefix(&buf, edata);
>> > + appendStringInfo(&buf, _("TABLE NAME: %s\n"),
>> > + edata->table_name);
>> > + }
>> >
>> > I used pgindent to selectively indent parts of the codebase affected
>> > by this patch. I am about ready to mark this one "ready for
>> > committer", but it would be nice at this point to get some buy-in on
>> > the basic view of how these things should work that informed this
>> > revision. Does anyone disagree with my contention that there should be
>> > a sane, well-defined contract, or any of the details of what that
>> > should look like? Was I right to suggest that some of the set of
>> > fields that appeared in Pavel's eelog3.diff revision are unnecessary?
>> >
>> > I'm sorry it took me as long as it did to get back to you on this.
>> >
>> > --
>> > Peter Geoghegan http://www.2ndQuadrant.com/
>> > PostgreSQL Development, 24x7 Support, Training and Services
>
>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-12-28 15:23:20 Re: multiple CREATE FUNCTION AS items for PLs
Previous Message Stephen Frost 2012-12-28 14:00:50 Re: enhanced error fields