Re: Similar to csvlog but not really, json logs?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Jim Nasby <jim(at)nasby(dot)net>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Similar to csvlog but not really, json logs?
Date: 2014-08-28 18:40:52
Message-ID: 53FF77B4.7020109@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/27/2014 09:53 AM, Andres Freund wrote:
>> > Perhaps instead of doing this in-core it would be better to make log handling more extensible? I'm thinking add a specific "binary" format and an external tool that can parse that and do whatever the user wants with it. That means we don't have to keep adding more complexity to the internal log handling (which already has the risk of being a bottleneck), while allowing maximum user flexibility.
> There's a logging hook. Most of this should be doable from there.

Is there any docs at all on the logging hooks? I couldn't find any.
Maybe that's why people keep trying to reinvent them.

The main reason I personally would like to have JSON logs has more to do
with formatting and extracting data. For example, if you have a
prepared statement and log_min_duration_statement turned on, you get
something like this in the "message" and "details" fields:

"duration: 8.253 ms execute <unnamed>: SELECT login FROM users WHERE id
= $1, "parameters: $1 = '90700'"

... and then for various analytics, like pgBadger and troubleshooting,
you have to break out the parts of those fields by regex and split,
which is error-prone; in our query replay tool, I have at least a dozen
commits tweaking the regexes because of some special case I didn't
account for (like an array parameter).

It would be vastly easier to work with the above as JSON:

...
"message" : { "duration" : 8.253, "command" : "execute",
"statement_name" : "<unnamed>", "statement" : "SELECT login FROM users
WHERE id = $1" }, "details" : { "parameters" : { "$1" : 90700 } }
...

This would allow me, or Dalibo, to remove literally dozens of lines of
error-prone regexing code.

That fix would, IMHO, make it worth enabling JSON logging as a logging
hook or something similar. If we're just going to convert the CSV to
JSON, with the existing fields? Waste of time, I can do that with a
5-line script in at least 3 different languages.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-08-28 19:05:05 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Pavel Stehule 2014-08-28 18:35:03 Re: Why data of timestamptz does not store value of timezone passed to it?