Re: Publish autovacuum informations

Lists: pgsql-hackers
From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Publish autovacuum informations
Date: 2014-12-29 15:55:34
Message-ID: CAECtzeU_VyakTqtL7mPHo=pdRi4oD4PVuGymjoK+Nw1O9Aaenw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hey,

There are times where I would need more informations on the autovacuum
processes.

I'd love to know what each worker is currently doing. I can get something
like this from the pg_stat_activity view but it doesn't give me as much
informations as the WorkerInfoData struct.

I'd also love to have more informations on the contents of the tables list
(how many tables still to process, which table next, what kind of
processing they'll get, etc... kinda what you have in the autovac_table
struct).

All in all, I want to get informations that are typically stored in shared
memory, handled by the autovacuum launcher and autovacuum workers. I first
thought I could get that by writing some C functions embedded in an
extension. But it doesn't seem to me I can access this part of the shared
memory from a C function. If I'm wrong, I'd love to get a pointer on how to
do this.

Otherwise, I wonder what would be more welcome: making the shared memory
structs handles available outside of the autovacuum processes (and then
build an extension to decode the informations I need), or adding functions
in core to get access to this information (in that case, no need for an
extension)?

Thanks.

Regards.

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2014-12-29 16:03:44
Message-ID: 7087.1419869024@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Guillaume Lelarge <guillaume(at)lelarge(dot)info> writes:
> All in all, I want to get informations that are typically stored in shared
> memory, handled by the autovacuum launcher and autovacuum workers. I first
> thought I could get that by writing some C functions embedded in an
> extension. But it doesn't seem to me I can access this part of the shared
> memory from a C function. If I'm wrong, I'd love to get a pointer on how to
> do this.

> Otherwise, I wonder what would be more welcome: making the shared memory
> structs handles available outside of the autovacuum processes (and then
> build an extension to decode the informations I need), or adding functions
> in core to get access to this information (in that case, no need for an
> extension)?

Either one of those approaches would cripple our freedom to change those
data structures; which we've done repeatedly in the past and will surely
want to do again. So I'm pretty much -1 on exposing them.

regards, tom lane


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2014-12-29 16:07:35
Message-ID: CAECtzeWUB6cRgEJQM+wf9+TzMEFYHrCzUjGjYQEwEvDNupabNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-12-29 17:03 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Guillaume Lelarge <guillaume(at)lelarge(dot)info> writes:
> > All in all, I want to get informations that are typically stored in
> shared
> > memory, handled by the autovacuum launcher and autovacuum workers. I
> first
> > thought I could get that by writing some C functions embedded in an
> > extension. But it doesn't seem to me I can access this part of the shared
> > memory from a C function. If I'm wrong, I'd love to get a pointer on how
> to
> > do this.
>
> > Otherwise, I wonder what would be more welcome: making the shared memory
> > structs handles available outside of the autovacuum processes (and then
> > build an extension to decode the informations I need), or adding
> functions
> > in core to get access to this information (in that case, no need for an
> > extension)?
>
> Either one of those approaches would cripple our freedom to change those
> data structures; which we've done repeatedly in the past and will surely
> want to do again. So I'm pretty much -1 on exposing them.
>
>
I don't see how that's going to deny us the right to change any structs. If
they are in-core functions, we'll just have to update them. If they are
extension functions, then the developer of those functions would simply
need to update his code.

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2014-12-31 16:19:52
Message-ID: CA+TgmoYxtr1rOHSaxn43MUqrwPqZcXVjnkR3fq8SK6cG4BjJKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 29, 2014 at 11:03 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Either one of those approaches would cripple our freedom to change those
> data structures; which we've done repeatedly in the past and will surely
> want to do again. So I'm pretty much -1 on exposing them.

We could instead add a view of this information to core --
pg_stat_autovacuum, or whatever.

But to be honest, I'm more in favor of Guillaume's proposal. I will
repeat my recent assertion that we -- you in particular -- are too
reluctant to expose internal data structures to authors of C
extensions, and that this is developer-hostile.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2014-12-31 17:46:17
Message-ID: 21803.1420047977@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Dec 29, 2014 at 11:03 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Either one of those approaches would cripple our freedom to change those
>> data structures; which we've done repeatedly in the past and will surely
>> want to do again. So I'm pretty much -1 on exposing them.

> We could instead add a view of this information to core --
> pg_stat_autovacuum, or whatever.

> But to be honest, I'm more in favor of Guillaume's proposal. I will
> repeat my recent assertion that we -- you in particular -- are too
> reluctant to expose internal data structures to authors of C
> extensions, and that this is developer-hostile.

Well, the core question there is whether we have a policy of not breaking
extension-visible APIs. While we will very often do things like adding
parameters to existing functions, I think we've tended to refrain from
making wholesale semantic revisions to exposed data structures.

I'd be all right with putting the data structure declarations in a file
named something like autovacuum_private.h, especially if it carried an
annotation that "if you depend on this, don't be surprised if we break
your code in future".

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2015-01-01 22:17:33
Message-ID: 20150101221733.GC2169761@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 31, 2014 at 12:46:17PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, Dec 29, 2014 at 11:03 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Either one of those approaches would cripple our freedom to change those
> >> data structures; which we've done repeatedly in the past and will surely
> >> want to do again. So I'm pretty much -1 on exposing them.
>
> > We could instead add a view of this information to core --
> > pg_stat_autovacuum, or whatever.
>
> > But to be honest, I'm more in favor of Guillaume's proposal. I will
> > repeat my recent assertion that we -- you in particular -- are too
> > reluctant to expose internal data structures to authors of C
> > extensions, and that this is developer-hostile.
>
> Well, the core question there is whether we have a policy of not breaking
> extension-visible APIs.

No, we have no policy restricting backend C API changes in major releases.
Though this message is old enough to enroll in first grade, I know of no
policy decision supplanting it:
http://www.postgresql.org/message-id/8706.1230569070@sss.pgh.pa.us

> While we will very often do things like adding
> parameters to existing functions, I think we've tended to refrain from
> making wholesale semantic revisions to exposed data structures.

True. I especially look to avoid changes that will cause extensions to build
and run, yet silently misbehave at runtime. For example, had I reviewed the
pg_policy patch, I would have examined whether an unmodified 9.4 extension
might let a user bypass relation policy. I oppose most header reorganization,
which breaks builds in exchange for insubstantial benefits. I don't wish to
extend that anywhere near to the point of saying, "Your C function can't use
struct foo, because exposing struct foo in a header file would imply freezing
it." Desire for backend API stability should not drive us to reject new
functionality.

> I'd be all right with putting the data structure declarations in a file
> named something like autovacuum_private.h, especially if it carried an
> annotation that "if you depend on this, don't be surprised if we break
> your code in future".

Such an annotation would be no more true than it is for the majority of header
files. If including it makes you feel better, I don't object.

nm


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2015-01-02 17:21:56
Message-ID: 54A6D3B4.3090505@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/1/15, 4:17 PM, Noah Misch wrote:
>> I'd be all right with putting the data structure declarations in a file
>> >named something like autovacuum_private.h, especially if it carried an
>> >annotation that "if you depend on this, don't be surprised if we break
>> >your code in future".
> Such an annotation would be no more true than it is for the majority of header
> files. If including it makes you feel better, I don't object.

We need to be careful with that. Starting to segregate things into _private headers implies that stuff in non-private headers *is* locked down. We'd need to set clear expectations.

I do think more clarity would be good here. Right now the only distinction we have is things like SPI are spelled out in the docs. Other than that, the there really isn't anything to indicate how safe it is to rely on what's in the headers. For example, I've got some code that's looking at fcinfo->flinfo->fn_expr, and I have no idea how likely that is to get broken. Since it's a parse node, my guess is "likely", but I'm just guessing.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2015-01-05 16:40:20
Message-ID: CA+Tgmob3OsG9BqcHZv7Q__=sG78byPkkAx=fTS8Tdz-LTrUB_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'd be all right with putting the data structure declarations in a file
> named something like autovacuum_private.h, especially if it carried an
> annotation that "if you depend on this, don't be surprised if we break
> your code in future".

Works for me. I am not in general surprised when we do things that
break my code, or anyway, the code that I'm responsible for
maintaining. But I think it makes sense to segregate this into a
separate header file so that we are clear that it is only exposed for
the benefit of extension authors, not so that other things in the core
system can touch it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2015-01-05 16:44:13
Message-ID: CAECtzeWkxwqcAE5VfzkSjk44Xsa0GdTL36d9UaHNvVQ-cG-9xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:

> On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I'd be all right with putting the data structure declarations in a file
> > named something like autovacuum_private.h, especially if it carried an
> > annotation that "if you depend on this, don't be surprised if we break
> > your code in future".
>
> Works for me. I am not in general surprised when we do things that
> break my code, or anyway, the code that I'm responsible for
> maintaining. But I think it makes sense to segregate this into a
> separate header file so that we are clear that it is only exposed for
> the benefit of extension authors, not so that other things in the core
> system can touch it.
>
>
I'm fine with that too. I'll try to find some time to work on that.

Thanks.

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2015-06-04 20:10:00
Message-ID: CAECtzeVfJ1r7MQ5DmpzvMP2NFLrtGkc6TOikm9nuaEFXkbnhrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info>:

> 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:
>
>> On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > I'd be all right with putting the data structure declarations in a file
>> > named something like autovacuum_private.h, especially if it carried an
>> > annotation that "if you depend on this, don't be surprised if we break
>> > your code in future".
>>
>> Works for me. I am not in general surprised when we do things that
>> break my code, or anyway, the code that I'm responsible for
>> maintaining. But I think it makes sense to segregate this into a
>> separate header file so that we are clear that it is only exposed for
>> the benefit of extension authors, not so that other things in the core
>> system can touch it.
>>
>>
> I'm fine with that too. I'll try to find some time to work on that.
>
>
So I took a look at this this week. I discovered, with the help of a
coworker, that I can already use the AutoVacuumShmem pointer and read the
struct. Unfortunately, it doesn't give me as much details as I would have
liked. The list of databases and tables aren't in shared memory. They are
local to the process that uses them. Putting them in shared memory (if at
all possible) would imply a much bigger patch than I was willing to write
right now.

Thanks anyway for the help.

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-02-29 18:04:58
Message-ID: 56D4884A.9010602@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/06/2015 22:10, Guillaume Lelarge wrote:
> 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
> <mailto:guillaume(at)lelarge(dot)info>>:
>
> 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>:
>
> On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
> > I'd be all right with putting the data structure declarations in a file
> > named something like autovacuum_private.h, especially if it carried an
> > annotation that "if you depend on this, don't be surprised if we break
> > your code in future".
>
> Works for me. I am not in general surprised when we do things that
> break my code, or anyway, the code that I'm responsible for
> maintaining. But I think it makes sense to segregate this into a
> separate header file so that we are clear that it is only
> exposed for
> the benefit of extension authors, not so that other things in
> the core
> system can touch it.
>
>
> I'm fine with that too. I'll try to find some time to work on that.
>
>
> So I took a look at this this week. I discovered, with the help of a
> coworker, that I can already use the AutoVacuumShmem pointer and read
> the struct. Unfortunately, it doesn't give me as much details as I would
> have liked. The list of databases and tables aren't in shared memory.
> They are local to the process that uses them. Putting them in shared
> memory (if at all possible) would imply a much bigger patch than I was
> willing to write right now.
>
> Thanks anyway for the help.
>
>

Sorry to revive such an old thread.

I think some hooks in the autovacuum could be enough to have good
insight without exposing private structure.

Please find attached a patch that adds some hooks to the autovacuum, and
as an example a quick proof of concept extension that use them and allow
to see what are the autovacuum worker todo list, skipped tables and so on.

I'm not really sure about which information should be provided, so I'm
open to any suggestion to improve this.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
autovacuum_hook_v1.diff text/plain 4.8 KB
pg_stat_autovacuum.tgz application/x-compressed-tar 4.7 KB

From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-02-29 19:20:28
Message-ID: CAFcNs+py_4Dz7ThTCr0-yJFQh0i55uEXnpn8ZZ6REnSXEBn4nA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
wrote:
>
> On 04/06/2015 22:10, Guillaume Lelarge wrote:
> > 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
> > <mailto:guillaume(at)lelarge(dot)info>>:
> >
> > 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
> > <mailto:robertmhaas(at)gmail(dot)com>>:
> >
> > On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> > <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
> > > I'd be all right with putting the data structure declarations
in a file
> > > named something like autovacuum_private.h, especially if it
carried an
> > > annotation that "if you depend on this, don't be surprised if
we break
> > > your code in future".
> >
> > Works for me. I am not in general surprised when we do things
that
> > break my code, or anyway, the code that I'm responsible for
> > maintaining. But I think it makes sense to segregate this into
a
> > separate header file so that we are clear that it is only
> > exposed for
> > the benefit of extension authors, not so that other things in
> > the core
> > system can touch it.
> >
> >
> > I'm fine with that too. I'll try to find some time to work on that.
> >
> >
> > So I took a look at this this week. I discovered, with the help of a
> > coworker, that I can already use the AutoVacuumShmem pointer and read
> > the struct. Unfortunately, it doesn't give me as much details as I would
> > have liked. The list of databases and tables aren't in shared memory.
> > They are local to the process that uses them. Putting them in shared
> > memory (if at all possible) would imply a much bigger patch than I was
> > willing to write right now.
> >
> > Thanks anyway for the help.
> >
> >
>
> Sorry to revive such an old thread.
>
> I think some hooks in the autovacuum could be enough to have good
> insight without exposing private structure.
>

Interesting idea...

> Please find attached a patch that adds some hooks to the autovacuum, and
> as an example a quick proof of concept extension that use them and allow
> to see what are the autovacuum worker todo list, skipped tables and so on.
>
> I'm not really sure about which information should be provided, so I'm
> open to any suggestion to improve this.
>

I have a look at the patch and it's compile without warning and without
regression.

But something goes wrong when installing the extension:

fabrizio(at)bagual:~/Downloads/pg_stat_autovacuum
$ pg_config
BINDIR = /data/home/fabrizio/pgsql/bin
DOCDIR = /data/home/fabrizio/pgsql/share/doc
HTMLDIR = /data/home/fabrizio/pgsql/share/doc
INCLUDEDIR = /data/home/fabrizio/pgsql/include
PKGINCLUDEDIR = /data/home/fabrizio/pgsql/include
INCLUDEDIR-SERVER = /data/home/fabrizio/pgsql/include/server
LIBDIR = /data/home/fabrizio/pgsql/lib
PKGLIBDIR = /data/home/fabrizio/pgsql/lib
LOCALEDIR = /data/home/fabrizio/pgsql/share/locale
MANDIR = /data/home/fabrizio/pgsql/share/man
SHAREDIR = /data/home/fabrizio/pgsql/share
SYSCONFDIR = /data/home/fabrizio/pgsql/etc
PGXS = /data/home/fabrizio/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/home/fabrizio/pgsql' '--enable-cassert'
'--enable-coverage' '--enable-tap-tests' '--enable-depend'
CC = gcc
CPPFLAGS = -DFRONTEND -D_GNU_SOURCE
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
-fprofile-arcs -ftest-coverage
CFLAGS_SL = -fpic
LDFLAGS = -L../../src/common -Wl,--as-needed
-Wl,-rpath,'/home/fabrizio/pgsql/lib',--enable-new-dtags
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgcommon -lpgport -lz -lreadline -lrt -lcrypt -ldl -lm
VERSION = PostgreSQL 9.6devel

fabrizio(at)bagual:~/Downloads/pg_stat_autovacuum
$ make USE_PGXS=1 install
/bin/mkdir -p '/data/home/fabrizio/pgsql/lib'
/bin/mkdir -p '/data/home/fabrizio/pgsql/share/extension'
/bin/mkdir -p '/data/home/fabrizio/pgsql/share/extension'
/bin/mkdir -p '/data/home/fabrizio/pgsql/share/doc/extension'
/usr/bin/install -c -m 755 pg_stat_autovacuum.so
'/data/home/fabrizio/pgsql/lib/pg_stat_autovacuum.so'
/usr/bin/install -c -m 644 .//pg_stat_autovacuum.control
'/data/home/fabrizio/pgsql/share/extension/'
/usr/bin/install -c -m 644 .//pg_stat_autovacuum--0.0.1.sql
'/data/home/fabrizio/pgsql/share/extension/'
/usr/bin/install -c -m 644 '/data/home/fabrizio/pgsql/share/doc/extension/'
/usr/bin/install: missing destination file operand after
‘/data/home/fabrizio/pgsql/share/doc/extension/’
Try '/usr/bin/install --help' for more information.
make: *** [install] Error 1

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: fabriziomello(at)gmail(dot)com
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-02-29 19:38:30
Message-ID: 56D49E36.9070004@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29/02/2016 20:20, Fabrízio de Royes Mello wrote:
>
> On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud
> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>>
>> On 04/06/2015 22:10, Guillaume Lelarge wrote:
>> > 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
> <mailto:guillaume(at)lelarge(dot)info>
>> > <mailto:guillaume(at)lelarge(dot)info <mailto:guillaume(at)lelarge(dot)info>>>:
>> >
>> > 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>
>> > <mailto:robertmhaas(at)gmail(dot)com <mailto:robertmhaas(at)gmail(dot)com>>>:
>> >
>> > On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane
> <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> > <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>> wrote:
>> > > I'd be all right with putting the data structure
> declarations in a file
>> > > named something like autovacuum_private.h, especially if
> it carried an
>> > > annotation that "if you depend on this, don't be surprised
> if we break
>> > > your code in future".
>> >
>> > Works for me. I am not in general surprised when we do
> things that
>> > break my code, or anyway, the code that I'm responsible for
>> > maintaining. But I think it makes sense to segregate this
> into a
>> > separate header file so that we are clear that it is only
>> > exposed for
>> > the benefit of extension authors, not so that other things in
>> > the core
>> > system can touch it.
>> >
>> >
>> > I'm fine with that too. I'll try to find some time to work on that.
>> >
>> >
>> > So I took a look at this this week. I discovered, with the help of a
>> > coworker, that I can already use the AutoVacuumShmem pointer and read
>> > the struct. Unfortunately, it doesn't give me as much details as I would
>> > have liked. The list of databases and tables aren't in shared memory.
>> > They are local to the process that uses them. Putting them in shared
>> > memory (if at all possible) would imply a much bigger patch than I was
>> > willing to write right now.
>> >
>> > Thanks anyway for the help.
>> >
>> >
>>
>> Sorry to revive such an old thread.
>>
>> I think some hooks in the autovacuum could be enough to have good
>> insight without exposing private structure.
>>
>
> Interesting idea...
>

Thanks for looking at it!

>
>> Please find attached a patch that adds some hooks to the autovacuum, and
>> as an example a quick proof of concept extension that use them and allow
>> to see what are the autovacuum worker todo list, skipped tables and so on.
>>
>> I'm not really sure about which information should be provided, so I'm
>> open to any suggestion to improve this.
>>
>
> I have a look at the patch and it's compile without warning and without
> regression.
>
> But something goes wrong when installing the extension:
>
> [...]
> /usr/bin/install -c -m 644 '/data/home/fabrizio/pgsql/share/doc/extension/'
> /usr/bin/install: missing destination file operand after
> ‘/data/home/fabrizio/pgsql/share/doc/extension/’
> Try '/usr/bin/install --help' for more information.
> make: *** [install] Error 1
>

Oups, I'm not really sure what I removed and shouldn't have. I attached
v2 of the extension with a "standard" Makefile, which I just tested and
works fine.

> Regards,
>
> --
> Fabrízio de Royes Mello
> Consultoria/Coaching PostgreSQL
>>> Timbira: http://www.timbira.com.br
>>> Blog: http://fabriziomello.github.io
>>> Linkedin: http://br.linkedin.com/in/fabriziomello
>>> Twitter: http://twitter.com/fabriziomello
>>> Github: http://github.com/fabriziomello

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
pg_stat_autovacuum_v2.tgz application/x-compressed-tar 4.9 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 06:50:34
Message-ID: CAB7nPqRL2S5Vm9Sx9+wuXZZ6H02j7AmtsgwNUgeACUYk1sgPJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 1, 2016 at 4:38 AM, Julien Rouhaud
<julien(dot)rouhaud(at)dalibo(dot)com> wrote:
> On 29/02/2016 20:20, Fabrízio de Royes Mello wrote:
>>
>> On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud
>> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>>>
>>> On 04/06/2015 22:10, Guillaume Lelarge wrote:
>>> > 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
>> <mailto:guillaume(at)lelarge(dot)info>
>>> > <mailto:guillaume(at)lelarge(dot)info <mailto:guillaume(at)lelarge(dot)info>>>:
>>> >
>>> > 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
>> <mailto:robertmhaas(at)gmail(dot)com>
>>> > <mailto:robertmhaas(at)gmail(dot)com <mailto:robertmhaas(at)gmail(dot)com>>>:
>>> >
>>> > On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane
>> <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>
>>> > <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>> wrote:
>>> > > I'd be all right with putting the data structure
>> declarations in a file
>>> > > named something like autovacuum_private.h, especially if
>> it carried an
>>> > > annotation that "if you depend on this, don't be surprised
>> if we break
>>> > > your code in future".
>>> >
>>> > Works for me. I am not in general surprised when we do
>> things that
>>> > break my code, or anyway, the code that I'm responsible for
>>> > maintaining. But I think it makes sense to segregate this
>> into a
>>> > separate header file so that we are clear that it is only
>>> > exposed for
>>> > the benefit of extension authors, not so that other things in
>>> > the core
>>> > system can touch it.
>>> >
>>> >
>>> > I'm fine with that too. I'll try to find some time to work on that.
>>> >
>>> >
>>> > So I took a look at this this week. I discovered, with the help of a
>>> > coworker, that I can already use the AutoVacuumShmem pointer and read
>>> > the struct. Unfortunately, it doesn't give me as much details as I would
>>> > have liked. The list of databases and tables aren't in shared memory.
>>> > They are local to the process that uses them. Putting them in shared
>>> > memory (if at all possible) would imply a much bigger patch than I was
>>> > willing to write right now.
>>> >
>>> > Thanks anyway for the help.
>>> >
>>> >
>>>
>>> Sorry to revive such an old thread.
>>>
>>> I think some hooks in the autovacuum could be enough to have good
>>> insight without exposing private structure.

Instead of introducing 4 new hooks, which do not represent a general
use actually, why don't you expose a portion of this information in
shared memory as mentioned upthread? This sounds like a good approach
to me. Your extension could then scan them as needed and put that on
view or a function. This information is now private in the autovacuum
processes, exposing them would allow plugin authors to do a bunch of
fancy things I think, in a more flexible way than those hooks. And
there is no need to add more hooks should the structure of the
autovacuum code change for a reason or another in the future.
--
Michael


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 11:44:30
Message-ID: 56D5809E.2000408@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/2016 07:50, Michael Paquier wrote:
> On Tue, Mar 1, 2016 at 4:38 AM, Julien Rouhaud
> <julien(dot)rouhaud(at)dalibo(dot)com> wrote:
>> On 29/02/2016 20:20, Fabrízio de Royes Mello wrote:
>>>
>>> On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud
>>> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>>>>
>>>> On 04/06/2015 22:10, Guillaume Lelarge wrote:
>>>>> 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
>>> <mailto:guillaume(at)lelarge(dot)info>
>>>>> <mailto:guillaume(at)lelarge(dot)info <mailto:guillaume(at)lelarge(dot)info>>>:
>>>>>
>>>>> 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
>>> <mailto:robertmhaas(at)gmail(dot)com>
>>>>> <mailto:robertmhaas(at)gmail(dot)com <mailto:robertmhaas(at)gmail(dot)com>>>:
>>>>>
>>>>> On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane
>>> <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>
>>>>> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>> wrote:
>>>>> > I'd be all right with putting the data structure
>>> declarations in a file
>>>>> > named something like autovacuum_private.h, especially if
>>> it carried an
>>>>> > annotation that "if you depend on this, don't be surprised
>>> if we break
>>>>> > your code in future".
>>>>>
>>>>> Works for me. I am not in general surprised when we do
>>> things that
>>>>> break my code, or anyway, the code that I'm responsible for
>>>>> maintaining. But I think it makes sense to segregate this
>>> into a
>>>>> separate header file so that we are clear that it is only
>>>>> exposed for
>>>>> the benefit of extension authors, not so that other things in
>>>>> the core
>>>>> system can touch it.
>>>>>
>>>>>
>>>>> I'm fine with that too. I'll try to find some time to work on that.
>>>>>
>>>>>
>>>>> So I took a look at this this week. I discovered, with the help of a
>>>>> coworker, that I can already use the AutoVacuumShmem pointer and read
>>>>> the struct. Unfortunately, it doesn't give me as much details as I would
>>>>> have liked. The list of databases and tables aren't in shared memory.
>>>>> They are local to the process that uses them. Putting them in shared
>>>>> memory (if at all possible) would imply a much bigger patch than I was
>>>>> willing to write right now.
>>>>>
>>>>> Thanks anyway for the help.
>>>>>
>>>>>
>>>>
>>>> Sorry to revive such an old thread.
>>>>
>>>> I think some hooks in the autovacuum could be enough to have good
>>>> insight without exposing private structure.
>
> Instead of introducing 4 new hooks, which do not represent a general
> use actually, why don't you expose a portion of this information in
> shared memory as mentioned upthread? This sounds like a good approach
> to me. Your extension could then scan them as needed and put that on
> view or a function. This information is now private in the autovacuum
> processes, exposing them would allow plugin authors to do a bunch of
> fancy things I think, in a more flexible way than those hooks. And
> there is no need to add more hooks should the structure of the
> autovacuum code change for a reason or another in the future.
>

I thought exposing private structures could be a blocking issue. I
tried to see what could be done using hooks, and one thing I like is
that we can compute the process time of each relation, or even aggregate
some statistics. Having the vacuum time is something that we can
actually only obtain by setting log_autovacuum_min_duration and parsing
the logs, and I don't think it would be possible to do this by just
exposing current private structure.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 12:47:36
Message-ID: CAFcNs+oj9QV=4iJbhkp1+Kmsw8XGpx=cNsaK6eaWtUAd02xcCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 1, 2016 at 8:44 AM, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
wrote:
>
> On 01/03/2016 07:50, Michael Paquier wrote:
> > On Tue, Mar 1, 2016 at 4:38 AM, Julien Rouhaud
> > <julien(dot)rouhaud(at)dalibo(dot)com> wrote:
> >> On 29/02/2016 20:20, Fabrízio de Royes Mello wrote:
> >>>
> >>> On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud
> >>> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
> >>>>
> >>>> On 04/06/2015 22:10, Guillaume Lelarge wrote:
> >>>>> 2015-01-05 17:44 GMT+01:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info
> >>> <mailto:guillaume(at)lelarge(dot)info>
> >>>>> <mailto:guillaume(at)lelarge(dot)info <mailto:guillaume(at)lelarge(dot)info>>>:
> >>>>>
> >>>>> 2015-01-05 17:40 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
> >>> <mailto:robertmhaas(at)gmail(dot)com>
> >>>>> <mailto:robertmhaas(at)gmail(dot)com <mailto:robertmhaas(at)gmail(dot)com>>>:
> >>>>>
> >>>>> On Wed, Dec 31, 2014 at 12:46 PM, Tom Lane
> >>> <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>
> >>>>> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>>
wrote:
> >>>>> > I'd be all right with putting the data structure
> >>> declarations in a file
> >>>>> > named something like autovacuum_private.h, especially if
> >>> it carried an
> >>>>> > annotation that "if you depend on this, don't be surprised
> >>> if we break
> >>>>> > your code in future".
> >>>>>
> >>>>> Works for me. I am not in general surprised when we do
> >>> things that
> >>>>> break my code, or anyway, the code that I'm responsible for
> >>>>> maintaining. But I think it makes sense to segregate this
> >>> into a
> >>>>> separate header file so that we are clear that it is only
> >>>>> exposed for
> >>>>> the benefit of extension authors, not so that other things
in
> >>>>> the core
> >>>>> system can touch it.
> >>>>>
> >>>>>
> >>>>> I'm fine with that too. I'll try to find some time to work on
that.
> >>>>>
> >>>>>
> >>>>> So I took a look at this this week. I discovered, with the help of a
> >>>>> coworker, that I can already use the AutoVacuumShmem pointer and
read
> >>>>> the struct. Unfortunately, it doesn't give me as much details as I
would
> >>>>> have liked. The list of databases and tables aren't in shared
memory.
> >>>>> They are local to the process that uses them. Putting them in shared
> >>>>> memory (if at all possible) would imply a much bigger patch than I
was
> >>>>> willing to write right now.
> >>>>>
> >>>>> Thanks anyway for the help.
> >>>>>
> >>>>>
> >>>>
> >>>> Sorry to revive such an old thread.
> >>>>
> >>>> I think some hooks in the autovacuum could be enough to have good
> >>>> insight without exposing private structure.
> >
> > Instead of introducing 4 new hooks, which do not represent a general
> > use actually, why don't you expose a portion of this information in
> > shared memory as mentioned upthread? This sounds like a good approach
> > to me. Your extension could then scan them as needed and put that on
> > view or a function. This information is now private in the autovacuum
> > processes, exposing them would allow plugin authors to do a bunch of
> > fancy things I think, in a more flexible way than those hooks. And
> > there is no need to add more hooks should the structure of the
> > autovacuum code change for a reason or another in the future.
> >
>
> I thought exposing private structures could be a blocking issue. I
> tried to see what could be done using hooks, and one thing I like is
> that we can compute the process time of each relation, or even aggregate
> some statistics. Having the vacuum time is something that we can
> actually only obtain by setting log_autovacuum_min_duration and parsing
> the logs, and I don't think it would be possible to do this by just
> exposing current private structure.
>

We understood (IMHO is an interesting idea) but as Michael said hooks is
for a general purpose. So can you demonstrate other use cases for this new
hooks?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: fabriziomello(at)gmail(dot)com
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 14:37:28
Message-ID: 56D5A928.6040207@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/2016 13:47, Fabrízio de Royes Mello wrote:
> On Tue, Mar 1, 2016 at 8:44 AM, Julien Rouhaud
> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>>
>> On 01/03/2016 07:50, Michael Paquier wrote:
>> > On Tue, Mar 1, 2016 at 4:38 AM, Julien Rouhaud
>> > <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>> >> On 29/02/2016 20:20, Fabrízio de Royes Mello wrote:
>> >>>
>> >>> On Mon, Feb 29, 2016 at 3:04 PM, Julien Rouhaud
>> >>> <julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>
> <mailto:julien(dot)rouhaud(at)dalibo(dot)com <mailto:julien(dot)rouhaud(at)dalibo(dot)com>>>
> wrote:
>> >>>>
>> >>>> I think some hooks in the autovacuum could be enough to have good
>> >>>> insight without exposing private structure.
>> >
>> > Instead of introducing 4 new hooks, which do not represent a general
>> > use actually, why don't you expose a portion of this information in
>> > shared memory as mentioned upthread? This sounds like a good approach
>> > to me. Your extension could then scan them as needed and put that on
>> > view or a function. This information is now private in the autovacuum
>> > processes, exposing them would allow plugin authors to do a bunch of
>> > fancy things I think, in a more flexible way than those hooks. And
>> > there is no need to add more hooks should the structure of the
>> > autovacuum code change for a reason or another in the future.
>> >
>>
>> I thought exposing private structures could be a blocking issue. I
>> tried to see what could be done using hooks, and one thing I like is
>> that we can compute the process time of each relation, or even aggregate
>> some statistics. Having the vacuum time is something that we can
>> actually only obtain by setting log_autovacuum_min_duration and parsing
>> the logs, and I don't think it would be possible to do this by just
>> exposing current private structure.
>>
>
> We understood (IMHO is an interesting idea) but as Michael said hooks is
> for a general purpose. So can you demonstrate other use cases for this
> new hooks?
>

I can think of several usage. First, since the hook will always be
called, an extension will see all the activity a worker is doing when
exposing private structure will always be some kind of sampling. Then,
you can have other information that wouldn't be available just by
exposing private structure. For instance knowing a VACUUM isn't
performed by the worker (either because another worker is already
working on it or because it isn't needed anymore). IIRC there was a
discussion about concurrency issue in this case. We can also know if the
maintenance was cancelled due to lock not obtained fast enough.
Finally, as long as the hooks aren't use, they don't have any overhead.
I agree that all this is for monitoring purpose.

I'm not sure what are the fancy things that Michael had in mind with
exposing the private structure. Michael, was it something like having
the ability to change some of these data through an extension?

> Regards,
>
> --
> Fabrízio de Royes Mello
> Consultoria/Coaching PostgreSQL
>>> Timbira: http://www.timbira.com.br
>>> Blog: http://fabriziomello.github.io
>>> Linkedin: http://br.linkedin.com/in/fabriziomello
>>> Twitter: http://twitter.com/fabriziomello
>>> Github: http://github.com/fabriziomello

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, <fabriziomello(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 19:29:29
Message-ID: 56D5ED99.9010501@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/1/16 8:37 AM, Julien Rouhaud wrote:
>> >
>> >We understood (IMHO is an interesting idea) but as Michael said hooks is
>> >for a general purpose. So can you demonstrate other use cases for this
>> >new hooks?
>> >
> I can think of several usage. First, since the hook will always be
> called, an extension will see all the activity a worker is doing when
> exposing private structure will always be some kind of sampling. Then,

I think that's pretty key. If you wanted to create an extension that
logs vacuums (which would be great, since current state of the art is
logs + pgBadger), you'd want to gather your data about what the vacuum
did as the vacuum was ending.

I can certainly see cases where you don't care about that and just want
what's in shared memory, but that would only be useful for monitoring
what's happening real-time, not for knowing what final results are.

BTW, I think as much of this as possible should also work for regular
vacuums.

> you can have other information that wouldn't be available just by
> exposing private structure. For instance knowing a VACUUM isn't
> performed by the worker (either because another worker is already
> working on it or because it isn't needed anymore). IIRC there was a
> discussion about concurrency issue in this case. We can also know if the
> maintenance was cancelled due to lock not obtained fast enough.
> Finally, as long as the hooks aren't use, they don't have any overhead.
> I agree that all this is for monitoring purpose.
>
> I'm not sure what are the fancy things that Michael had in mind with
> exposing the private structure. Michael, was it something like having
> the ability to change some of these data through an extension?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, fabriziomello(at)gmail(dot)com
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 21:02:03
Message-ID: 56D6034B.7030202@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/2016 20:29, Jim Nasby wrote:
> On 3/1/16 8:37 AM, Julien Rouhaud wrote:
>>> >
>>> >We understood (IMHO is an interesting idea) but as Michael said
>>> hooks is
>>> >for a general purpose. So can you demonstrate other use cases for this
>>> >new hooks?
>>> >
>> I can think of several usage. First, since the hook will always be
>> called, an extension will see all the activity a worker is doing when
>> exposing private structure will always be some kind of sampling. Then,
>
> I think that's pretty key. If you wanted to create an extension that
> logs vacuums (which would be great, since current state of the art is
> logs + pgBadger), you'd want to gather your data about what the vacuum
> did as the vacuum was ending.
>

Indeed these information are missing. I guess that'd be possible by
adding (or moving) a hook in lazy_vacuum_rel() that provide access to
part or all of the LVRelStats and rusage informations.

> I can certainly see cases where you don't care about that and just want
> what's in shared memory, but that would only be useful for monitoring
> what's happening real-time, not for knowing what final results are.
>
> BTW, I think as much of this as possible should also work for regular
> vacuums.
>

You mean for database wide vacuum?

>> you can have other information that wouldn't be available just by
>> exposing private structure. For instance knowing a VACUUM isn't
>> performed by the worker (either because another worker is already
>> working on it or because it isn't needed anymore). IIRC there was a
>> discussion about concurrency issue in this case. We can also know if the
>> maintenance was cancelled due to lock not obtained fast enough.
>> Finally, as long as the hooks aren't use, they don't have any overhead.
>> I agree that all this is for monitoring purpose.
>>
>> I'm not sure what are the fancy things that Michael had in mind with
>> exposing the private structure. Michael, was it something like having
>> the ability to change some of these data through an extension?

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, <fabriziomello(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-01 21:30:37
Message-ID: 56D609FD.1040305@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/1/16 3:02 PM, Julien Rouhaud wrote:
> You mean for database wide vacuum?

I mean manual vacuum. Some hooks and stats would apply only to autovac
obviously (and it'd be nice to get visibility into the scheduling
decisions both daemons are making). But as much as possible things
should be done in vacuum.c/lazyvacuum.c so it works for manual vacuums
as well.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-02 06:30:53
Message-ID: CAB7nPqT+edmThH=rSsK4uDHmCWgycVPqVe2K4TeoBf_w+AUgfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 1, 2016 at 11:37 PM, Julien Rouhaud wrote:
> I'm not sure what are the fancy things that Michael had in mind with
> exposing the private structure. Michael, was it something like having
> the ability to change some of these data through an extension?

I was referring to you here :)
I have witnessed already many fancy things coming out of brain, and I
have no doubt you could make something out of just a sampling of data.
Jokes apart, what I mean with fancy things here is putting the
sampling data in a shape that a user suits to him. It would be easy to
exploit that for example in a background worker that scans
periodically the shared memory, or have an extension that represents
this shared memory data into something that can be queried at SQL
level. Now, the main use case that I have with the data available in
shared memory is more or less:
- represent the current shared memory as SQL
- Scan this data, reshape it and report it elsewhere, say a background
worker printing out a file.

Now, take your set of hooks... There are 4 hooks here:
- autovacuum_list_tables_hook, to do something with the list of tables
a worker has collected, at the moment they have been collected.
- autovacuum_end_table_hook, to do something when knowing that a table
is skipped or cancelled
- autovacuum_begin_table_hook, to trigger something when a relation is
beginning to be processed.
- autovacuum_database_finished_hook, to trigger something once a
database is done with its processing.

The only use cases that I have in mind here for those hooks, which
would help in decision-making to tune autovacuum-related parameters
would be the following:
- Log entries regarding those operations, then why not introducing a
GUC parameter that is an on/off switch, like log_autovacuum (this is
not a per-relation parameter), defaulting to off. Those could be used
by pgbadger.
- Have system statistics with a new system relation like
pg_stat_autovacuum, and make this information available to user.
Are there other things that you think could make use those hooks? Your
extension just does pg_stat_autovacuum and emulates the existing
pg_stat_* facility when gathering information about the global
autovacuum statistics. So it seems to me that those hooks are not that
necessary, and that this may help in tuning a system, particularly the
number of relations skipped would be interesting to have.

The stats could have a delay, the point being to have hints on how
autovacuum workers are sorting things out. In short, I am doubtful
about the need of hooks in those code paths, the thing that we should
try to do instead is to improve native solutions to give user more
information regarding how autovacuum works, which help in tuning it.
--
Michael


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-02 16:48:45
Message-ID: 56D7196D.1060504@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/03/2016 07:30, Michael Paquier wrote:
> On Tue, Mar 1, 2016 at 11:37 PM, Julien Rouhaud wrote:
>> I'm not sure what are the fancy things that Michael had in mind with
>> exposing the private structure. Michael, was it something like having
>> the ability to change some of these data through an extension?
>
> I was referring to you here :)
> I have witnessed already many fancy things coming out of brain, and I
> have no doubt you could make something out of just a sampling of data.

:D

> Jokes apart, what I mean with fancy things here is putting the
> sampling data in a shape that a user suits to him. It would be easy to
> exploit that for example in a background worker that scans
> periodically the shared memory, or have an extension that represents
> this shared memory data into something that can be queried at SQL
> level. Now, the main use case that I have with the data available in
> shared memory is more or less:
> - represent the current shared memory as SQL
> - Scan this data, reshape it and report it elsewhere, say a background
> worker printing out a file.
>
> Now, take your set of hooks... There are 4 hooks here:
> - autovacuum_list_tables_hook, to do something with the list of tables
> a worker has collected, at the moment they have been collected.
> - autovacuum_end_table_hook, to do something when knowing that a table
> is skipped or cancelled
> - autovacuum_begin_table_hook, to trigger something when a relation is
> beginning to be processed.
> - autovacuum_database_finished_hook, to trigger something once a
> database is done with its processing.
>
> The only use cases that I have in mind here for those hooks, which
> would help in decision-making to tune autovacuum-related parameters
> would be the following:
> - Log entries regarding those operations, then why not introducing a
> GUC parameter that is an on/off switch, like log_autovacuum (this is
> not a per-relation parameter), defaulting to off. Those could be used
> by pgbadger.

This would be nice, but the point if this proposal is to be able to have
this available at SQL level. (but big +1 on the feature)

> - Have system statistics with a new system relation like
> pg_stat_autovacuum, and make this information available to user.
> Are there other things that you think could make use those hooks? Your
> extension just does pg_stat_autovacuum and emulates the existing
> pg_stat_* facility when gathering information about the global
> autovacuum statistics. So it seems to me that those hooks are not that
> necessary, and that this may help in tuning a system, particularly the
> number of relations skipped would be interesting to have.
>
> The stats could have a delay, the point being to have hints on how
> autovacuum workers are sorting things out. In short, I am doubtful
> about the need of hooks in those code paths, the thing that we should
> try to do instead is to improve native solutions to give user more
> information regarding how autovacuum works, which help in tuning it.
>

Good point, I don't see a lot of information available with this hooks
that a native system statistics couldn't offer. To have the same amount
of information, I think we'd need a pg_stat_autovacuum view that shows a
realtime insight of the workers, and also add some aggregated counters
to PgStat_StatTabEntry. I wonder if adding counters to
PgStat_StatTabEntry would be accepted though.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-02 23:48:06
Message-ID: 56D77BB6.6080606@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/2/16 10:48 AM, Julien Rouhaud wrote:
> Good point, I don't see a lot of information available with this hooks
> that a native system statistics couldn't offer. To have the same amount
> of information, I think we'd need a pg_stat_autovacuum view that shows a
> realtime insight of the workers, and also add some aggregated counters
> to PgStat_StatTabEntry. I wonder if adding counters to
> PgStat_StatTabEntry would be accepted though.

I would also really like to see a means of logging (auto)vacuum activity
in the database itself. We figured out how to do that with
pg_stat_statements, which was a lot harder... it seems kinda silly not
to offer that for vacuum. Hooks plus shared memory data should allow for
that (the only tricky bit is the hook would need to start and then
commit a transaction, but that doesn't seem onerous).

I think the shared memory structures should be done as well. Having that
real-time info is also valuable.

I don't see too much point in adding stuff to the stats system for this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: Jim(dot)Nasby(at)BlueTreble(dot)com
Cc: julien(dot)rouhaud(at)dalibo(dot)com, michael(dot)paquier(at)gmail(dot)com, fabriziomello(at)gmail(dot)com, guillaume(at)lelarge(dot)info, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Publish autovacuum informations
Date: 2016-03-03 09:54:16
Message-ID: 20160303.185416.153711032.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

At Wed, 2 Mar 2016 17:48:06 -0600, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> wrote in <56D77BB6(dot)6080606(at)BlueTreble(dot)com>
> On 3/2/16 10:48 AM, Julien Rouhaud wrote:
> > Good point, I don't see a lot of information available with this hooks
> > that a native system statistics couldn't offer. To have the same
> > amount
> > of information, I think we'd need a pg_stat_autovacuum view that shows
> > a
> > realtime insight of the workers, and also add some aggregated counters
> > to PgStat_StatTabEntry. I wonder if adding counters to
> > PgStat_StatTabEntry would be accepted though.
>
> I would also really like to see a means of logging (auto)vacuum
> activity in the database itself. We figured out how to do that with
> pg_stat_statements, which was a lot harder... it seems kinda silly not
> to offer that for vacuum. Hooks plus shared memory data should allow
> for that (the only tricky bit is the hook would need to start and then
> commit a transaction, but that doesn't seem onerous).
>
> I think the shared memory structures should be done as well. Having
> that real-time info is also valuable.
>
> I don't see too much point in adding stuff to the stats system for
> this.

I wonder why there haven't been discussions so far on what kind
of information we want by this feature. For example I'd be happy
to see the time of last autovacuum trial and the cause if it has
been skipped for every table. Such information would (maybe)
naturally be shown in pg_stat_*_tables.

=====
=# select relid, last_completed_autovacuum, last_completed_autovacv_status, last_autovacuum_trial, last_autovacuum_result from pg_stat_user_tables;
-[ RECORD 1 ]-----------------+------
relid | 16390
last_completed_autovacuum | 2016-03-01 01:25:00.349074+09
last_completed_autovac_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
last_autovacuum_trial | 2016-03-03 17:33:04.004322+09
last_autovac_traial_status | Canceled by PID 2355. Processed 144/553 pages.
-[ RECORD 2 ]----------+------
...
last_autovacuum_trial | 2016-03-03 07:25:00.349074+09
last_autovac_traial_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
-[ RECORD 3 ]----------+------
...
last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
last_autovac_trial_status | Processing by PID 42334, 564 / 32526 pages done.
-[ RECORD 4 ]----------+------
...
last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
last_autovac_trial_status | Skipped by dead-tuple threashold.
=====

Apart from the appropriateness of the concrete shape, it would be
done by extending the current stats system and needs modification
of some other parts but the hooks and WorkerInfoData is not
enough. This might be a business of Rahila's "VACUUM Progress
Checker" and it convers some real-time info.

https://commitfest.postgresql.org/9/545/

On the other hand, it would be in another place and needs another
method if we want a history like the current autovacuum
completion logs (at debug3..) of 100 latest invocation of
autovacuum worker. Anyway the WorkerInfoData is not enough.

What kind of information we (will) want to have?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jim(dot)Nasby(at)BlueTreble(dot)com
Cc: michael(dot)paquier(at)gmail(dot)com, fabriziomello(at)gmail(dot)com, guillaume(at)lelarge(dot)info, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Publish autovacuum informations
Date: 2016-03-04 21:52:03
Message-ID: 56DA0383.2090108@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2016 10:54, Kyotaro HORIGUCHI wrote:
> Hello,
>
> At Wed, 2 Mar 2016 17:48:06 -0600, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> wrote in <56D77BB6(dot)6080606(at)BlueTreble(dot)com>
>> On 3/2/16 10:48 AM, Julien Rouhaud wrote:
>>> Good point, I don't see a lot of information available with this hooks
>>> that a native system statistics couldn't offer. To have the same
>>> amount
>>> of information, I think we'd need a pg_stat_autovacuum view that shows
>>> a
>>> realtime insight of the workers, and also add some aggregated counters
>>> to PgStat_StatTabEntry. I wonder if adding counters to
>>> PgStat_StatTabEntry would be accepted though.
>>
>> I would also really like to see a means of logging (auto)vacuum
>> activity in the database itself. We figured out how to do that with
>> pg_stat_statements, which was a lot harder... it seems kinda silly not
>> to offer that for vacuum. Hooks plus shared memory data should allow
>> for that (the only tricky bit is the hook would need to start and then
>> commit a transaction, but that doesn't seem onerous).
>>
>> I think the shared memory structures should be done as well. Having
>> that real-time info is also valuable.
>>
>> I don't see too much point in adding stuff to the stats system for
>> this.
>
> I wonder why there haven't been discussions so far on what kind
> of information we want by this feature. For example I'd be happy
> to see the time of last autovacuum trial and the cause if it has
> been skipped for every table. Such information would (maybe)
> naturally be shown in pg_stat_*_tables.
>
> =====
> =# select relid, last_completed_autovacuum, last_completed_autovacv_status, last_autovacuum_trial, last_autovacuum_result from pg_stat_user_tables;
> -[ RECORD 1 ]-----------------+------
> relid | 16390
> last_completed_autovacuum | 2016-03-01 01:25:00.349074+09
> last_completed_autovac_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
> last_autovacuum_trial | 2016-03-03 17:33:04.004322+09
> last_autovac_traial_status | Canceled by PID 2355. Processed 144/553 pages.
> -[ RECORD 2 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 07:25:00.349074+09
> last_autovac_traial_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
> -[ RECORD 3 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
> last_autovac_trial_status | Processing by PID 42334, 564 / 32526 pages done.
> -[ RECORD 4 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
> last_autovac_trial_status | Skipped by dead-tuple threashold.
> =====
>
> Apart from the appropriateness of the concrete shape, it would be
> done by extending the current stats system and needs modification
> of some other parts but the hooks and WorkerInfoData is not
> enough. This might be a business of Rahila's "VACUUM Progress
> Checker" and it convers some real-time info.
>
> https://commitfest.postgresql.org/9/545/
>
> On the other hand, it would be in another place and needs another
> method if we want a history like the current autovacuum
> completion logs (at debug3..) of 100 latest invocation of
> autovacuum worker. Anyway the WorkerInfoData is not enough.
>
>
> What kind of information we (will) want to have?
>

Very good suggestion.

I think the most productive way to work on this is to start a wiki page
to summarize what's the available information, what we should store and
how to represent it.

I'll update this thread as soon as I'll have a first draft finished.

>
> regards,
>

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jim(dot)Nasby(at)bluetreble(dot)com, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, guillaume(at)lelarge(dot)info, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-04 22:34:24
Message-ID: CAB7nPqS+QT8SRv0WTF1Zxtfa3nb9_pcrv3Pkun9wGGkMARjBVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 5, 2016 at 6:52 AM, Julien Rouhaud
<julien(dot)rouhaud(at)dalibo(dot)com> wrote:
> Very good suggestion.
>
> I think the most productive way to work on this is to start a wiki page
> to summarize what's the available information, what we should store and
> how to represent it.
>
> I'll update this thread as soon as I'll have a first draft finished.

New design discussions are a little bit late for 9.6 I am afraid :(
Perhaps we should consider this patch as returned with feedback for
the time being? The hook approach is not something I'd wish for if we
can improve in-core facility that would help user to decide better how
to tune autovacuum parameters. The VACUUM progress facility covers a
different need by helping to track how long a scan is still going to
take. What we want here is something that would run on top of that.
Logs at least may be helpful for things like pgbadger.
--
Michael


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jim(dot)Nasby(at)bluetreble(dot)com, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, guillaume(at)lelarge(dot)info, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-05 00:21:13
Message-ID: 56DA2679.60402@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/2016 23:34, Michael Paquier wrote:
> On Sat, Mar 5, 2016 at 6:52 AM, Julien Rouhaud
> <julien(dot)rouhaud(at)dalibo(dot)com> wrote:
>> Very good suggestion.
>>
>> I think the most productive way to work on this is to start a wiki page
>> to summarize what's the available information, what we should store and
>> how to represent it.
>>
>> I'll update this thread as soon as I'll have a first draft finished.
>
> New design discussions are a little bit late for 9.6 I am afraid :(
> Perhaps we should consider this patch as returned with feedback for
> the time being? The hook approach is not something I'd wish for if we
> can improve in-core facility that would help user to decide better how
> to tune autovacuum parameters.

Yes, it's clearly not suited for the final commitfest. I just closed the
patch as "returned with feedback".

I'll work on the feedbacks I already had to document a wiki page, and
wait for this commitfest to be more or less finished before starting a
new thread on autovacuum instrumentation design.

> The VACUUM progress facility covers a
> different need by helping to track how long a scan is still going to
> take. What we want here is something that would run on top of that.
> Logs at least may be helpful for things like pgbadger.
>

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jim(dot)Nasby(at)bluetreble(dot)com, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, guillaume(at)lelarge(dot)info, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-05 00:32:04
Message-ID: CAB7nPqRQrw7ms02x-eCknKfYcOLOX2JvHd7h=XRWg==9f3_z5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 5, 2016 at 9:21 AM, Julien Rouhaud
<julien(dot)rouhaud(at)dalibo(dot)com> wrote:
> On 04/03/2016 23:34, Michael Paquier wrote:
>> New design discussions are a little bit late for 9.6 I am afraid :(
>> Perhaps we should consider this patch as returned with feedback for
>> the time being? The hook approach is not something I'd wish for if we
>> can improve in-core facility that would help user to decide better how
>> to tune autovacuum parameters.
>
> Yes, it's clearly not suited for the final commitfest. I just closed the
> patch as "returned with feedback".
>
> I'll work on the feedbacks I already had to document a wiki page, and
> wait for this commitfest to be more or less finished before starting a
> new thread on autovacuum instrumentation design.

OK, thanks.
--
Michael


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: <julien(dot)rouhaud(at)dalibo(dot)com>, <michael(dot)paquier(at)gmail(dot)com>, <fabriziomello(at)gmail(dot)com>, <guillaume(at)lelarge(dot)info>, <robertmhaas(at)gmail(dot)com>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-19 00:11:56
Message-ID: 56EC994C.9040001@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/3/16 3:54 AM, Kyotaro HORIGUCHI wrote:
> I wonder why there haven't been discussions so far on what kind
> of information we want by this feature. For example I'd be happy
> to see the time of last autovacuum trial and the cause if it has
> been skipped for every table. Such information would (maybe)
> naturally be shown in pg_stat_*_tables.
>
> =====
> =# select relid, last_completed_autovacuum, last_completed_autovacv_status, last_autovacuum_trial, last_autovacuum_result from pg_stat_user_tables;
> -[ RECORD 1 ]-----------------+------
> relid | 16390
> last_completed_autovacuum | 2016-03-01 01:25:00.349074+09
> last_completed_autovac_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
> last_autovacuum_trial | 2016-03-03 17:33:04.004322+09
> last_autovac_traial_status | Canceled by PID 2355. Processed 144/553 pages.
> -[ RECORD 2 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 07:25:00.349074+09
> last_autovac_traial_status | Completed in 4 seconds. Scanned 434 pages, skipped 23 pages
> -[ RECORD 3 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
> last_autovac_trial_status | Processing by PID 42334, 564 / 32526 pages done.
> -[ RECORD 4 ]----------+------
> ...
> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
> last_autovac_trial_status | Skipped by dead-tuple threashold.
> =====

I kinda like where you're going here, but I certainly don't think the
stats system is the way to do it. Stats bloat is already a problem on
bigger systems. More important, I don't think having just the last
result is very useful. If you've got a vacuum problem, you want to see
history, especially history of the vacuum runs themselves.

The good news is that vacuum is a very low-frequency operation, so it
has none of the concerns that the generic stats system does. I think it
would be reasonable to provide event triggers that fire on every
launcher loop, after a worker has built it's "TODO list", and after
every (auto)vacuum.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: michael(dot)paquier(at)gmail(dot)com, fabriziomello(at)gmail(dot)com, guillaume(at)lelarge(dot)info, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Publish autovacuum informations
Date: 2016-03-30 21:09:33
Message-ID: 56FC408D.9070708@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/03/2016 01:11, Jim Nasby wrote:
> On 3/3/16 3:54 AM, Kyotaro HORIGUCHI wrote:
>> I wonder why there haven't been discussions so far on what kind
>> of information we want by this feature. For example I'd be happy
>> to see the time of last autovacuum trial and the cause if it has
>> been skipped for every table. Such information would (maybe)
>> naturally be shown in pg_stat_*_tables.
>>
>> =====
>> =# select relid, last_completed_autovacuum,
>> last_completed_autovacv_status, last_autovacuum_trial,
>> last_autovacuum_result from pg_stat_user_tables;
>> -[ RECORD 1 ]-----------------+------
>> relid | 16390
>> last_completed_autovacuum | 2016-03-01 01:25:00.349074+09
>> last_completed_autovac_status | Completed in 4 seconds. Scanned 434
>> pages, skipped 23 pages
>> last_autovacuum_trial | 2016-03-03 17:33:04.004322+09
>> last_autovac_traial_status | Canceled by PID 2355. Processed
>> 144/553 pages.
>> -[ RECORD 2 ]----------+------
>> ...
>> last_autovacuum_trial | 2016-03-03 07:25:00.349074+09
>> last_autovac_traial_status | Completed in 4 seconds. Scanned 434
>> pages, skipped 23 pages
>> -[ RECORD 3 ]----------+------
>> ...
>> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
>> last_autovac_trial_status | Processing by PID 42334, 564 / 32526
>> pages done.
>> -[ RECORD 4 ]----------+------
>> ...
>> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
>> last_autovac_trial_status | Skipped by dead-tuple threashold.
>> =====
>
> I kinda like where you're going here, but I certainly don't think the
> stats system is the way to do it. Stats bloat is already a problem on
> bigger systems. More important, I don't think having just the last
> result is very useful. If you've got a vacuum problem, you want to see
> history, especially history of the vacuum runs themselves.
>
> The good news is that vacuum is a very low-frequency operation, so it
> has none of the concerns that the generic stats system does. I think it
> would be reasonable to provide event triggers that fire on every
> launcher loop, after a worker has built it's "TODO list", and after
> every (auto)vacuum.

The main issue I see with an event trigger based solution is that you'll
always have to create them and the needed objects on every database.

Another issue is that both of these approach are not intended to give a
global overview but per-database statistics. I'd prefer a global overview.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, guillaume(at)lelarge(dot)info, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish autovacuum informations
Date: 2016-03-31 00:42:14
Message-ID: CAB7nPqRJupX1-de=0-BMq_v0DXUxOdHJdnw+T5vdgEvU+3o-bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 31, 2016 at 6:09 AM, Julien Rouhaud
<julien(dot)rouhaud(at)dalibo(dot)com> wrote:
> On 19/03/2016 01:11, Jim Nasby wrote:
>> On 3/3/16 3:54 AM, Kyotaro HORIGUCHI wrote:
>>> I wonder why there haven't been discussions so far on what kind
>>> of information we want by this feature. For example I'd be happy
>>> to see the time of last autovacuum trial and the cause if it has
>>> been skipped for every table. Such information would (maybe)
>>> naturally be shown in pg_stat_*_tables.
>>>
>>> =====
>>> =# select relid, last_completed_autovacuum,
>>> last_completed_autovacv_status, last_autovacuum_trial,
>>> last_autovacuum_result from pg_stat_user_tables;
>>> -[ RECORD 1 ]-----------------+------
>>> relid | 16390
>>> last_completed_autovacuum | 2016-03-01 01:25:00.349074+09
>>> last_completed_autovac_status | Completed in 4 seconds. Scanned 434
>>> pages, skipped 23 pages
>>> last_autovacuum_trial | 2016-03-03 17:33:04.004322+09
>>> last_autovac_traial_status | Canceled by PID 2355. Processed
>>> 144/553 pages.
>>> -[ RECORD 2 ]----------+------
>>> ...
>>> last_autovacuum_trial | 2016-03-03 07:25:00.349074+09
>>> last_autovac_traial_status | Completed in 4 seconds. Scanned 434
>>> pages, skipped 23 pages
>>> -[ RECORD 3 ]----------+------
>>> ...
>>> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
>>> last_autovac_trial_status | Processing by PID 42334, 564 / 32526
>>> pages done.
>>> -[ RECORD 4 ]----------+------
>>> ...
>>> last_autovacuum_trial | 2016-03-03 17:59:12.324454+09
>>> last_autovac_trial_status | Skipped by dead-tuple threashold.
>>> =====
>>
>> I kinda like where you're going here, but I certainly don't think the
>> stats system is the way to do it. Stats bloat is already a problem on
>> bigger systems. More important, I don't think having just the last
>> result is very useful. If you've got a vacuum problem, you want to see
>> history, especially history of the vacuum runs themselves.
>>
>> The good news is that vacuum is a very low-frequency operation, so it
>> has none of the concerns that the generic stats system does. I think it
>> would be reasonable to provide event triggers that fire on every
>> launcher loop, after a worker has built it's "TODO list", and after
>> every (auto)vacuum.
>
> The main issue I see with an event trigger based solution is that you'll
> always have to create them and the needed objects on every database.

Which has surely a performance impact as those are row-based. I have
seen complains regarding the fact that those objects can be easily
forgotten...

> Another issue is that both of these approach are not intended to give a
> global overview but per-database statistics. I'd prefer a global overview.

That's important, autovacuum GUC parameters, like the number of
workers, are system-wide.
--
Michael