Review: Display number of changed rows since last analyze

Lists: pgsql-hackers
From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Review: Display number of changed rows since last analyze
Date: 2013-06-17 11:49:02
Message-ID: A737B7A37273E048B164557ADEF4A58B17BB36DC@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This is a review of the patch in 5192D7D2(dot)8020605(at)catalyst(dot)net(dot)nz

The patch applies cleanly (with the exception of catversion.h of course),
compiles without warnings and passes the regression tests.

It contains enough documentation, though I'd prefer
"Estimated number of rows modified since the table was last analyzed"
to
"Estimated number of row changes (inserts + updates + deletes) since the last analyze"

The patch works as it should, and I think that this is a
useful addition. It only exposes a value that is already
available internally, so there shouldn't be any penalties.

I think that the column name is ok as it is, even if it
is a bit long - I cannot come up with a more succinct
idea. Perhaps "n_changed_since_analyze" could be shortened
to "n_mod_since_analyze", but that's not much of an improvement.

This is a very simple change, and I'll mark this patch "ready for committer".

Yours,
Laurenz Albe


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Display number of changed rows since last analyze
Date: 2013-07-01 11:43:33
Message-ID: CABUevExhWPJMv_PAAH8nPRyS427DviT0pfRYpjWHZ885g=kZYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 17, 2013 at 1:49 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
> This is a review of the patch in 5192D7D2(dot)8020605(at)catalyst(dot)net(dot)nz
>
> The patch applies cleanly (with the exception of catversion.h of course),
> compiles without warnings and passes the regression tests.
>
> It contains enough documentation, though I'd prefer
> "Estimated number of rows modified since the table was last analyzed"
> to
> "Estimated number of row changes (inserts + updates + deletes) since the last analyze"
>
> The patch works as it should, and I think that this is a
> useful addition. It only exposes a value that is already
> available internally, so there shouldn't be any penalties.
>
> I think that the column name is ok as it is, even if it
> is a bit long - I cannot come up with a more succinct
> idea. Perhaps "n_changed_since_analyze" could be shortened
> to "n_mod_since_analyze", but that's not much of an improvement.

AFAICT it's related to "n_live_tup", and "n_dead_tup". How about just
"n_mod_tup"? Though that doesn't convey that it's since the last
analyze, I guess.

But given that both n_dead_tup and n_live_tup don't really indicate
that they're not "since the beginning of stats" in the name (which
other stats counters are), I'm not sure that's a problem? It would be
a name that sounds more similar to the rest of the table.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Magnus Hagander *EXTERN*" <magnus(at)hagander(dot)net>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Display number of changed rows since last analyze
Date: 2013-07-01 12:48:03
Message-ID: A737B7A37273E048B164557ADEF4A58B17BC290B@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> On Mon, Jun 17, 2013 at 1:49 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>> This is a review of the patch in 5192D7D2(dot)8020605(at)catalyst(dot)net(dot)nz
>>
>> The patch applies cleanly (with the exception of catversion.h of course),
>> compiles without warnings and passes the regression tests.
>>
>> It contains enough documentation, though I'd prefer
>> "Estimated number of rows modified since the table was last analyzed"
>> to
>> "Estimated number of row changes (inserts + updates + deletes) since the last analyze"
>>
>> The patch works as it should, and I think that this is a
>> useful addition. It only exposes a value that is already
>> available internally, so there shouldn't be any penalties.
>>
>> I think that the column name is ok as it is, even if it
>> is a bit long - I cannot come up with a more succinct
>> idea. Perhaps "n_changed_since_analyze" could be shortened
>> to "n_mod_since_analyze", but that's not much of an improvement.
>
> AFAICT it's related to "n_live_tup", and "n_dead_tup". How about just
> "n_mod_tup"? Though that doesn't convey that it's since the last
> analyze, I guess.
>
> But given that both n_dead_tup and n_live_tup don't really indicate
> that they're not "since the beginning of stats" in the name (which
> other stats counters are), I'm not sure that's a problem? It would be
> a name that sounds more similar to the rest of the table.

I don't get that.

As far as I know, n_dead_tup and n_live_tup are estimates for
the total number of live and dead tuples, period.

Both numbers are not reset to zero when ANALYZE (or even VACUUM)
takes place.

Yours,
Laurenz Albe


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Display number of changed rows since last analyze
Date: 2013-07-01 12:51:11
Message-ID: CABUevExfvVq1t8w7M-ZaPxCB8kPdYTxqSU-92+FsoCbF8iybzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 2:48 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
> Magnus Hagander wrote:
>> On Mon, Jun 17, 2013 at 1:49 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>>> This is a review of the patch in 5192D7D2(dot)8020605(at)catalyst(dot)net(dot)nz
>>>
>>> The patch applies cleanly (with the exception of catversion.h of course),
>>> compiles without warnings and passes the regression tests.
>>>
>>> It contains enough documentation, though I'd prefer
>>> "Estimated number of rows modified since the table was last analyzed"
>>> to
>>> "Estimated number of row changes (inserts + updates + deletes) since the last analyze"
>>>
>>> The patch works as it should, and I think that this is a
>>> useful addition. It only exposes a value that is already
>>> available internally, so there shouldn't be any penalties.
>>>
>>> I think that the column name is ok as it is, even if it
>>> is a bit long - I cannot come up with a more succinct
>>> idea. Perhaps "n_changed_since_analyze" could be shortened
>>> to "n_mod_since_analyze", but that's not much of an improvement.
>>
>> AFAICT it's related to "n_live_tup", and "n_dead_tup". How about just
>> "n_mod_tup"? Though that doesn't convey that it's since the last
>> analyze, I guess.
>>
>> But given that both n_dead_tup and n_live_tup don't really indicate
>> that they're not "since the beginning of stats" in the name (which
>> other stats counters are), I'm not sure that's a problem? It would be
>> a name that sounds more similar to the rest of the table.
>
> I don't get that.
>
> As far as I know, n_dead_tup and n_live_tup are estimates for
> the total number of live and dead tuples, period.
>
> Both numbers are not reset to zero when ANALYZE (or even VACUUM)
> takes place.

No, but they are zero *until* vacuum runs.

The point I was trying to make was that they are showing an absolute
number. Unlike for example n_tup_inserted and friends which show the
total number of <event> since stat reset.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Magnus Hagander *EXTERN*" <magnus(at)hagander(dot)net>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Display number of changed rows since last analyze
Date: 2013-07-01 13:15:44
Message-ID: A737B7A37273E048B164557ADEF4A58B17BC2938@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
>>> On Mon, Jun 17, 2013 at 1:49 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>>>> I think that the column name is ok as it is, even if it
>>>> is a bit long - I cannot come up with a more succinct
>>>> idea. Perhaps "n_changed_since_analyze" could be shortened
>>>> to "n_mod_since_analyze", but that's not much of an improvement.
>>>
>>> AFAICT it's related to "n_live_tup", and "n_dead_tup". How about just
>>> "n_mod_tup"? Though that doesn't convey that it's since the last
>>> analyze, I guess.
>>>
>>> But given that both n_dead_tup and n_live_tup don't really indicate
>>> that they're not "since the beginning of stats" in the name (which
>>> other stats counters are), I'm not sure that's a problem? It would be
>>> a name that sounds more similar to the rest of the table.
>>
>> I don't get that.
>>
>> As far as I know, n_dead_tup and n_live_tup are estimates for
>> the total number of live and dead tuples, period.
>>
>> Both numbers are not reset to zero when ANALYZE (or even VACUUM)
>> takes place.
>
> No, but they are zero *until* vacuum runs.
>
> The point I was trying to make was that they are showing an absolute
> number. Unlike for example n_tup_inserted and friends which show the
> total number of <event> since stat reset.

Ok, I understand you now.

All the old names are fairly intuitive in my opinion.

"Number of life tuples since the statistics were reset" doesn't make
a lot of sense to me, so I would automatically read that as an absolute
number.

But it would not be clear to me that "n_mod_tuples" are counted
since the last ANALYZE (different from other columns); I would
jump to the conclusion that it is a sum of n_tup_ins, n_tup_upd
and n_tup_del.

So I think that a name that it less likely to cause confusion
would be better that a short, but misleading name.

Yours,
Laurenz Albe


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Display number of changed rows since last analyze
Date: 2013-07-05 13:13:15
Message-ID: CABUevEw8B7MuW7LD9GGs_0_EkcTW4=F_F+=ZUb_n0D9qhTcy6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 3:15 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
> Magnus Hagander wrote:
>>>> On Mon, Jun 17, 2013 at 1:49 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>>>>> I think that the column name is ok as it is, even if it
>>>>> is a bit long - I cannot come up with a more succinct
>>>>> idea. Perhaps "n_changed_since_analyze" could be shortened
>>>>> to "n_mod_since_analyze", but that's not much of an improvement.
>>>>
>>>> AFAICT it's related to "n_live_tup", and "n_dead_tup". How about just
>>>> "n_mod_tup"? Though that doesn't convey that it's since the last
>>>> analyze, I guess.
>>>>
>>>> But given that both n_dead_tup and n_live_tup don't really indicate
>>>> that they're not "since the beginning of stats" in the name (which
>>>> other stats counters are), I'm not sure that's a problem? It would be
>>>> a name that sounds more similar to the rest of the table.
>>>
>>> I don't get that.
>>>
>>> As far as I know, n_dead_tup and n_live_tup are estimates for
>>> the total number of live and dead tuples, period.
>>>
>>> Both numbers are not reset to zero when ANALYZE (or even VACUUM)
>>> takes place.
>>
>> No, but they are zero *until* vacuum runs.
>>
>> The point I was trying to make was that they are showing an absolute
>> number. Unlike for example n_tup_inserted and friends which show the
>> total number of <event> since stat reset.
>
> Ok, I understand you now.
>
> All the old names are fairly intuitive in my opinion.
>
> "Number of life tuples since the statistics were reset" doesn't make
> a lot of sense to me, so I would automatically read that as an absolute
> number.
>
> But it would not be clear to me that "n_mod_tuples" are counted
> since the last ANALYZE (different from other columns); I would
> jump to the conclusion that it is a sum of n_tup_ins, n_tup_upd
> and n_tup_del.
>
> So I think that a name that it less likely to cause confusion
> would be better that a short, but misleading name.

Yeah, you're probably right. Applied with your suggested name, and
some further minor tweaking on the wording in the docs.

Thanks!

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/