Re: Postgresql Materialized views

Lists: pgsql-hackers
From: Jean-Michel Pouré <jm(at)poure(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Postgresql Materialized views
Date: 2008-01-12 12:27:34
Message-ID: 1200140854.4905.9.camel@debian
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear Friends,

In my past development projects, I always used the concept of
Materialized VIEW to speed-up SELECTs over INSERTs.

You are well aware of Jonathan Gardner preliminary work:
http://www.varlena.com/varlena/GeneralBits/Tidbits/matviews.html

When do you plan to add MATERIALIZED VIEWS to PostgreSQL?
This would be major improvement ni the case of Web applications.

I run a 400.000+ message board using PhpBB 3.0. After optimization, some
queries still need 30 millisecond to run. With Materialized views, it
should be possible to drive these queries to 1 millisecond. This means
that in some situations a PostgreSQL backend could handle 10 times more
queries.

My database handles 10 to 20 queries every second. There are 100 selects
for 1 INSERT. But my database could well handle over 500 queries a
second using materialized views.

At my level, here are my plans:

1) Publish some long query LOGs from my database, longuer than 30
milliseconds.

2) Write some PL code to demonstrate the interest in Materialized Views.
Publish benckmarks showing time improvement, like 1 milisecond. 30x
faster.

3) Then wait for someone on Hackers mailing list to pick-up this
important issue and integrate Materialized views in PostgreSQL schema
and SQL language.

Any information and discussion about materialized views is welcome.

Please pick-up this important issue for developpers. There is no need to
concentrate on complex issues, when handling materialized views could
boost somme web apps. by a factor of 10 or more.

Kind regards and happy new year.
I hope that 2008 will be the year of materialized views.

Jean-Michel Pouré


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 15:26:49
Message-ID: 4788DC39.3030303@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jean-Michel Pouré wrote:
> In my past development projects, I always used the concept of
> Materialized VIEW to speed-up SELECTs over INSERTs

Unless you are going to *pay* for it - you do realize that the best way
to get it implemented, would be to open up the source code, and give it
a try yourself?

If it was so easy, and such a clear win, I think one of the very
competent people using PostgreSQL today would have already done it?

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: "Dave Page" <dpage(at)postgresql(dot)org>
To: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 18:29:05
Message-ID: 937d27e10801121029n10fd242gc5a7089e8905b0dc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/01/2008, Mark Mielke <mark(at)mark(dot)mielke(dot)cc> wrote:
> Jean-Michel Pouré wrote:
> > In my past development projects, I always used the concept of
> > Materialized VIEW to speed-up SELECTs over INSERTs
>
> Unless you are going to *pay* for it - you do realize that the best way
> to get it implemented, would be to open up the source code, and give it
> a try yourself?

In fairness to Jean-Michel, he has spent hundreds of hours in the past
doing just that and far more for the pgAdmin users in the community -
I'm sure we can excuse him for asking for what many do think would be
a useful feature in the hopes that someone listening might just decide
to pick it up.

In the meantime though - have you tried rolling your own materialised
views with some triggers Jean-Michel? I have good results doing that
in the past.

Regards, Dave


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Dave Page <dpage(at)postgresql(dot)org>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 18:50:15
Message-ID: 47890BE7.2050108@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dave Page wrote:
> On 12/01/2008, Mark Mielke <mark(at)mark(dot)mielke(dot)cc> wrote:
>
>> Jean-Michel Pouré wrote:
>>
>>> In my past development projects, I always used the concept of
>>> Materialized VIEW to speed-up SELECTs over INSERTs
>>>
>> Unless you are going to *pay* for it - you do realize that the best way
>> to get it implemented, would be to open up the source code, and give it
>> a try yourself?
>>
>
> In fairness to Jean-Michel, he has spent hundreds of hours in the past
> doing just that and far more for the pgAdmin users in the community -
> I'm sure we can excuse him for asking for what many do think would be
> a useful feature in the hopes that someone listening might just decide
> to pick it up.
>
> In the meantime though - have you tried rolling your own materialised
> views with some triggers Jean-Michel? I have good results doing that
> in the past
I'm not good with names - I suppose Jean-Michel should be asking his
benefactors to return the favour then? :-)

In my own case - I use a combination of triggers and application to
maintain materialized views - but the subject does seem complex to me.

The last two uses of materialized views I used:

Counts, because as we all know, PostgreSQL count(*) is slow, and in any
case, my count(*) is not on the whole table, but on a subset. Doing this
in a general way seems complex to me as it would need to be able to
evaluate whether a given INSERT or UPDATE or one of the dependent tables
would impact the WHERE clause for the materialized view, and it still
wouldn't know which rows to add/update/remove without detailed analysis,
so it would either be throwing out the entire materialized view and
recreating it on INSERT or UPDATE (or deferring until the next query?)
in which case it may be very slow, or it may be very complex.

Another one that I use is a complex join of several tables, and merging
1:N tables including aggregate queries into a 1:1 materialized view. I
see this as the same problem where it needs to do dependency analysis,
and it still doesn't know how to INSERT/UPDATE/DELETE materialized rows
without complex analysis forcing a re-build. In my case, it is 1 ms to
query my materialized view and 1500 ms to rebuild the materialized view.
I do NOT want to rebuild this view after every update.

In summary, I don't think materialized views is an easy thing to do.
Perhaps the very simplest of cases - but the simplest of cases can be
easily managed with triggers or application logic.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Dave Page <dpage(at)postgresql(dot)org>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 18:58:35
Message-ID: 47890DDB.3080804@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:
> Counts, because as we all know, PostgreSQL count(*) is slow, and in
> any case, my count(*) is not on the whole table, but on a subset.
> Doing this in a general way seems complex to me as it would need to be
> able to evaluate whether a given INSERT or UPDATE or one of the
> dependent tables would impact the WHERE clause for the materialized
> view, and it still wouldn't know which rows to add/update/remove
> without detailed analysis, so it would either be throwing out the
> entire materialized view and recreating it on INSERT or UPDATE (or
> deferring until the next query?) in which case it may be very slow, or
> it may be very complex.

Bah. I forgot to add: The feature I've been wondering about (and not
necessarily looking for somebody else to do, although I don't think I
know the code well enough to do it at this point):

Web applications often make the same queries over and over. While
memcache can be used to cache results, the memcache interface is
different from the web application interfere requiring complex code, and
as well, one loses the transaction guarantees as the memcache results
are not guaranteed to be up-to-date with the database. I see the
greatest overall performance gain for web applications to be for
PostgreSQL to hang on to the results of the previous X queries along
with transactions numbers of each of the dependent tables as of the
snapshot of the table that is used, and if one of them matches, return
the results immediately. I believe MySQL does this (although not sure
how reliable their implementation is). I believe I have seen this
subject talked about on this list in the past. For web applications, I
believe this gives most of the benefits that materialized views would
provide, with less of the costs?

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 18:59:20
Message-ID: 20080112105920.561e97d4@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 12 Jan 2008 10:26:49 -0500
Mark Mielke <mark(at)mark(dot)mielke(dot)cc> wrote:

> Jean-Michel Pouré wrote:
> > In my past development projects, I always used the concept of
> > Materialized VIEW to speed-up SELECTs over INSERTs
>
> Unless you are going to *pay* for it - you do realize that the best
> way to get it implemented, would be to open up the source code, and
> give it a try yourself?
>
> If it was so easy, and such a clear win, I think one of the very
> competent people using PostgreSQL today would have already done it?
>

No actually, and your reply is offensive. There are a lot of things
PostgreSQL is missing that are "easy" and a clear win, yet people still
don't do them. A simple one is the ridiculous usage of pg_dump and
pg_dumpall. Or that we can't use pg_restore to use the plain text
backup.

I think his email was very well written and a simple request of
discussion of alternatives as well as future plans.

Sincerely,

Joshua D. Drake

> Cheers,
> mark
>

- --
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHiQ4LATb/zqfZUUQRAvEpAKCmLzjPg1+95ddWHBqogK4Ea981AgCfa4FG
2AVhx/5BNwYbCta086Iz1oo=
=ng7C
-----END PGP SIGNATURE-----


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 21:19:11
Message-ID: 1200172751.4266.1411.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2008-01-12 at 13:27 +0100, Jean-Michel Pouré wrote:

> Please pick-up this important issue for developpers. There is no need to
> concentrate on complex issues, when handling materialized views could
> boost somme web apps. by a factor of 10 or more.

It's more complex than you think, but the main reason was that HOT was a
prerequisite for making summary tables work efficiently, which is only
now just about to go live into 8.3

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-12 22:31:32
Message-ID: 47893FC4.9090707@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
>> Unless you are going to *pay* for it - you do realize that the best
>> way to get it implemented, would be to open up the source code, and
>> give it a try yourself?
>>
>> If it was so easy, and such a clear win, I think one of the very
>> competent people using PostgreSQL today would have already done it?
>>
> No actually, and your reply is offensive. There are a lot of things
> PostgreSQL is missing that are "easy" and a clear win, yet people still
> don't do them. A simple one is the ridiculous usage of pg_dump and
> pg_dumpall. Or that we can't use pg_restore to use the plain text
> backup.
>
> I think his email was very well written and a simple request of
> discussion of alternatives as well as future plans.
>
Offensive is relative. I find it offensive when people demand things on
one of the many mailing lists I read without providing anything to the
community.

I didn't realize the original poster did not fit this class of person.
For this, I apologize. As for tone - I don't see anything technically
wrong with my response. The best way to get something done *is* to pay
for it, or do it yourself. It's a tried and true practice in the open
source community. Also, I do not think it is as easy as you say - but
feel free to continue the discussion and prove how idiotic I am for
calling the problem "not easy". :-)

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 00:03:20
Message-ID: 36e682920801121603i65266256r8a5692eaef56fea4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 12, 2008 5:31 PM, Mark Mielke <mark(at)mark(dot)mielke(dot)cc> wrote:
> Joshua D. Drake wrote:
>> I think his email was very well written and a simple request of
>> discussion of alternatives as well as future plans.

Agreed, JD.

> Offensive is relative. I find it offensive when people demand things on one
> of the many mailing lists I read without providing anything to the
> community.

I have to agree with JD. Your response was quite negative.
Similarly, your statement, "If it was so easy, and such a clear win, I
think one of the very competent people using PostgreSQL today would
have already done it?" is misplaced. Jean-Michel didn't say it was
easy at all, he just explained some of the benefits.

Having used materialized views for years, I can tell you they are an
important feature Postgres currently lacks (in native form). Also, as
I have personally looked into implementing materialized views in
Postgres, I can tell you it's not too difficult. However, as Simon
stated, it's an iterative process.

You don't need to jump on someone for a well-written request with a
specific use-case.

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah(dot)harris(at)enterprisedb(dot)com
Edison, NJ 08837 | http://www.enterprisedb.com/


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 00:20:41
Message-ID: Pine.GSO.4.64.0801121830380.19451@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 12 Jan 2008, Jean-Michel Pour wrote:

> You are well aware of Jonathan Gardner preliminary work:
> http://www.varlena.com/varlena/GeneralBits/Tidbits/matviews.html

There's also PostgreSQL::Snapshots ; intro at
http://cunha17.cristianoduarte.pro.br/postgresql/snapshots.en_us.php and
main project page at http://pgfoundry.org/projects/snapshot/

> I run a 400.000+ message board using PhpBB 3.0...My database handles 10
> to 20 queries every second. There are 100 selects for 1 INSERT. But my
> database could well handle over 500 queries a second using materialized
> views.

I hope you don't take this the wrong way, but if you can't hit 500
queries/second on that volume of messages I would guess that something is
wrong with either the design scalability of the PhpBB software running
against a PostgreSQL database or some detail of how you've got it setup.
A quick read suggests it's not unusual for people to drop PhpBB and use
something vBulletin instead exactly because of PhpBB's issues handling
larger communities. It's probably out of date but I found the discussion
of query optimization for larger message boards at
http://www.phpbb.com/community/viewtopic.php?t=135383 to be informative on
this topic.

> At my level, here are my plans:
> 1) Publish some long query LOGs from my database, longuer than 30
> milliseconds.

Rather than chasing after core product features that are some distance
off, I think what might be more productive for you in the short term is to
collect this information--including EXPLAIN ANALYZE plans--and include it
along with information about your server and how the postgresql.conf is
setup in a message to the performance list. That might get you immediate
suggestions. In addition to "query is badly written/indexed for
PostgreSQL" (which is another potential side to the query issues discussed
in the phpbb forum topic I referenced), problems you might not have caught
that could be nailing you include things like not allocating enough memory
for use by the database and tables not being analyzed frequently enough.

I would love to have materialized views in the core database. But it's a
ways off no matter what, is moderately hard to accomplish, and you can
emulate some of the benefits using things like Gardner's trigger-based
approach. All of that makes it harder to kick off such a project. I
don't think you need to convince anyone that it's important--the
occasional person has been screaming about needing this feature for years
now. The real question is who cares enough about the feature that it's
worth their trouble to fund development, and I'm not sure whether your
personal attempts to rouse demand will impact that.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Jean-Michel Pouré <jm(at)poure(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 11:18:00
Message-ID: 1200223080.28643.16.camel@debian
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> In fairness to Jean-Michel, he has spent hundreds of hours in the past
> doing just that and far more for the pgAdmin users in the community -
> I'm sure we can excuse him for asking for what many do think would be
> a useful feature in the hopes that someone listening might just decide
> to pick it up.
>
> In the meantime though - have you tried rolling your own materialised
> views with some triggers Jean-Michel? I have good results doing that
> in the past.

Dear friends,

For of all, thanks for picking up my message and replying.

I agree with Dave Page and others that it is hard to find contributors
for a Free project. When writing 2% of pgAdmin2 code, I noticed that you
could count 10.000 end-users for ONE developper.

This is why I usually offer a bounty for any kind of feature.

Two months ago, my last bounty for a Kdenlive feature (100€), but no-one
aggreed to receive the money after developing the feature, probably
because there were dozens of contibutors (people writing code on top of
someone else code).

But the power of PostgreSQL is to be a real community, like Xorg or
Apache. This makes all the difference. This was the heart of my message.

This being said, our behind the scene needs are:
* We are a non-profit organisation based in France, providing
real-estate listing services.
* We are going to spend 2000€ buying two servers (Phantom 4-die + 8GB 2U
servers), running on Debian: one for PostgreSQL, the other for Apache2.
* We plan to launch the first virtual real-estate agency managed by a
free community.
* We are going to use a mixture of PhpBB 3.0, Drupal and a custom
applications.
* We plan to handle thousands of simultaneous users.

So ...

Based on my current logs, I know that I may have to buy and run a web
farm. Which I do not want, because it implies extra hosting costs.

AND

I know that ONE PostgreSQL database can handle 1000 simultaneous
queries, when using server-side code.

I wrote a small PostgreSQL query optimization HOWTO :
http://area51.phpbb.com/phpBB/viewtopic.php?f=3&t=29292
Pardon me if some information is not accurate.

I tried my best to educate PhpBB developers choosing the right technology.
PostgreSQL... Now, I am going to write the requested server-side code for PhpBB 3.0
Then I will benchmark queries on our large server.

In the end I hope that someone will be willing to pick-up this
MATERIALIZED VIEW issue, so that any applications running
on PostgreSQL can benefit from "lightning speed".

If someone is willing to receive a bounty, please contact me.

Being a non-profit organisation, I can ask money to contributors on our web site.
This will not be any kind of large sum of money, something between 100€ and 500€.

Kind regards and happy new year.
Jean-Michel Pouré


From: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 13:35:35
Message-ID: 478A13A7.7030605@mansionfamily.plus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:
> Joshua D. Drake wrote:
>>> Unless you are going to *pay* for it - you do realize that the best
>>> way to get it implemented, would be to open up the source code, and
>>> give it a try yourself?
>>>
Because users possibly want to do that - use it? Some of us have better
things to do than go
through the learning curve of how the internals of a non-trivial system
work. Does that really
mean its unreasonable to voice an opinion of what would make the system
more useful?
>>
> Offensive is relative. I find it offensive when people demand things
> on one of the many mailing lists I read without providing anything to
> the community.
>
If your view of the community is that it should be insular and closed to
those who can't or won't be developers, then fine. But taking that
attitude will pretty much guarantee that your system will never amount
to more than a hill of beans.

One of the major problems with open source as a whole is that you get
this 'fix it yourself or pay for it' business which provides no way to
spread the cost over many users who would all have something to gain -
but none of whom can justify footing the bill for the entire
development. Most of us are in that position as users, even if we do
have skills that would enable us to help - we have our own issues to
deal with. Buying support isn't the answer - its not support that's
needed, after all, so much as an ability to buy a share of influence
over a roadmap..

Do you want ensure that only the very rich in cash or time can have any
influence? You're going the right way about it with your attitude,
which appears deliberately user-hostile.

What do you want?

James


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 16:40:20
Message-ID: 478A3EF4.5000702@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

FYI: I don't like being attacked for stating the truth, nor distracting
the mailing list with these emotional discussions. However, there are
things that need to be clarified. Feel free to kill the thread in your
mail browser.

James Mansion wrote:
> Mark Mielke wrote:
>> Joshua D. Drake wrote:
>>>> Unless you are going to *pay* for it - you do realize that the best
>>>> way to get it implemented, would be to open up the source code, and
>>>> give it a try yourself?
> Because users possibly want to do that - use it? Some of us have
> better things to do than go
> through the learning curve of how the internals of a non-trivial
> system work. Does that really
> mean its unreasonable to voice an opinion of what would make the
> system more useful?

It's unreasonable to demand or set a time schedule like "drop everything
and work on materialized views in 2008 because you are wasting your time
on other things - web developers can get 10X improvement today if you do
this one thing". It's not unreasonable to ask for support for an idea.
As it was, I believe I misinterpreted Jean-Michel's message as a demand
(the words seemed demanding), whereas it may have been a plea. I was far
more sympathetic to his second post.

>> Offensive is relative. I find it offensive when people demand things
>> on one of the many mailing lists I read without providing anything to
>> the community.
>>
> If your view of the community is that it should be insular and closed
> to those who can't or won't be developers, then fine. But taking that
> attitude will pretty much guarantee that your system will never amount
> to more than a hill of beans.

First point: I don't speak for the developers and it is not my system.
Second point: This has nothing to do with insular vs non-insular
community. Everything I stated was true - you may not like what I
stated, but it was true. It's possible I could have adjusted a word or
two, or added a smiley to get an affect you would find more pleasing,
but there was nothing incorrect about what I stated. The best way to get
a feature in is to pay for somebody to do it, or do it yourself. If you
need reference points look to the last 20 large features that went into
PostgreSQL. See how it progresses. Also, people have been working on
materialized views on and off for years with differing degrees of
success. It is *NOT* the simple problem that people make it out to be.
(Although it seems that some people may have a good grasp of the
problem, and may be verging on a solution)

> One of the major problems with open source as a whole is that you get
> this 'fix it yourself or pay for it' business which provides no way to
> spread the cost over many users who would all have something to gain -
> but none of whom can justify footing the bill for the entire
> development. Most of us are in that position as users, even if we do
> have skills that would enable us to help - we have our own issues to
> deal with. Buying support isn't the answer - its not support that's
> needed, after all, so much as an ability to buy a share of influence
> over a roadmap..

Open source is a double-edged sword for sure. This does not give any
right to demand anything. If you want to work on something *with* other
people, the message is "I am looking at doing tihs, but don't know how
to start - does anybody with similar interest want to help?" This is
*not* the message that was presented.

> Do you want ensure that only the very rich in cash or time can have
> any influence? You're going the right way about it with your
> attitude, which appears deliberately user-hostile.
> What do you want?
I want people to respect the few developers we have, because I value
them and respect them.

As for my "attitude" - words in email are hardly capable of showing
emotion, especially something as short as I wrote. You will think what
you wish of my words, and your decision was already made the moment you
ignored the fact that I spoke truth and focused only on the tone.

I have an apology for Jean-Michel. I don't have an apology for people
such as you who took this opportunity to attack me. Pot, kettle, black.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
Cc: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 16:52:40
Message-ID: 20080113165240.GB9224@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

James Mansion wrote:

> If your view of the community is that it should be insular and closed to
> those who can't or won't be developers, then fine. But taking that
> attitude will pretty much guarantee that your system will never amount to
> more than a hill of beans.

Keep in mind that Mark Mielke is not a PostgreSQL developer.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 17:07:50
Message-ID: b42b73150801130907h49f3825esd21889e06cf0b3bc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 12, 2008 4:19 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sat, 2008-01-12 at 13:27 +0100, Jean-Michel Pouré wrote:
>
> > Please pick-up this important issue for developpers. There is no need to
> > concentrate on complex issues, when handling materialized views could
> > boost somme web apps. by a factor of 10 or more.
>
> It's more complex than you think, but the main reason was that HOT was a
> prerequisite for making summary tables work efficiently, which is only
> now just about to go live into 8.3

+1

If you know how to write triggers, materialization techniques aren't
all that difficult. The real technical limitation was not lack of
materialization techniques (write triggers), but was dealing with the
mvcc penalty. Previously to HOT, for summary tables I would redirect
the trigger to insert to a 'roll up' table and move the data to the
summary on cron or from an application event.

Materialized views are syntax sugar (but still very sweet).

merlin


From: "Sean Utt" <sean(at)strateja(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 20:41:40
Message-ID: 022a01c85624$b3754e80$0201a8c0@randomnoise
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

<sarcasm>Good to see</sarcasm> things haven't changed, and requests for
features and improvements on the pgsql-hackers list can still degenerate
rapidly into a discussion about how to request features and improvements.

As Joshua Drake has pointed out before, most of the core people working on
PostgreSQL don't actually use it for anything themselves. I will expand a
little on that and say that this means that while they are extremely good at
what they do, they really don't have a clue what might be useful to someone
"in the wild". Sort of like automotive engineers who in the 1970's made the
Cadillac's engine so large that you couldn't change the spark plugs without
taking the motor mounts loose and lifting the engine.

The basic question this brings up in the context of this latest flurry of
hurt feelings is whether user demand is officially a driving force in
PostgreSQL development.
If the answer to that question is yes, then the next question is how is that
structured?

I'm not sure that the collective answer to the first question is actually
yes. If it actually is yes, then the next question has barely been touched,
as witnessed by these flurries of electrons on the list, unless the
structure is Anarchy. Which is oxymoronic of course.

There doesn't appear to be an easy way to officially "take the temperature"
of either the developer community, or the user community, and there
certainly is no official way to clearly and easily communicate between the
two in order to effect change. Unfortunately for all of us, the
communications, social, organizational, and people skills/talents necessary
to envision and create the type of social structure that benefits the entire
community are outside the range of experience of everyone on this list. How
do I know that? Because if even one person had those talents/skills it would
have happened already. That is what those type of people do, they can't help
themselves.* I'm not sure that anyone reading this would even be able to
recognize such a person if they met them. Perhaps we should go fishing for
some help from one of those "University Places"? From people outside the
Computer Science department? Maybe even some people in Industrial
Psychology? Somebody probably needs a Master's project....

--
Sean Utt

* Actually I'm being optimistic, while organizers are compulsive, they know
a hopeless cause when they see it, and quietly disappear. I'm operating
under the assumption that the PostgreSQL community is not a hopeless cause
organizationally.

[entire original message deleted for lack of usefulness]


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 20:54:06
Message-ID: 478A7A6E.70800@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sean Utt wrote:
> <sarcasm>Good to see</sarcasm> things haven't changed, and requests for
> features and improvements on the pgsql-hackers list can still degenerate
> rapidly into a discussion about how to request features and improvements.
>
> As Joshua Drake has pointed out before, most of the core people working
> on PostgreSQL don't actually use it for anything themselves.

<snip>

> Unfortunately for all of us,
> the communications, social, organizational, and people skills/talents
> necessary to envision and create the type of social structure that
> benefits the entire community are outside the range of experience of
> everyone on this list. How do I know that? Because if even one person
> had those talents/skills it would have happened already. That is what
> those type of people do, they can't help themselves.* I'm not sure that
> anyone reading this would even be able to recognize such a person if
> they met them.

I would argue that they know and recognize them but... they don't care.
They are -hackers. They care about code. Some of them in a tertiary
manner care about the entire paragraph above, others not at all.

I think if you look around what you will see is disparate group of
people all organizing the community in their own particular way for the
benefit of the community as a whole.

Consider the following, in the last 2 years we have had a number of
significant community events happen that allow for exactly what you are
stating above. Formation of the Fundraising group and affiliation with
Software in the Public Interest. This allowed us to execute the
Anniversary, PostgreSQL Conference West, as well as the upcoming EAST
conference. It also allowed to to provide help to PgDay.IT for last
years Italian PostgreSQL day. Since that time they have also formed
ITPUG and Italian non-profit for PostgreSQL. Lastly this affiliation has
allowed allowed the community as a whole to sponsor travel for
PostgreSQL speakers to various conferences and support major events such
as OSCON and LinuxWorld.

> Perhaps we should go fishing for some help from one of
> those "University Places"? From people outside the Computer Science
> department? Maybe even some people in Industrial Psychology? Somebody
> probably needs a Master's project....

Actually you just need to not be on -hackers :). This belongs over in
-advocacy.

Sincerely,

Joshua D. Drake


From: "Webb Sprague" <webb(dot)sprague(at)gmail(dot)com>
To:
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-13 20:59:02
Message-ID: b11ea23c0801131259k2bade7e7y953855546543a1b7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Just my two cents on this (rapidly degenerating) thread.

On 1/13/08, Sean Utt <sean(at)strateja(dot)com> wrote:
> <sarcasm>Good to see</sarcasm> things haven't changed, and requests for
> features and improvements on the pgsql-hackers list can still degenerate
> rapidly into a discussion about how to request features and improvements.
>
> As Joshua Drake has pointed out before, most of the core people working on
> PostgreSQL don't actually use it for anything themselves. I will expand a
> little on that and say that this means that while they are extremely good at
> what they do, they really don't have a clue what might be useful to someone
> "in the wild". Sort of like automotive engineers who in the 1970's made the
> Cadillac's engine so large that you couldn't change the spark plugs without
> taking the motor mounts loose and lifting the engine.

As a very satisfied Postgres "customer", I take exception to the comparison.

<SNIP>

> There doesn't appear to be an easy way to officially "take the temperature"
> of either the developer community, or the user community, and there
> certainly is no official way to clearly and easily communicate between the
> two in order to effect change.

Huh? A politely worded feature request generally gets discussed and
then put on the TODO list if there is some consensus about its
usefulness. If there is no consensus, then the requester usually has
to do more work, which might involve prototyping some code etc.
Admittedly, some developers get grumpy sometimes, but, as the man
said, "Let him who is without sin throw the first stone..." There was
an issue with the tone of the request for material views in the
beginning of this thread, but that seems to ironed out among those who
are actually interested in accomplishing something.

May I propose the following: (1) can we put materialized views on the
TODO, perhaps as a library or as core, still subject to a lot of
design work and with no particular deadline? (2) Can we discontinue
this particular flame war about the responsive of the developers (who
in my estimation do a huge amount of work that benefits me immensely
with nary a "thank you")?

Sorry for the meta rant, I just couldn't take it anymore.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Webb Sprague" <webb(dot)sprague(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 00:01:26
Message-ID: 10842.1200268886@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Webb Sprague" <webb(dot)sprague(at)gmail(dot)com> writes:
> May I propose the following: (1) can we put materialized views on the
> TODO, perhaps as a library or as core, still subject to a lot of
> design work and with no particular deadline?

Actually, I had thought they *were* on the TODO list, because certainly
it's been suggested before. I can't find anything about 'em in the
list though, so +1 for adding the entry. There are plenty of other TODO
items that we have no idea how to do, so the lack of a clear design is
hardly an objection ;-)

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 00:30:44
Message-ID: 478AAD34.30101@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sean Utt wrote:
>
> As Joshua Drake has pointed out before, most of the core people
> working on PostgreSQL don't actually use it for anything themselves. I
> will expand a little on that and say that this means that while they
> are extremely good at what they do, they really don't have a clue what
> might be useful to someone "in the wild". Sort of like automotive
> engineers who in the 1970's made the Cadillac's engine so large that
> you couldn't change the spark plugs without taking the motor mounts
> loose and lifting the engine.
>
>

This is both gratuitously offensive and based on a demonstrably false
premise. The definition of "core people working on PostgreSQL" is
somewhat vague. But if you were to take it as, say, the group of active
committers, then I would say that the majority of us earn our living in
whole or in part using PostgreSQL. Certainly I do (there's a reason I
use an elephant logo for my business).

Many of the things I have contributed to PostgreSQL have been in
response to user requests, and often things I personally have no
immediate need for. The same is true of many contributors. And almost
all of mine have been aimed at increasing usability (e.g. dollar
quoting, CSV import/export). So I rather take offense at the quite
inappropriate spark plug analogy.

As for feature requests, everyone has a right to air an opinion.
Personally, I will pay more attention to people who contribute to the
community than to those who don't. That doesn't just mean coders, though
- contribution comes in many forms. In the case of the present request,
Jean-Michel is a contributor, and any suggestion to the contrary is
right out of line.

The real question for me is whether we want to support Materialized
Views at the grammar level. If we do then it should be worked on. If not
then it can probably be dealt with via a bolt-on module. Personally I'm
inclined to say we should support it via the grammar.

cheers

andrew


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Sean Utt <sean(at)strateja(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 00:57:17
Message-ID: 478AB36D.5010708@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
>
>
> Sean Utt wrote:
>>
>> As Joshua Drake has pointed out before, most of the core people
>> working on PostgreSQL don't actually use it for anything themselves. I
>> will expand a little on that and say that this means that while they
>> are extremely good at what they do, they really don't have a clue what
>> might be useful to someone "in the wild". Sort of like automotive
>> engineers who in the 1970's made the Cadillac's engine so large that
>> you couldn't change the spark plugs without taking the motor mounts
>> loose and lifting the engine.
>>
>>
>
> This is both gratuitously offensive and based on a demonstrably false
> premise. The definition of "core people working on PostgreSQL" is
> somewhat vague. But if you were to take it as, say, the group of active
> committers, then I would say that the majority of us earn our living in
> whole or in part using PostgreSQL. Certainly I do (there's a reason I
> use an elephant logo for my business).

The consideration of my comment (which I believe was made some time ago)
was not about -hacking which as I understand it is what you and most
everyone else on -hackers does. My comment was a consideration to the
amount of "core" that are managing postgresql in a production
environment. E.g; being DBAs. I would argue that very few committers
actually qualify as that either. Feel free to prove me wrong :)

Sincerely,

Joshua D. Drake


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Sean Utt <sean(at)strateja(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 01:10:38
Message-ID: 478AB68E.3080004@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
> Andrew Dunstan wrote:
>>
>>
>> Sean Utt wrote:
>>>
>>> As Joshua Drake has pointed out before, most of the core people
>>> working on PostgreSQL don't actually use it for anything themselves.
>>> I will expand a little on that and say that this means that while
>>> they are extremely good at what they do, they really don't have a
>>> clue what might be useful to someone "in the wild". Sort of like
>>> automotive engineers who in the 1970's made the Cadillac's engine so
>>> large that you couldn't change the spark plugs without taking the
>>> motor mounts loose and lifting the engine.
>>>
>>>
>>
>> This is both gratuitously offensive and based on a demonstrably false
>> premise. The definition of "core people working on PostgreSQL" is
>> somewhat vague. But if you were to take it as, say, the group of
>> active committers, then I would say that the majority of us earn our
>> living in whole or in part using PostgreSQL. Certainly I do (there's
>> a reason I use an elephant logo for my business).
>
> The consideration of my comment (which I believe was made some time
> ago) was not about -hacking which as I understand it is what you and
> most everyone else on -hackers does. My comment was a consideration to
> the amount of "core" that are managing postgresql in a production
> environment. E.g; being DBAs. I would argue that very few committers
> actually qualify as that either. Feel free to prove me wrong :)
>
>

I'm not going to bother trying, because you just moved the goalposts
(managing in a production environment vs using). And why should the
number of DBAs matter one whit? Why should they matter more than, say
application developers, when it comes to language level features?

cheers

andrew


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Sean Utt <sean(at)strateja(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 01:21:36
Message-ID: 478AB920.7060300@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:

>>
>
> I'm not going to bother trying, because you just moved the goalposts
> (managing in a production environment vs using). And why should the
> number of DBAs matter one whit? Why should they matter more than, say
> application developers, when it comes to language level features?

Andrew don't get your knickers in a bunch. This was not an attack, I
wasn't declaring that one type of user was better than another, nor was
I moving goal posts. I was simply explaining the intent of my statement
because Sean quoted me without context.

Oh and as a note... the fact that you wonder about the worth of the
number of DBAs proves my point infinitely.

Joshua D. Drake


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Sean Utt <sean(at)strateja(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 01:30:59
Message-ID: 478ABB53.7070406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua D. Drake wrote:
> Andrew Dunstan wrote:
>
>>
>> I'm not going to bother trying, because you just moved the goalposts
>> (managing in a production environment vs using). And why should the
>> number of DBAs matter one whit? Why should they matter more than, say
>> application developers, when it comes to language level features?
>
>
> Andrew don't get your knickers in a bunch. This was not an attack, I
> wasn't declaring that one type of user was better than another, nor
> was I moving goal posts. I was simply explaining the intent of my
> statement because Sean quoted me without context.
>
> Oh and as a note... the fact that you wonder about the worth of the
> number of DBAs proves my point infinitely.
>
>

What is your point? You are implying something, but I'm damned if I know
what. I did not wonder about the worth of DBAs - that would be silly.
Please do not put words in my mouth that I did not speak. What I
wondered was why DBAs should be uniquely important, and why we should
think that there is any significance to the number of DBAs as opposed to
other classes of user, among the ranks of either the core team, or the
active committers, or hackers generally. I don't think they should -
there are many classes of user whose needs we should cater for, of which
DBAs are just one.

cheers

andrew


From: "Sean Utt" <sean(at)strateja(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 01:44:25
Message-ID: 028a01c8564e$fed41990$0201a8c0@randomnoise
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

My point is simply this: The lack of a clear formal process for feature
requests leads to this degradation in the conversation. Without a formalized
structure, the conversation devolves rapidly into an argument over semantics
and word choice. It is not my contention that the "core" developers need to
be different in any way. It is also not my contention that the users need to
be different in any way. It is my contention that the "process" currently
generates more ill will than it needs to, and needs to be replaced. The
problem is a systemic one. There needs to be a more formal structure put in
place than just the -hackers mailing list. There needs to be a way to
evaluate the demand for a specific feature as well as the benefits and the
effort it will require. It needs to be done in as neutral a way as possible.
In order to be effective, it will have to be driven into being by the
developers, because they will be the ones who can hamstring it -- not the
users.

Or we can just keep bickering among ourselves over semantics and word
choice. That seems to be fun for everyone so far.

Sean


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 01:46:31
Message-ID: 478ABEF7.608@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sean Utt wrote:
> My point is simply this: The lack of a clear formal process for
> feature requests leads to this degradation in the conversation.
> Without a formalized structure, the conversation devolves rapidly into
> an argument over semantics and word choice. It is not my contention
> that the "core" developers need to be different in any way. It is also
> not my contention that the users need to be different in any way. It
> is my contention that the "process" currently generates more ill will
> than it needs to, and needs to be replaced. The problem is a systemic
> one. There needs to be a more formal structure put in place than just
> the -hackers mailing list. There needs to be a way to evaluate the
> demand for a specific feature as well as the benefits and the effort
> it will require. It needs to be done in as neutral a way as possible.
> In order to be effective, it will have to be driven into being by the
> developers, because they will be the ones who can hamstring it -- not
> the users.
>
> Or we can just keep bickering among ourselves over semantics and word
> choice. That seems to be fun for everyone so far.
>
>

Very lofty sentiments. Perhaps next time you have a suggestion to make
you could rise to that level in the first rather than the second
instance, and avoid the unnecessary insults.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Sean Utt" <sean(at)strateja(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 02:18:15
Message-ID: 17343.1200277095@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Sean Utt" <sean(at)strateja(dot)com> writes:
> My point is simply this: The lack of a clear formal process for feature
> requests leads to this degradation in the conversation.

Two comments:

1) The existing informal process has served us very well for more than
ten years now. I'm disinclined to consider replacing it, because that
would risk altering the community's dynamics for the worse.

2) In the end, this is an open source *community*; no amount of formal
feature requesting will have any material impact on what actually gets
implemented, because there isn't any central control. What gets
implemented is whatever individual contributors choose to work on,
either because they find it interesting or (in some cases) because
someone pays them to do something specific. Certainly, some
contributors pay attention to what's being requested, but I see no
reason to think that increasing the level of formality will help them.

In short: we haven't got a "road map" and we like it that way.

regards, tom lane


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 02:36:04
Message-ID: 478ACA94.7080708@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sean Utt wrote:
> My point is simply this: The lack of a clear formal process for
> feature requests leads to this degradation in the conversation.
> Without a formalized structure, the conversation devolves rapidly into
> an argument over semantics and word choice. It is not my contention
> that the "core" developers need to be different in any way. It is also
> not my contention that the users need to be different in any way. It
> is my contention that the "process" currently generates more ill will
> than it needs to, and needs to be replaced. The problem is a systemic
> one. There needs to be a more formal structure put in place than just
> the -hackers mailing list. There needs to be a way to evaluate the
> demand for a specific feature as well as the benefits and the effort
> it will require. It needs to be done in as neutral a way as possible.
> In order to be effective, it will have to be driven into being by the
> developers, because they will be the ones who can hamstring it -- not
> the users.
What sort of structure are you envisioning? Features do make it into
PostgreSQL - PostgreSQL has had, and arguable still has a more complete
feature set than well funded alternatives such as MySQL. Their is a TODO
list that both grows and shrinks with each release. I have myself seen
major changes in 8.0, 8.1, 8.2, and 8.3 that have personally benefited
me. What problem needs to be solved?

I agree with your sentiment. For somebody who wants to request a
feature, expect it to be taken up by a champion (other than them), and
monitor it's regular process, no clear infrastructure seems to exist.
However, I question the relevance or value of such a system. My own
initial contribution was a reaction to the notion that somebody should
be able to demand other people to volunteer their time to work on
something. It may have been unfair and cold to the original poster, and
for this, I apologize. I don't believe the user community should
necessary be able to demand or prioritize a feature unless they are
willing to put up resources to support the effort. Resources usually
means either people or money. If people truly have a strong business
case for a feature, there are several qualified companies willing to
take their money and turn it into something real. If people are not able
to produce such a business case or justify the expenditure of funds,
attracting volunteers to do the same work requires a very different
approach. It requires zeal, compelling argument, and a reasonable amount
of ego stroking or challenge. Putting in place an official process can
have the opposite effect. People such as myself are very willing to
volunteer efforts in an informal manner without formal deadlines or
processes, because we enjoy it. Forcing the regular amount of red tape
many of us need to cut just to get our jobs done at work is not very
motivational for people such as myself. Now, while I have contributed to
other open source projects, I have not personally contributed much to
PostgreSQL. The core PostgreSQL contributors would have to make their
voice heard. I think, though, that telling them that they must work on a
certain feature, because that's what the users are asking for, is the
wrong approach. Not to say that is exactly what you are requesting, but
I suggest that is where you are leading.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: Andrew Chernow <ac(at)esilo(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Sean Utt <sean(at)strateja(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 03:57:48
Message-ID: 478ADDBC.5070906@esilo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I think, though, that telling them that they must work on a
> certain feature, because that's what the users are asking for, is the
> wrong approach. Not to say that is exactly what you are requesting, but
> I suggest that is where you are leading.
>
> Cheers,
> mark
>

The more communication between users and developers the better.
Companies would pay a fortune for the user complaints, moans and groans,
suggestions, bug reports and demands that postgresql sees in a single day.

IMHO developers need reality checks ... well so do users but for
different reasons. There is no quicker way to kill a product or project
than to ignore the needs/wants of the user/customer base (open-source or
not).

andrew


From: "Webb Sprague" <webb(dot)sprague(at)gmail(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 04:09:10
Message-ID: b11ea23c0801132009g1602f741uf3dffb75b30dae8e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I realize that some very important navel-gazing (^H^H^H "group
process") is happening, but let us remember where bona-fide feature
requests should go:

http://www.postgresql.org/docs/faqs.TODO.html

So far, I don't see any mention of materialized views on this page,
and I did refresh ... :)


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Webb Sprague <webb(dot)sprague(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 04:23:10
Message-ID: 478AE3AE.3040400@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Webb Sprague wrote:
> I realize that some very important navel-gazing (^H^H^H "group
> process") is happening, but let us remember where bona-fide feature
> requests should go:
>
> http://www.postgresql.org/docs/faqs.TODO.html
>
> So far, I don't see any mention of materialized views on this page,
> and I did refresh ... :)

But did you clear your cache? :P

Joshua D. Drake

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Andrew Chernow <ac(at)esilo(dot)com>
Cc: Sean Utt <sean(at)strateja(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 04:26:47
Message-ID: 478AE487.60301@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Chernow wrote:
>> I think, though, that telling them that they must work on a certain
>> feature, because that's what the users are asking for, is the wrong
>> approach. Not to say that is exactly what you are requesting, but I
>> suggest that is where you are leading.
> The more communication between users and developers the better.
> Companies would pay a fortune for the user complaints, moans and
> groans, suggestions, bug reports and demands that postgresql sees in a
> single day.
>
> IMHO developers need reality checks ... well so do users but for
> different reasons. There is no quicker way to kill a product or
> project than to ignore the needs/wants of the user/customer base
> (open-source or not).

Nobody is ignoring users or needs or wants. It is a question of
priority. My priorities may not match yours may not match Tom's or one
of the other core contributors. Valuable features are being added to
PostgreSQL with each release as community efforts (or business efforts).
With the exception of some arguably naive requests such as "give me a
super fast count(*) - why is that so hard?" users are certainly being
listened to - because we are the users. I have personally requested
items, found myself to be not alone, and seen the group request
implemented ahead of my expectations.

So again, my question is - what is broken? What needs to be fixed?

I, for one, think the core developers are doing a great job, and the
extended team (community) is working very well together. I am a
PostgreSQL advocate *because* of the model, not in spite of the model.
If you want a corporate model, I believe Oracle is also a great product
and it may be more compatible with your expectations? (Ironically, I
would expect to see new features that *I* ask for implemented in
PostgreSQL *before* Oracle would listen to me)

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Andrew Chernow <ac(at)esilo(dot)com>, Sean Utt <sean(at)strateja(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 04:52:09
Message-ID: 478AEA79.1050209@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:

> Nobody is ignoring users or needs or wants. It is a question of
> priority. My priorities may not match yours may not match Tom's or one
> of the other core contributors. Valuable features are being added to

FYI, the terminology core contributors is confusing. There is not really
any such thing in PostgreSQL land. There is "core" which isn't really
relevant to development (except in some specific circumstances such as
addressing security concerns). There are "contributors" which are people
who are constantly contributing to the community. An incomplete list of
them is here:

http://www.postgresql.org/community/contributors

Sincerely,

Joshua D. Drake


From: "Webb Sprague" <webb(dot)sprague(at)gmail(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 04:54:43
Message-ID: b11ea23c0801132054he415229k51a9ad6de57435d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> But did you clear your cache? :P

Freud might say it takes a lifetime to clear one's cache ....
Luckily, in therapy you don't have to wait for those darn Postgres
developers ;)

>
> Joshua D. Drake
>
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
> >
>
>


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Sean Utt <sean(at)strateja(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 09:22:01
Message-ID: 1200302521.4266.1474.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2008-01-13 at 17:44 -0800, Sean Utt wrote:

> It is not my contention that the "core" developers need to
> be different in any way. It is also not my contention that the users need to
> be different in any way.

First, this is an open forum, so thank you for expressing your views
openly in the manner you've felt them.

General comment:

I think one perspective I have on the above statement is the feeling
that there is a distinction between two groups of people and that one
group is put here to serve the other group better.

Many new users of Postgres are so used to the closed source situation of
Developers being the only people who can see the code that they often
perpetuate the concept of tiering or groups, when it doesn't exist.

Almost all of the people on the list are users of Postgres. There's just
a complete range of people from new users to experienced hackers.
Postgres is well documented, well commented and completely open source,
so there is no barrier to anyone who wishes to change, and if you choose
to define that change positively, improve.

So I support Mark Mielke's views on writing code. Anybody who wants to
code, can. There's probably a project of a size and complexity that's
right for your first project. Apparently the guy that invented the new
scheduling algorithms for Linux wasn't even a coder, but he sat down and
worked it out.

This is Hackers: Write some code today, everybody. You *can*.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Jean-Michel Pouré <jm(at)poure(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 09:57:12
Message-ID: 1200304632.20340.10.camel@debian
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear Friends,

I hope that this flame war can stop, as it is useless.

The logic of free software is that developers pick-up issues, based on
their skills and interest. The power of the cummunity is to gather very
talented developers from all over the planet. Freedom is the logic and
there is no need to drive the community. In the end, PostgreSQL relies
on the knownledge of talented developers.

When posting this thread, I hope that a talented developer would some
day pick-up the materialized view issue and work on it, during a process
of discussion.

In a few days, I will post some precise statistics on how much
MATERIALIZED VIEW could boost a 400.000 posts PhpBB 3.0 forum. I will
write the required PL trigger, test them and install them on my server.

Then each individual is able to decide whether materiazed views are
important or not. This is my definition of freedom. Freedom of choice.

I don't ask for more.
Now, I hope that the list can return to a more peaceful state.

Kind regards,
Jean-Michel Pouré


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Sean Utt <sean(at)strateja(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 10:09:06
Message-ID: 1200305346.7528.67.camel@PCD12478
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2008-01-14 at 09:22 +0000, Simon Riggs wrote:
> So I support Mark Mielke's views on writing code. Anybody who wants to
> code, can. There's probably a project of a size and complexity that's
> right for your first project.

The main problem is that usually that initial thing is not what you
desperately need today... so the motivation will be pretty low unless
you just have loads of time to start off playing with the code.

> Apparently the guy that invented the new
> scheduling algorithms for Linux wasn't even a coder, but he sat down and
> worked it out.

> This is Hackers: Write some code today, everybody. You *can*.

Certainly everybody can write code, but the barrier to accept it is
pretty high in the postgres community. So you better be a damn good
coder if you expect your code to be accepted... and even then with
considerable fight for justifying the use case for your feature ;-)

This is all good for a stable product, but it really makes the barrier
between simple users and hackers pretty high.

Cheers,
Csaba.


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Jean-Michel Pouré <jm(at)poure(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 13:11:20
Message-ID: 478B5F78.6030508@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jean-Michel Pouré wrote:
> When posting this thread, I hope that a talented developer would some
> day pick-up the materialized view issue and work on it, during a process
> of discussion.
>

In case it was lost in the noise - there are several capable people that
have been personally on this issue for years. You are not the first to
suggest it, and your opinion that the feature is valuable is shared by a
larger group. There is still a difference between talking about it and
doing it. A few of the doers told me off privately stating that it is
not that difficult. My suspicion is that it *is* difficult and they are
not doing themselves credit, or their solution is incomplete, but
whatever - the result is the same. When one or more of these people are
ready, you will likely see it released. It may even be complete before
2008 is complete.

> In a few days, I will post some precise statistics on how much
> MATERIALIZED VIEW could boost a 400.000 posts PhpBB 3.0 forum. I will
> write the required PL trigger, test them and install them on my server.
>

The two factors of note here: 1) If you can write a PL trigger for it,
the value of the feature is limited. In my own case, I found it fairly
easy to write triggers, or update the summary table from the application
(poor man's view). 2) In my experience, a custom PL trigger can make
assumptions about the application that allow greater optimization that a
general 'syntactical sugar' solution could. In my own case, performance
of queries leapt from 1500 ms to 1 ms. Even if materialized views were
implemented to a level that most people would consider "full", I do not
expect to see the same speed improvement, because a generalized
implementation would not be able to make the assumptions that I can.
FYI, my triggers are perhaps 10 lines each, and I believe I have three
triggers in the 1500 ms -> 1 ms example. I have a view and a summary
table. I update the summary table from the view. In my opinion, this
solution is very manageable given the 1500:1 performance improvement it
grants me.

> Then each individual is able to decide whether materiazed views are
> important or not. This is my definition of freedom. Freedom of choice.
>
> I don't ask for more.
> Now, I hope that the list can return to a more peaceful state

You are doing fine. I am sorry for assuming you intended more and giving
you a cold-ish shoulder.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 13:26:44
Message-ID: 20080114132644.GD4584@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:

> FYI, my triggers are perhaps 10 lines each, and I believe I have three
> triggers in the 1500 ms -> 1 ms example. I have a view and a summary
> table. I update the summary table from the view. In my opinion, this
> solution is very manageable given the 1500:1 performance improvement
> it grants me.

But you had to modify your queries. I would think that a materialized
views implementation worth its salt would put the view to work on the
original, unmodified queries.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 14:07:54
Message-ID: 478B6CBA.5000705@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Mark Mielke wrote:
>
>> FYI, my triggers are perhaps 10 lines each, and I believe I have three
>> triggers in the 1500 ms -> 1 ms example. I have a view and a summary
>> table. I update the summary table from the view. In my opinion, this
>> solution is very manageable given the 1500:1 performance improvement
>> it grants me.
>>
>
> But you had to modify your queries. I would think that a materialized
> views implementation worth its salt would put the view to work on the
> original, unmodified queries.
>

I might be slow today (everyday? :-) ) - but what do you mean by this?
The only difference between *_table and *_view is that *_table is the
summary table and *_view is the view. The triggers on the tables the
view is derived from select from *_view and update *_table. The queries
remain unchanged except for deciding whether to use *_table or *_view.
Yes, syntactical sugar would make it prettier and more manageable - but
I am confused as to how a syntactical sugar solution would improve
performance, and I highly suspect it would not achieve the same
performance benefit. As I said - I can make assumptions about how the
base tables are updated. A generalized solution would not be able to
make these assumptions?

For some further background - the base tables are a mirror of accpac
tables (augh!) from mssql. The view and summary table gathers
information from 5 or so of these tables including aggregates,
conditionals, sub-selects (different queries to the same base tables)
and deep joins. Perhaps my imagination is too limited - but I don't see
how it would be easy to make syntactical sugar for this and still
maintain the performance I describe above. For about 30 lines of
pl/pgsql and some application-side updates (again from the view to the
summary table) in the synchronization script it seems acceptable.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 14:20:17
Message-ID: 20080114142017.GF4584@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:
> Alvaro Herrera wrote:

>> But you had to modify your queries. I would think that a materialized
>> views implementation worth its salt would put the view to work on the
>> original, unmodified queries.
>
> I might be slow today (everyday? :-) ) - but what do you mean by this? The
> only difference between *_table and *_view is that *_table is the summary
> table and *_view is the view.

My point is that you should be able to query _table and the system
should automatically use the view, without you saying so (except by
initially creating them).

At least if you had eagerly-updated materialized views. If you had lazy
ones, I think those should be used explicitely only.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
To: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: Jean-Michel Pouré <jm(at)poure(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 14:50:33
Message-ID: E1539E0ED7043848906A8FF995BDA57902B6246B@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> > But you had to modify your queries. I would think that a materialized
> > views implementation worth its salt would put the view to work on the
> > original, unmodified queries.
> >
>
> I might be slow today (everyday? :-) ) - but what do you mean by this?
> The only difference between *_table and *_view is that *_table is the
> summary table and *_view is the view. The triggers on the tables the
> view is derived from select from *_view and update *_table. The queries
> remain unchanged except for deciding whether to use *_table or *_view.

Traditionally materialized views exist, so that you do not need to code differently.
Your queries still run on the detail table, but are silently answered
by a suitable MV. The MV might have count + other aggregated columns
grouped by some columns, and thus be able e.g. shortcircuit a
"select count(*) from atab". The MV should be MVCC aware (have different
values for different snapshots) and not substantially reduce possible
concurrency of updates to the base table.

> For some further background - the base tables are a mirror of accpac
> tables (augh!) from mssql. The view and summary table gathers
> information from 5 or so of these tables including aggregates,
> conditionals, sub-selects (different queries to the same base tables)
> and deep joins. Perhaps my imagination is too limited - but I
> don't see
> how it would be easy to make syntactical sugar for this and still
> maintain the performance I describe above. For about 30 lines of
> pl/pgsql and some application-side updates (again from the
> view to the
> summary table) in the synchronization script it seems acceptable.

As long as you can formulate a normal view on the above statement,
you should be able to tell the db to materialize that.

A good MV feature would be able to use that MV regardless of whether
you select from the view, or use a statement that the view is a generalization of.

I think MV's where originally invented to boost benchmark results
and thus had to operate on given sql to base tables.

Andreas


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 15:54:22
Message-ID: 14479.1200326062@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at> writes:
> Traditionally materialized views exist, so that you do not need to code differently.
> Your queries still run on the detail table, but are silently answered
> by a suitable MV. The MV might have count + other aggregated columns
> grouped by some columns, and thus be able e.g. shortcircuit a
> "select count(*) from atab". The MV should be MVCC aware (have different
> values for different snapshots) and not substantially reduce possible
> concurrency of updates to the base table.

Note that you just raised the minimum bar for implementation of the
feature by a couple orders of magnitude. We cannot automatically
substitute an MV into queries unless this is guaranteed not to change
the results. No lazy updates, MVCC transparency required, etc.

regards, tom lane


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 16:06:55
Message-ID: 163677e12fa23bdda63b94b5db61abc1@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> My point is simply this: The lack of a clear formal process for feature
> requests leads to this degradation in the conversation. Without a formalized
> structure, the conversation devolves rapidly into an argument over semantics
> and word choice.
...
> There needs to be a way to evaluate the demand for a specific feature as
> well as the benefits and the effort it will require.

You could always start a page on the developer's wiki:

http://developer.postgresql.org/

That would seem to be a good place to at least describe the problem in detail,
show how you would like a feature to behave, and have people add the pros and
cons of certain approaches. Certainly would be better to have a page to point
to rather than trying to trawl through mailing archives (heck, the page could
even mostly be a collection of such links).

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200801141104
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iD8DBQFHi4gvvJuQZxSWSsgRA66dAKCGCPBPDfTFDoizE0WDwXBzDK/W3ACg8dwZ
99OvuSU9PPmG6XDPPK2iQzA=
=Xseg
-----END PGP SIGNATURE-----


From: "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 16:39:46
Message-ID: E1539E0ED7043848906A8FF995BDA57902B6249A@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> > Traditionally materialized views exist, so that you do not need to code differently.
> > Your queries still run on the detail table, but are silently answered
> > by a suitable MV. The MV might have count + other aggregated columns
> > grouped by some columns, and thus be able e.g. shortcircuit a
> > "select count(*) from atab". The MV should be MVCC aware (have different
> > values for different snapshots) and not substantially reduce possible
> > concurrency of updates to the base table.
>
> Note that you just raised the minimum bar for implementation of the
> feature by a couple orders of magnitude. We cannot automatically
> substitute an MV into queries unless this is guaranteed not to change
> the results. No lazy updates, MVCC transparency required, etc.

Yes, unfortunately. But don't you also think that this is what makes it
a worthwhile feature ?

I mean, we do have the doityourself triggered summary table approach,
which is not overly difficult to set up. It needs some thought and possibly design
by the user to solve the most obvious concurrency issues, but it is doable.

Imho MV could be separated in 2 parts:
1: materialized and MVCC aware views (only used explicitly)
2: add the smarts to rewrite sql

Part 1 is already useful by itself since it provides a generic and easy
solution to concurrency for the user. (probably nice and mindboggling, how to
best implement that, though :-)

The "lazy update" and non MVCC approach imho sounds too much like your
"you can make it arbitrarily fast if it does not need to be correct" :-)

Andreas


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 17:17:10
Message-ID: 478B9916.2090305@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Mielke wrote:
> Mark Mielke wrote:
>> Counts, because as we all know, PostgreSQL count(*) is slow, and in
>> any case, my count(*) is not on the whole table, but on a subset.
>> Doing this in a general way seems complex to me as it would need to be
>> able to evaluate whether a given INSERT or UPDATE or one of the
>> dependent tables would impact the WHERE clause for the materialized
>> view, and it still wouldn't know which rows to add/update/remove
>> without detailed analysis, so it would either be throwing out the
>> entire materialized view and recreating it on INSERT or UPDATE (or
>> deferring until the next query?) in which case it may be very slow, or
>> it may be very complex.
>
> Bah. I forgot to add: The feature I've been wondering about (and not
> necessarily looking for somebody else to do, although I don't think I
> know the code well enough to do it at this point):
>
> Web applications often make the same queries over and over. While
> memcache can be used to cache results, the memcache interface is
> different from the web application interfere requiring complex code, and
> as well, one loses the transaction guarantees as the memcache results
> are not guaranteed to be up-to-date with the database.

Regarding up-to-dateness note that there is a pgfoundry project that
helps there. http://pgfoundry.org/projects/pgmemcache/ The other
advantages of doing the caching outside the database is that (a) the
memory for the cached results don't have to sit in the database machine,
and (b) you can cache post-processed (rendered into HTML or gifs)
fragments rather than raw data.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Mark Mielke <mark(at)mark(dot)mielke(dot)cc>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-14 22:14:18
Message-ID: 200801142214.m0EMEIR12471@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

* Add the ability to automatically create materialized views

Right now materialized views require the user to create triggers on the
main table to keep the summary table current. SQL syntax should be able
to manager the triggers and summary table automatically. A more
sophisticated implementation would automatically retrieve from the
summary table when the main table is referenced, if possible.

I also thought this was on the TODO list.

---------------------------------------------------------------------------

Zeugswetter Andreas ADI SD wrote:
>
> > > Traditionally materialized views exist, so that you do not need to code differently.
> > > Your queries still run on the detail table, but are silently answered
> > > by a suitable MV. The MV might have count + other aggregated columns
> > > grouped by some columns, and thus be able e.g. shortcircuit a
> > > "select count(*) from atab". The MV should be MVCC aware (have different
> > > values for different snapshots) and not substantially reduce possible
> > > concurrency of updates to the base table.
> >
> > Note that you just raised the minimum bar for implementation of the
> > feature by a couple orders of magnitude. We cannot automatically
> > substitute an MV into queries unless this is guaranteed not to change
> > the results. No lazy updates, MVCC transparency required, etc.
>
> Yes, unfortunately. But don't you also think that this is what makes it
> a worthwhile feature ?
>
> I mean, we do have the doityourself triggered summary table approach,
> which is not overly difficult to set up. It needs some thought and possibly design
> by the user to solve the most obvious concurrency issues, but it is doable.
>
> Imho MV could be separated in 2 parts:
> 1: materialized and MVCC aware views (only used explicitly)
> 2: add the smarts to rewrite sql
>
> Part 1 is already useful by itself since it provides a generic and easy
> solution to concurrency for the user. (probably nice and mindboggling, how to
> best implement that, though :-)
>
> The "lazy update" and non MVCC approach imho sounds too much like your
> "you can make it arbitrarily fast if it does not need to be correct" :-)
>
> Andreas
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>
Cc: "Mark Mielke" <mark(at)mark(dot)mielke(dot)cc>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-15 00:18:28
Message-ID: 9109.1200356308@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at> writes:
>> Note that you just raised the minimum bar for implementation of the
>> feature by a couple orders of magnitude.

> Yes, unfortunately. But don't you also think that this is what makes it
> a worthwhile feature ?

Well, my point is that taking automatic rewriting as a required feature
has at least two negative impacts:

* it rules out any form of lazy update, even though for many applications
an out-of-date summary view would be acceptable for some purposes;

* requiring MVCC consistency will probably hugely reduce the variety of
views that we can figure out how to materialize, and cost performance
even for the ones we can do at all.

It's not zero-cost, even if you consider implementation effort and
complexity as free (which I don't).

regards, tom lane


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgresql Materialized views
Date: 2008-01-16 11:16:55
Message-ID: 478DE7A7.1050100@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure wrote:
> On Jan 12, 2008 4:19 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Sat, 2008-01-12 at 13:27 +0100, Jean-Michel Pouré wrote:
>>
>>> Please pick-up this important issue for developpers. There is no need to
>>> concentrate on complex issues, when handling materialized views could
>>> boost somme web apps. by a factor of 10 or more.
>> It's more complex than you think, but the main reason was that HOT was a
>> prerequisite for making summary tables work efficiently, which is only
>> now just about to go live into 8.3
>
> +1

I don't quite agree with that. HOT certainly speeds up UPDATEs on small
tables, like you a summary table, but there's a lot of use cases like
data warehousing, where the summary tables are not updated that often
for the updates to become a bottleneck.

> If you know how to write triggers, materialization techniques aren't
> all that difficult. The real technical limitation was not lack of
> materialization techniques (write triggers), but was dealing with the
> mvcc penalty. Previously to HOT, for summary tables I would redirect
> the trigger to insert to a 'roll up' table and move the data to the
> summary on cron or from an application event.
>
> Materialized views are syntax sugar (but still very sweet).

There's two things involved in materialized views:

1. Automatically updating the materialized view, when the tables change.
This can be done with triggers, right now, but requires quite a bit of
manual work to set up, especially with more complex views.

2. Using the materialized views to speed up existing queries. For
example, if you have a materialized view on "SELECT COUNT(*) FROM foo",
and someone issues the query "SELECT COUNT(*) FROM foo", the planner
should automatically use the view to satisfy that.

1 is syntactic sugar, but 2 isn't.

These are orthogonal features. Implementing just 1 without 2 would still
be very useful, and in fact that seems to be what most people mean by
materialized views.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-16 22:16:02
Message-ID: 1200521762.4255.37.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2008-01-16 at 11:16 +0000, Heikki Linnakangas wrote:
> Merlin Moncure wrote:
> > On Jan 12, 2008 4:19 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >> On Sat, 2008-01-12 at 13:27 +0100, Jean-Michel Pouré wrote:
> >>
> >>> Please pick-up this important issue for developpers. There is no need to
> >>> concentrate on complex issues, when handling materialized views could
> >>> boost somme web apps. by a factor of 10 or more.
> >> It's more complex than you think, but the main reason was that HOT was a
> >> prerequisite for making summary tables work efficiently, which is only
> >> now just about to go live into 8.3
> >
> > +1
>
> I don't quite agree with that. HOT certainly speeds up UPDATEs on small
> tables, like you a summary table, but there's a lot of use cases like
> data warehousing, where the summary tables are not updated that often
> for the updates to become a bottleneck.

I should have said that was *my* reason for not doing it sooner.

My thinking was if you load a 1000 rows and they all have the same key
in your summary table then you'll be doing 1000 updates on a single row.
With HOT that will cause the block to fill up and then we wipe out the
previously updated rows, since they are inserted and deleted in same
transaction. So all 1000 updates can occur without going beyond that
single block. Much faster, no mess.

Before HOT, large loads were worse, since there was no chance of
VACUUMing the table between updates.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-17 00:06:59
Message-ID: 478E9C23.1020206@pse-consulting.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
>
> My thinking was if you load a 1000 rows and they all have the same key
> in your summary table then you'll be doing 1000 updates on a single row.
>
This is true because the statement level triggers are still rudimentary,
with no OLD and NEW support. A single AFTER statement trigger execution
could maintain the summary table with much less effort.

Regards,
Andreas


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Mark Mielke <mark(at)mark(dot)mielke(dot)cc>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jean-Michel Pouré <jm(at)poure(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Materialized views
Date: 2008-01-17 11:30:36
Message-ID: 478F3C5C.9090701@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Well, my point is that taking automatic rewriting as a required feature
> has at least two negative impacts:
>
> * it rules out any form of lazy update, even though for many applications
> an out-of-date summary view would be acceptable for some purposes;
>
> * requiring MVCC consistency will probably hugely reduce the variety of
> views that we can figure out how to materialize, and cost performance
> even for the ones we can do at all.
>
> It's not zero-cost, even if you consider implementation effort and
> complexity as free (which I don't).

There is one big additional advantage of automatic rewriting though, I
believe. If we had the infrastructure to recognize that possibility of
using a predefined (materialized) view for executing a query, we can
also use that infrastructure to get implement a kind of optimizer hints.

How? We'd need "statistics-materialized views", which don't materialize
the full result of the view, but instead compute it's statistical
properties (the same which ANALYZE computes for a table). When planning
a query we can then substitute the guessed values for rowcount and
friends of a subplan by the values computed for the corresponding
statistics-materialized view.

However, until someone figures out *how* to *actually* recognize that
possibility of using a MV for executing a query, this is just
hand-wavering of course...

regards, Florian Pflug