GSoC - proposal - Materialized Views in PostgreSQL

From: pavelbaros <baros(dot)p(at)seznam(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: GSoC - proposal - Materialized Views in PostgreSQL
Date: 2010-04-09 20:36:04
Message-ID: hpo33i$7am$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I am sending my proposal on GSoC. Details are listed below. Please, if
you have ideas, tips, or if you only want to say you opinion about my
project, go ahead.

thanks,
Pavel Baros

Abstract:

It is effort to implement snapshot materialized view (are only updated
when refreshed) in PostgreSQL. In this time I finished some part of it
and I am trying to publish my present work on git.postgresql.org

Benefits to the PostgreSQL Community

First of all, it would be the best if my work is helpful to everybody
who misses materialized views in PostgreSQL, because PostgreSQL do not
have still implemented materialized views. In addition, MV is mentioned
as feature in TODO list.

Deliverables

First of all, at the end of whole my project is not only writing
bachelors thesis, but finish it as patch and if possible, get patch into
next PostgrSQL release, or keep git repository actual to last PosgreSQL
version. I have also personal goals. Arouse the interest about
implementing MV in PostgreSQL, or at least arouse discussion about it.

Project Schedule

My work goes quite good, I am on good way to finish main parts on
backend in few weeks. After that I will make and run tests and implement
related changes to PosgreSQL tools (psql, pg_dump, etc.). I am also
waiting for approval for my repository named "materialized_view" on
git.postgresql.org, so I could publish completed parts. For now next
step will be to discuss implementation on postgresql.hackers.

Bio

I am from Czech Republic and I am studying on Faculty of Electrical
Engineering on Czech Technical University in Prague www.fel.cvut.cz/en/.
My bachelor thesis is based on this project, implementing MV in PostgreSQL.

I've experienced many different jobs. The best experience for me was,
when I've worked as tester and software engineer in C/C++ and C# for
Radiant Systems Inc. for more than year. After that I've worked as Web
developer with Internet technologies (PHP, HTML, CSS, ...), where the
goal was to make an internal system for an advertising agency. Finally
my recent job was as Windows Mobile Developer. Except the first
experience, others lasts only few months mainly because those were
temporary projects. For now I am looking for some part time job, of
course, preferably something closer to database systems.

Implementation: could be divided to few steps:

1) create materialized view
- modify grammar (parser): CREATE MATERIALIZED VIEW mv_name AS SELECT ...
- change executor, so that it will create physical table defined by
select statement

2) change rewriter
- usually, view is relation with defined rule and when rewriting, rule
is fired and relation (view) is replaced by definition of view. If
relation do not have rule, planner and executor behave to it as physical
table (relation). In case of materialized view we want to rewrite select
statement only in case when we refreshing MV. In other cases rewriter
should skip rewriting and pick up physical relation. Exclude situation
when other rewrite rules which are not related to MV definition are
specified.

3) create command that takes snapshot (refresh MV)
- modify grammar: ALTER MATERIALIZED VIEW mv_name REFRESH;
- taking snapshot (refreshing) is similar to command "SELECT INTO ..."
and I decided to follow the way it works. After parsing query and before
transformation is MANUALLY created tree representation of "SELECT * INTO
..." with flag IntoClause->isrefresh set true, indicating it is
refreshing materialized view. Everithing acts as it would be regular
"SELECT INTO ..." except functions OpenIntoRel() and CloseIntoRel(). In
function OpenIntoRel is created temp table (without catalog) and set as
destination for result of select. In function CloseIntoRel executor swap
relfilenode's of temp table and original table and finally delete temp
table. Behavior of CloseIntoRel function is inspired by CLUSTER statement.

Contacts: baros(dot)p(at)seznam(dot)cz

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-04-09 20:36:50 Re: extended operator classes vs. type interfaces
Previous Message Dimitri Fontaine 2010-04-09 20:08:18 Re: GSOC PostgreSQL partitioning issue