Re: GSoC proposal - "make an unlogged table logged"

From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: GSoC proposal - "make an unlogged table logged"
Date: 2014-03-07 03:01:46
Message-ID: CAFcNs+rE_o2zwk=4L7U4xa_sUnxejKM6a161kBBz+vc12D4Qug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 6, 2014 at 5:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Mar 6, 2014 at 2:52 PM, Fabrízio de Royes Mello
> <fabriziomello(at)gmail(dot)com> wrote:
> > On Thu, Mar 6, 2014 at 4:42 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> >> I think this isn't a good design. Per the discussion between Andres
> >> and I, I think that I think you should do is make ALTER TABLE .. SET
> >> LOGGED work just like VACUUM FULL, with the exception that it will set
> >> a different relpersistence for the new relfilenode. If you do it that
> >> way, this will be less efficient, but much simpler, and you might
> >> actually finish it in one summer.
> >>
> >
> > Do it like 'VACUUM FULL' for any wal_level?
>
> Yep. Anything else appears to be a research problem.
>

Updated proposal:

<proposal>

** Add to PostgreSQL the capacity to making an "Unlogged" table "Logged" **

Introduction

This project will allow to change an "unlogged" table (that doesn't create
transaction logs - WAL files) and it's dependencies to a "logged" table, in
other words, a regular table that create WAL files. To make this happen
we'll introduce a new SQL syntax:

ALTER TABLE name SET LOGGED;

Benefits to the PostgreSQL Community

The "unlogged" tables feature was introduced by 9.1 version, and provide
better write performance than regular tables (logged), but are not
crash-safe. Their contents are automatically discarded (cleared) in a case
of a server crash, and their contents do not propagate to replication
slaves, either.
With the capacity of turning an "unlogged" table in a logged table will
allow us have the better of two features, in other words, we can use an
"unlogged" table to run a bulk load a thousands of lines (ETL scripts) and
get better performance, and then change it to a "logged" table to get
durability of loaded data.

Deliverables

This project has just one deliverable at the end. The deliverable will be
the implementation of the routines that transform an "unlogged" table to
"logged", using the same algorithm of the "vacuum full", with the exception
that it will set a different "relpersistence" for the new "relfilenode".

Project Schedule

until May 19:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project (
http://wiki.postgresql.org/wiki/Todo)
* as already discussed in pgsql-hackers mailing list this feature will be
implemented similar to "vacuum full", with the exception that it will set a
differente "relpersistence" for the new "relfilenode"
* learn about some PostgreSQL internals:
. grammar (src/backend/parser/gram.y)
. vacuum full (src/backend/commands/[vacuum.c | cluster.c])

May 19 - June 23
* implementation of the first prototype:
. change the grammar of PostgreSQL to support "ALTER TABLE ... SET LOGGED"
. implement and/or adapt the routines to change an "unlogged" table to
"logged" (similar to "vacuum full")
* write documentation and the test cases
* submit this first prototype to the commitfest 2014/06 (
https://commitfest.postgresql.org/action/commitfest_view?id=22)

June 23 - June 27
* review with the Mentor of the work done until now

June 27 - August 18
* do the adjustments based on the community feedback during the commitfest
2014/06
* submit to the commitfest 2014/09 for final evaluation and maybe will be
committed to 9.5 version (webpage don't created yet)

August 18 - August 22
* final review with the Mentor of all work done.

About the proponent

Fabrízio de Royes Mello
e-mail: fabriziomello(at)gmail(dot)com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello

Currently I help people and teams to take the full potential of relational
databases, especially PostgreSQL, helping teams to design the structure of
the database (modeling), build physical architecture (database schema),
programming (procedural languages), SQL (usage, tuning, best practices),
optimization and orchestration of instances in production too. I perform a
volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br),
supporting mailing lists, organizing events (pgbr.postgresql.org.br) and
some admin tasks. And also I help a little the PostgreSQL Global
Development Group (PGDG) in the implementation of some features and review
of patches (git.postgresql.org).

</proposal>

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-03-07 03:36:00 Re: GSoC proposal - "make an unlogged table logged"
Previous Message Bruce Momjian 2014-03-07 02:25:11 Re: pg_ctl status with nonexistent data directory