Updateable cursors

From: "FAST PostgreSQL" <fastpgs(at)fast(dot)fujitsu(dot)com(dot)au>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Updateable cursors
Date: 2007-01-23 15:42:21
Message-ID: 5537.10001169527168.fast.fujitsu.com.au@MHS
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We are trying to develop the updateable cursors functionality into
Postgresql. I have given below details of the design and also issues we are
facing. Looking forward to the advice on how to proceed with these issues.

Rgds,
Arul Shaji


1. Introduction
--------------
This is a combined proposal and design document for adding updatable
(insensitive) cursor capability to the PostgreSQL database.
There have already been a couple of previous proposals since 2003 for
implementing this feature so there appears to be community interest in doing
so. This will enable the following constructs to be processed:

UPDATE <table_name> SET value_list WHERE CURRENT OF <cursor_name>
DELETE FROM <table_name> WHERE CURRENT OF <cursor_name>

This has the effect of users being able to update or delete specific rows of
a table, as defined by the row currently fetched into the cursor.

2. Overall Conceptual Design
-----------------------------
The design is considered from the viewpoint of progression of a command
through the various stages of processing, from changes to the file ‘gram.y’
to implement the actual grammar changes, through to changes in the Executor
portion of the database architecture.

2.1 Changes to the Grammar
------------------------------
The following changes will be done to the PostgreSQL grammar:

UPDATE statement has the option ‘WHERE CURRENT OF <cursor_name>’ added
DELETE statement has the option ‘WHERE CURRENT OF <cursor_name>’ added

The cursor_name data is held in the UpdateStmt and DeleteStmt structures and
contains just the name of the cursor.

The pl/pgsql grammar changes in the same manner.

The word CURRENT will be added to the ScanKeywords array in keywords.c.

2.2 Changes to Affected Data Structures
------------------------------------------
The following data structures are affected by this change:

Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt
structures

The Portal will contain a list of structures of relation ids and tuple ids
relating to the tuple held in the QueryDesc structure. There will be one
entry in the relation and tuple id list for each entry in the relation-list
of the statement below:

DECLARE <cursor_name> [WITH HOLD] SELECT FOR UPDATE OF <relation-list>

The QueryDesc structure will contain the relation id and the tuple id
relating to the tuple obtained via the FETCH command so that it can be
propagated back to the Portal for storage in the list described above.

The UpdateStmt and DeleteStmt structures have the cursor name added so that
the information is available for use in obtaining the portal structure
related to the cursor previously opened via the DECLARE CURSOR request.

2.3 Changes to the SQL Parser
------------------------------------
At present, although the FOR UPDATE clause of the DECLARE CURSOR command has
been present in the grammar, it causes an error message later in the
processing since cursors are currently not updatable. This now needs to
change. The ‘FOR UPDATE’ clause has to be valid, but not the ‘FOR SHARE’
clause.

The relation names that follow the ‘FOR UPDATE’ clause will be added to the
rtable in the Query structure and identified by means of the rowMarks array.
In the case of an updatable cursor the FOR SHARE option is not allowed
therefore all entries in the rtable that are identified by the rowMarks array
must relate to tables that are FOR UPDATE.

In the UPDATE or DELETE statements the ‘WHERE CURRENT OF <cursor_name>’
clause results in the cursor name being placed in the UpdateStmt or
DeleteStmt structure. During the processing of the functions -
transformDeleteStmt() and transformUpdateStmt() - the cursor name is used to
obtain a pointer to the related Portal structure and the tuple affected by
the current UPDATE or DELETE statement is extracted from the Portal, where it
has been placed as the result of a previous FETCH request. At this point all
the information for the UPDATE or DELETE statement is available so the
statements can be transformed into standard UPDATE or DELETE statements and
sent for re-write/planning/execution as usual.

2.4 Changes to the Optimizer
------------------------------
There is a need to add a TidScan node to planning UPDATE / DELETE statements
where the statements are ‘UPDATE / DELETE at position’. This is to enable the
tuple ids of the tuples in the tables relating to the query to be obtained.
There will need to be a new mechanism to achieve this, as at present, a Tid
scan is done only if there is a standard WHERE condition on update or delete
statements to provide Tid qualifier data.

2.5 Changes to the Executor
-------------------------------
There are various options that have been considered for this part of the
enhancement. These are described in the sections below.

We would like to hear opinions on which option is the best way to go or if
none of these is acceptable, any alternate ideas ?

Option 1 MVCC Via Continuous Searching of Database

The Executor is to be changed in the following ways:
1) When the FETCH statement is executed the id of the resulting tuple is
extracted and passed back to the Portal structure to be saved to indicate the
cursor is currently positioned on a tuple.
2) When the UPDATE or DELETE request is executed the tuple id previously
FETCHed is held in the QueryDesc structure so that it can be compared with
the tuple ids returned from the TidScan node processed prior to the actual
UPDATE / DELETE node in the plan. This enables a decision to be made as to
whether the tuple held in the cursor is visible to the UPDATE / DELETE
request according to the rules of concurrency. The result is that, at the
cost of repeatedly searching the database at each UPDATE / DELETE command,
the hash table is no longer required.
This approach has the advantage that there is no hash table held in memory or
on disk so it will not be memory intensive but will be processing intensive.

This is a good ‘one-off’ solution to the problem and, taken in isolation is
probably the best approach. However, if one considers the method(s) used in
other areas of PostgreSQL, it is probably not the best solution. This option
will probably not be used further.

Option 2 MVCC via New Snapshot

The executor can be changed by adding a new kind of snapshot that is
specifically used for identifying if a given tuple, retrieved from the
database during an update or delete statement should be visible during the
current transaction.

This approach requires a new kind of snapshot (this idea was used by Gavin
for a previous updatable cursor patch but objections were raised.)

Option 3 MVCC Via Hash Table in Memory

The executor can be changed by saving into a hash table and comparing each
tuple in the cursor with that set to check if the tuple should be visible.
This approach has the advantage that it will be quick. It has the
disadvantage that, since the hash table will contain all the tuples of the
table being checked that it may use all local memory for a large table.

Option 4 MVCC Via Hash Table on Disk

When the UPDATE or DELETE request is executed the first time the Tid scan
database retrieval will be done first. At this time the tuple id of each row
in the table to be updated by the request will be available in the executor.
These tuple ids need to be stored in a hash table that is stored to disk, as,
if the table is large there could be a huge number of tuple ids. This data is
then available for comparison with the individual tuple to be updated or
deleted to check if it should be processed. The hash table will exist for the
duration of the transaction, from BEGIN to END (or ABORT).

The hash table is then used to identify if the tuple should be visible during
the current transaction. If the tuple should be visible then the update or
delete proceeds as usual.

This approach has the advantage that it will use little memory but will be
relatively slow as the data has to be accessed from disk.

Option 5 Store Tuple Id in Snapshot.

The Snapshot structure can be changed to include the tuple id. This enables
the current state of the tuple to be identified with respect to the current
transaction.
The tuple id, as identified in the cursor at the point where the
DELETE/UPDATE statement is being processed, can use the snapshot to identify
if the tuple should be visible in the context of the current transaction.

2.6 Changes to the Catalog
----------------------------
The Catalog needs to reflect changes introduced by the updatable cursor
implementation. A boolean attribute ‘is_for_update’ is to be added to the
pg_cursors implementation. It will define that the cursor is for update
(value is FALSE) or for share (value is TRUE, the default value).

3 Design Assumptions
----------------------------
The following design assumptions are made:

As PostgreSQL8.2 does not support the SENSITIVE cursor option the tuples
contained in a cursor can never be updated so these tuples will always appear
in their ‘original’ form as at the start of the transaction. This is in
breach of the SQL2003 Standard as described in 5WD-02-Foundation-2003-09.pdf,
p 810. The standard requires the updatable cursor to be declared as sensitive.

With respect to nested transactions – In PostgreSQL nested transactions are
implemented by defining ‘save points’ via the keyword SAVEPOINT. A ‘ROLLBACK
TO SAVEPOINT’ rolls back the database contents to the last savepoint in this
transaction or the begin statement, whichever is closer.

It is assumed that the FETCH statement is used to return only a single row
into the cursor with each command when the cursor is updatable.

According to the SQL2003 Standard Update and Delete statements may contain
only a single base table.

The DECLARE CURSOR statement is supposed to use column level locking, but
PostgreSQL supports only row level locking. The result of this is that the
column list that the standard requires ‘DECLARE <cursor_name> SELECT … FOR
UPDATE OF column-list’ becomes a relation (table) list.

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe(at)fast(dot)fujitsu(dot)com(dot)au

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2007-01-23 15:42:39 Re: [HACKERS] Win32 WEXITSTATUS too
Previous Message Bruce Momjian 2007-01-23 15:41:45 Re: [HACKERS] Win32 WEXITSTATUS too