implicit abort harmful?

From: "Wayne Armstrong" <wdarmst(at)bacchus(dot)com(dot)au>
To: pgsql-general(at)postgresql(dot)org
Subject: implicit abort harmful?
Date: 2003-05-25 00:04:30
Message-ID: 200305250004.h4P04XSY019198@mail.bacchus.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
I have been using postgresql for about 3 months now (after about 15 years of
using various other dbms).

I love it :) but - (and you just knew that but was coming :) there is one thing
that really causes me grief.
.
It is the tendency for postgres to rollback automatically on some errors.

What this leads to is the need for an application to commit much more
frequently than is really desirable.

Example1.

During import of 120 thousand records from an isam file system, 3 say records
fail integrity checks ( files in non-database systems tend not to have
referential integrity implemented on them except at an application level
(meaning not at all :). The desired result is to drop the records failing
integrity checks.
Importing into db2 or oracle say - I have the option to ignore the referential
integrity errors (i get the error code on the insert anyway), and continue with
the insert of the 120 thousand - 3 records.
In postgres, I either have to commit after every record write, or guarantee the
"cleanness" of the data before I begin the import - which is sometimes
difficult given the data sources I may be importing from (Often I neither own
nor can modify them).
Worse, if the intention behind the automatic rollback is to guarantee data
purity (from the postgresql manual 10.4.1 -
" Turn off autocommit and just do one commit at the end. (In plain SQL, this
means issuing BEGIN at the start and COMMIT at the end. Some client libraries
may do this behind your back, in which case you need to make sure the library
does it when you want it done.) If you allow each insertion to be committed
separately, PostgreSQL is doing a lot of work for each record added. An
additional benefit of doing all insertions in one transaction is that if the
insertion of one record were to fail then the insertion of all records inserted
up to that point would be rolled back, so you won't be stuck with partially
loaded data." ),
It actually makes more of a mess than leaving the decision to rollback under
application control.
If in this example the three "bad" records are scattered throughout the
dataset, I could end up (if I ignore the error as i would do for most other
dbms), with a random number of records in the table. etc etc.

Of course also, to do robust imports of the insert ifError update style
avaiilable in most other dbms I have worked with, it is also nescessary to
issue a commit before each insert/update attempt.

Example2 - ODBC driver rollbacks :-

Here is a snippet from a postgres log :-
2003-05-24 23:09:14 [1525] LOG: duration: 0.134503 sec
2003-05-24 23:09:14 [1525] LOG: query: select nspname from pg_namespace n, p
_class c where c.relnamespace=n.oid and c.oid='select'::regclass
2003-05-24 23:09:14 [1525] ERROR: Relation "select" does not exist
2003-05-24 23:09:14 [1525] LOG: statement: select nspname from pg_namespace
, pg_class c where c.relnamespace=n.oid and c.oid='select'::regclass
2003-05-24 23:09:14 [1525] LOG: query: ROLLBACK
2003-05-24 23:09:14 [1525] LOG: duration: 0.000538 sec

The rollback here is generated by the odbc driver to clear an error created by
the odbc driver incorrectly parsing the select from a subselect statement as a
table name.
The application is totally unaware an error has occured and a rollback has been
issued.
This is most likely to lead to data loss, unless, basically, an application
using odbc is in autocommit mode or commits after every sql statement issued.

This concerns me to the piont where I would really recommend not using the
parse statement option in the odbc driver (even though that reduces the odbc
drivers ability to mimic a prepare which is also problematic :) unless you are
using autocommit.

For any application complex enough to be working with autocommit turned off in
the odbc driver, the auto rollback is a real problem.

Please consider making the automatic abort/rollback feature an option ?

Regards,
Wayne Armstorng
Bacchus Management Systems
http://www.bacchus.com.au

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Doug McNaught 2003-05-25 00:50:13 Re: pg newbie stumped on sequences!
Previous Message Wayne Armstrong 2003-05-24 23:44:11 Re: Query error=Implicit ABORT?