pgcheck - data integrity check

From: Rober Mach <mach(dot)robert(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pgcheck - data integrity check
Date: 2007-08-10 03:48:08
Message-ID: 46BBDFF8.9080909@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am working on a data integrity check tool (pgcheck).
I would like to discuss the following issues:

Right now I am working on a function, which should check the validity of
pages in relation. The relation is passed to the function as its
argument (its oid). For the integrity check of a page, I am using an
AccessShare lock on a relation as you can see on
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgcheck/pokus/pgcheck_page/
. In near future, I would like to extend the functionality also with a
recovery tool which would be able to repair broken pages. Should the
function also repair the corrupted data on a page, the AccessShare lock
on a relation would not be sufficient. But on the other hand
AccessExclusive lock on the entire relation could significantly
influence the performance of a database.
So far I see these possibilities:
1- use AccessShare lock for the integrity check function and in case of
faulty page, pass this page to a special function, which would lock the
page using AccessExclusiveLock on the relation for correction.
+ it would not influence the performance so much
- higher complexity
2- use one function which would lock the relation with AccessExclusive
lock, check the integrity of data and in case of faulty pages, it would
repair it at once.
+ easier to implement
- could cause performance downturn because of relatively long
Exclusive lock on some relations.
3- use the AccessShare lock for the integrity check and use special
"single-user mode" for recovery of data
+ safest option for recovery of data
- long down time of database

Furhter, I would like to know your opinions on what should be checked
next in order to check the integrity of data in database. I am thinking
of checking:
-in case of variable-length data, compare the formal and actual size of data
-check whether constrains applied on items are fulfilled
-compare data in indexes with indexed tables, whether they are correct

Robert

P.S. Any comments to the c-funtion I made so far a welcome

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrian Maier 2007-08-10 05:08:55 Re: Compilation of pg 7.4.17 fails on HP-UX
Previous Message Gregory Stark 2007-08-10 03:35:57 Re: Unexpected VACUUM FULL failure