Re: PostgreSQL Data Loss

Lists: pgsql-hackers
From: BluDes <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Subject: PostgreSQL Data Loss
Date: 2007-01-26 10:22:37
Message-ID: epckqg$aq9$1@nnrp.ngi.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi everyone,
I have a problem with one of my costomers.
I made a program that uses a PostgreSQL (win32) database to save its data.
My customer claims that he lost lots of data reguarding his own clients
and that those data had surely been saved on the database.
My first guess is that he is the one who deleted the data but wants to
blame someone else, obviously I can't prove it.

Could it be possible for PostgreSQL to lose its data? Maybe with a file
corruption? Could it be possible to restore these data?

My program does not modify or delete data since its more like a log that
only adds information. It is obviously possible to delete these logs but
it requires to answer "yes" to 2 different warnings, so the data can't
be deleted accidentally.

I have other customers with even 10 times the amount of data of the one
who claimed the loss but no problems with them.
He obviously made no backups (and claims whe never told him to do them
so we are responsible even for this) though the program has a dedicated
Backup-section.

Any suggestion?

Daniele


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: BluDes <darocchi(at)tiscali(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-26 21:46:28
Message-ID: 45BA76B4.2080904@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

BluDes wrote:
> I made a program that uses a PostgreSQL (win32) database to save its data.

What version of PostgreSQL is this?

> My customer claims that he lost lots of data reguarding his own clients
> and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to
> blame someone else, obviously I can't prove it.

Did he lose all data in one table, or just some rows? Or is there some
other pattern?

> Could it be possible for PostgreSQL to lose its data?

Not when properly installed.

> Maybe with a file corruption?

I doubt it. You'd almost certainly get warnings or errors if there's
corruption.

> Could it be possible to restore these data?

The first thing to do is to take a filesystem-level physical copy of the
data directory to prevent further damage. Copy the data directory to
another system for forensics.

You might be able to get a picture of what happened by looking at the
WAL logs using the xlogviewer tool in pgfoundry.

You can also modify the PostgreSQL source code so that it shows also row
versions marked as deleted, and recover the deleted data. I can't
remember exactly how to do it, maybe others who have done it can fill
in. A row stays physically in the file until the table is vacuumed;
hopefully it hasn't been.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: BluDes <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-26 22:00:48
Message-ID: 45BA7A10.5050700@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

If data are deleted then they are still stored in database until VACUUM
cleans them. You can look by some hex viewer, if you see some know text
data there. Or I think there is also some tool which dump tuple list
from pages.

You can also see deleted data if you change current transaction ID. But
I not sure if it is simply possible.

Before experiments, do not forget backup of database files.

Zdenek

BluDes wrote:
> Hi everyone,
> I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save its data.
> My customer claims that he lost lots of data reguarding his own clients
> and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to
> blame someone else, obviously I can't prove it.
>
> Could it be possible for PostgreSQL to lose its data? Maybe with a file
> corruption? Could it be possible to restore these data?
>
> My program does not modify or delete data since its more like a log that
> only adds information. It is obviously possible to delete these logs but
> it requires to answer "yes" to 2 different warnings, so the data can't
> be deleted accidentally.
>
> I have other customers with even 10 times the amount of data of the one
> who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do them
> so we are responsible even for this) though the program has a dedicated
> Backup-section.
>
> Any suggestion?
>
> Daniele
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings


From: "J(dot) Andrew Rogers" <jrogers(at)neopolitan(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-26 22:09:54
Message-ID: DA124DC7-B25A-4EAB-ACAC-96AF87D102C3@neopolitan.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Jan 26, 2007, at 2:22 AM, BluDes wrote:
> I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save
> its data.
> My customer claims that he lost lots of data reguarding his own
> clients and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants
> to blame someone else, obviously I can't prove it.
>
> Could it be possible for PostgreSQL to lose its data? Maybe with a
> file corruption? Could it be possible to restore these data?
>
> My program does not modify or delete data since its more like a log
> that only adds information. It is obviously possible to delete
> these logs but it requires to answer "yes" to 2 different warnings,
> so the data can't be deleted accidentally.
>
> I have other customers with even 10 times the amount of data of the
> one who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do
> them so we are responsible even for this) though the program has a
> dedicated Backup-section.

I have seen this data loss pattern many, many times, and on Oracle
too. The most frequent culprits in my experience:

1.) The customer screwed up big time and does not want to admit that
they made a mistake, hoping you can somehow pull their butt out of
the fire for free.

2.) Someone else sabotaged or messed up the customers database, and
the customer is not aware of it.

3.) The customer deleted their own data and is oblivious to the fact
that they are responsible.

4.) There is some rare edge case in your application that generates
SQL that deletes all the data.

There is always the possibility that there is in fact some data loss
due to a failure of the database, but it is a rare kind of corruption
that deletes a person's data but leaves everything else intact with
no error messages, warnings, or other indications that something is
not right. Given the description of the problem, I find an internal
failure of the database to be a low probability reason for the data
loss.

Having run many database systems that had various levels of pervasive
internal change auditing/versioning, often unbeknownst to the casual
user, virtually all of the several "data loss" cases I've seen with a
description like the above clearly fit in the cases #1-3 above when
we went into the audit logs i.e. someone explicitly did the
deleting. I cannot tell you how many times people have tried to
pretend that the database "lost" or "messed up" their data and then
been embarrassed when they discover that I can step through every
single action they took to destroy their own data. I've never seen a
single case like the one described above that was due to an internal
database failure; when there is an internal database failure, it is
usually ugly and obvious.

Cheers,

J. Andrew Rogers
jrogers(at)neopolitan(dot)com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: BluDes <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-26 23:46:49
Message-ID: 45BA92E9.3000606@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

BluDes wrote:
> Hi everyone,
> I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save its
> data.
> My customer claims that he lost lots of data reguarding his own
> clients and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to
> blame someone else, obviously I can't prove it.
>
> Could it be possible for PostgreSQL to lose its data? Maybe with a
> file corruption? Could it be possible to restore these data?
>
> My program does not modify or delete data since its more like a log
> that only adds information. It is obviously possible to delete these
> logs but it requires to answer "yes" to 2 different warnings, so the
> data can't be deleted accidentally.
>
> I have other customers with even 10 times the amount of data of the
> one who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do them
> so we are responsible even for this) though the program has a
> dedicated Backup-section.
>
> Any suggestion?
>
>

This isn't any sort of report that can be responded to. We need to know
what has happened to the machine, what is in the server logs, what are
the symptoms of data loss. The most likely explanations are pilot error
and hardware error.

cheers

andrew


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "BluDes" <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-27 00:11:59
Message-ID: 87odol8h0w.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"BluDes" <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it> writes:

> My customer claims that he lost lots of data reguarding his own clients and
> that those data had surely been saved on the database.

Has this Postgres database been running for a long time? There is a regular
job called VACUUM that has to be run on every table periodically to recover
free space.

If this isn't run for a very long time (how long depends on how busy the
database is, but even on extremely large databases it's usually a matter of
months, on more normal databases it would be years) then very old records seem
to suddenly disappear. There is a way to recover data that this has happened
to though as long as you don't run vacuum after the data has disappeared.

To repeat: If you think this may have happened DO NOT run vacuum now.

Do you think this may have happened? How long ago was this database created?
Does your system periodically run VACUUM? Is the missing data in every table
or just a particular table?

Incidentally recent versions of Postgres don't allow this to occur and stop
running with a message insisting you run vacuum before continuing.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: BluDes <DESPAMMAMIdarocchi(at)PERFAVOREtiscali(dot)it>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-27 05:31:01
Message-ID: 20070127053101.GB2917@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jan 27, 2007 at 12:11:59AM +0000, Gregory Stark wrote:
> If this isn't run for a very long time (how long depends on how busy the
> database is, but even on extremely large databases it's usually a matter of
> months, on more normal databases it would be years) then very old records seem
> to suddenly disappear. There is a way to recover data that this has happened
> to though as long as you don't run vacuum after the data has disappeared.
>
> To repeat: If you think this may have happened DO NOT run vacuum now.

Actually, for XID wraparound a VACUUM may actually be the right thing.
I looked at this (with guidence from Tom) and we came to the conclusion
that XID wraparound will hide tuples older than 2 billion transaction,
but VACUUM will mark as frozen anything newer than 3 billion
transactions, so for 1 billion transactions you can actually get your
data back.

Expect for things like uniqueness guarentees, but they're solvable.

Not that I'm saying that the OP has this issue...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: desrocchi(at)gmail(dot)com
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL Data Loss
Date: 2007-01-27 15:01:17
Message-ID: 1169910077.368699.205340@v33g2000cwv.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 Gen, 06:31, klep(dot)(dot)(dot)(at)svana(dot)org (Martijn van Oosterhout) wrote:

> > To repeat: If you think this may have happened DO NOT run vacuum now.Actually, for XID wraparound a VACUUM may actually be the right thing.
> I looked at this (with guidence from Tom) and we came to the conclusion
> that XID wraparound will hide tuples older than 2 billion transaction,
> but VACUUM will mark as frozen anything newer than 3 billion
> transactions, so for 1 billion transactions you can actually get your
> data back.
>
> Expect for things like uniqueness guarentees, but they're solvable.

Hello,
thank you all for the help.
@Andrew Dunstan: this is the first time I'm having this kind of
problem with PostgreSQL, I'm sorry I didn't provide all the needed
information.
Let me try to fill in something:
- the postgresql version is 8.1.4-1
- as far as I know, nothing happened to the machine. I work near
Milan, my customer is from something between Rome and Tuscany. It
would be a long jurney to retrieve a PC that he surely won't give us.
- The server logs... huh? Never heard of them... or better, never
needed. Where can I find them?

There is even a more foolish explanation to all of this, but my
customer denied this happened:
in my program it is possible to deactivate the auto-save function of
the work done. Without this option the user has to click himself the
button to store the data on the database... so it could even be that
I'm trying to find data that has never even been saved.

Anyway this teaches me that I have to put logs in my programs to trace
every single time the users change settings.

Bye,
Daniele