Re: beta testing version

Lists: pgsql-hackers
From: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
To: "'Zeugswetter Andreas SB'" <ZeugswetterA(at)wien(dot)spardat(dot)at>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: beta testing version
Date: 2000-12-07 03:50:42
Message-ID: 8F4C99C66D04D4118F580090272A7A234D31D4@sectorbase1.sectorbase.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > > > Sounds great! We can follow this way: when first after last
> > > > checkpoint update to a page being logged, XLOG code can log
> > > > not AM specific update record but entire page (creating backup
> > > > "physical log"). During after crash recovery such pages will
> > > > be redone first, ensuring page consistency for further redo ops.
> > > > This means bigger log, of course.
> > >
> > > Be sure to include a CRC of each part of the block that you hope
> > > to replay individually.
> >
> > Why should we do this? I'm not going to replay parts individually,
> > I'm going to write entire pages to OS cache and than apply
> > changes to them. Recovery is considered as succeeded after server
> > is ensured that all applyed changes are on the disk. In the case of
> > crash during recovery we'll replay entire game.
>
> Yes, but there would need to be a way to verify the last page
> or record from txlog when running on crap hardware. The point was,
> that crap hardware writes our 8k pages in any order (e.g. 512 bytes
> from the end, then 512 bytes from front ...), and does not
> even notice, that it only wrote part of one such 512 byte block when
> reading it back after a crash. But, I actually doubt that this is
> true for all but the most crappy hardware.

Oh, I didn't consider log consistency that time. Anyway we need in CRC
for entire log record not for its 512-bytes parts.

Well, I didn't care about not atomic 8K-block writes in current WAL
implementation - we never were protected from this: backend inserts
tuple, but only line pointers go to disk => new lp points on some
garbade inside unupdated page content. Yes, transaction was not
committed but who knows content of this garbade and what we'll get
from scan trying to read it. Same for index pages.

Can we come to agreement about CRC in log records? Probably it's
not too late to add it (initdb).

Seeing bad CRC recovery procedure will assume that current record
(and all others after it, if any) is garbade - ie comes from
interrupted disk write - and may be ignored (backend writes data
pages only after changes are logged - if changes weren't
successfully logged then on-disk image of data pages was not
updated and we are not interested in log records).

This may be implemented very fast (if someone points me where
I can find CRC func). And I could implement "physical log"
till next monday.

Comments?

Vadim


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
Cc: "'Zeugswetter Andreas SB'" <ZeugswetterA(at)wien(dot)spardat(dot)at>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: beta testing version
Date: 2000-12-07 04:26:11
Message-ID: 331.976163171@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM> writes:
> This may be implemented very fast (if someone points me where
> I can find CRC func).

Lifted from the PNG spec (RFC 2083):

15. Appendix: Sample CRC Code

The following sample code represents a practical implementation of
the CRC (Cyclic Redundancy Check) employed in PNG chunks. (See also
ISO 3309 [ISO-3309] or ITU-T V.42 [ITU-V42] for a formal
specification.)

/* Make the table for a fast CRC. */
void make_crc_table(void)
{
unsigned long c;
int n, k;
for (n = 0; n < 256; n++) {
c = (unsigned long) n;
for (k = 0; k < 8; k++) {
if (c & 1)
c = 0xedb88320L ^ (c >> 1);
else
c = c >> 1;
}
crc_table[n] = c;
}
crc_table_computed = 1;
}

/* Update a running CRC with the bytes buf[0..len-1]--the CRC
should be initialized to all 1's, and the transmitted value
is the 1's complement of the final running CRC (see the
crc() routine below)). */

unsigned long update_crc(unsigned long crc, unsigned char *buf,
int len)
{
unsigned long c = crc;
int n;

if (!crc_table_computed)
make_crc_table();
for (n = 0; n < len; n++) {
c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8);
}
return c;
}

/* Return the CRC of the bytes buf[0..len-1]. */
unsigned long crc(unsigned char *buf, int len)
{
return update_crc(0xffffffffL, buf, len) ^ 0xffffffffL;
}

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, "'Zeugswetter Andreas SB'" <ZeugswetterA(at)wien(dot)spardat(dot)at>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: beta testing version
Date: 2000-12-07 04:48:57
Message-ID: 382.976164537@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Lifted from the PNG spec (RFC 2083):

Drat, I dropped the table declarations:

/* Table of CRCs of all 8-bit messages. */
unsigned long crc_table[256];

/* Flag: has the table been computed? Initially false. */
int crc_table_computed = 0;

regards, tom lane


From: "Horst Herb" <hherb(at)malleenet(dot)net(dot)au>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, <pgsql-hackers(at)postgresql(dot)org>
Subject: CRC was: Re: beta testing version
Date: 2000-12-07 07:40:49
Message-ID: 00e701c06021$04f92d00$fcee2bcb@midgard
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> This may be implemented very fast (if someone points me where
> I can find CRC func). And I could implement "physical log"
> till next monday.

I have been experimenting with CRCs for the past 6 month in our database for
internal logging purposes. Downloaded a lot of hash libraries, tried
different algorithms, and implemented a few myself. Which algorithm do you
want? Have a look at the openssl libraries (www.openssl.org) for a start -if
you don't find what you want let me know.

As the logging might include large data blocks, especially now that we can
TOAST our data, I would strongly suggest to use strong hashes like RIPEMD or
MD5 instead of CRC-32 and the like. Sure, it takes more time tocalculate and
more place on the hard disk, but then: a database without data integrity
(and means of _proofing_ integrity) is pretty worthless.

Horst


From: "xuyifeng" <jamexu(at)telekbird(dot)com(dot)cn>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: pre-beta is slow
Date: 2000-12-07 08:46:28
Message-ID: 003d01c0602a$32751240$5ac809c0@xyf
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

recently I have downloaded a pre-beta postgresql, I found insert and update speed is slower then 7.0.3,
even I turn of sync flag, it is still slow than 7.0, why? how can I make it faster?

Regards,
XuYifeng


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Horst Herb <hherb(at)malleenet(dot)net(dot)au>
Cc: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CRC was: Re: beta testing version
Date: 2000-12-07 10:17:23
Message-ID: 3A2F63B3.75A46541@tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Horst Herb wrote:
>
> > This may be implemented very fast (if someone points me where
> > I can find CRC func). And I could implement "physical log"
> > till next monday.
>
> I have been experimenting with CRCs for the past 6 month in our database for
> internal logging purposes. Downloaded a lot of hash libraries, tried
> different algorithms, and implemented a few myself. Which algorithm do you
> want? Have a look at the openssl libraries (www.openssl.org) for a start -if
> you don't find what you want let me know.
>
> As the logging might include large data blocks, especially now that we can
> TOAST our data, I would strongly suggest to use strong hashes like RIPEMD or
> MD5 instead of CRC-32 and the like. Sure, it takes more time tocalculate and
> more place on the hard disk, but then: a database without data integrity
> (and means of _proofing_ integrity) is pretty worthless.

The choice of hash algoritm could be made a compile-time switch quite
easyly I guess.

---------
Hannu


From: ncm(at)zembu(dot)com (Nathan Myers)
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CRC was: Re: beta testing version
Date: 2000-12-07 19:42:53
Message-ID: 20001207114253.Y30335@store.zembu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 07, 2000 at 06:40:49PM +1100, Horst Herb wrote:
> > This may be implemented very fast (if someone points me where
> > I can find CRC func). And I could implement "physical log"
> > till next monday.
>
> As the logging might include large data blocks, especially now that
> we can TOAST our data, I would strongly suggest to use strong hashes
> like RIPEMD or MD5 instead of CRC-32 and the like.

Cryptographically-secure hashes are unnecessarily expensive to compute.
A simple 64-bit CRC would be of equal value, at much less expense.

Nathan Myers
ncm(at)zembu(dot)com