Re: MD5 aggregate

From: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MD5 aggregate
Date: 2013-06-14 13:20:45
Message-ID: CADbMkNPFHbevxzfDN5iCnR+pF18cPTbf0SC+GcVC0+7epZO3fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 14, 2013 at 2:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Marko Kreen <markokr(at)gmail(dot)com> writes:
> > On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
> wrote:
> >> Attached is a patch implementing a new aggregate function md5_agg() to
> >> compute the aggregate MD5 sum across a number of rows.
>
> > It's more efficient to calculate per-row md5, and then sum() them.
> > This avoids the need for ORDER BY.
>
> Good point. The aggregate md5 function also fails to distinguish the
> case where we have 'xyzzy' followed by 'xyz' in two adjacent rows
> from the case where they contain 'xyz' followed by 'zyxyz'.
>
> Now, as against that, you lose any sensitivity to the ordering of the
> values.
>
> Personally I'd be a bit inclined to xor the per-row md5's rather than
> sum them, but that's a small matter.
>
> regards, tom lane
>
>
xor works but only if each row is different (e.g. at the very least all
columns together make a unique key).

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-06-14 13:21:52 Re: Patch for fail-back without fresh backup
Previous Message Heikki Linnakangas 2013-06-14 13:17:25 Re: Patch for fail-back without fresh backup