From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, david(at)fetter(dot)org |
Subject: | Re: MD5 aggregate |
Date: | 2013-06-26 21:48:14 |
Message-ID: | 20130626214814.GA865548@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 26, 2013 at 09:04:34PM +0100, Dean Rasheed wrote:
> On 26 June 2013 19:32, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote:
> > md5_agg() is well-defined and not cryptographically novel, and your use case
> > is credible. However, not every useful-sometimes function belongs in core; we
> > mostly stick to functions with ubiquitous value and functions that would be
> > onerous to implement externally. (We do carry legacy stuff that wouldn't make
> > the cut today.) In my estimation, md5_agg() does not make that cut. The
> > variety of intuitive md5_agg() definitions attested upthread doesn't bode well
> > for its broad applicability. It could just as well live in an extension
> > published on PGXN. Mine is just one opinion, though; I won't be horrified if
> > the community wants an md5_agg() in core.
>
> I disagree with this though. I see md5_agg() as a natural extension of
> the already-in-core md5() functions and underlying code, satisfying a
> well-defined use-case, and providing a checksum comparable with
> externally generated md5 sums.
All true, but I don't see those facts justifying core inclusion.
> A quick google search reveals several people asking for something like
> this, and people recommending md5(string_agg(...)) or
> md5(string_agg(md5(...))) based solutions, which are doomed to failure
> on larger tables. So I think that there is a case for having md5_agg()
> in core as an alternative to such hacks, while having more
> sophisticated crypto functions available as extensions.
My nine Google hits for "md5(string_agg" included one Stack Overflow answer,
several mirrors of that answer, and a few posts on this thread.
> I haven't looked at pgcrypto because, as I said, performance wasn't my
> primary criterion either, but removing the unnessary data copy just
> seemed like good sense.
>
> I'll take a look at it, and then as you and Peter suggest, look to
> split the patch into two. I think I will withdraw md5_total() because
> I was never entirely happy with that anyway.
Understood; feel free to back off from any performance improvements in which
you didn't wish to involve yourself.
If we do end up moving forward with md5_agg(), I note that the pgcrypto
version already has an initialize/accumulate/finalize API division. A patch
importing that code largely-unchanged would be easier to verify than a patch
restructuring the src/backend/libpq/md5.c implementation. The two patches
would then be a "use pgcrypto md5.c in core" patch and an md5_agg() patch.
Thanks,
nm
--
Noah Misch
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2013-06-26 21:49:46 | Re: Spin Lock sleep resolution |
Previous Message | Selena Deckelmann | 2013-06-26 21:32:24 | Re: Kudos for Reviewers -- straw poll |