From: | Gaetano Mendola <mendola(at)gmail(dot)com> |
---|---|
To: | PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> |
Subject: | Re: CUDA Sorting |
Date: | 2012-02-12 01:20:16 |
Message-ID: | 4F3713D0.1030605@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 19/09/2011 21:41, PostgreSQL - Hans-Jürgen Schönig wrote:
>
> On Sep 19, 2011, at 5:16 PM, Tom Lane wrote:
>
>> Greg Stark<stark(at)mit(dot)edu> writes:
>>> That said, to help in the case I described you would have to implement
>>> the tapesort algorithm on the GPU as well.
>>
>> I think the real problem would be that we are seldom sorting just the
>> key values. If you have to push the tuples through the GPU too, your
>> savings are going to go up in smoke pretty quickly …
>>
>
>
> i would argument along a similar line.
> to make GPU code fast it has to be pretty much tailored to do exactly one thing - otherwise you have no chance to get anywhere close to card-bandwith.
> if you look at "two similar" GPU codes which seem to do the same thing you might easily see that one is 10 times faster than the other - for bloody reason such as memory alignment, memory transaction size or whatever.
> this opens a bit of a problem: PostgreSQL sorting is so generic and so flexible that i would be really surprised if somebody could come up with a solution which really comes close to what the GPU can do.
> it would definitely be interesting to see a prototype, however.
Thrust Nvidia library provides the same sorting flexibility as postgres
does.
// generate 32M random numbers on the host
thrust::host_vector<int> h_vec(32 << 20);
thrust::generate(h_vec.begin(), h_vec.end(), rand);
// transfer data to the device
thrust::device_vector<int> d_vec = h_vec;
// sort data on the device (846M keys per second on GeForce GTX 480)
thrust::sort(d_vec.begin(), d_vec.end());
// transfer data back to host
thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin());
as you can see the type to be ordered is template, and
the thrust::sort have also a version in where it takes the comparator to
use.
So compared with pg_qsort thrust::sort gives you the same flexibility.
http://docs.thrust.googlecode.com/hg/group__sorting.html
Regards
Gaetano Mendola
From | Date | Subject | |
---|---|---|---|
Next Message | Vik Reykja | 2012-02-12 02:06:24 | Optimize referential integrity checks (todo item) |
Previous Message | Gaetano Mendola | 2012-02-12 01:14:03 | Re: CUDA Sorting |