Re: Cube extension improvement, GSoC

From: Stas Kelvich <stanconn(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cube extension improvement, GSoC
Date: 2013-05-14 12:30:21
Message-ID: 3AF67843-F8C3-4666-A234-39DCA33A6715@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

HI.

Thanks, Heikki, for the answer on google-melange. For some reason i didn't receive email notification, so I saw this answer only today.

> Do you have access to a server you can use to perform those tests? (...)

Yes, i have. I am maintaining MPI cluster in my university, so it is not a problem. Actually tests for this proposal was made on servers from this cluster. But, anyway, thanks for offering help.

There is an open question about supporting different datatypes in cube extension. As I understand we have following ideas:

* add cube-like operators for arrays. We already have support for arrays of any datatype, and any number of dimensions.
If we want to use tree-like data structures for this operators we will run into the same problems with trees and types. And we always can cast array to cube and use this operators. Or I wrongly understand this.

* Create support for storing cube coordinates with different data types. (2,4,8-bytes integers, 4,8-bytes floats)
Main goal of this is reducing the index size, so in order not to break big amount of code we can store data according to data type size (i.e. | smallint | real | real | double | double | ) and when we load data from the disk or cache we can cast it to float8 and existent code will work. To achieve this behavior two steps should be performed:
1) Store information about coordinates types when the index is created. Good question is where to store this data structure, but I believe it can be done.
2) Change functions that read and write data to disk, so they can cast type to/from float8 using data from previous clause.

* Don't do cube with type support
Eventually, there is different ways of reducing R-Tree size. For example we can store relative coordinates with dynamic size of MBR (VRMBR), instead of absolute coordinates with fixed sized MBR. There is some evidences, that this can sufficiently reduce size. http://link.springer.com/chapter/10.1007/11427865_13

On May 8, 2013, at 2:35 PM, Alexander Korotkov wrote:

> On Sat, May 4, 2013 at 11:19 PM, Stas Kelvich <stanconn(at)gmail(dot)com> wrote:
> > I think we have at least 3 data types more or less similar to cube.
> > 1) array of ranges
> > 2) range of arrays
> > 3) 2d arrays
> > Semantically cube is most close to array or ranges. However array of ranges have huge storage overhead.
> > Also we can declare cube as domain on 2d arrays and declare operations of that domain.
>
> But what we should do when arrays in different records have different numbers of element?
>
> We can be faced with absolutely same situation with cube.
>
> test=# create table cube_test (v cube);
> CREATE TABLE
>
> test=# insert into cube_test values (cube(array[1,2])), (cube(array[1,2,3]));
> INSERT 0 2
>
> In order to force all cubes to have same number of dimensions excplicit CHECK on table is required.
> As I remember cube treats absent dimensions as zeros.
>
> ------
> With best regards,
> Alexander Korotkov.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-05-14 12:35:03 Re: erroneous restore into pg_catalog schema
Previous Message Peter Eisentraut 2013-05-14 12:00:56 Re: PostgreSQL 9.3 beta breaks some extensions "make install"