Re: SQL/MED estimated time of arrival?

From: Eric Davies <eric(at)barrodale(dot)com>
To: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org,Mike Dunham-Wilkie <Mike(at)barrodale(dot)com>, Ian Barrodale <Ian(at)barrodale(dot)com>
Subject: Re: SQL/MED estimated time of arrival?
Date: 2010-11-16 17:31:43
Message-ID: 20101116173151.534E21337B7E@mail.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 01:36 AM 11/16/2010, Shigeru HANADA wrote:
>Thanks for the information about Informix VTI. Because I'm not
>familiar to Informix, I might have missed your point. Would you mind
>telling me more about Informix VTI?

>On Mon, 15 Nov 2010 08:45:14 -0800
>Eric Davies <eric(at)barrodale(dot)com> wrote:
> > With Informix VTI, indexing is the same for native tables as for
> > virtual tables, except the interpretation of the 32 bit rowid is left
> > up to the developer. When you define the VTI class, you optionally
> > supply a method that can fetch data based on a 32 bit rowid, and it's
> > the responsibility of your non-indexed scanning methods to provide
> > rowids along with the row tuple.
>
>ISTM that index on a VTI table could be inconsistent when original
>(remote) data was changed in the way other than VTI. Is it assumed
>that the data source is never updated without VTI interface?

Yes, the data sources are assumed to updated only through the VTI interface.
With our UFI product, the data sources are assumed to be unchanging
files, you'd need to re-index them if they changed.

> > Having local indexes can be very useful if you have a user that
> > issues queries like:
> > select count(*) from some_external_table where .... ;
> > With VTI, the "count" aggregate doesn't get pushed down, meaning that
> > without a local index, your scanning method has to return as many
> > tuples as match the where clause, which can be very slow.
>
>How can Informix server optimize such kind of query? Counts the index
>tuple which match the WHERE clause?

That would be my assumption.

> If so, such optimization seems to
>be limited to "count" and wouldn't be able to be useful for "max" or
>"sum". Or, specialized index or VTI class is responsible to the
>optimization?

If there is an index on the column you want to sum/min/max, and your
where clause restricts the query to a particular set of rows based on
that index, Informix can get the values for that column from the
index (which it needed to scan anyhow) without looking at the table.
This isn't particular to VTI, it's just a clever use of indexes.

Here is a clipping from one of the Informix manuals on the topic:
The way that the optimizer chooses to read a table is called an
access plan. The simplest method to access a table is to read it
sequentially, which is called a table scan. The optimizer chooses a
table scan when most of the table must be read or the table does not
have an index that is useful for the query.
The optimizer can also choose to access the table by an index. If the
column in the index is the same as a column in a filter of the query,
the optimizer can use the index to retrieve only the rows that the
query requires. The optimizer can use a key-only index scan if the
columns requested are within one index on the table. The database
server retrieves the needed data from the index and does not access
the associated table.
Important:
The optimizer does not choose a key-only scan for a VARCHAR column.
If you want to take advantage of key-only scans, use the ALTER TABLE
with the MODFIY clause to change the column to a CHAR data type.
The optimizer compares the cost of each plan to determine the best
one. The database server derives cost from estimates of the number of
I/O operations required, calculations to produce the results, rows
accessed, sorting, and so forth.

> > Local indexes also affords the opportunity of using specialized
> > indexes built into the database. My guess is that without some form
> > of rowids being passed back and forth, you couldn't define
> > non-materialized views of virtual tables that could be indexed.
> >
> > That said, we implemented our own btree-like index that used the
> > pushed down predicates because fetching data one row at a time wasn't
> > desirable with our design choices, and we wanted to support virtual
> > tables with more than 4 billion rows.
>
>I couldn't see the way to handle virtual table with more than 4
>billion rows with 32 bit rowids in local index. Do you mean that your
>"btree-like index" searches result rows by predicates directly and
>skips getbyid()?

Exactly. Our own "rowids" can be up to 64 bits but are never seen by
Informix. As far as Informix is concerned, it's a regular table scan
because the use of our indexes is hidden.

>Regards,
>--
>Shigeru Hanada

Cheers,
Eric.

**********************************************
Eric Davies, M.Sc.
Senior Programmer Analyst
Barrodale Computing Services Ltd.
1095 McKenzie Ave., Suite 418
Victoria BC V8P 2L5
Canada

Tel: (250) 704-4428
Web: http://www.barrodale.com
Email: eric(at)barrodale(dot)com
**********************************************

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-11-16 18:01:04 Re: GiST insert algorithm rewrite
Previous Message Robert Haas 2010-11-16 17:27:05 Re: autovacuum maintenance_work_mem