Re: timestamp with time zone

Lists: pgsql-performance
From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: timestamp with time zone
Date: 2012-02-09 18:42:15
Message-ID: CAAB3BBJnD+dcaOQ_T8g+y==zen9yuOj2NvXrE5vhVe=RP28hMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Here's my query:

SELECT foursquare.name, foursquare.city, COUNT(moment_id) AS popularity
FROM foursq_categories
JOIN foursquare USING (foursq_id)
JOIN places USING (foursq_id)
JOIN blocks USING (block_id)
WHERE "primary"
AND (created at time zone timezone)::date = 'yesterday'
AND (country = 'USA' OR country = 'United States')
AND foursq_categories.name @@ to_tsquery('Restaurant')
GROUP BY foursq_id, foursquare.name, foursquare.city ORDER BY popularity
DESC LIMIT 12;

Here's my explain: http://explain.depesz.com/s/xoH

To my surprise, it was not the tsquery that made this slow (which is
awesome, because I was worried about that) but rather the filter: (created
at time zone timezone)::date = 'yesterday'
created has an index (btree if it matters). timezone does not. I'm
wondering if the solution to my problem is to create a joint index between
created and timezone (and if so, if there is a particular way to do that to
make it work the way I want).

Thanks in advance.

-Alessandro


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alessandro Gagliardi <alessandro(at)path(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: timestamp with time zone
Date: 2012-02-09 19:46:06
Message-ID: 27573.1328816766@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Gagliardi <alessandro(at)path(dot)com> writes:
> WHERE ... (created at time zone timezone)::date = 'yesterday'

> created has an index (btree if it matters). timezone does not. I'm
> wondering if the solution to my problem is to create a joint index between
> created and timezone (and if so, if there is a particular way to do that to
> make it work the way I want).

The only way to make that indexable is to create an expression index on
the whole expression "(created at time zone timezone)::date". Seems
pretty special-purpose, though it might be worthwhile if you do that a
lot.

regards, tom lane


From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: timestamp with time zone
Date: 2012-02-09 20:00:56
Message-ID: CAAB3BB+R7O7NCDZJJp7sABWSDtbYsSf4H7ugxGn8EDcJ9H_VqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I tried: CREATE INDEX blocks_created_at_timezone_idx ON blocks USING btree
((created at time zone timezone));

(Actually, I originally did try one on "(created at time zone
timezone)::date" but couldn't figure out how to phrase it in a way
PostgreSQL would accept.)

Anyway, no difference: http://explain.depesz.com/s/Zre

I even tried changing the filter to (created at time zone timezone) >
'yesterday' AND (created at time zone timezone) < 'today' to see if that
might make a difference. Sadly, no: http://explain.depesz.com/s/dfh

Here's the definition for the offending table:

CREATE TABLE blocks
(
block_id character(24) NOT NULL,
user_id character(24) NOT NULL,
created timestamp with time zone,
locale character varying,
shared boolean,
private boolean,
moment_type character varying NOT NULL,
user_agent character varying,
inserted timestamp without time zone NOT NULL DEFAULT now(),
networks character varying[],
lnglat point,
timezone character varying,
CONSTRAINT blocks_pkey PRIMARY KEY (block_id )
)
WITH (
OIDS=FALSE
);

CREATE INDEX blocks_created_at_timezone_idx
ON blocks
USING btree
(timezone(timezone::text, created) );

CREATE INDEX blocks_created_idx
ON blocks
USING btree
(created DESC NULLS LAST);

CREATE INDEX blocks_lnglat_idx
ON blocks
USING gist
(lnglat );

CREATE INDEX blocks_moment_type_idx
ON blocks
USING btree
(moment_type );

CREATE INDEX blocks_networks_idx
ON blocks
USING btree
(networks );

CREATE INDEX blocks_private_idx
ON blocks
USING btree
(private );

CREATE INDEX blocks_shared_idx
ON blocks
USING btree
(shared );

CREATE INDEX blocks_timezone_idx
ON blocks
USING btree
(timezone );

On Thu, Feb 9, 2012 at 11:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alessandro Gagliardi <alessandro(at)path(dot)com> writes:
> > WHERE ... (created at time zone timezone)::date = 'yesterday'
>
> > created has an index (btree if it matters). timezone does not. I'm
> > wondering if the solution to my problem is to create a joint index
> between
> > created and timezone (and if so, if there is a particular way to do that
> to
> > make it work the way I want).
>
> The only way to make that indexable is to create an expression index on
> the whole expression "(created at time zone timezone)::date". Seems
> pretty special-purpose, though it might be worthwhile if you do that a
> lot.
>
> regards, tom lane
>


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alessandro Gagliardi" <alessandro(at)path(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: timestamp with time zone
Date: 2012-02-09 20:15:26
Message-ID: 4F33D4FE0200002500045139@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Gagliardi <alessandro(at)path(dot)com> wrote:

> (Actually, I originally did try one on "(created at time zone
> timezone)::date" but couldn't figure out how to phrase it in a way
> PostgreSQL would accept.)

CREATE INDEX blocks_created_date_idx
ON blocks
USING btree
(((created at time zone timezone)::date));

-Kevin


From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: timestamp with time zone
Date: 2012-02-09 20:38:22
Message-ID: CAAB3BBJyVqQQKqDZSz3ankPuT0PcsNxfC6U_GaPCCLDVWBRpzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Still slow as mud: http://explain.depesz.com/s/Zfn

Now I've got indices on created, timezone, created at time zone timezone,
and (created at time zone timezone)::date. Clearly the problem isn't a lack
of indices!...except, wait, it's not actually using blocks_created_date_idx
(or blocks_created_at_timezone_idx). How do I make that happen?

On Thu, Feb 9, 2012 at 12:15 PM, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov
> wrote:

> Alessandro Gagliardi <alessandro(at)path(dot)com> wrote:
>
> > (Actually, I originally did try one on "(created at time zone
> > timezone)::date" but couldn't figure out how to phrase it in a way
> > PostgreSQL would accept.)
>
> CREATE INDEX blocks_created_date_idx
> ON blocks
> USING btree
> (((created at time zone timezone)::date));
>
> -Kevin
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alessandro Gagliardi <alessandro(at)path(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-performance(at)postgresql(dot)org
Subject: Re: timestamp with time zone
Date: 2012-02-10 06:19:52
Message-ID: 7688.1328854792@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alessandro Gagliardi <alessandro(at)path(dot)com> writes:
> Still slow as mud: http://explain.depesz.com/s/Zfn
> Now I've got indices on created, timezone, created at time zone timezone,
> and (created at time zone timezone)::date. Clearly the problem isn't a lack
> of indices!...except, wait, it's not actually using blocks_created_date_idx
> (or blocks_created_at_timezone_idx). How do I make that happen?

Did you ANALYZE the table after creating those indexes? Generally you
need an ANALYZE so that the planner will have some stats about an
expression index.

It might still think that the other index is a better option. In that
case you can experiment to see if it's right or not; the general idea
is

begin;
drop index index_that_planner_prefers;
explain analyze your_query;
rollback; -- revert the index drop

If that EXPLAIN isn't actually any better than what you had, then the
planner was right. If it is better, let's see 'em both.

regards, tom lane


From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-performance(at)postgresql(dot)org
Subject: Re: timestamp with time zone
Date: 2012-02-10 19:53:54
Message-ID: CAAB3BBL4JyD7cPo6uxnYL0-2CdtW+UdWP16pyOm85yEEtJqh2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hm. Tried running ANALYZE. Took almost 10 minutes to run. (Don't know if it
would have been run automatically since I last tried this yesterday, but
figured it couldn't hurt.) Still, no difference:
http://explain.depesz.com/s/xHq
Actually, it's 10x worse (maybe because this is my first time running this
query today, whereas last time I had run it, or a version of it, several
times before running that EXPLAIN). Anyway, good tip on dropping the index,
but I don't think that would be a good idea in this case because the index
it appears to be choosing is the primary key!

On Thu, Feb 9, 2012 at 10:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alessandro Gagliardi <alessandro(at)path(dot)com> writes:
> > Still slow as mud: http://explain.depesz.com/s/Zfn
> > Now I've got indices on created, timezone, created at time zone timezone,
> > and (created at time zone timezone)::date. Clearly the problem isn't a
> lack
> > of indices!...except, wait, it's not actually using
> blocks_created_date_idx
> > (or blocks_created_at_timezone_idx). How do I make that happen?
>
> Did you ANALYZE the table after creating those indexes? Generally you
> need an ANALYZE so that the planner will have some stats about an
> expression index.
>
> It might still think that the other index is a better option. In that
> case you can experiment to see if it's right or not; the general idea
> is
>
> begin;
> drop index index_that_planner_prefers;
> explain analyze your_query;
> rollback; -- revert the index drop
>
> If that EXPLAIN isn't actually any better than what you had, then the
> planner was right. If it is better, let's see 'em both.
>
> regards, tom lane
>