Re: Performance on large, append-only tables

Lists: pgsql-performance
From: David Yeu <david(dot)yeu(at)skype(dot)net>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Cc: "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Performance on large, append-only tables
Date: 2012-02-08 18:03:04
Message-ID: CB582308.955%david.yeu@skype.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi there,

We've got a pretty large table that sees millions of new rows a day, and
we're trying our best to optimize queries against it. We're hoping to find
some guidance on this list.

Thankfully, the types of queries that we perform against this table are
pretty constrained. We never update rows and we never join against other
tables. The table essentially looks like this:

| id | group_id | created_at | everything elseŠ

Where `id' is the primary key, auto-incrementing, `group_id' is the
foreign key that we always scope against, and `created_at' is the
insertion time. We have indices against the primary key and the group_id.
Our queries essentially fall into the following cases:

* Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
* Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
* Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
* Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

In human words, we're looking for:

* The most recent (20) rows.
* The most recent rows after a given `id'.
* Twenty rows before a given `id'.
* Pages of twenty rows.

Originally, this table was part of our primary database, but recently we
saw queries take upwards of thirty seconds or more to complete. Since
we're serving web requests, that's basically unacceptable, and caused a
lot of requests to backup. Our interim solution has been to simply carve
out a new database that hosts only this table, and that has worked to some
degree. We aren't seeing thirty seconds plus database response times
anymore, but some queries still take many seconds and the cost of spinning
up a new master-slave configuration hasn't been cheap.

In the meantime, we're hoping to investigate other ways to optimize this
table and the queries against it. Heroku's data team has suggested balling
up these rows into arrays, where a single row would represent a group_id,
and the data would occupy a single column as an array. We don't have any
experience with this and were wondering if anyone here has tried it.

And finally, we're also trying out alternative stores, since it seems like
this data and its retrieval could be well suited to document-oriented
backends. Redis and DynamoDB are currently the best contenders.

Thanks in advance for any help,

Regards,

Dave Yeu & Neil Sarkar
GroupMe


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: David Yeu <david(dot)yeu(at)skype(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Re: Performance on large, append-only tables
Date: 2012-02-10 15:19:10
Message-ID: CAHyXU0wVRgMW_cNwhvg51PT=Rb3Cbu0EEs_2t3QfVf1t33gd4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Feb 8, 2012 at 12:03 PM, David Yeu <david(dot)yeu(at)skype(dot)net> wrote:
> Hi there,
>
> We've got a pretty large table that sees millions of new rows a day, and
> we're trying our best to optimize queries against it. We're hoping to find
> some guidance on this list.
>
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
>
> Where `id' is the primary key, auto-incrementing, `group_id' is the
> foreign key that we always scope against, and `created_at' is the
> insertion time. We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>
> In human words, we're looking for:
>
>  * The most recent (20) rows.
>  * The most recent rows after a given `id'.
>  * Twenty rows before a given `id'.
>  * Pages of twenty rows.

You can probably significantly optimize this. But first, can we see
some explain analyze for the affected queries?

merlin


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: David Yeu <david(dot)yeu(at)skype(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Re: Performance on large, append-only tables
Date: 2012-02-10 15:33:33
Message-ID: CAGTBQpY8ySbL0=b0GG75tsR-8dLaXFq-V3k5=WujXPDoorYJbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Feb 8, 2012 at 3:03 PM, David Yeu <david(dot)yeu(at)skype(dot)net> wrote:
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
...
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

I think you have something to gain from partitioning.
You could partition on group_id, which is akin to sharding only on a
single server, and that would significantly decrease each partition's
index size. Since those queries' performance is highly dependent on
index size, and since you seem to have such a huge table, I would
imagine such partitioning would help keep the indices performant.

Now, we do need statistics. How many groups are there? Do they grow
with your table, or is the number of groups constant? Which values of
offsets do you use? (offset is quite expensive)

And of course... explain analyze.


From: Marti Raudsepp <marti(at)juffo(dot)org>
To: David Yeu <david(dot)yeu(at)skype(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Re: Performance on large, append-only tables
Date: 2012-02-10 15:34:39
Message-ID: CABRT9RBDXQDwjHSA1GGeAKd97UXRaSTK1B58V6rRmZvit8G2CQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Feb 8, 2012 at 20:03, David Yeu <david(dot)yeu(at)skype(dot)net> wrote:
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>  * Pages of twenty rows.

A good improvement for this sort of queries is the "scalable paging"
trick. Instead of increasing the OFFSET argument -- which means that
Postgres has to scan more and more rows -- you should remember an
index key where the last page ended.

In other words, you get the first page using:
WHERE group_id = ? ORDER BY created_at DESC LIMIT 20

Say, this page returns created_at values between 2012-01-01 and
2012-01-10. If the user clicks "next page", you run a query like this
instead:
WHERE group_id = ? AND created_at>'2012-01-10' ORDER BY created_at DESC LIMIT 20

Thus, every "next page" fetch always takes a constant time. Of course
there's a small problem when two rows have equal times. Then, you can
add primary key to the sort key to disambiguate those rows:

WHERE group_id = ? AND (created_at, pkey_col) > ('2012-01-10', 712)
ORDER BY created_at, pkey_col DESC LIMIT 20

Of course an index on (group_id, created_at) or (group_id, created_at,
pkey_col) is necessary for these to work well

Regards,
Marti


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Yeu <david(dot)yeu(at)skype(dot)net>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Re: Performance on large, append-only tables
Date: 2012-02-10 15:35:52
Message-ID: 13841.1328888152@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

David Yeu <david(dot)yeu(at)skype(dot)net> writes:
> Our queries essentially fall into the following cases:

> * WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
> * WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
> * WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
> * WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

All of those should be extremely cheap if you've got the right indexes,
with the exception of the last one. Large OFFSET values are never a
good idea, because Postgres always has to scan and discard that many
rows. If you need to fetch successive pages, consider using a cursor
with a series of FETCH commands. Another possibility, if the data is
sufficiently constrained, is to move the limit point with each new
query, ie instead of OFFSET use something like

WHERE group_id = ? AND created_at < last-previous-value
ORDER BY created_at DESC LIMIT 20;

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "David Yeu" <david(dot)yeu(at)skype(dot)net>
Cc: "ops(at)groupme(dot)com" <ops(at)groupme(dot)com>
Subject: Re: Performance on large, append-only tables
Date: 2012-02-10 15:44:02
Message-ID: 4F34E6E20200002500045251@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

David Yeu <david(dot)yeu(at)skype(dot)net> wrote:

> We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
>
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
> * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
> * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT
> 20;
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET
> ?;
>
> In human words, we're looking for:
>
> * The most recent (20) rows.
> * The most recent rows after a given `id'.
> * Twenty rows before a given `id'.
> * Pages of twenty rows.

The first thing I would try is building an index (perhaps
CONCURRENTLY to avoid disrupting production) on (group_id,
created_at). It might also be worth creating an index on (group_id,
id, created_at), but that's a less-sure win.

> Originally, this table was part of our primary database, but
> recently we saw queries take upwards of thirty seconds or more to
> complete. Since we're serving web requests, that's basically
> unacceptable, and caused a lot of requests to backup.

With only the indexes you mention, it had to be doing either
complete table scans for each request, or a lot of random access to
rows it didn't need.

> Our interim solution has been to simply carve out a new database
> that hosts only this table, and that has worked to some degree. We
> aren't seeing thirty seconds plus database response times anymore,
> but some queries still take many seconds and the cost of spinning
> up a new master-slave configuration hasn't been cheap.

Well, throwing hardware at something doesn't generally hurt, but
it's not the first solution I would try, especially when the product
you're using has ways to tune performance.

> In the meantime, we're hoping to investigate other ways to
> optimize this table and the queries against it. Heroku's data team
> has suggested balling up these rows into arrays, where a single
> row would represent a group_id, and the data would occupy a single
> column as an array.

Ugh. You're a long way from needing to give up on the relational
model here.

> And finally, we're also trying out alternative stores, since it
> seems like this data and its retrieval could be well suited to
> document-oriented backends. Redis and DynamoDB are currently the
> best contenders.

Your current use of PostgreSQL is more or less equivalent to driving
a car around in first gear. You might consider a tuned PostgreSQL
as another alternative store. :-)

-Kevin