Re: need some help with a delete statement

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Matthew Hixson <hixson(at)poindextrose(dot)org>
Cc: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: need some help with a delete statement
Date: 2003-06-30 12:06:25
Message-ID: Pine.LNX.4.33.0306300604080.13665-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, 27 Jun 2003, Matthew Hixson wrote:

> Hi, I have a bunch of records that I need to delete from our database.
> These records represent shopping carts for visitors to our website.
> The shopping carts I'd like to delete are the ones without anything in
> them. Here is the schema:
>
> create sequence carts_sequence;
> create table carts(
> cart_id integer default nextval('carts_sequence') primary key,
> cart_cookie varchar(24));
>
> create sequence cart_contents_sequence;
> create table cart_contents(
> cart_contents_id integer default nextval('cart_contents_sequence')
> primary key,
> cart_id integer not null,
> content_id integer not null,
> expire_time timestamp);
>
> I'm trying to use this query to delete the carts that are not
> referenced from the cart_contents table.
>
> delete from carts where cart_id in (select cart_id from carts except
> (select distinct cart_id from cart_contents));
>
> My dev machine is running Postgres 7.3.3 and is a 550Mhz Titanium
> running MacOS X 10.2.6. It has 1GB of RAM. I have 266777 entries in
> v_carts and only 3746 entries in v_cart_contents. Clearly there are a
> very large number of empty carts. Running the delete statement above
> runs for over 15 minutes on this machine. I just cancelled it because
> I want to find a faster query to use in case I ever need to do this
> again. While the query is running the disk does not thrash at all. It
> is definitely CPU bound.
> Limiting the statement to 1 item takes about 12 seconds to run:
>
> delete from carts where cart_id in (select cart_id from carts except
> (select distinct cart_id from cart_contents) limit 1);
> Time: 12062.16 ms

While in() is notoriously slow, this sounds more like a problem where your
query is having to seq scan due to mismatching or missing indexes.

So, what kind of index do you have on cart_id, and what happens if you:

select cart_id from carts except
(select distinct cart_id from cart_contents) limit 1;

then feed the cart_id into

explain analyze delete from carts where cart_id=id_from_above;

from psql?

Is cart_id a fk to another table (or is another table using it as a fk?)

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Rudi Starcevic 2003-06-30 12:28:15 Bitwise operation
Previous Message Anagha Joshi 2003-06-30 10:53:50 Seqno. is not btree?