Re: need some help with a delete statement

From: Matthew Hixson <hixson(at)poindextrose(dot)org>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: need some help with a delete statement
Date: 2003-07-01 08:24:04
Message-ID: 606DE816-AB9D-11D7-AB18-000393669C1A@poindextrose.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


On Monday, June 30, 2003, at 12:00 PM, scott.marlowe wrote:

> On Mon, 30 Jun 2003, Matthew Hixson wrote:
>
>> On Monday, June 30, 2003, at 05:06 AM, scott.marlowe wrote:
>>
>>> On Fri, 27 Jun 2003, Matthew Hixson wrote:
>>>
>>>> Hi, I have a bunch of records that I need to delete from our
>>>> database.
>>>> These records represent shopping carts for visitors to our website.
>>>> The shopping carts I'd like to delete are the ones without anything
>>>> in
>>>> them. Here is the schema:
>>>>
>>>> create sequence carts_sequence;
>>>> create table carts(
>>>> cart_id integer default nextval('carts_sequence') primary key,
>>>> cart_cookie varchar(24));
>>>>
>>>> create sequence cart_contents_sequence;
>>>> create table cart_contents(
>>>> cart_contents_id integer default nextval('cart_contents_sequence')
>>>> primary key,
>>>> cart_id integer not null,
>>>> content_id integer not null,
>>>> expire_time timestamp);
>>>>
>>>> I'm trying to use this query to delete the carts that are not
>>>> referenced from the cart_contents table.
>>>>
>>>> delete from carts where cart_id in (select cart_id from carts except
>>>> (select distinct cart_id from cart_contents));
>>>>
>>>> My dev machine is running Postgres 7.3.3 and is a 550Mhz Titanium
>>>> running MacOS X 10.2.6. It has 1GB of RAM. I have 266777 entries
>>>> in
>>>> v_carts and only 3746 entries in v_cart_contents. Clearly there
>>>> are a
>>>> very large number of empty carts. Running the delete statement
>>>> above
>>>> runs for over 15 minutes on this machine. I just cancelled it
>>>> because
>>>> I want to find a faster query to use in case I ever need to do this
>>>> again. While the query is running the disk does not thrash at all.
>>>> It
>>>> is definitely CPU bound.
>>>> Limiting the statement to 1 item takes about 12 seconds to run:
>>>>
>>>> delete from carts where cart_id in (select cart_id from carts except
>>>> (select distinct cart_id from cart_contents) limit 1);
>>>> Time: 12062.16 ms
>>>
>>> While in() is notoriously slow, this sounds more like a problem where
>>> your
>>> query is having to seq scan due to mismatching or missing indexes.
>>>
>>> So, what kind of index do you have on cart_id,
>>
>> Its is a btree index.
>>
>> Table "public.carts"
>> Column | Type | Modifiers
>> -------------+-----------------------
>> +--------------------------------------------------
>> cart_id | integer | not null default
>> nextval('carts_sequence'::text)
>> cart_cookie | character varying(24) |
>> Indexes: v_carts_pkey primary key btree (cart_id),
>> cart_cart_cookie btree (cart_cookie)
>>
>>
>>> and what happens if you:
>>>
>>> select cart_id from carts except
>>> (select distinct cart_id from cart_contents) limit 1;
>>>
>>> then feed the cart_id into
>>>
>>> explain analyze delete from carts where cart_id=id_from_above;
>>>
>>> from psql?
>>
>> #explain analyze delete from carts where cart_id=2700;
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> --
>> ------------------------------------------------
>> Index Scan using carts_pkey on carts (cost=0.00..3.16 rows=1
>> width=6)
>> (actual time=162.14..162.17 rows=1 loops=1)
>> Index Cond: (cart_id = 2700)
>> Total runtime: 162.82 msec
>> (3 rows)
>
> what does the output of psql say if you have the /timing switch on?

# select cart_id from carts except (select distinct cart_id from
cart_contents) limit 1;
cart_id
---------
2701
(1 row)
Time: 10864.89 ms

# explain analyze delete from carts where cart_id=2701;
QUERY PLAN
------------------------------------------------------------------------
--------------------------------------------
Index Scan using carts_pkey on carts (cost=0.00..3.16 rows=1 width=6)
(actual time=0.50..0.52 rows=1 loops=1)
Index Cond: (cart_id = 2701)
Total runtime: 1.06 msec
(3 rows)
Time: 257.83 ms

Thanks,
-M@

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Alon Noy 2003-07-01 09:01:22 passing a record as a function argument in pl/pgsql
Previous Message Stefan Bill 2003-07-01 04:54:45 Re: LEAST and GREATEST functions?