Re: valid use of wildcard

Lists: pgsql-general
From: Irene Barg <ibarg(at)noao(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: valid use of wildcard
Date: 2008-10-29 23:04:03
Message-ID: 4908EBE3.3050505@noao.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

Is the following query a valid use of the 'wildcard' in (='2008-10-27%')?

> [arcsoft(at)dsan4 arcsoft]$ psql metadata
> Password:
> Welcome to psql 8.1.9, the PostgreSQL interactive terminal.
>
> metadata=# SELECT * FROM viewspace.siap AS t WHERE t."startDate"='2008-10-27%' AND t.prop_id LIKE '%' LIMIT 1000;

Causes the %CPU to jump and process lingers for over an hour.

> Processes: 87 total, 3 running, 84 sleeping... 321 threads 15:51:49
> Load Avg: 0.28, 0.28, 0.24 CPU usage: 11.4% user, 9.1% sys, 79.5% idle
> SharedLibs: num = 164, resident = 29.5M code, 4.52M data, 7.30M LinkEdit
> MemRegions: num = 10409, resident = 311M + 13.8M private, 501M shared
> PhysMem: 750M wired, 125M active, 1.42G inactive, 2.27G used, 1.73G free
> VM: 13.2G + 97.3M 30039(0) pageins, 0(0) pageouts
>
> PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
> 10637 postgres 69.1% 0:17.43 1 9 52 7.60M- 433M 56.9M- 1.06G
> 10635 psql 0.0% 0:00.00 1 14 22 256K+ 608K 728K+ 27.2M
> 10634 top 9.1% 0:03.96 1 21 20 492K 396K 976K 27.0M
> 10633 bash 0.0% 0:00.00 1 14 16 204K 792K 808K 27.1M
> 10632 sshd 0.0% 0:00.00 1 11 45 116K 1.58M 516K 30.0M
> 10628 sshd 0.0% 0:00.09 1 18 46 144K 1.58M 1.47M 30.1M
> 10562 postgres 0.0% 0:43.65 1 9 30 1.30M 433M 64.8M 1.05G
> 10559 psql 0.0% 0:00.03 1 14 23 252K 608K 736K 27.2M

I do a 'reindexdb -d metadata' and re-run same query and get a response
back quickly:

> [arcsoft(at)dsan4 arcsoft]$ psql metadata
> Password:
> Welcome to psql 8.1.9, the PostgreSQL interactive terminal.
>
>
> metadata=# SELECT * FROM viewspace.siap AS t WHERE t."startDate"='2008-10-27%' AND t.prop_id LIKE '%' LIMIT 1000;
> image_id | reference | fits_extension | object | prop_id | startDate | ra
> | dec | equinox | numberOfAxes | naxis_length | scale | mimeType | instrument | telesco
> pe | cprojection | crefpixel | crefvalue | cdmatrix | fileSize | pixflags | bandpass_id | bandpas
> s_unit | bandpass_lolimit | bandpass_hilimit | exposure | depth | depthErr | seeing | releaseDate
> | vo_id
> -----------+-------------------+----------------+-----------+------------+---------------------+----------
> -----+---------------+---------+--------------+--------------+---------+------------+------------+--------
> ---+-------------+-----------+-----------+----------+-----------+----------+---------------------+--------
> -------+------------------+------------------+----------+---------+----------+---------+------------------
> ---+-------
> ct1417659 | ct1417659.fits.gz | 1 | object | noao | 2008-10-27 00:00:00 | 14:59:22.
> 49 | -30:08:17.49 | 2000.0 | 2 | unknown | unknown | image/fits | mosaic_2 | ct4m
> | unknown | unknown | unknown | unknown | 88343772 | unknown | VR Supermacho c6027 | unknown
> | unknown | unknown | 1.000 | unknown | unknown | unknown | 2010-04-27 00:00:
> 00 |
> ct1417660 | ct1417660.fits.gz | 1 | unknown | smarts | 2008-10-27 00:00:00 | 18:05:49.
> 42 | -19:26:22.6 | 2000.0 | 2 | unknown | unknown | image/fits | ccd_spec | ct15m
> | unknown | unknown | unknown | unknown | 270250 | unknown | CuSO4 | unknown
> | unknown | unknown | 0.000 | unknown | unknown | unknown | 2010-04-27 00:00:
> 00 |
> ct1417661 | ct1417661.fits.gz | 1 | unknown | smarts | 2008-10-27 00:00:00 | 18:06:02.
> 66 | -19:26:22.8 | 2000.0 | 2 | unknown | unknown | image/fits | ccd_spec | ct15m
> | unknown | unknown | unknown | unknown | 269673 | unknown | CuSO4 | unknown
>

Why does reindexdb help?
How is WHERE t."startDate"='2008-10-27%' getting interpreted?

Thank you.
-- irene
---------------------------------------------------------------------
Irene Barg Email: ibarg(at)noao(dot)edu
NOAO/AURA Inc. http://www.noao.edu/noao/staff/ibarg
950 N. Cherry Ave. Voice: 520-318-8273
Tucson, AZ 85726 USA FAX: 520-318-8360
---------------------------------------------------------------------


From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: ibarg(at)noao(dot)edu
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: valid use of wildcard
Date: 2008-10-30 02:56:57
Message-ID: dcc563d10810291956pf6c0e7cpc8a8f50c53fd6ae7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Oct 29, 2008 at 5:04 PM, Irene Barg <ibarg(at)noao(dot)edu> wrote:
> Hi,
>
> Is the following query a valid use of the 'wildcard' in (='2008-10-27%')?
>
>> [arcsoft(at)dsan4 arcsoft]$ psql metadata
>> Password: Welcome to psql 8.1.9, the PostgreSQL interactive terminal.
>>
>> metadata=# SELECT * FROM viewspace.siap AS t WHERE
>> t."startDate"='2008-10-27%' AND t.prop_id LIKE '%' LIMIT 1000;
>
> Causes the %CPU to jump and process lingers for over an hour.

Bad move. dates aren't strings, and their format can change based on
what you've got set for datestyle.

If you want a start date (that's a date or a timestamp) then use the
proper operators

where startDate='2008-10-27'

If startDate is a text / varchar type then you need to change it to a
date. storing dates in strings is bad.


From: Klint Gore <kgore4(at)une(dot)edu(dot)au>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: ibarg(at)noao(dot)edu, pgsql-general(at)postgresql(dot)org
Subject: Re: valid use of wildcard
Date: 2008-10-30 03:45:32
Message-ID: 49092DDC.80505@une.edu.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Scott Marlowe wrote:
> On Wed, Oct 29, 2008 at 5:04 PM, Irene Barg <ibarg(at)noao(dot)edu> wrote:
> > Hi,
> >
> > Is the following query a valid use of the 'wildcard' in (='2008-10-27%')?
> >
> >> [arcsoft(at)dsan4 arcsoft]$ psql metadata
> >> Password: Welcome to psql 8.1.9, the PostgreSQL interactive terminal.
> >>
> >> metadata=# SELECT * FROM viewspace.siap AS t WHERE
> >> t."startDate"='2008-10-27%' AND t.prop_id LIKE '%' LIMIT 1000;
> >
> > Causes the %CPU to jump and process lingers for over an hour.
>
> Bad move. dates aren't strings, and their format can change based on
> what you've got set for datestyle.
>
> If you want a start date (that's a date or a timestamp) then use the
> proper operators
>
> where startDate='2008-10-27'
>
> If startDate is a text / varchar type then you need to change it to a
> date. storing dates in strings is bad.
>

Surprisingly, '2008-10-27%' casts to a date in 8.3.3. I was expecting
the planner to cast the field to string to compare it (or throw an error
about implicit casting), but the literal goes to the field type (see
explain on a timestamp field below). Does the % have any special
meaning in casts to date/timestamp?

postgres=# select version();
version
-----------------------------------------------------
PostgreSQL 8.3.3, compiled by Visual C++ build 1400
(1 row)

postgres=# select '2008-10-27%'::date;
date
------------
2008-10-27
(1 row)

postgres=# explain select * from data where "timestamp" = '2008-10-27%';
QUERY PLAN
------------------------------------------------------------------------------
Seq Scan on data (cost=0.00..504.68 rows=2 width=27)
Filter: ("timestamp" = '2008-10-27 00:00:00'::timestamp without time
zone)
(2 rows)

postgres=#

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4(at)une(dot)edu(dot)au


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Klint Gore <kgore4(at)une(dot)edu(dot)au>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, ibarg(at)noao(dot)edu, pgsql-general(at)postgresql(dot)org
Subject: Re: valid use of wildcard
Date: 2008-10-30 04:12:18
Message-ID: 4441.1225339938@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Klint Gore <kgore4(at)une(dot)edu(dot)au> writes:
> Surprisingly, '2008-10-27%' casts to a date in 8.3.3.

Yeah, the datetime input code is pretty willing to overlook unexpected
punctuation. There are enough odd formats out there that I'm not sure
tightening it up would be a good idea.

regards, tom lane


From: Irene Barg <ibarg(at)noao(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Klint Gore <kgore4(at)une(dot)edu(dot)au>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: valid use of wildcard
Date: 2008-11-03 17:39:32
Message-ID: 490F3754.7030802@noao.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

The 'real' problem was the database had not been re-indexed in a long
while (it is a test system). After re-indexing the db, the query below
ran fairly quicky:

>>> metadata=# SELECT * FROM viewspace.siap AS t WHERE
>>> t."startDate"='2008-10-27%' AND t.prop_id LIKE '%' LIMIT 1000;

The 'startDate' is a timestamp. I was just questioning the use of the
'=' operator with '%' instead of LIKE. I would have expected the '=' to
take the '%' as a literal.

Thanks Tom, Klint and Scott. I learned some debugging tips from this post.

--irene

Tom Lane wrote:
> Klint Gore <kgore4(at)une(dot)edu(dot)au> writes:
>> Surprisingly, '2008-10-27%' casts to a date in 8.3.3.
>
> Yeah, the datetime input code is pretty willing to overlook unexpected
> punctuation. There are enough odd formats out there that I'm not sure
> tightening it up would be a good idea.
>
> regards, tom lane

--
---------------------------------------------------------------------
Irene Barg Email: ibarg(at)noao(dot)edu
NOAO/AURA Inc. http://www.noao.edu/noao/staff/ibarg
950 N. Cherry Ave. Voice: 520-318-8273
Tucson, AZ 85726 USA FAX: 520-318-8360
---------------------------------------------------------------------