Re: Query optimizer 8.0.1 (and 8.0)

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql(at)mohawksoft(dot)com, Ron Mayer <ron(at)cheapcomplexdevices(dot)com>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query optimizer 8.0.1 (and 8.0)
Date: 2005-02-13 21:14:02
Message-ID: 20050213211402.GB52357@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

What's the purpose of doing this transformation? Is it just a means to
sub-divide the dataset? It's very possible that PostGIS would do just as
good a job, without using HTM. Granted, GIS is designed more for working
in LAT/LONG, but I suspect it should work just as well in whatever
coordinate system astronomers use.

Something else to consider is that it would be relatively easy to
imlpement an HTM type in postgresql, which would result in substantial
space savings. You need 3 bits for level 0, each additional level
requires 2 bits. This will be much smaller than storing the HTM in a
varchar, and also smaller than using a bit to indicate N vs S and an
int (or using sign to indicate N/S with an int).

On Sun, Feb 13, 2005 at 08:14:58PM +0300, Oleg Bartunov wrote:
> Probably off-topic, but I think it's worth to see what astronomers are
> doing with their very big spatial databases. For example, we are working
> with more than 500,000,000 rows catalog and we use some special
> transformation
> of coordinates to integer numbers with preserving objects closeness.
> I hope we could show postgresql is good enough to be used in astronomy
> for very big catalogs. Currently, MS SQL is in use.
> See http://www.sdss.jhu.edu/htm/ for details. We use another technique.
>
>
> Oleg
> On Wed, 9 Feb 2005 pgsql(at)mohawksoft(dot)com wrote:
>
> >I wrote a message caled "One Big trend vs multiple smaller trends in table
> >statistics" that, I think, explains what we've been seeing.
> >
> >
> >>pgsql(at)mohawksoft(dot)com wrote:
> >>>
> >>>In this case, the behavior observed could be changed by altering the
> >>>sample size for a table. I submit that an arbitrary fixed sample size is
> >>>not a good base for the analyzer, but that the sample size should be
> >>>based
> >>>on the size of the table or some calculation of its deviation.
> >>>
> >>
> >> Mark,
> >>
> >>Do you have any evidence that the Sample Size had anything to do
> >>with the performance problem you're seeing?
> >
> >Sample size is only a bandaid for the issue, however, more samples always
> >provide more information.
> >
> >
> >>
> >>I also do a lot with the complete Census/TIGER database.
> >>
> >>Every problem I have with the optimizer comes down to the
> >>fact that the data is loaded (and ordered on disk) by
> >>State/County FIPS codes, and then queried by zip-code
> >>or by city name. Like this:
> >>
> >> Alabama 36101 [hundreds of pages with zip's in 36***]
> >> Alaska 99686 [hundreds of pages with zip's in 9****]
> >> Arizona 85701 [hundreds of pages with zip's in 855**]
> >>
> >>Note that the zip codes are *NOT* sequential.
> >
> >Again, read "One Big Trend..." and let me know what you think. I think it
> >describes exactly the problem that we see.
> >
> >For now, the solution that works for me is to seriously up the value of
> >"targrows" in analyze.c. It makes it take longer, and while the stats are
> >not "correct" because they are not designed to detect these sorts of
> >patterns, a larger sample allows them to be "less wrong" enough to give a
> >better hint to the planner.
> >
> >
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 7: don't forget to increase your free space map settings
> >
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-02-13 22:07:52 Design notes for BufMgrLock rewrite
Previous Message Jeroen van Iddekinge 2005-02-13 18:18:56 Re: strerror_r int and char* return type mixup on FC2