Re: Minmax indexes

From: Greg Stark <stark(at)mit(dot)edu>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2013-09-27 18:43:51
Message-ID: CAM-w4HOJ4qq=HFy_5AUdbbdojN_nYjdeLooHFO7_N4c6j16Dkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 27, 2013 at 7:22 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>
> Yeah, we obviously kept things simpler when adding forks in order to get the feature out the door. There's improvements that need to be made. But IMHO that's not reason to automatically avoid forks; we need to consider the cost of improving them vs what we gain by using them.

We think this gives short change to the decision to introduce forks.
If you go back to the discussion at the time it was a topic of debate
and the argument which won the day is that interleaving different
streams of data in one storage system is exactly what the file system
is designed to do and we would just be reinventing the wheel if we
tried to do it ourselves. I think that makes a lot of sense for things
like the fsm or vm which grow indefinitely and are maintained by a
different piece of code from the main heap.

The tradeoff might be somewhat different for the pieces of a data
structure like a bitmap index or gin index where the code responsible
for maintaining it is all the same.

> Honestly, I think we actually need more obfuscation between what happens on the filesystem and the rest of postgres... we're starting to look at areas where that would help. For example, the recent idea of being able to truncate individual relation files and not being limited to only truncating the end of the relation. My concern in that case is that 1GB is a pretty arbitrary size that we happened to pick, so if we're going to go for more efficiency in storage we probably shouldn't just blindly stick with 1G (though of course initial implementation might do that to reduce complexity, but we better still consider where we're headed).

The ultimate goal here would be to get the filesystem to issue a TRIM
call so an SSD storage system can reuse the underlying blocks.
Truncating 1GB files might be a convenient way to do it, especially if
we have some new kind of vacuum full that can pack tuples within each
1GB file.

But there may be easier ways to achieve the same thing. If we can
notify the filesystem that we're not using some of the blocks in the
middle of the file we might be able to just leave things where they
are and have holes in the files. Or we might be better off not
depending on truncate and just look for ways to mark entire 1GB files
as "deprecated" and move tuples out of them until we can just remove
that whole file.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-09-27 19:00:25 Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Previous Message Jim Nasby 2013-09-27 18:24:15 Re: Extra functionality to createuser