Re: Question about databases in alternate locations...

Lists: pgsql-generalpgsql-hackerspgsql-sql
From: Richard J Kuhns <rjk(at)grauel(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Question about databases in alternate locations...
Date: 2000-05-16 15:25:48
Message-ID: 14625.26748.356187.68826@localhost.grauel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Could anyone please tell me what I'm doing wrong? I'm sure I'm just
overlooking something, but what?

======================

moran:/acct$ id
uid=1007(postgres) gid=1003(postgres) groups=1003(postgres)
moran:/acct$ export P=/acct/pindybook
moran:/acct$ initlocation P
The location will be initialized with username "postgres".
This user will own all the files and must also own the server process.

Fixing permissions on pre-existing directory /acct/pindybook
Creating directory /acct/pindybook/base

initlocation is complete.
You can now create a database using
CREATE DATABASE <name> WITH LOCATION = 'P'
in SQL, or
createdb <name> -D 'P'
from the shell.

moran:/acct$ createdb indybook -D 'P'
ERROR: The database path 'P' is invalid. This may be due to a character that is not allowed or because the chosen path isn't permitted for databases
createdb: database creation failed
moran:/acct$ ls -ld pindybook
drwx------ 3 postgres postgres 512 May 16 09:40 pindybook
moran:/acct$ ls -l pindybook
total 1
drwx------ 2 postgres postgres 512 May 16 09:40 base
moran:/acct$

======================

Thanks...

--
Richard Kuhns rjk(at)grauel(dot)com
PO Box 6249 Tel: (765)477-6000 \
100 Sawmill Road x319
Lafayette, IN 47903 (800)489-4891 /


From: Jeff Hoffmann <jeff(at)propertykey(dot)com>
To: Richard J Kuhns <rjk(at)grauel(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-16 15:47:38
Message-ID: 39216D9A.48D1577A@propertykey.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Richard J Kuhns wrote:
>
> Could anyone please tell me what I'm doing wrong? I'm sure I'm just
> overlooking something, but what?
>
> ======================
>
> moran:/acct$ id
> uid=1007(postgres) gid=1003(postgres) groups=1003(postgres)
> moran:/acct$ export P=/acct/pindybook

first guess is this: did you export that value before you started the
postmaster? the postmaster needs to have that value in it's environment
before it is started in order for you to use the alternate location.

good luck,

jeff


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Richard J Kuhns <rjk(at)grauel(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-17 05:38:58
Message-ID: 39223072.63596770@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

> Could anyone please tell me what I'm doing wrong? I'm sure I'm just
> overlooking something, but what?

As Jeff pointed out, the environment variable "P" must be known to the
server backend to be used in the WITH LOCATION clause. Using it in the
preceeding initlocation invocation was correct. The utility tries it
as an environment variable, then as an absolute path, so "initlocation
P" and "initlocation $P" are both valid. You can make the environment
variable known to the backend by defining it in the postgres account's
.cshrc or .bashrc file, or by explicitly setting it before firing up
the backend.

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: Richard J Kuhns <rjk(at)grauel(dot)com>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Richard J Kuhns <rjk(at)grauel(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-17 13:50:27
Message-ID: 14626.41891.173479.250021@localhost.grauel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Thomas Lockhart writes:
> > Could anyone please tell me what I'm doing wrong? I'm sure I'm just
> > overlooking something, but what?
>
> As Jeff pointed out, the environment variable "P" must be known to the
> server backend to be used in the WITH LOCATION clause. Using it in the
> preceeding initlocation invocation was correct. The utility tries it
> as an environment variable, then as an absolute path, so "initlocation
> P" and "initlocation $P" are both valid. You can make the environment
> variable known to the backend by defining it in the postgres account's
> .cshrc or .bashrc file, or by explicitly setting it before firing up
> the backend.
>
> - Thomas

Thanks to everyone who answered; my problem was that the backend knew
nothing about it.

That brings up a comment, a question, and an offer. First, the comment: I
actually did check the user's guide before I posted the question, but the
description of initlocation doesn't mention it at all -- it just gives an
example that doesn't work unless the backend already knows about the
variable. It does refer to the CREATE DATABASE section, but at a quick
glance (I know, I should have read more carefully, mea culpa!) I just saw
an example that looked similar to the initlocation example.

Now for the question. What's the reason for using this method, as opposed
to using, say, a system catalog to hold the valid locations? Historical?
Having to stop and restart the backend so it can re-read its environment
seems kind of archaic.

Now the offer. I'm in the design stage of the process of converting a
fairly large legacy application to PostgreSQL. It's going to be
essentially a complete re-write, but I should be able to do it in
more-or-less independent sections. I really like what I've experienced so
far of PostgreSQL, I'd like to contribute, and modifying the postmaster to
use (or at least look at, if it exists) a system catalog for this info
might be a good way to get my feet wet. Comments?

Thanks...
- Rich

--
Richard Kuhns rjk(at)grauel(dot)com
PO Box 6249 Tel: (765)477-6000 \
100 Sawmill Road x319
Lafayette, IN 47903 (800)489-4891 /


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Richard J Kuhns <rjk(at)grauel(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-17 15:27:46
Message-ID: 3922BA72.7D78BF69@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

> That brings up a comment, a question, and an offer. First, the comment: I
> actually did check the user's guide before I posted the question, but the
> description of initlocation doesn't mention it at all -- it just gives an
> example that doesn't work unless the backend already knows about the
> variable. It does refer to the CREATE DATABASE section, but at a quick
> glance (I know, I should have read more carefully, mea culpa!) I just saw
> an example that looked similar to the initlocation example.

Hmm. You are right, the doc on initlocation is weak. I'll put it on my
todo list (and will always gladly accept patches to the sgml sources
or just new words in an email ;)

However, the topic is covered in more detail in the Admin Guide, in
the chapter on "Disk Management" (actually, it is the only topic in
that chapter so far :(

> Now for the question. What's the reason for using this method, as opposed
> to using, say, a system catalog to hold the valid locations? Historical?
> Having to stop and restart the backend so it can re-read its environment
> seems kind of archaic.

This was and is a topic of discussion on the -hackers list. Peter E
(if I recall right) was proposing some changes to remove the
environment variable capabilities in Postgres. He also proposed making
a *list* of allowed locations as an environment variable as a way of
managing or controlling the allowed locations.

In my view, environment variables (or some other mechanism,
potentially) allow a dbamin to decouple the storage location from the
database contents, and give some control over allowed locations. The
current implementation is not ideal; for example Peter's proposal to
have a list of allowed locations seems great, since at the moment the
backend will try *any* environment variable (e.g. $HOME) so could be a
security problem.

Putting all of this stuff in a table is a possibility, but
1) Ingres did this, but they had way too many tables involved in
defining and using tables imho. We should do better.
2) If a dbadmin wants to *carefully* move database locations around,
the environment variables allow this to happen by just shutting down
the backend, tarring/untarring a disk area, redefining the environment
variable, and restarting the backend.
3) We don't (yet) have a way to move tables from within Postgres. So
hardcoding or "hard storing" absolute paths would make it pretty
difficult to accomplish (2).

> Now the offer. I'm in the design stage of the process of converting a
> fairly large legacy application to PostgreSQL. It's going to be
> essentially a complete re-write, but I should be able to do it in
> more-or-less independent sections. I really like what I've experienced so
> far of PostgreSQL, I'd like to contribute, and modifying the postmaster to
> use (or at least look at, if it exists) a system catalog for this info
> might be a good way to get my feet wet. Comments?

Not sure that we should do the system catalog thing without first
implementing the ability to do a "ALTER TABLE SET LOCATION=..."
command from within Postgres. But it's time to move the the -hackers
list. Welcome!

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard J Kuhns <rjk(at)grauel(dot)com>
Cc: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>, pgsql-general(at)postgresql(dot)org
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-17 16:35:14
Message-ID: 17483.958581314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Richard J Kuhns <rjk(at)grauel(dot)com> writes:
> Now for the question. What's the reason for using this method, as opposed
> to using, say, a system catalog to hold the valid locations? Historical?
> Having to stop and restart the backend so it can re-read its environment
> seems kind of archaic.

Well, there'd be a certain amount of circularity in consulting a table
to find out where you can find tables, no? ;-) But you're right, the
environment-variable mechanism is pretty grotty. There's been a great
deal of discussion already in pg-hackers about how to clean up this
and related issues; suggest you consult the archives if you want to get
involved with fixing it.

regards, tom lane


From: Patrick Robin <Patrick(dot)Robin(at)disney(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: pg_dump return failed sanity check
Date: 2000-05-17 17:13:20
Message-ID: 3922D330.470E9451@fa.disney.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Hi,

When I try to use pg_dump, I get this error. Can it have something to

do with a custom type I added. I made sure I added the input/output functions

and comparision functions for sorting and queries.

The type works fine in SQL queries in general.

> pg_dump -s scm
\connect - d23adm
failed sanity check, type with oid 457690 was not found

Thanks

Patrick

--
________________________________________
Patrick Robin
patrickr(at)fa(dot)disney(dot)com
Walt Disney Feature Animation
500 South Buena Vista Street
Burbank,California 91521-4817


From: mikeo <mikeo(at)spectrumtelecorp(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: line type
Date: 2000-05-17 17:41:24
Message-ID: 3.0.1.32.20000517134124.0092e240@pop.spectrumtelecorp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

hi,
we're looking at migrating from ORACLE to postgres in the
very near future and we've run into a small problem. there's
a data type defined "LINE". we have named one of our tables
as "LINE" also and it would require a great deal of code
changes to rename that table. is it possible to simply
"turn off" the line type? any help is appreciated.

thanks,
mikeo


From: mikeo <mikeo(at)spectrumtelecorp(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: remove line type?
Date: 2000-05-17 18:43:40
Message-ID: 3.0.1.32.20000517144340.0092faa0@pop.spectrumtelecorp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

hi,
we're looking at migrating from ORACLE to postgres in the
very near future and we've run into a small problem. there's
a data type defined "LINE". we have named one of our tables
as "LINE" also and it would require a great deal of code
changes to rename that table. is it possible to simply
"turn off" the line type? any help is appreciated.

thanks,
mikeo


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: mikeo <mikeo(at)spectrumtelecorp(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: remove line type?
Date: 2000-05-17 18:51:18
Message-ID: 200005171851.OAA02070@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

I guess you could remove the line type from the pg_type table and see if
that helps.

> hi,
> we're looking at migrating from ORACLE to postgres in the
> very near future and we've run into a small problem. there's
> a data type defined "LINE". we have named one of our tables
> as "LINE" also and it would require a great deal of code
> changes to rename that table. is it possible to simply
> "turn off" the line type? any help is appreciated.
>
> thanks,
> mikeo
>
>

--
Bruce Momjian | http://www.op.net/~candle
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: mikeo <mikeo(at)spectrumtelecorp(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: remove line type?
Date: 2000-05-17 19:04:52
Message-ID: 200005171904.PAA02741@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

If you do it in template1 database after initdb, all new databases will
not have that type either.

> that worked!!! thanks!
>
> mikeo
>
>
> At 02:51 PM 5/17/00 -0400, Bruce Momjian wrote:
> >I guess you could remove the line type from the pg_type table and see if
> >that helps.
> >
> >> hi,
> >> we're looking at migrating from ORACLE to postgres in the
> >> very near future and we've run into a small problem. there's
> >> a data type defined "LINE". we have named one of our tables
> >> as "LINE" also and it would require a great deal of code
> >> changes to rename that table. is it possible to simply
> >> "turn off" the line type? any help is appreciated.
> >>
> >> thanks,
> >> mikeo
> >>
> >>
> >
> >
> >--
> > Bruce Momjian | http://www.op.net/~candle
> > pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> > + If your life is a hard drive, | 830 Blythe Avenue
> > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> >
>

--
Bruce Momjian | http://www.op.net/~candle
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: mikeo <mikeo(at)spectrumtelecorp(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: remove line type?
Date: 2000-05-17 19:09:07
Message-ID: 3.0.1.32.20000517150907.009318c0@pop.spectrumtelecorp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

that worked!!! thanks!

mikeo

At 02:51 PM 5/17/00 -0400, Bruce Momjian wrote:
>I guess you could remove the line type from the pg_type table and see if
>that helps.
>
>> hi,
>> we're looking at migrating from ORACLE to postgres in the
>> very near future and we've run into a small problem. there's
>> a data type defined "LINE". we have named one of our tables
>> as "LINE" also and it would require a great deal of code
>> changes to rename that table. is it possible to simply
>> "turn off" the line type? any help is appreciated.
>>
>> thanks,
>> mikeo
>>
>>
>
>
>--
> Bruce Momjian | http://www.op.net/~candle
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Patrick Robin <Patrick(dot)Robin(at)disney(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump return failed sanity check
Date: 2000-05-18 03:18:54
Message-ID: 21336.958619934@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Patrick Robin <Patrick(dot)Robin(at)disney(dot)com> writes:
> When I try to use pg_dump, I get this error. Can it have something to
> do with a custom type I added. I made sure I added the input/output functions
> and comparision functions for sorting and queries.
> The type works fine in SQL queries in general.

>> pg_dump -s scm
> \connect - d23adm
> failed sanity check, type with oid 457690 was not found

That's probably an indication that you forgot to delete a function that
takes or returns an older custom type that you deleted.

Look in pg_proc for a function containing 457690 in proargtypes or
prorettype, and delete that tuple (or tuples if more than one).

pg_dump oughta be more helpful about where it sees the problem...

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Richard J Kuhns <rjk(at)grauel(dot)com>, pgsql-general(at)postgresql(dot)org, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-18 23:50:18
Message-ID: Pine.LNX.4.21.0005181422410.349-100000@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Thomas Lockhart writes:

> Peter E (if I recall right) was proposing some changes to remove the
> environment variable capabilities in Postgres. He also proposed making
> a *list* of allowed locations as an environment variable as a way of
> managing or controlling the allowed locations.

That was an interesting line of thought until the system catalog idea came
up. I believe everyone would agree that keeping things system catalog
controlled is the generally preferred choice. If you create a system
catalog pg_location(locname name, locpath text) then you still have in
fact a list of allowed locations, but one that can be changed while the
server is up, that can be queried, that can easily be joined against
pg_database, etc. Heck, finely grained permissions are the next logical
step.

Table spaces are another point of consideration. Surely you would
eventually want table space administration to be via query language
commands. In essence, the alternative locations are a table space kind of
thingy. The only difference is that the granularity of control stops at
the database level, but that's only a difference of degree, not kind. In
fact, if someone comes around to reworking the logical->physical relation
name mapping then you could add a field pg_class.rellocation and voilà,
there's your table spaces.

So all in all I do like the system catalog driven model much better in
terms of ease of use, functionality, extensibility, everything. And no,
there's no chicken-and-egg problem because the relation name mapping for
shared system relations would presumably not be changed. (How would that
work anyway?)

> Putting all of this stuff in a table is a possibility, but
> 1) Ingres did this, but they had way too many tables involved in
> defining and using tables imho. We should do better.

Well, so far we'd have one table. Is there any reason why we would need
more? Why did they have so many? I don't mind many tables if they give
more functionality.

> 2) If a dbadmin wants to *carefully* move database locations around,
> the environment variables allow this to happen by just shutting down
> the backend, tarring/untarring a disk area, redefining the environment
> variable, and restarting the backend.

1. shut down database
2. move data area
3. connect to template1
4. update pg_location
5. connect to the moved database

That's not very different.

> 3) We don't (yet) have a way to move tables from within Postgres. So
> hardcoding or "hard storing" absolute paths would make it pretty
> difficult to accomplish (2).

I don't know what you mean with "hard storing".

All in all this might be a relatively small job for great immediate and
future benefit.

--
Peter Eisentraut Sernanders väg 10:115
peter_e(at)gmx(dot)net 75262 Uppsala
http://yi.org/peter-e/ Sweden


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Richard J Kuhns <rjk(at)grauel(dot)com>, pgsql-general(at)postgresql(dot)org, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-19 13:43:45
Message-ID: 39254511.5447BBA9@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

> > Peter E (if I recall right) was proposing some changes to remove the
> > environment variable capabilities in Postgres. He also proposed making
> > a *list* of allowed locations as an environment variable as a way of
> > managing or controlling the allowed locations.
> That was an interesting line of thought until the system catalog idea came
> up. I believe everyone would agree that keeping things system catalog
> controlled is the generally preferred choice. If you create a system
> catalog pg_location(locname name, locpath text) then you still have in
> fact a list of allowed locations, but one that can be changed while the
> server is up, that can be queried, that can easily be joined against
> pg_database, etc. Heck, finely grained permissions are the next logical
> step.

So pg_location would hold the full path (absolute or logical) to every
file resource in every database? Or would it hold only a list of
allowed paths? Or only a list of resources for each database (~1 row
per database) and then table-specific info would be stored somewhere
local to the database itself?

> Table spaces are another point of consideration. Surely you would
> eventually want table space administration to be via query language
> commands. In essence, the alternative locations are a table space kind of
> thingy. The only difference is that the granularity of control stops at
> the database level, but that's only a difference of degree, not kind. In
> fact, if someone comes around to reworking the logical->physical relation
> name mapping then you could add a field pg_class.rellocation and voilà,
> there's your table spaces.

Yes, this capability will be great.
ALTER TABLE SET LOCATION=...
and/or
ALTER DATABASE SET LOCATION=...
should help administration and scalability.

> So all in all I do like the system catalog driven model much better in
> terms of ease of use, functionality, extensibility, everything. And no,
> there's no chicken-and-egg problem because the relation name mapping for
> shared system relations would presumably not be changed. (How would that
> work anyway?)
> > Putting all of this stuff in a table is a possibility, but
> > 1) Ingres did this, but they had way too many tables involved in
> > defining and using tables imho. We should do better.
> Well, so far we'd have one table. Is there any reason why we would need
> more? Why did they have so many? I don't mind many tables if they give
> more functionality.

I have no idea why they had so many. Probably because it grew
incrementally, or possibly because they normalized their tables to the
theoretically correct point. It was ugly either way (right Bruce?).

> > 2) If a dbadmin wants to *carefully* move database locations around,
> > the environment variables allow this to happen by just shutting down
> > the backend, tarring/untarring a disk area, redefining the environment
> > variable, and restarting the backend.
> 1. shut down database
> 2. move data area
> 3. connect to template1
> 4. update pg_location
> 5. connect to the moved database
> That's not very different.

But hard to do? If pg_location has 5000 entries, and you've scattered
tables all over the place (perhaps a bad decision, but we *should*
have the flexibility to do that) then it might be very error prone
when working with absolute paths imho.

> > 3) We don't (yet) have a way to move tables from within Postgres. So
> > hardcoding or "hard storing" absolute paths would make it pretty
> > difficult to accomplish (2).
> I don't know what you mean with "hard storing".

Putting absolute path names as pointers to tables or data areas. I'm
getting the sense I'm in a minority (in a group of 3? ;) in this
discussion, but imho having some decoupling between logical paths in
the database and actual paths outside is A Good Thing. Always has been
a mark of good design in my experience.

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: "Richard J(dot) Kuhns" <rjk(at)grauel(dot)com>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: [HACKERS] Re: Question about databases in alternate locations...
Date: 2000-05-19 16:25:35
Message-ID: 14629.27391.756251.94479@sawmill.grauel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Thomas Lockhart writes:
> So pg_location would hold the full path (absolute or logical) to every
> file resource in every database? Or would it hold only a list of
> allowed paths? Or only a list of resources for each database (~1 row
> per database) and then table-specific info would be stored somewhere
> local to the database itself?
>
Is a list of allowed paths really necessary? If initlocation has already
been run so a directory tree with the proper structure and permissions
exists there'd be no new security hole (ie, I couldn't ask the backend to
create a database on any arbitrary partition; only one that's already been
prepared by the administrator).

I'd like to see a list of resources per database, with any table-specific
info stored locally.

> ALTER TABLE SET LOCATION=...
> and/or
> ALTER DATABASE SET LOCATION=...
> should help administration and scalability.
>
Definitely. Of course, I'd want to make sure any new LOCATION had been
prepared by the administrator.

> But hard to do? If pg_location has 5000 entries, and you've scattered
> tables all over the place (perhaps a bad decision, but we *should*
> have the flexibility to do that) then it might be very error prone
> when working with absolute paths imho.
>
I'd think that a pg_location entry wouldn't be necessary for the majority
of tables -- the default location would be just like it is now, under the
database directory. Creating a database directory in one place and
scattering the tables all over creation would definitely be a Bad Decision,
IMHO, but it would be doable.

> Putting absolute path names as pointers to tables or data areas. I'm
> getting the sense I'm in a minority (in a group of 3? ;) in this
> discussion, but imho having some decoupling between logical paths in
> the database and actual paths outside is A Good Thing. Always has been
> a mark of good design in my experience.
>
How about requiring an absolute path for the data(base) area, and
allowing relative paths for the tables? Actually, if you want
ALTER DATABASE SET LOCATION=...
to move tables, you'd either have to require relative paths for the
tables or ignore tables that have absolute paths, right?

Hmm. And all I originally wanted was an easier way to create a database in
an alternate location :-).

- Rich

--
Richard Kuhns rjk(at)grauel(dot)com
PO Box 6249 Tel: (765)477-6000 \
100 Sawmill Road x319
Lafayette, IN 47903 (800)489-4891 /


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Richard J Kuhns <rjk(at)grauel(dot)com>, pgsql-general(at)postgresql(dot)org, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about databases in alternate locations...
Date: 2000-05-20 13:35:58
Message-ID: Pine.LNX.4.21.0005200047570.489-100000@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers pgsql-sql

Thomas Lockhart writes:

> So pg_location would hold the full path (absolute or logical) to every
> file resource in every database? Or would it hold only a list of
> allowed paths?

The way I imagined it it would hold data like this:

locname | locpath
----------------+-------------------
alt1 | /mnt/foo/db
joes alt store | /home/joe/storage

When I create a database I would then do CREATE DATABASE "my_db" WITH
LOCATION = "alt1"; which would place the database at
/mnt/foo/db/data/base/my_db. Then if I create another that I want at the
same place I do CREATE DATABASE "another" WITH LOCATION =
"alt1";. pg_database would presumably contain a reference to
pg_location.oid instead of the current datpath attribute. So one could say
I'm really just normalizing pg_database.

In some future life you might be able to do CREATE TABLE xxx (...) WITH
LOCATION = "joes alt store" but then we'd have to think about how to
resolve the path. One idea would be to get rid of per-database
subdirectories and just store all heap files in one directory, but I'm
sure Bruce would hate that. :) But that's another day's story.

So yes, it is a list of allowed locations associated with freely choosable
descriptive names. Environment variables do essentially provide a similar
service but I find this much more administration friendly and
flexible. (E.g., "What sort of stuff is being stored at /var/abc/def?" --
use a query)

> > 1. shut down database
> > 2. move data area
> > 3. connect to template1
> > 4. update pg_location
> > 5. connect to the moved database
> > That's not very different.
>
> But hard to do?

ALTER LOCATION "name" SET PATH TO '/new/path';? (Alternatively, use update
pg_location set locpath='/new/path' where locname='name'.) That isn't any
harder than setting environment variables. It might in fact be easier.

> but imho having some decoupling between logical paths in the database
> and actual paths outside is A Good Thing. Always has been a mark of
> good design in my experience.

Sure, that's exactly what this would provide. locname is the logical name
of the "storage location", locpath is the physical path. It's just a
matter of whether you maintain that information in environment variables
(which might get unset, forgotten, require postmaster shutdown, are
subject to certain rules we don't control) or in the database (which comes
with all the conveniences you might imagine).

--
Peter Eisentraut Sernanders väg 10:115
peter_e(at)gmx(dot)net 75262 Uppsala
http://yi.org/peter-e/ Sweden