Quick Links

Re: insert

Lists:	pgsql-performance

From:	alan <alan(dot)miller3(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	insert
Date:	2011-07-23 16:23:48
Message-ID:	2e208bc6-1bb8-4170-809f-77ab63a15bf4@l18g2000yql.googlegroups.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

next question.

I have a product table with a 'category" column that I want to
maintain in a separate table.

CREATE TABLE products (
product_id INTEGER DEFAULT
nextval('product_id_seq'::regclass) NOT NULL,
name VARCHAR(60) NOT NULL,
category SMALLINT NOT NULL,
CONSTRAINT product_id PRIMARY KEY (product_id)
);
CREATE TABLE products (
category_id INTEGER DEFAULT
nextval('category_id_seq'::regclass) NOT NULL,
name VARCHAR(20) NOT NULL,
CONSTRAINT category_id PRIMARY KEY (category_id)
);

Every product must have a category,
Since many (but not all) products have the same category I only want 1
table with unique categories.

To do the insert into the products table I need to retrieve or insert
the category_id in categories first.
Which means more code on my client app (if ($cat_id =
get_cat_id($cat)) }else { $cat_id = insert_cat($cat)})

Can I write a BEFORE ROW trigger for the products table to runs on
INSERT or UPDATE to
1. insert a new category & return the new category_id OR
2. return the existing category_id for the (to be inserted row)

Alan
I donproducts.category to be a foreign key that points to the uniqie
category_id id in the want to keep I need to do get the cate

From:	alan <alan(dot)miller3(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: insert
Date:	2011-07-25 14:34:45
Message-ID:	a3899672-2ff6-4174-bc31-07f5c4c2a113@l37g2000yqd.googlegroups.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

I think I figured it out myself.
If anyone sees issues with this (simple) approach, please let me know.

I changed my table definitions to this:

CREATE SEQUENCE public.product_id_seq
CREATE TABLE products (
product_id INTEGER DEFAULT nextval('product_id_seq'::regclass) NOT
NULL,
name VARCHAR(60) NOT NULL,
category SMALLINT NOT NULL,
CONSTRAINT product_id PRIMARY KEY (product_id)
);
CREATE SEQUENCE public.category_id_seq
CREATE TABLE category (
category_id INTEGER DEFAULT nextval('category_id_seq'::regclass)
NOT NULL,
name VARCHAR(20) NOT NULL,
CONSTRAINT category_id PRIMARY KEY (category_id)
);
ALTER TABLE products ADD CONSTRAINT category_products_fk
FOREIGN KEY (category)
REFERENCES category (category_id)
ON DELETE NO ACTION ON UPDATE CASCADE
;

Then created this function:

CREATE OR REPLACE FUNCTION getid(_table text,_pk text,_name text)
RETURNS integer AS $$
DECLARE _id integer;
BEGIN
EXECUTE 'SELECT '
|| _pk
|| ' FROM '
|| _table::regclass
|| ' WHERE name'
|| ' = '
|| quote_literal(_name)
INTO _id;

IF _id > 0 THEN
return _id;
ELSE
EXECUTE 'INSERT INTO '
|| _table
|| ' VALUES (DEFAULT,' || quote_literal(_name) || ')'
|| ' RETURNING ' || _pk
INTO _id;
return _id;
END IF;
END;
$$
LANGUAGE 'plpgsql' VOLATILE;

Now I can just insert into the products table via:

INSERT INTO products VALUES(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));

For example:

testdb=# select * from products;
product_id | name | category
------------+------+----------
(0 rows)

iims_test=# select * from category;
category_id | name
-------------+------
(0 rows)

testdb=# insert into products values(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));
INSERT 0 1

testdb=# select * from
category;
category_id | name
-------------+-------
1 | books

testdb=# select * from products;
product_id | name | category
------------+------------------------+----------
1 | Postgresql for Dummies | 1

Updating the category_id in category table are also cascaded to the
product table.

testdb=# UPDATE category SET category_id = 2 WHERE category_id = 1;
UPDATE 1

testdb=# SELECT * FROM products;
product_id | name | category
------------+------------------------+----------
1 | Postgresql for Dummies | 2

Alan

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"alan" <alan(dot)miller3(at)gmail(dot)com>,<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: insert
Date:	2011-07-29 20:14:37
Message-ID:	4E32CE5D020000250003F8FF@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

alan <alan(dot)miller3(at)gmail(dot)com> wrote:

> Can I write a BEFORE ROW trigger for the products table to runs
> on INSERT or UPDATE to
> 1. insert a new category & return the new category_id OR
> 2. return the existing category_id for the (to be inserted row)

What would you be using to match an existing category? If this
accurately identifies a category, why not use it for the key to the
category table, rather than generating a synthetic key value?

-Kevin

From:	Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	alan <alan(dot)miller3(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: insert
Date:	2011-07-30 05:24:21
Message-ID:	4E339585.9080301@archidevsys.co.nz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On 30/07/11 08:14, Kevin Grittner wrote:
> alan<alan(dot)miller3(at)gmail(dot)com> wrote:
>
>> Can I write a BEFORE ROW trigger for the products table to runs
>> on INSERT or UPDATE to
>> 1. insert a new category& return the new category_id OR
>> 2. return the existing category_id for the (to be inserted row)
>
> What would you be using to match an existing category? If this
> accurately identifies a category, why not use it for the key to the
> category table, rather than generating a synthetic key value?
>
> -Kevin
>
Hi Alan,

This is the way I would define the tables, I think it conforms tom your
requirements, and the definitions look clearer.

I have the convention that the id of the table itself is not prefixed
with the table name, but references to the id field of other tables are
(e.g. category_id). This is not something you need to follow, but it
helps to clearly identify what is a foreign key, and what is the current
table's id! Likewise, I think it is simpler to make the table names
singular, but this again is a bit arbitrary.

I guess, even if you prefer my conventions, it is more important to
follow the standards of the existing database!

CREATE TABLE product
(
id SERIAL PRIMARY KEY,
category_id int REFERENCES category(id),
name VARCHAR(60) NOT NULL
);

CREATE TABLE category
(
id SERIAL PRIMARY KEY,
name VARCHAR(20) UNIQUE NOT NULL
);

Though for the primary key of the category table, it might be better to
explicitly assign the key, then you have more control of the numbers used.

I would be a bit wary of automatically inserting a new category, when
the given category is not already there, you could end up with several
variations of spelling for the same category! I once saw a system with
about 20 variations of spelling, and number of spaces between words, for
the name of the same company!

Possibly your client GUI application could have a drop down list of
available categories, and provision to enter new ones, but then this
might be outside your control.

Cheers,
GAvin

From:	Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To:	alan <alan(dot)miller3(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: insert
Date:	2011-08-01 08:52:17
Message-ID:	4E366941.9070507@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

Hello.

Please note that in multitasking environment you may have problems with
your code. Two connections may check if "a" is available and if not (and
both got empty "select" result), try to insert. One will succeed,
another will fail if you have a unique constraint on category name (and
you'd better have one).

Please note that select for update won't help you much, since this is
new record you are looking for, and select don't return (and lock) it. I
am using "lock table <tableName> in SHARE ROW EXCLUSIVE mode" in this case.

But then, if you have multiple lookup dictinaries, you need to ensure
strict order of locking or you will be getting deadlocks. As for me, I
did create a special application-side class to retrieve such values. If
I can't find a value in main connection with simple select, I open new
connection, perform table lock, check if value is in there. If it is
not, add the value and commit. This may produce orphaned dictionary
entries (if dictionary entry is committed an main transaction is rolled
back), but this is usually OK for dictionaries. At the same time I don't
introduce hard locks into main transaction and don't have to worry about
deadlocks.

Best regards, Vitalii Tymchyshyn

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"alan" <alan(dot)miller3(at)gmail(dot)com>, "Vitalii Tymchyshyn" <tivv00(at)gmail(dot)com>
Cc:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: insert
Date:	2011-08-01 14:47:16
Message-ID:	4E367624020000250003F93F@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

Vitalii Tymchyshyn <tivv00(at)gmail(dot)com> wrote:

> Please note that in multitasking environment you may have problems
> with your code. Two connections may check if "a" is available and
> if not (and both got empty "select" result), try to insert. One
> will succeed, another will fail if you have a unique constraint on
> category name (and you'd better have one).
>
> Please note that select for update won't help you much, since this
> is new record you are looking for, and select don't return (and
> lock) it. I am using "lock table <tableName> in SHARE ROW
> EXCLUSIVE mode" in this case.
>
> But then, if you have multiple lookup dictinaries, you need to
> ensure strict order of locking or you will be getting deadlocks.
> As for me, I did create a special application-side class to
> retrieve such values. If I can't find a value in main connection
> with simple select, I open new connection, perform table lock,
> check if value is in there. If it is not, add the value and
> commit. This may produce orphaned dictionary entries (if
> dictionary entry is committed an main transaction is rolled back),
> but this is usually OK for dictionaries. At the same time I don't
> introduce hard locks into main transaction and don't have to worry
> about deadlocks.

It sounds like you might want to check out the new "truly
serializable" transactions in version 9.1. If you can download the
latest beta version of it and test with
default_transaction_isolation = 'serializable' I would be interested
to hear your results. Note that you can't have deadlocks, but you
can have other types of serialization failures, so your software
needs to be prepared to start a transaction over from the beginning
when the SQLSTATE of a failure is '40001'.

The Wiki page which was used to document and organize the work is:

http://wiki.postgresql.org/wiki/Serializable

This is in a little bit of a funny state because not all of the
wording that was appropriate while the feature was under development
(e.g., future tense verbs) has been changed to something more
appropriate for a finished feature, but it should cover the
theoretical ground pretty well. An overlapping document which was
initially based on parts of the Wiki page and has received more
recent attention is the README-SSI file here:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=master

Some examples geared toward programmers and DBAs is at this Wiki
page:

http://wiki.postgresql.org/wiki/SSI

It could use a couple more examples and a bit of language cleanup,
but what is there is fairly sound. The largest omission is that we
need to show more explicitly that serialization failures can occur
at times other than COMMIT. (I got a little carried away trying to
show that there was no blocking and that the "first committer
wins".)

-Kevin