Re: Does a connection support multiple transactions.

Lists: pgsql-novice
From: johnf <jfabiani(at)yolo(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Does a connection support multiple transactions.
Date: 2006-03-27 19:28:52
Message-ID: 200603271128.52075.jfabiani@yolo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hi,

Suse 10.0 using the Free Pascal Compiler connection version 2.1.1.

I have opened a connection to a database (FPC uses the postgres c interface).
When I attempt to make two queries using the same connection I get an error
that suggest I have to close the first query (it actually sends a 'begin').
The question is NOT how Pascal works but if Postgres will accept multiple
queries via one connection or do queries have to be serial. Will postgres
allow two queries to overlap? I think postgres will allow nested
'begins' (not sure about that). I'm not sure I'm asking the question
correctly. But in the windows world I only open one connection (normally via
ODBC) and can make several queries (one right after the other and sometimes
in the same query i.e 'select * from customers;select * from contacts').
Each of the queries are live and I'm able to update if required. I have to
save the data with a commit but I still can have the two queries available.

I'm sure this question is not very clear but I hope some guru will figure it
out.

Thanks

John


From: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Time consuming process ...3 million records please help
Date: 2006-03-28 00:59:50
Message-ID: 20060328005950.9895.qmail@web38102.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Dear group,
I have two tables and a temp table where I uploaded
data using \copy. This temp table has "3,348,107"
lines of data.

I wrote a plpgsql function to read each record in
temp_table, take the firt col. data (temp_refseq_id)
and get corresponding 'seq_id' from table B and insert
into table A.

I started this process 8 hrs back. It has been running
for last 8 hrs and yet it is not finished. the reason
i did this temp table thing was to speedup process by
writing a server side function. I still did not get
to win over the time issue here. If this is a
continuous problem I will have to search for another
db system since my files from now on are huge and has
over mil records. I am betting a lot of time in this
case.

Could any one help writing a faster function.

thanks
looking forward to hear from people.

Temp_table:

temp_refseq_id | temp_imageid | temp_genbankacc
----------------+----------------+-----------------
NM_003604 | IMAGE:1099538 | AA594716
NM_003604 | IMAGE:853047 | AA668250
NM_001008860 | IMAGE:3640970 | BC011775
NM_001008860 | IMAGE:3640970 | BE737509
NM_001008860 | IMAGE:6040319 | BU079001
NM_001008860 | IMAGE:6040319 | BU078725
NM_001008860 | IMAGE:3451448 | BC000957
NM_001008860 | IMAGE:3451448 | BE539334
NM_001008860 | IMAGE:4794135 | BG708105
NM_001008860 | IMAGE:5214087 | BI911674

Table A : (I want to upload data from temp to here)

spota_id | seq_id | spota_imageid | spota_genbacc
----------+--------+---------------+--------
23 | 54525 | IMAGE:1099538 | AA594716

Table B : This table is seqdump table where seq_id is
a FK in Table B

seq_id | seq_acc | seq_name
------------------------------
54519 | NM_152918 | EMR2
54520 | NM_001008860| CGGBP1
54521 | NM_020040 | TUBB4Q
54522 | NM_017525 | CDC42BPG
54523 | NM_032158 | WBSCR20C
54524 | NM_004047 | ATP6V0B
54525 | NM_003604 | PLCB3

Function:

CREATE FUNCTION tab_update() RETURNS integer AS '
DECLARE
referrer_keys RECORD;
BEGIN
FOR referrer_keys IN SELECT * from
temp_spotanno LOOP
INSERT INTO spotanno(seq_id,
spotanno_imageid,spotanno_genbankacc) values((SELECT
seq_id from seqdump where seq_acc =
referrer_keys.temp_refseq_id),referrer_keys.temp_imageid,referrer_keys.temp_genbankacc);
END LOOP;
return 0;

END;
' LANGUAGE plpgsql;

Thanks
Sri

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Time consuming process ...3 million records please help
Date: 2006-03-28 01:35:34
Message-ID: 442892E6.7010408@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Srinivas Iyyer wrote:
> Dear group,
> I have two tables and a temp table where I uploaded
> data using \copy. This temp table has "3,348,107"
> lines of data.
>
> I wrote a plpgsql function to read each record in
> temp_table, take the firt col. data (temp_refseq_id)
> and get corresponding 'seq_id' from table B and insert
> into table A.
>
> I started this process 8 hrs back. It has been running
> for last 8 hrs and yet it is not finished. the reason
> i did this temp table thing was to speedup process by
> writing a server side function. I still did not get
> to win over the time issue here. If this is a
> continuous problem I will have to search for another
> db system since my files from now on are huge and has
> over mil records. I am betting a lot of time in this
> case.
>
> Could any one help writing a faster function.
>
> thanks
> looking forward to hear from people.
>
> Temp_table:
>
> temp_refseq_id | temp_imageid | temp_genbankacc
> ----------------+----------------+-----------------
> NM_003604 | IMAGE:1099538 | AA594716
> NM_003604 | IMAGE:853047 | AA668250
> NM_001008860 | IMAGE:3640970 | BC011775
> NM_001008860 | IMAGE:3640970 | BE737509
> NM_001008860 | IMAGE:6040319 | BU079001
> NM_001008860 | IMAGE:6040319 | BU078725
> NM_001008860 | IMAGE:3451448 | BC000957
> NM_001008860 | IMAGE:3451448 | BE539334
> NM_001008860 | IMAGE:4794135 | BG708105
> NM_001008860 | IMAGE:5214087 | BI911674
>
> Table A : (I want to upload data from temp to here)
>
> spota_id | seq_id | spota_imageid | spota_genbacc
> ----------+--------+---------------+--------
> 23 | 54525 | IMAGE:1099538 | AA594716
>
>
>
>
> Table B : This table is seqdump table where seq_id is
> a FK in Table B
>
> seq_id | seq_acc | seq_name
> ------------------------------
> 54519 | NM_152918 | EMR2
> 54520 | NM_001008860| CGGBP1
> 54521 | NM_020040 | TUBB4Q
> 54522 | NM_017525 | CDC42BPG
> 54523 | NM_032158 | WBSCR20C
> 54524 | NM_004047 | ATP6V0B
> 54525 | NM_003604 | PLCB3
>
>
> Function:
>
> CREATE FUNCTION tab_update() RETURNS integer AS '
> DECLARE
> referrer_keys RECORD;
> BEGIN
> FOR referrer_keys IN SELECT * from
> temp_spotanno LOOP
> INSERT INTO spotanno(seq_id,
> spotanno_imageid,spotanno_genbankacc) values((SELECT
> seq_id from seqdump where seq_acc =
> referrer_keys.temp_refseq_id),referrer_keys.temp_imageid,referrer_keys.temp_genbankacc);
> END LOOP;
> return 0;
>
> END;
> ' LANGUAGE plpgsql;
>
>
> Thanks
> Sri
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
Hi, Sri.

I would do this differently. You don't need the function I don't
think. I would do something like this (untested):

insert into spontanno (seq_id, spontanno_imageid, spontanno_genbankacc)
select seqdump.seq_id, referrer_keys.temp_imageid,
referrer_keys.temp.genbankacc from referrer_keys, seqdump
where referrer_keys.temp_refseq_id=seqdump.seq_acc;

Make sure there is an index on seq_acc and temp_refseq_id. This should
be done in seconds to minutes, not hours.

Sean


From: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
To: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Time consuming process ...3 million records please help
Date: 2006-03-28 02:17:16
Message-ID: 20060328021716.44814.qmail@web38105.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hi Sean,
thanks for your reply. in my temp table,
temp_refseq_id column cannot be indexed due to
duplicate values. will that effect drastically.
I havent tested it yet. on top of that i killed 8 hrs
process running previously on tab_update() function.
:-)

thanks
sri

> Hi, Sri.
>
> I would do this differently. You don't need the
> function I don't
> think. I would do something like this (untested):
>
> insert into spontanno (seq_id, spontanno_imageid,
> spontanno_genbankacc)
> select seqdump.seq_id, referrer_keys.temp_imageid,
> referrer_keys.temp.genbankacc from
> referrer_keys, seqdump
> where
> referrer_keys.temp_refseq_id=seqdump.seq_acc;
>
> Make sure there is an index on seq_acc and
> temp_refseq_id. This should
> be done in seconds to minutes, not hours.
>
> Sean

-

<snip>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Time consuming process ...3 million records please help
Date: 2006-03-28 02:31:07
Message-ID: 44289FEB.207@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Srinivas Iyyer wrote:
> Hi Sean,
> thanks for your reply. in my temp table,
> temp_refseq_id column cannot be indexed due to
> duplicate values. will that effect drastically.
> I havent tested it yet. on top of that i killed 8 hrs
> process running previously on tab_update() function.
> :-)

Sri,

Columns with duplicate values can be indexed; only if you declare an
index to be "unique" will you then have a problem with inserting
duplicates.

You just need to do:

1. Create your temp table structure.
2. Copy in your data.
3. Create an index for all fields that show up in a "where" or join
clause (THIS IS NECESSARY FOR ANY RDBMS TO WORK EFFICIENTLY!!!)

create index temp_refseq_id_index on TABLENAME(temp_refseq_id);

4. Vacuum analyze the table(s) after copying and creating the index
(thereby telling postgres what is in the table)

vacuum analyze TABLENAME

5. Do something along the lines of:

insert into ....
select from ....

That should do it, approximately. If you still have problems, then you
will need to be more specific about what the problems are, I think.

As an aside, I have similar data stored in a postgres database, but in
some places on the order of several 10's of millions of records;
postgres is not limiting in this respect.

Sean

> thanks
> sri
>
>
>> Hi, Sri.
>>
>> I would do this differently. You don't need the
>> function I don't
>> think. I would do something like this (untested):
>>
>> insert into spontanno (seq_id, spontanno_imageid,
>> spontanno_genbankacc)
>> select seqdump.seq_id, referrer_keys.temp_imageid,
>> referrer_keys.temp.genbankacc from
>> referrer_keys, seqdump
>> where
>> referrer_keys.temp_refseq_id=seqdump.seq_acc;
>>
>> Make sure there is an index on seq_acc and
>> temp_refseq_id. This should
>> be done in seconds to minutes, not hours.
>>
>> Sean
>
> -
>
>
>
> <snip>
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


From: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
To: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Non-Unique intems
Date: 2006-03-28 22:12:02
Message-ID: 20060328221202.62952.qmail@web38113.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hi Sean and group,
thank you for your help. It worked.
However, I guess I stepped on a land mine of unique
and non-unique items.

Here is the problem:
(Example data) I have table A:

seq_id seq_name
123 ITAM3
234 ITAR

Table B:

spot_id seq_id image_name
--------------------------------------------
849343 123 IMAGE: 12335
1348238 234 IMAGE: 12335

Table C:

exp_id | spot_id | spot_value
-------|-----------|-----------

Data to insert into Table C
IMAGE: 12335 98.03344

Here the log of query:

arraydb=# SELECT spotanno_id from spotanno
arraydb-# where spotanno_imageid = 'IMAGE:755402';
spotanno_id
-------------
849343
1348238
(2 rows)

arraydb=# select * from spotanno where spotanno_id =
849343;
spotanno_id | seq_id | spotanno_imageid
-------------+--------+------------------
849343 | 75343 | IMAGE:755402
(1 row)

arraydb=# select * from spotanno where spotanno_id =
1348238;
spotanno_id | seq_id | spotanno_imageid
-------------+--------+------------------
1348238 | 50475 | IMAGE:755402
(1 row)

arraydb=# select * from seqdump where seq_id = 50475;
seq_id | seq_acc | seq_name
--------+-----------+----------
50475 | NM_005501 | ITGA3
(1 row)

arraydb=# select * from seqdump where seq_id = 75343;
seq_id | seq_acc | seq_name
--------+-----------+----------
75343 | NM_002204 | ITGA3
(1 row)

An instance of row of the data file that to be
uploaded:

IMAGE:755402 0.299781845119261
12.3638881597060

The question:
when I have a non-unique item (viz. IMAGE:755402 )
what is the approach one should generally take.

Do you have any suggestions/solution. Please help me.

Thanks again.

-sri

>
> That should do it, approximately. If you still have
> problems, then you
> will need to be more specific about what the
> problems are, I think.
>
> As an aside, I have similar data stored in a
> postgres database, but in
> some places on the order of several 10's of millions
> of records;
> postgres is not limiting in this respect.
>
> Sean
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Non-Unique intems
Date: 2006-03-29 01:54:46
Message-ID: 4429E8E6.1090508@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Srinivas Iyyer wrote:
> Hi Sean and group,
> thank you for your help. It worked.
> However, I guess I stepped on a land mine of unique
> and non-unique items.
>
> Here is the problem:
> (Example data) I have table A:
>
> seq_id seq_name
> 123 ITAM3
> 234 ITAR
>
>
> Table B:
>
> spot_id seq_id image_name
> --------------------------------------------
> 849343 123 IMAGE: 12335
> 1348238 234 IMAGE: 12335
>
>
>
>
> Table C:
>
> exp_id | spot_id | spot_value
> -------|-----------|-----------
>
> Data to insert into Table C
> IMAGE: 12335 98.03344
>
>
>
>
>
>
> Here the log of query:
>
> arraydb=# SELECT spotanno_id from spotanno
> arraydb-# where spotanno_imageid = 'IMAGE:755402';
> spotanno_id
> -------------
> 849343
> 1348238
> (2 rows)
>
> arraydb=# select * from spotanno where spotanno_id =
> 849343;
> spotanno_id | seq_id | spotanno_imageid
> -------------+--------+------------------
> 849343 | 75343 | IMAGE:755402
> (1 row)
>
> arraydb=# select * from spotanno where spotanno_id =
> 1348238;
> spotanno_id | seq_id | spotanno_imageid
> -------------+--------+------------------
> 1348238 | 50475 | IMAGE:755402
> (1 row)
>
> arraydb=# select * from seqdump where seq_id = 50475;
> seq_id | seq_acc | seq_name
> --------+-----------+----------
> 50475 | NM_005501 | ITGA3
> (1 row)
>
> arraydb=# select * from seqdump where seq_id = 75343;
> seq_id | seq_acc | seq_name
> --------+-----------+----------
> 75343 | NM_002204 | ITGA3
> (1 row)
>
>
> An instance of row of the data file that to be
> uploaded:
>
>
> IMAGE:755402 0.299781845119261
> 12.3638881597060
>
>
>
> The question:
> when I have a non-unique item (viz. IMAGE:755402 )
> what is the approach one should generally take.
>
> Do you have any suggestions/solution. Please help me.
>
> Thanks again.
>
> -sri

Sri,

Unfortunately, the biological data that you are working with has
one-to-many and many-to-many relationships. While one would like to
believe that there should not be such relationships, there are.
Therefore, you need to store the data in a manner that respects those
manifold relationships. In other words, store the data in a table with
whatever is the primary key (in this case, it looks like an IMAGE ID)
and store the annotation separately, allowing for a one-to-many
relationship between IMAGE ID and gene. There is no way around this and
to try to eliminate these "non-unique" situations in this particular
case won't be possible; instead, you have to understand where the data
are coming from and design your database to match, not the other way
around.

Sean


From: "Daniel T(dot) Staal" <DStaal(at)usa(dot)net>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Non-Unique intems
Date: 2006-03-29 14:17:03
Message-ID: 56990.63.172.115.138.1143641823.squirrel@MageHandbook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Tue, March 28, 2006 8:54 pm, Sean Davis said:

> Unfortunately, the biological data that you are working with has
> one-to-many and many-to-many relationships. While one would like to
> believe that there should not be such relationships, there are.
> Therefore, you need to store the data in a manner that respects those
> manifold relationships. In other words, store the data in a table with
> whatever is the primary key (in this case, it looks like an IMAGE ID)
> and store the annotation separately, allowing for a one-to-many
> relationship between IMAGE ID and gene. There is no way around this and
> to try to eliminate these "non-unique" situations in this particular
> case won't be possible; instead, you have to understand where the data
> are coming from and design your database to match, not the other way
> around.

Summary: It is easier to design databases than to design data. ;)

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


From: george young <gry(at)ll(dot)mit(dot)edu>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Does a connection support multiple transactions.
Date: 2006-03-30 15:09:12
Message-ID: 20060330100912.277aee4f.gry@ll.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Mon, 27 Mar 2006 11:28:52 -0800
johnf <jfabiani(at)yolo(dot)com> threw this fish to the penguins:

> Suse 10.0 using the Free Pascal Compiler connection version 2.1.1.
>
> I have opened a connection to a database (FPC uses the postgres c interface).
> When I attempt to make two queries using the same connection I get an error
> that suggest I have to close the first query (it actually sends a 'begin').
> The question is NOT how Pascal works but if Postgres will accept multiple
> queries via one connection or do queries have to be serial. Will postgres
> allow two queries to overlap? I think postgres will allow nested
> 'begins' (not sure about that). I'm not sure I'm asking the question
> correctly. But in the windows world I only open one connection (normally via
> ODBC) and can make several queries (one right after the other and sometimes
> in the same query i.e 'select * from customers;select * from contacts').
> Each of the queries are live and I'm able to update if required. I have to
> save the data with a commit but I still can have the two queries available.

A transaction starts with a "begin" statement and concludes with
either an "commit" or "rollback". Postgres does not (yet)
support nested transactions, so you must end the current transaction
before starting a new one. Within a transaction you may certainly
execute many selects, updates, inserts, deletes, etc.

I don't know what the pascal DB API looks like, so its hard to discern
exactly what your problem is. Could you post some specific code, and
explain what your intent is, and what doesn't work? Also, please
always mention your postgres version and platform to help people help
you.

I seem to recall encountering some API that did not allow multiple ";"
separated statements in one call. But any API must allow consecutive
calls within a transaction.

-- George Young

> I'm sure this question is not very clear but I hope some guru will figure it
> out.
>
>
> Thanks
>
> John
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
"Are the gods not just?" "Oh no, child.
What would become of us if they were?" (CSL)