Re: [SQL] Multiple values for a field

Lists: pgsql-sql
From: "Mike Field" <mike(at)fieldco(dot)com>
To: <pgsql-sql(at)postgreSQL(dot)org>
Subject: Multiple values for a field
Date: 1999-08-13 15:11:00
Message-ID: 008701bee59e$0e951f80$0200a8c0@mike
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi-

Does anyone have any ideas on the following?

In my database that categorizes research documents and publications, I have
in some cases more than one author, more than one region, subject, etc. per
document. You mentioned yesterday that the best way to handle that is to
have a table for say, authors like this:

docid | authorid

and have more than one row for a document if that document has more than one
author.

My problem is that I'm pulling the info out of the database on the web, and
I find that I have to use a loop for the fields that have multiliple
values - and since I have several of those fields, many loops starts to bog
down the processor. My objective is to try to get all the data out, for all
documents, in one single query - so that I can make a list of all the
documents and their pertinent data.

Do you see any problem if I put the actual names of the authors, comma
separated, in the author field (for example: Mike Field, Tom Lane, ZZ Top)?
I would do the same for my six fields which may have multiple values
(author, region, subject, related_projects, doc_type, translations) I have
about 550 documents to put into the database.

I know using text fields would take up more space relative to just using a
number to refer to the author... but using a number makes me perform more
loops and thus, more processing time.

Let me know what you think.
Thanks,
Mike


From: Herouth Maoz <herouth(at)oumail(dot)openu(dot)ac(dot)il>
To: "Mike Field" <mike(at)fieldco(dot)com>, <pgsql-sql(at)postgreSQL(dot)org>
Subject: Re: [SQL] Multiple values for a field
Date: 1999-08-15 13:26:31
Message-ID: l03130307b3dc6e665034@[147.233.159.109]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

At 18:11 +0300 on 13/08/1999, Mike Field wrote:

> Do you see any problem if I put the actual names of the authors, comma
> separated, in the author field (for example:  Mike Field, Tom Lane, ZZ Top)?
> I would do the same for my six fields which may have multiple values
> (author, region, subject, related_projects, doc_type, translations) I have
> about 550 documents to put into the database.

Repeating groups are harder to update. What if you found out that you made
a spelling mistake in some region, and you have to go through the entire
table, find it in each of the strings, and replace with the correct
version. What if you wanted to drop an author? You have to select it, fix
and update. Too much work.

> I know using text fields would take up more space relative to just using a
> number to refer to the author... but using a number makes me perform more
> loops and thus, more processing time.

An alternative may be to create an aggregate function that will give you
the comma-separated list you desired upon request:

CREATE AGGREGATE textcat_all
(
basetype = text,
sfunc1 = textcat,
stype1 = text
);

Then you can do something like:

SELECT da.docid, textcat_all( a.Author_name || ',' )
FROM docs_authors da, authors a
WHERE da.authorid = a.id
GROUP BY da.docid;

This will return a comma-separated list of author names for each doc id,
assuming the actual author name is in a separate table. There is an extra
comma at the end, but I don't suppose that's too much of a problem.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma


From: "Ross J(dot) Reedstrom" <reedstrm(at)wallace(dot)ece(dot)rice(dot)edu>
To: Herouth Maoz <herouth(at)oumail(dot)openu(dot)ac(dot)il>
Cc: Mike Field <mike(at)fieldco(dot)com>, pgsql-sql(at)postgreSQL(dot)org
Subject: Re: [SQL] Multiple values for a field
Date: 1999-08-15 15:36:26
Message-ID: 19990815103626.A19124@wallace.ece.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On Sun, Aug 15, 1999 at 04:26:31PM +0300, Herouth Maoz wrote:
>
> An alternative may be to create an aggregate function that will give you
> the comma-separated list you desired upon request:
>
> CREATE AGGREGATE textcat_all
> (
> basetype = text,
> sfunc1 = textcat,
> stype1 = text
> );
>
> Then you can do something like:
>
> SELECT da.docid, textcat_all( a.Author_name || ',' )
> FROM docs_authors da, authors a
> WHERE da.authorid = a.id
> GROUP BY da.docid;
>
> This will return a comma-separated list of author names for each doc id,
> assuming the actual author name is in a separate table. There is an extra
> comma at the end, but I don't suppose that's too much of a problem.
>

And can be solved as so:
SELECT da.docid, rtrim(textcat_all( a.Author_name || ',' ), ',') ...

Excellent tip, Herouth. This one goes in the 'keepers' bag.

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm(at)rice(dot)edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005