Re:

Lists: pgsql-general
From: Francisco Reyes <lists(at)natserv(dot)com>
To: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject:
Date: 2002-07-04 18:06:31
Message-ID: 20020704120156.Q4321-100000@zoraida.natserv.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On the FAQ we have:

CHAR() is best when storing strings that are usually the
same length. VARCHAR() is best when storing variable-length strings,
but you want to limit how long a string can be. TEXT is for strings
of unlimited length, maximum 1 gigabyte. BYTEA is for storing
binary data, particularly values that include NULL bytes.

How much space does "NULL" takes on a varchar.
I have a table where possibly less than 40% of the records will have a
value. I am wondering if I should go with a second table for those that
have the values or if NULL is small enough that it won't be much an
overhead for the possibly 60% instances when the column would be empty.


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Francisco Reyes <lists(at)natserv(dot)com>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-04 18:38:31
Message-ID: 200207041838.g64IcVI26416@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Francisco Reyes wrote:
> On the FAQ we have:
>
> CHAR() is best when storing strings that are usually the
> same length. VARCHAR() is best when storing variable-length strings,
> but you want to limit how long a string can be. TEXT is for strings
> of unlimited length, maximum 1 gigabyte. BYTEA is for storing
> binary data, particularly values that include NULL bytes.
>
>
> How much space does "NULL" takes on a varchar.

Zero space. We have a bitmask of NULL columns for every row.

> I have a table where possibly less than 40% of the records will have a
> value. I am wondering if I should go with a second table for those that
> have the values or if NULL is small enough that it won't be much an
> overhead for the possibly 60% instances when the column would be empty.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Francisco Reyes <lists(at)natserv(dot)com>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject: Re:
Date: 2002-07-04 18:45:53
Message-ID: 4773.1025808353@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Francisco Reyes <lists(at)natserv(dot)com> writes:
> How much space does "NULL" takes on a varchar.

This is getting to be a FAQ itself --- Bruce, would you like to
add something along the following lines to the FAQ?

Q. How much space does a NULL take?

A. None; it's not stored. However, if a table row contains any NULLs
then it must include a "nulls bitmap" that shows which columns are NULL.
The bitmap has one bit per table column. The bitmap is omitted if the
row contains no NULLs at all. So, in practice the first NULL you put in
a row costs number-of-columns bits. Any additional NULLs in the same
row are completely free as far as storage space is concerned.

Also, the space effectively occupied by the bitmap depends on alignment
considerations, because the total size of the row header and bitmap
will be rounded up to a MAXALIGN boundary (either 4 or 8 bytes on
most hardware). In 7.3 it's likely that a bitmap for a row of up to
8 columns will be completely free, because the rounded-up header size
will be the same with or without it.

regards, tom lane


From: strange(at)nsk(dot)yi(dot)org
To: Francisco Reyes <lists(at)natserv(dot)com>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-04 18:50:16
Message-ID: 20020704195016.A12936@nsk.yi.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Jul 04, 2002 at 02:06:31PM -0400, Francisco Reyes wrote:
> On the FAQ we have:
>
> CHAR() is best when storing strings that are usually the
> same length. VARCHAR() is best when storing variable-length strings,
> but you want to limit how long a string can be. TEXT is for strings
> of unlimited length, maximum 1 gigabyte. BYTEA is for storing
> binary data, particularly values that include NULL bytes.
>
>
> How much space does "NULL" takes on a varchar.

>From http://developer.postgresql.org/docs/postgres/datatype.html:
The storage requirement for data of these types is 4 bytes plus the
actual string, and in case of character plus the padding.

So at least 4 bytes. Another bit in the table will be used to indicate
that it's a NULL value, AFAIK, but it will exist anyway.

> I have a table where possibly less than 40% of the records will have a
> value. I am wondering if I should go with a second table for those that
> have the values or if NULL is small enough that it won't be much an
> overhead for the possibly 60% instances when the column would be empty.

If your overhead is about space, I don't think you'd win much, as you'd
probably need an indicator of whether the value exists and its place in
the other table anyway.

And you'll lose convenience also.

Regards,
Luciano Rocha


From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Francisco Reyes <lists(at)natserv(dot)com>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-04 18:50:34
Message-ID: l559iuc6manb26ohn53cqu7du2a039ilpc@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 4 Jul 2002 14:06:31 -0400 (EDT), Francisco Reyes
<lists(at)natserv(dot)com> wrote:
>How much space does "NULL" takes on a varchar.

Francisco,

the simple answer is: no space.

The more complete answer is: The NULL state of all columns is stored
in a bitmap. If a row does not contain any NULL column, the bitmap is
omitted. So the first NULL in a row causes the bitmap to be created,
more NULLs in the same row are free. In v7.2 the size of the bitmap
is one bit per column (sic!) rounded up to an architecture specific
value, usually 4 or 8 bytes.

>I have a table where possibly less than 40% of the records will have a
>value. I am wondering if I should go with a second table for those that
>have the values or if NULL is small enough that it won't be much an
>overhead for the possibly 60% instances when the column would be empty.

If there are no other reasons for having a second table, don't bother
with it.

Servus
Manfred


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Francisco Reyes <lists(at)natserv(dot)com>, pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-04 19:23:36
Message-ID: 200207041923.g64JNaT00892@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Francisco Reyes <lists(at)natserv(dot)com> writes:
> > How much space does "NULL" takes on a varchar.
>
> This is getting to be a FAQ itself --- Bruce, would you like to
> add something along the following lines to the FAQ?
>
> Q. How much space does a NULL take?
>
> A. None; it's not stored. However, if a table row contains any NULLs
> then it must include a "nulls bitmap" that shows which columns are NULL.
> The bitmap has one bit per table column. The bitmap is omitted if the
> row contains no NULLs at all. So, in practice the first NULL you put in
> a row costs number-of-columns bits. Any additional NULLs in the same
> row are completely free as far as storage space is concerned.
>
> Also, the space effectively occupied by the bitmap depends on alignment
> considerations, because the total size of the row header and bitmap
> will be rounded up to a MAXALIGN boundary (either 4 or 8 bytes on
> most hardware). In 7.3 it's likely that a bitmap for a row of up to
> 8 columns will be completely free, because the rounded-up header size
> will be the same with or without it.

I added this to the disk space FAQ item. Let's see how that works:

<P><SMALL>NULL</SMALL>s are stored in bitmaps, so they
use very little space.</P>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Francisco J Reyes <fran(at)natserv(dot)net>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-05 16:57:20
Message-ID: 20020705124739.D9157-100000@zoraida.natserv.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 4 Jul 2002, Bruce Momjian wrote:

> Tom Lane wrote:
> > Q. How much space does a NULL take?
> >
> > A. None; it's not stored. However, if a table row contains any NULLs
> > then it must include a "nulls bitmap" that shows which columns are NULL.
> > The bitmap has one bit per table column. The bitmap is omitted if the
> > row contains no NULLs at all. So, in practice the first NULL you put in
> > a row costs number-of-columns bits. Any additional NULLs in the same
> > row are completely free as far as storage space is concerned.
> >
> > Also, the space effectively occupied by the bitmap depends on alignment
> > considerations, because the total size of the row header and bitmap
> > will be rounded up to a MAXALIGN boundary (either 4 or 8 bytes on
> > most hardware). In 7.3 it's likely that a bitmap for a row of up to
> > 8 columns will be completely free, because the rounded-up header size
> > will be the same with or without it.
>
> I added this to the disk space FAQ item. Let's see how that works:
>
> <P><SMALL>NULL</SMALL>s are stored in bitmaps, so they
> use very little space.</P>

That would work, although we would probably save further questions in the
future if we had the longer explanation on the FAQ or pointed the user to
see the Data Types section of the Docs for more detail.. Then
on the data type section mention (if it already doesn't have it) how the
bitmap is used when there are NULLs.


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Francisco J Reyes <fran(at)natserv(dot)net>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-05 18:54:48
Message-ID: 200207051854.g65IsmP01410@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Francisco J Reyes wrote:
> > I added this to the disk space FAQ item. Let's see how that works:
> >
> > <P><SMALL>NULL</SMALL>s are stored in bitmaps, so they
> > use very little space.</P>
>
> That would work, although we would probably save further questions in the
> future if we had the longer explanation on the FAQ or pointed the user to
> see the Data Types section of the Docs for more detail.. Then
> on the data type section mention (if it already doesn't have it) how the
> bitmap is used when there are NULLs.

OK, I will try to get something into the datatypes section about NULL
storage.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Francisco J Reyes <fran(at)natserv(dot)net>, pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re:
Date: 2002-07-06 15:00:07
Message-ID: 22146.1025967607@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
>> That would work, although we would probably save further questions in the
>> future if we had the longer explanation on the FAQ or pointed the user to
>> see the Data Types section of the Docs for more detail.. Then
>> on the data type section mention (if it already doesn't have it) how the
>> bitmap is used when there are NULLs.

> OK, I will try to get something into the datatypes section about NULL
> storage.

I don't think it belongs in Datatypes, but in Performance Tips.
In general, Datatypes is talking about features, not performance or
internal implementation details.

regards, tom lane