Lists: | pgsql-bugs |
---|
From: | "Boris van Schooten" <schooten(at)cs(dot)utwente(dot)nl> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-03 17:44:04 |
Message-ID: | 20050503174404.B3508F0B09@svr2.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 1643
Logged by: Boris van Schooten
Email address: schooten(at)cs(dot)utwente(dot)nl
PostgreSQL version: 7.4.5, 8.0.2
Operating system: FreeBSD
Description: dbf2pg broken + quick fix
Details:
As several people on the mailing lists have already noted, dbf2pg is broken.
It usually fails to enter the numeric values from the dbf file in the sql
tables. If you enable verbose output (-vv) you see "Illegal numeric value
found" errors.
As it turns out, the integer checking code (contrib/dbase/dbf2pg.c, function
isinteger) is broken. It appears the function reports 0 in case it finds a
space in the dbf rather than a digit. I disabled it (always made it return
1). Now, my dbf files import fine.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Boris van Schooten" <schooten(at)cs(dot)utwente(dot)nl> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-03 21:15:32 |
Message-ID: | 12065.1115154932@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
"Boris van Schooten" <schooten(at)cs(dot)utwente(dot)nl> writes:
> As it turns out, the integer checking code (contrib/dbase/dbf2pg.c, function
> isinteger) is broken. It appears the function reports 0 in case it finds a
> space in the dbf rather than a digit. I disabled it (always made it return
> 1). Now, my dbf files import fine.
Hmm. I know nothing about dbase ... but if the test has any use at all,
I guess it must be to handle NULL values. How does dbase represent a
NULL? Why is this code only checking this for integer columns?
regards, tom lane
From: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-03 22:32:40 |
Message-ID: | Pine.SOL.4.33.0505032348030.27583-100000@demeter |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
It only does the integer check for numbers of type integer (when # of
decimals = 0), noninteger numbers are not checked in any way. If
isinteger returns 0, the program does give a warning, suggesting it thinks
it detected an illegal field rather than a null field. Contrary to the
isinteger check on dates, which does not give a warning, though it does
enter a null. On dates the check makes sense because dates are always 8
digits long and have no blank padding like numbers. See the spec.
http://www.dbase.com/KnowledgeBase/int/db7_file_fmt.htm
Don't know anything about nulls in dbf though. I am not a dbase expert, I
just run into dbfs often when trying to enter gis data into postgis.
Kind regards,
Boris van Schooten
On Tue, 3 May 2005, Tom Lane wrote:
> "Boris van Schooten" <schooten(at)cs(dot)utwente(dot)nl> writes:
> > As it turns out, the integer checking code (contrib/dbase/dbf2pg.c, function
> > isinteger) is broken. It appears the function reports 0 in case it finds a
> > space in the dbf rather than a digit. I disabled it (always made it return
> > 1). Now, my dbf files import fine.
>
> Hmm. I know nothing about dbase ... but if the test has any use at all,
> I guess it must be to handle NULL values. How does dbase represent a
> NULL? Why is this code only checking this for integer columns?
>
> regards, tom lane
>
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-04 02:55:18 |
Message-ID: | 13952.1115175318@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> See the spec.
> http://www.dbase.com/KnowledgeBase/int/db7_file_fmt.htm
Thanks for the link. As far as can be told from this, dbase hasn't got
nulls at all -- is that correct, or are they just omitting a ton of
relevant information?
It looks to me like we should just remove the special case for integer
fields altogether. The special case for date fields is wrong in detail
as well: as coded it will accept "date" fields with a leading sign,
which surely is not intentional.
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-04 17:59:17 |
Message-ID: | 24325.1115229557@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> Don't know anything about nulls in dbf though. I am not a dbase expert, I
> just run into dbfs often when trying to enter gis data into postgis.
I'm considering the following patch, which turns around the test: check
for an empty string and if so believe it's a null, otherwise just insert
the value as-is. I dunno if the check for null is actually meaningful,
but I doubt this will break any cases that worked before. Comments anyone?
regards, tom lane
Index: contrib/dbase/dbf2pg.c
===================================================================
RCS file: /cvsroot/pgsql/contrib/dbase/dbf2pg.c,v
retrieving revision 1.21
diff -c -r1.21 dbf2pg.c
*** contrib/dbase/dbf2pg.c 14 Sep 2004 03:28:28 -0000 1.21
--- contrib/dbase/dbf2pg.c 4 May 2005 17:55:29 -0000
***************
*** 63,93 ****
char *convert_charset(char *string);
#endif
void usage(void);
- unsigned int isinteger(char *);
-
- unsigned int
- isinteger(char *buff)
- {
- char *i = buff;
-
- while (*i != '\0')
- {
- if (i == buff)
- if ((*i == '-') ||
- (*i == '+'))
- {
- i++;
- continue;
- }
- if (!isdigit((unsigned char) *i))
- return 0;
- i++;
- }
- return 1;
- }
-
static inline void
strtoupper(char *string)
{
--- 63,70 ----
***************
*** 471,478 ****
/* handle the date first - liuk */
if (fields[h].db_type == 'D')
{
! if ((strlen(foo) == 8) && isinteger(foo))
{
snprintf(pgdate, 11, "%c%c%c%c-%c%c-%c%c",
foo[0], foo[1], foo[2], foo[3],
foo[4], foo[5], foo[6], foo[7]);
--- 448,462 ----
/* handle the date first - liuk */
if (fields[h].db_type == 'D')
{
! if (strlen(foo) == 0)
{
+ /* assume empty string means a NULL */
+ strcat(query, "\\N");
+ }
+ else if (strlen(foo) == 8 &&
+ strspn(foo, "0123456789") == 8)
+ {
+ /* transform YYYYMMDD to Postgres style */
snprintf(pgdate, 11, "%c%c%c%c-%c%c-%c%c",
foo[0], foo[1], foo[2], foo[3],
foo[4], foo[5], foo[6], foo[7]);
***************
*** 480,505 ****
}
else
{
! /*
! * empty field must be inserted as NULL value in
! * this way
! */
! strcat(query, "\\N");
}
}
! else if ((fields[h].db_type == 'N') &&
! (fields[h].db_dec == 0))
{
! if (isinteger(foo))
! strcat(query, foo);
! else
{
strcat(query, "\\N");
- if (verbose)
- fprintf(stderr, "Illegal numeric value found "
- "in record %d, field \"%s\"\n",
- i, fields[h].db_name);
}
}
else
{
--- 464,482 ----
}
else
{
! /* try to insert it as-is */
! strcat(query, foo);
}
}
! else if (fields[h].db_type == 'N')
{
! if (strlen(foo) == 0)
{
+ /* assume empty string means a NULL */
strcat(query, "\\N");
}
+ else
+ strcat(query, foo);
}
else
{
From: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-05 09:34:06 |
Message-ID: | Pine.SOL.4.33.0505051123190.16840-100000@demeter |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Looks good to me. I'd prefer to have a warning message (if (verbose)
fprintf stderr) for each of the exceptional conditions though. I'm even
so paranoid I prefer to have the verbose switch on by default.
Kinds regards,
Boris van Schooten
On Wed, 4 May 2005, Tom Lane wrote:
> Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> > Don't know anything about nulls in dbf though. I am not a dbase expert, I
> > just run into dbfs often when trying to enter gis data into postgis.
>
> I'm considering the following patch, which turns around the test: check
> for an empty string and if so believe it's a null, otherwise just insert
> the value as-is. I dunno if the check for null is actually meaningful,
> but I doubt this will break any cases that worked before. Comments anyone?
>
> regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-05 13:33:56 |
Message-ID: | 22660.1115300036@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> Looks good to me. I'd prefer to have a warning message (if (verbose)
> fprintf stderr) for each of the exceptional conditions though. I'm even
> so paranoid I prefer to have the verbose switch on by default.
Don't really see the need for it. What we are doing here is trusting to
the backend to error-check the input, rather than making a half-baked
attempt to do error checking in dbf2pg.
regards, tom lane
From: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-06 08:58:12 |
Message-ID: | Pine.SOL.4.33.0505051536320.25790-100000@demeter |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
I don't see this as half baked error checking, but rather, as error
reporting during a "half baked conversion". I prefer to know when the
converter encounters something it doesn't understand, rather than having
it silently enter nulls into the db.
It was the original error reporting that helped me find this bug, I
suppose.
Kind regards,
Boris van Schooten
On Thu, 5 May 2005, Tom Lane wrote:
> Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> > Looks good to me. I'd prefer to have a warning message (if (verbose)
> > fprintf stderr) for each of the exceptional conditions though. I'm even
> > so paranoid I prefer to have the verbose switch on by default.
>
> Don't really see the need for it. What we are doing here is trusting to
> the backend to error-check the input, rather than making a half-baked
> attempt to do error checking in dbf2pg.
>
> regards, tom lane
>
From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-27 15:07:30 |
Message-ID: | 200505271507.j4RF7UV02090@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Tom, where are we on this patch? Seems we need to do something.
Here is the thread:
http://archives.postgresql.org/pgsql-bugs/2005-05/msg00008.php
---------------------------------------------------------------------------
Tom Lane wrote:
> Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl> writes:
> > Don't know anything about nulls in dbf though. I am not a dbase expert, I
> > just run into dbfs often when trying to enter gis data into postgis.
>
> I'm considering the following patch, which turns around the test: check
> for an empty string and if so believe it's a null, otherwise just insert
> the value as-is. I dunno if the check for null is actually meaningful,
> but I doubt this will break any cases that worked before. Comments anyone?
>
> regards, tom lane
>
>
> Index: contrib/dbase/dbf2pg.c
> ===================================================================
> RCS file: /cvsroot/pgsql/contrib/dbase/dbf2pg.c,v
> retrieving revision 1.21
> diff -c -r1.21 dbf2pg.c
> *** contrib/dbase/dbf2pg.c 14 Sep 2004 03:28:28 -0000 1.21
> --- contrib/dbase/dbf2pg.c 4 May 2005 17:55:29 -0000
> ***************
> *** 63,93 ****
> char *convert_charset(char *string);
> #endif
> void usage(void);
> - unsigned int isinteger(char *);
>
>
> -
> - unsigned int
> - isinteger(char *buff)
> - {
> - char *i = buff;
> -
> - while (*i != '\0')
> - {
> - if (i == buff)
> - if ((*i == '-') ||
> - (*i == '+'))
> - {
> - i++;
> - continue;
> - }
> - if (!isdigit((unsigned char) *i))
> - return 0;
> - i++;
> - }
> - return 1;
> - }
> -
> static inline void
> strtoupper(char *string)
> {
> --- 63,70 ----
> ***************
> *** 471,478 ****
> /* handle the date first - liuk */
> if (fields[h].db_type == 'D')
> {
> ! if ((strlen(foo) == 8) && isinteger(foo))
> {
> snprintf(pgdate, 11, "%c%c%c%c-%c%c-%c%c",
> foo[0], foo[1], foo[2], foo[3],
> foo[4], foo[5], foo[6], foo[7]);
> --- 448,462 ----
> /* handle the date first - liuk */
> if (fields[h].db_type == 'D')
> {
> ! if (strlen(foo) == 0)
> {
> + /* assume empty string means a NULL */
> + strcat(query, "\\N");
> + }
> + else if (strlen(foo) == 8 &&
> + strspn(foo, "0123456789") == 8)
> + {
> + /* transform YYYYMMDD to Postgres style */
> snprintf(pgdate, 11, "%c%c%c%c-%c%c-%c%c",
> foo[0], foo[1], foo[2], foo[3],
> foo[4], foo[5], foo[6], foo[7]);
> ***************
> *** 480,505 ****
> }
> else
> {
> ! /*
> ! * empty field must be inserted as NULL value in
> ! * this way
> ! */
> ! strcat(query, "\\N");
> }
> }
> ! else if ((fields[h].db_type == 'N') &&
> ! (fields[h].db_dec == 0))
> {
> ! if (isinteger(foo))
> ! strcat(query, foo);
> ! else
> {
> strcat(query, "\\N");
> - if (verbose)
> - fprintf(stderr, "Illegal numeric value found "
> - "in record %d, field \"%s\"\n",
> - i, fields[h].db_name);
> }
> }
> else
> {
> --- 464,482 ----
> }
> else
> {
> ! /* try to insert it as-is */
> ! strcat(query, foo);
> }
> }
> ! else if (fields[h].db_type == 'N')
> {
> ! if (strlen(foo) == 0)
> {
> + /* assume empty string means a NULL */
> strcat(query, "\\N");
> }
> + else
> + strcat(query, foo);
> }
> else
> {
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-27 15:15:06 |
Message-ID: | 14951.1117206906@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Tom, where are we on this patch? Seems we need to do something.
It's in my to-do pile. I was debating whether it's OK to commit to
8.0 branch, or should I just put it in HEAD.
regards, tom lane
From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Boris van Schooten <schooten(at)cs(dot)utwente(dot)nl>, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #1643: dbf2pg broken + quick fix |
Date: | 2005-05-27 15:17:33 |
Message-ID: | 200505271517.j4RFHX604219@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Tom, where are we on this patch? Seems we need to do something.
>
> It's in my to-do pile. I was debating whether it's OK to commit to
> 8.0 branch, or should I just put it in HEAD.
I am thinking just HEAD. If someone really needs it we can point them
to CVS HEAD and they can test.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073