Lists: | pgsql-sql |
---|
From: | Oliver Vecernik <vecernik(at)aon(dot)at> |
---|---|
To: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | CSV import |
Date: | 2003-01-28 14:46:07 |
Message-ID: | 3E3697AF.2030509@aon.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Hi again!
After investigating a little bit further my CSV import couldn't work
because of following reasons:
1. CSV files are delimited with CR/LF
2. text fields are surrounded by double quotes
Is there a direct way to import such files into PostgreSQL?
I would like to have something like MySQL provides:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[FIELDS
[TERMINATED BY '\t']
[[OPTIONALLY] ENCLOSED BY '']
[ESCAPED BY '\\' ]
]
[LINES TERMINATED BY '\n']
[IGNORE number LINES]
[(col_name,...)]
Has anybody written such a function already?
Regards,
Oliver
--
VECERNIK Datenerfassungssysteme
A-2560 Hernstein, Hofkogelgasse 17
Tel.: +43 2633 47530, Fax: DW 50
http://members.aon.at/vecernik
From: | Achilleus Mantzios <achill(at)matrix(dot)gatewaynet(dot)com> |
---|---|
To: | Oliver Vecernik <vecernik(at)aon(dot)at> |
Cc: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-28 19:01:23 |
Message-ID: | Pine.LNX.4.44.0301281700010.15743-100000@matrix.gatewaynet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
On Tue, 28 Jan 2003, Oliver Vecernik wrote:
> Hi again!
>
> After investigating a little bit further my CSV import couldn't work
> because of following reasons:
>
> 1. CSV files are delimited with CR/LF
See below
> 2. text fields are surrounded by double quotes
in vi
:1,$ s/"//g
>
> Is there a direct way to import such files into PostgreSQL?
>
> I would like to have something like MySQL provides:
>
> LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
> [REPLACE | IGNORE]
> INTO TABLE tbl_name
> [FIELDS
> [TERMINATED BY '\t']
> [[OPTIONALLY] ENCLOSED BY '']
> [ESCAPED BY '\\' ]
> ]
> [LINES TERMINATED BY '\n']
make it
[LINES TERMINATED BY '\r\n']
> [IGNORE number LINES]
> [(col_name,...)]
>
> Has anybody written such a function already?
>
> Regards,
> Oliver
>
> --
> VECERNIK Datenerfassungssysteme
> A-2560 Hernstein, Hofkogelgasse 17
> Tel.: +43 2633 47530, Fax: DW 50
> http://members.aon.at/vecernik
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
==================================================================
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel: +30-10-8981112
fax: +30-10-8981877
email: achill(at)matrix(dot)gatewaynet(dot)com
mantzios(at)softlab(dot)ece(dot)ntua(dot)gr
From: | Guy Fraser <guy(at)incentre(dot)net> |
---|---|
To: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-28 22:13:41 |
Message-ID: | 3E370095.3020200@incentre.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Hi
You will need two text utilities {dos2unix and sed} to do this in the simplest
way. They are fairly standard text utilities and are probably already on your
machine.
This is how I would do it :
sed "s/\"//g" file_name.txt \
| dos2unix \
| pgsql -c "COPY table_name FROM STDIN USING DELIMITERS ',';" db
Where "file_name.txt" is the csv file you want to import and "table_name" is
the previously created table you want to insert the data into and db is the
database name.
How this works is "sed" {stream editor} removes all the double quote
characters '"' then pipes the output through "dos2unix" which converts all the
CRLF {DOS EOL} sequences into CR {UNIX EOL} characters, then pipes the data to
"pgsql" with a command that does a bulk insert into the table of the database
you have selected.
Guy
Oliver Vecernik wrote:
> Hi again!
>
> After investigating a little bit further my CSV import couldn't work
> because of following reasons:
>
> 1. CSV files are delimited with CR/LF
> 2. text fields are surrounded by double quotes
>
> Is there a direct way to import such files into PostgreSQL?
>
> I would like to have something like MySQL provides:
>
> LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
> [REPLACE | IGNORE]
> INTO TABLE tbl_name
> [FIELDS
> [TERMINATED BY '\t']
> [[OPTIONALLY] ENCLOSED BY '']
> [ESCAPED BY '\\' ]
> ]
> [LINES TERMINATED BY '\n']
> [IGNORE number LINES]
> [(col_name,...)]
>
> Has anybody written such a function already?
>
> Regards,
> Oliver
>
From: | Jeff Eckermann <jeff_eckermann(at)yahoo(dot)com> |
---|---|
To: | Oliver Vecernik <vecernik(at)aon(dot)at>, pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-28 22:45:33 |
Message-ID: | 20030128224533.71751.qmail@web20809.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
--- Oliver Vecernik <vecernik(at)aon(dot)at> wrote:
> Is there a direct way to import such files into
> PostgreSQL?
>
As I believe others have replied: no, not yet.
If you are absolutely sure that your data will _never_
contain commas, then the simple solution of just
deleting all of the quotes , then using COPY with
comma delimiters, will work. Otherwise, parsing CSV
files gets just too complicated, and you are better
off using an existing solution (like a Perl module) to
preprocess your data.
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com
From: | Jean-Luc Lachance <jllachan(at)nsd(dot)ca> |
---|---|
To: | Guy Fraser <guy(at)incentre(dot)net> |
Cc: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-29 01:50:33 |
Message-ID: | 3E373369.52C862DE@nsd.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
You can acheive the same result with:
tr -d '"\015' < file_name.txt | psql {etc...}
Unix EOL is LF not CR.
Guy Fraser wrote:
>
> Hi
>
> You will need two text utilities {dos2unix and sed} to do this in the simplest
> way. They are fairly standard text utilities and are probably already on your
> machine.
>
> This is how I would do it :
>
> sed "s/\"//g" file_name.txt \
> | dos2unix \
> | pgsql -c "COPY table_name FROM STDIN USING DELIMITERS ',';" db
>
> Where "file_name.txt" is the csv file you want to import and "table_name" is
> the previously created table you want to insert the data into and db is the
> database name.
>
> How this works is "sed" {stream editor} removes all the double quote
> characters '"' then pipes the output through "dos2unix" which converts all the
> CRLF {DOS EOL} sequences into CR {UNIX EOL} characters, then pipes the data to
> "pgsql" with a command that does a bulk insert into the table of the database
> you have selected.
>
> Guy
>
> Oliver Vecernik wrote:
> > Hi again!
> >
> > After investigating a little bit further my CSV import couldn't work
> > because of following reasons:
> >
> > 1. CSV files are delimited with CR/LF
> > 2. text fields are surrounded by double quotes
> >
> > Is there a direct way to import such files into PostgreSQL?
> >
> > I would like to have something like MySQL provides:
> >
> > LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
> > [REPLACE | IGNORE]
> > INTO TABLE tbl_name
> > [FIELDS
> > [TERMINATED BY '\t']
> > [[OPTIONALLY] ENCLOSED BY '']
> > [ESCAPED BY '\\' ]
> > ]
> > [LINES TERMINATED BY '\n']
> > [IGNORE number LINES]
> > [(col_name,...)]
> >
> > Has anybody written such a function already?
> >
> > Regards,
> > Oliver
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
From: | Oliver Vecernik <vecernik(at)aon(dot)at> |
---|---|
To: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-29 05:50:42 |
Message-ID: | 3E376BB2.4010302@aon.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Oliver Vecernik schrieb:
> Hi again!
>
> After investigating a little bit further my CSV import couldn't work
> because of following reasons:
>
> 1. CSV files are delimited with CR/LF
> 2. text fields are surrounded by double quotes
>
> Is there a direct way to import such files into PostgreSQL?
The answer seems to be no. But after googeling a bit a found a wonderful
Python module called csv at:
http://www.object-craft.com.au/projects/csv/
A minimal script called 'csv2tab.py' for conversion to a tab delimited
file could be:
#!/usr/bin/env python
import csv
import sys
def convert(file):
try:
f = open(file, 'r')
lines = f.readlines()
p = csv.parser()
for line in lines:
print '\t'.join(p.parse(line))
except:
print 'Error opening file!'
if __name__ == '__main__':
convert(sys.argv[1]);
Regards,
Oliver
--
VECERNIK Datenerfassungssysteme
A-2560 Hernstein, Hofkogelgasse 17
Tel.: +43 2633 47530, Fax: DW 50
http://members.aon.at/vecernik
From: | Gary Stainburn <gary(dot)stainburn(at)ringways(dot)co(dot)uk> |
---|---|
To: | Oliver Vecernik <vecernik(at)aon(dot)at>, pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-29 10:04:48 |
Message-ID: | 200301291004.48919.gary.stainburn@ringways.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
On Wednesday 29 January 2003 5:50 am, Oliver Vecernik wrote:
> Oliver Vecernik schrieb:
> > Hi again!
> >
> > After investigating a little bit further my CSV import couldn't work
> > because of following reasons:
> >
> > 1. CSV files are delimited with CR/LF
> > 2. text fields are surrounded by double quotes
> >
> > Is there a direct way to import such files into PostgreSQL?
Here's a simple command that will take
"hello","world","splat","diddle"
"he said "hello world" to ","his mate"
and convert it to the following tab delimited file that can be COPYed using
psql. It even handles quotes inside fields. (^m and ^i are done by typing
CTRL+V CTRL+M and CTRL+V CTRL+I)
hello world splat diddle
he said "hello world" to his mate
sed 's/^"//' <t.txt|sed 's/"^m$//'|sed 's/","/^i/g'>t1.txt
Gary
>
> The answer seems to be no. But after googeling a bit a found a wonderful
> Python module called csv at:
>
> http://www.object-craft.com.au/projects/csv/
>
> A minimal script called 'csv2tab.py' for conversion to a tab delimited
> file could be:
>
> #!/usr/bin/env python
>
> import csv
> import sys
>
> def convert(file):
> try:
> f = open(file, 'r')
> lines = f.readlines()
> p = csv.parser()
> for line in lines:
> print '\t'.join(p.parse(line))
> except:
> print 'Error opening file!'
>
> if __name__ == '__main__':
> convert(sys.argv[1]);
>
> Regards,
> Oliver
--
Gary Stainburn
This email does not contain private or confidential material as it
may be snooped on by interested government parties for unknown
and undisclosed purposes - Regulation of Investigatory Powers Act, 2000
From: | "Chad Thompson" <chad(at)weblinkservices(dot)com> |
---|---|
To: | |
Cc: | "pgsql-sql" <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-30 20:48:36 |
Message-ID: | 013601c2c8a0$f6ca2280$32021aac@chad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
>
> Unix EOL is LF not CR.
>
>
Is this the only difference between a dos and unix text file?
Thanks
Chad
From: | Oliver Vecernik <vecernik(at)aon(dot)at> |
---|---|
To: | Chad Thompson <chad(at)weblinkservices(dot)com> |
Cc: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-30 22:31:30 |
Message-ID: | 3E39A7C2.4050703@aon.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Chad Thompson schrieb:
>
>
>>Unix EOL is LF not CR.
>>
>>
>>
>>
>
>Is this the only difference between a dos and unix text file?
>
Yes, but to be more precise:
dos: CR + LF
unix: LF
mac: CR
Oliver
--
VECERNIK Datenerfassungssysteme
A-2560 Hernstein, Hofkogelgasse 17
Tel.: +43 2633 47530, Fax: DW 50
http://members.aon.at/vecernik
From: | Jean-Luc Lachance <jllachan(at)nsd(dot)ca> |
---|---|
To: | Chad Thompson <chad(at)weblinkservices(dot)com> |
Cc: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-01-31 15:27:47 |
Message-ID: | 3E3A95F3.842C9691@nsd.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
In DOS and Windows, text lines end with <CR><LF>.
In Unix, text lines end with <LF> only.
hex dec oct
<CR>=CTRL-M or 0x0D or 13 or 015
<LF>=CTRL-J or 0x0A or 10 or 012
Chad Thompson wrote:
>
> >
> > Unix EOL is LF not CR.
> >
> >
>
> Is this the only difference between a dos and unix text file?
>
> Thanks
> Chad
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
From: | Guy Fraser <guy(at)incentre(dot)net> |
---|---|
To: | pgsql-sql <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: CSV import |
Date: | 2003-02-01 01:32:08 |
Message-ID: | 3E3B2398.1010101@incentre.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
FYI
In text files on a Mac. the EOL character is a <CR> only.
What a messy thing this whole EOL cruft is.
To convert between these text formats on linux is easy if you have dos2unix.
The dos2unix on linux can perform many format conversions to and from unix,dos
and mac formats.
On BSD you need dos2unix to convert from dos to unix and unix2dos to convert
from unix to dos. You probably need to get the GNU version of dos2unix or
mac2unix to convert to or from mac formatted text.
Guy
Jean-Luc Lachance wrote:
> In DOS and Windows, text lines end with <CR><LF>.
> In Unix, text lines end with <LF> only.
>
> hex dec oct
> <CR>=CTRL-M or 0x0D or 13 or 015
> <LF>=CTRL-J or 0x0A or 10 or 012
>
>
>
> Chad Thompson wrote:
>
>>>Unix EOL is LF not CR.
>>>
>>>
>>
>>Is this the only difference between a dos and unix text file?
>>
>>Thanks
>>Chad
---%<...snip...