Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search archives
  Advanced Search

How to "unique-ify" HUGE table?



Hi everyone!
I have a very large 2-column table (about 500M records) from which I want to
remove duplicate records.

I have tried many approaches, but they all take forever.

The table's definition consists of two short TEXT columns.  It is a
temporary table generated from a query:

CREATE TEMP TABLE huge_table AS SELECT x, y FROM ... ;

Initially I tried

CREATE TEMP TABLE huge_table AS SELECT DISTINCT x, y FROM ... ;

but after waiting for nearly an hour I aborted the query, and repeated it
after getting rid of the DISTINCT clause.

Everything takes forever with this monster!  It's uncanny.  Even printing it
out to a file takes forever, let alone creating an index for it.

Any words of wisdom on how to speed this up would be appreciated.

TIA!

Kynn


Home | Main Index | Thread Index

Privacy Policy | About PostgreSQL
Copyright © 1996 – 2012 PostgreSQL Global Development Group