Newsgroups: comp.databases
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!elroy.jpl.nasa.gov!decwrl!fernwood!oracle!news
From: cakers@oracle.com (Charles Akers)
Subject: Re: SQL Duplicate Row Deletion ???
Message-ID: <1991Apr9.223500.19272@oracle.com>
Sender: Charles Akers
Reply-To: cakers@oracle.com (Charles Akers)
Organization: Oracle Corporation, Belmont CA
References: <91091.141528SYSPMZT@GECRDVM1.BITNET> <1991Apr1.163615.56@cim-vax.honeywell.com> <670651675.103469@paladin.owego.ny.us>
Date: Tue, 9 Apr 91 22:35:00 GMT

The various methods described in the articles on deleting duplicate
rows in a SQL database table do not do it in a single SQL statement.
If the table involved has a keyfield and there are duplicates of keys,
the following SQL statement will delete all rows except the one row
for each keyfield which has the lowest rowid:

If the table is named table_X and the keyfield is named keycol:

DELETE FROM table_X t_alias
WHERE ROWID >
 (SELECT MIN(ROWID)
  FROM table_X
  WHERE keycol=t_alias.keycol);

This statement will delete extra rows with the same keyfield even if
the values in other (non-key) columns are not the same.  That is not a
problem if you have just inserted the same rows twice and want only
one copy of each row.

If there is no column in the table which was unique prior to inserting
the duplicates, the following 3 statements will eliminate duplicates
of rows which have identical values in all columns (requires your SQL
to have a RENAME command).

CREATE TABLE tmp_tab AS
 SELECT DISTINCT *
 FROM table_X;
DROP TABLE table_x;
RENAME tmp_tab TO table_X;

-------------
Charles Akers
-------------
