Newsgroups: comp.databases
Path: utzoo!utgpu!news-server.csri.toronto.edu!torsqnt!hybrid!scifi!paladin!beal
From: beal@paladin.owego.ny.us (Alan Beal)
Subject: Re: SQL Duplicate Row Deletion ???
Message-ID: <670651675.103469@paladin.owego.ny.us>
Organization: The Design Committee
References: <91091.141528SYSPMZT@GECRDVM1.BITNET> <1991Apr1.163615.56@cim-vax.honeywell.com>
Distribution: na
Date: Wed, 3 Apr 1991 04:07:53 GMT
Lines: 44

tdoyle@cim-vax.honeywell.com writes:

>In article <91091.141528SYSPMZT@GECRDVM1.BITNET>, SYSPMZT@gecrdvm1.crd.ge.com writes:
>> I've made a nice mistake loading data twice over several days into a table, and
>> would like to delete just one of the duplicate rows.
>> 
>> Anyone have a nifty solution to this problem?  The database manager is DB2, but
>> I'd think that any SQL based language would have the same problem.

I often use QMF as a tool for doing this, see below.

>Select identified dupicates into a temporary table (SELECT UNIQUE INTO TEMP).

How about in QMF:
   SELECT DISTINCT * FROM TABLE
Then:
   SAVE DATA AS TEMP
TEMP should now contain the unique rows.  Note that QMF creates a table
called TEMP in the tablespace specified by the SPACE parameter in your
QMF profile.  Typically, QMF's default tablespace is used by everyone,
may contain many tables, and may be in serious need of a reorganization.

>Then delete from the original where the record is the same as in TEMP.

Or using my example, delete all the records in the original table using:
   DELETE FROM TABLE
Can save time by using the LOAD utility with REPLACE and dummy input.  Of
course be careful if you have multiple tables in the tablespace.  The 
DELETE will run faster if the tablespace is segmented.

>This will rid ALL instances of the duplicates (including the original).
>Then add records from TEMP into ORIG to restore one copy.

Then reinsert all the data back in from TEMP:
   INSERT INTO TABLE
   SELECT * FROM TEMP

Of course, it would be wise to backup your table before doing this.  And
finally, I assume you realize having an unique index would have prevented
this problem in the first place.
-- 
Alan Beal
Internet: beal@paladin.Owego.NY.US
USENET:   {uunet,uunet!bywater!scifi}!paladin!beal
