From serged@ix.netcom.com Thu Apr 29 12:18:34 1999 Received: from mxu4.u.washington.edu (mxu4.u.washington.edu [140.142.33.8]) by lists.u.washington.edu (8.9.3+UW99.02/8.9.3+UW99.01) with ESMTP id MAA36462 for ; Thu, 29 Apr 1999 12:18:32 -0700 Received: from dfw-ix5.ix.netcom.com (dfw-ix5.ix.netcom.com [206.214.98.5]) by mxu4.u.washington.edu (8.9.3+UW99.02/8.9.3+UW99.01) with ESMTP id MAA09488 for ; Thu, 29 Apr 1999 12:18:32 -0700 Received: (from smap@localhost) by dfw-ix5.ix.netcom.com (8.8.4/8.8.4) id OAA14646; Thu, 29 Apr 1999 14:17:42 -0500 (CDT) Received: from unknown(209.12.41.241) by dfw-ix5.ix.netcom.com via smap (V1.3) id rma014503; Thu Apr 29 14:16:51 1999 Reply-To: From: "Serge Dumoulin" To: "Ric Skinner" , , "WAPHGIS" Subject: RE: (GIS-L) Data Hygiene for Geocoding Date: Thu, 29 Apr 1999 14:14:38 -0500 Message-ID: <000a01be9274$86d1a360$3301a8c0@RESEARCH.safeguardit.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook 8.5, Build 4.71.2232.26 In-Reply-To: <3728B0EB.32171AA0@fast.net> X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.1 one more thing incase it wasn't obvious in my last post.. I haven't been able to find a product that cleans MapMarkers' output..thus it does a very good job a cleaning your data before assigning it lat/long values..... -----Original Message----- From: owner-gis-l@geoint.com [mailto:owner-gis-l@geoint.com]On Behalf Of Ric Skinner Sent: Thursday, April 29, 1999 14:20 To: GIS-L@geoint.com; WAPHGIS Subject: (GIS-L) Data Hygiene for Geocoding --------------------------------------------------------------------- We have a very large (1 million records) address file that contains dirty and missing values. The vast majority of dirty values are misspelled street names, misspelled city name, incorrect zip code. We envision data hygiene will correct the dirty data and fill in missing values. We propose to develop a MS-Access data hygiene application and pass the file through prior to geocoding. We are considering using data files from USPS ('City State File'; 'Five Digit ZIP Code File'; http://www.usps.gov/ncsc/products/) and Semaphore Corp's. 'ZIP4' software (http://www.semaphorecorp.com/cgi/zp4.html). Has anyone used any of these products for data hygiene in preparation for geocoding? What other methods are used for data hygiene of large files? I will summarize informative responses that may be of interest to others. -- Ric Skinner Research Scientist -- GIS NJ Dept. of Health & Senior Services Cancer Epidemiology Services 3635 Quakerbridge Rd. P.O. Box 369 Trenton, NJ 08625-0369 Phone: 609-588-3500 FAX: 609-588-3638 wskinner@fast.net and Co-Chair 2nd International Health Geographics Conference (in planning) 1st IHGC website: www.jhsph.edu/ihgc +----------+ subscribe +--------- GIS-L ---------+ unsubscribe +---------+ send email To: listserver@geoint.com | send email To: listserver@geoint.com In the BODY, type: SUBSCRIBE GIS-L | In the BODY, type: UNSUBSCRIBE GIS-L +--------------------------------------------------------------------------+ + Digest version: GIS-L-DIGEST + Use the same method to subscribe + +--------------------------------------------------------------------------+ a service of GeoGraph International Corporation (http://www.geoint.com) .