From changym@email.uc.edu Thu Jul 8 14:30:32 1999 Received: from mxu2.u.washington.edu (mxu2.u.washington.edu [140.142.32.9]) by lists.u.washington.edu (8.9.3+UW99.02/8.9.3+UW99.01) with ESMTP id OAA26340 for ; Thu, 8 Jul 1999 14:30:31 -0700 From: changym@email.uc.edu Received: from newman.bch.uc.edu (newman.bch.uc.edu [129.137.33.152]) by mxu2.u.washington.edu (8.9.3+UW99.02/8.9.3+UW99.06) with ESMTP id OAA15877 for ; Thu, 8 Jul 1999 14:30:31 -0700 Received: from ihphsr17 (node124073.msb.uc.edu [129.137.124.73]) by newman.bch.uc.edu (8.9.2/8.9.2) with SMTP id QAA10349 for ; Thu, 8 Jul 1999 16:19:39 -0400 (EDT) Message-Id: <4.1.19990708173030.00a0a710@email.uc.edu> X-Sender: changym@email.uc.edu X-Mailer: QUALCOMM Windows Eudora Pro Version 4.1 Date: Thu, 08 Jul 1999 17:32:28 -0400 To: waphgis@u.washington.edu Subject: multiple entries per person Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Dear listers, My research has encountered a challenging issue - “multiple entries per person”. On the health data I am working on, considerable proportions of patients had requested health services more than one time for the same disease/ symptom (verified by ICD9). Many of them have moved from location A to location B, and possible C. Since geographic features are important parameters to be tested and later I would need to geocode* them using GIS, it becomes the issue. Although randomly choose one visit per patient can be technically done through SAS programming for patients without (address) moving, I am struggling with the best strategy for the patients with multiple visits and relocations. What would be the more acceptable methods to deal with this situation without bringing in skewed weighting (or missing the point) for modeling later? Or I should keep every entry for each patient when model is being tested. Your input would be greatly appreciated. Thanks in advance. Yu-mei Chang .