Well sed and awk by Dale Dougherty and Arnold Robbins was a nice christmas present and already I have used it to help with the sdf data harvest problem! I used sed to fix the harvest and eliminate the repetitions. Jughead has a late december database built and being served now. These new databases are smaller an no longer repeat items yet serve more info. data_dec_late = 10.2M 97,818 selectors