Subj : Re: Naive Bayes Algorithm? To : comp.programming From : Richard Heathfield Date : Thu Jul 21 2005 11:58 pm Mike wrote: > Ah, I see. I'm not using, wanting, the algorithm for studying spam/ham. > I'm playing with data from work fed into a bayes algorithm (that's my > goal) to see if I can use the bayes theorems to help with analysis > at work. How DARE you get creative!?!? :-) > For the numbers is there a way to distinguish where some event is good > and has a number of 7.1 versus an event that is bad with a number of 7.2? I don't quite see what you're getting at. The whole point of the Bayesian algorithm is that, at the marking stage, you simply identify "good" and "bad", without worrying about the data itself. After "training", the algorithm calculates a probability of "goodness" (or "badness", if you prefer). Where you draw the line between "good" and "bad" is up to you; you will have a continuum of goodness/badness, so a cut-off point is bound to be arbitrary. You might want to graph out a probability (X) vs frequency (Y) curve, and see if there's any point that jumps out as a cut-off (e.g. if you get a "hockey stick" curve, cutting somewhere near the "blade" probably makes a lot of sense). > Treating the data as only strings does greatly simplify things. Yes. -- Richard Heathfield "Usenet is a strange place" - dmr 29/7/1999 http://www.cpax.org.uk mail: rjh at above domain .