Subj : Re: Naive Bayes Algorithm? To : comp.programming From : moi Date : Fri Jul 22 2005 02:21 am Richard Heathfield wrote: > Mike wrote: > > >>Ah, I see. I'm not using, wanting, the algorithm for studying spam/ham. >>I'm playing with data from work fed into a bayes algorithm (that's my >>goal) to see if I can use the bayes theorems to help with analysis >>at work. > > > How DARE you get creative!?!? :-) > :-) > >>For the numbers is there a way to distinguish where some event is good >>and has a number of 7.1 versus an event that is bad with a number of 7.2? > > > I don't quite see what you're getting at. The whole point of the Bayesian > algorithm is that, at the marking stage, you simply identify "good" and > "bad", without worrying about the data itself. After "training", the > algorithm calculates a probability of "goodness" (or "badness", if you > prefer). Where you draw the line between "good" and "bad" is up to you; you > will have a continuum of goodness/badness, so a cut-off point is bound to > be arbitrary. > > You might want to graph out a probability (X) vs frequency (Y) curve, and > see if there's any point that jumps out as a cut-off (e.g. if you get a > "hockey stick" curve, cutting somewhere near the "blade" probably makes a > lot of sense). The sensitivity vs specificity -thingy "type A vs type B -errors" is sometimes referred to as Receiver Operator Curve (ROC-curve) analysis. It just means you'll have to quantify the amount of shit (false negative OR false positives) you are willing to take. But: shit will happen, anyway. HTH, AvK BTW: excellent example! .