Subj : Re: Naive Bayes Algorithm?
To   : comp.programming
From : moi
Date : Fri Jul 22 2005 02:21 am

Richard Heathfield wrote:
> Mike wrote:
> 
> 
>>Ah, I see. I'm not using, wanting, the algorithm for studying spam/ham.
>>I'm playing with data from work fed into a bayes algorithm (that's my
>>goal) to see if I can use the bayes theorems to help with analysis
>>at work.
> 
> 
> How DARE you get creative!?!? :-)
> 

:-)

> 
>>For the numbers is there a way to distinguish where some event is good
>>and has a number of 7.1 versus an event that is bad with a number of 7.2?
> 
> 
> I don't quite see what you're getting at. The whole point of the Bayesian 
> algorithm is that, at the marking stage, you simply identify "good" and 
> "bad", without worrying about the data itself. After "training", the 
> algorithm calculates a probability of "goodness" (or "badness", if you 
> prefer). Where you draw the line between "good" and "bad" is up to you; you 
> will have a continuum of goodness/badness, so a cut-off point is bound to 
> be arbitrary.
> 
> You might want to graph out a probability (X) vs frequency (Y) curve, and 
> see if there's any point that jumps out as a cut-off (e.g. if you get a 
> "hockey stick" curve, cutting somewhere near the "blade" probably makes a 
> lot of sense).

The sensitivity vs specificity -thingy "type A vs type B -errors" is 
sometimes referred to as
Receiver Operator Curve (ROC-curve) analysis.
It just means you'll have to quantify the amount of shit
(false negative OR false positives) you are willing to take.
But: shit will happen, anyway.

HTH,
AvK

BTW: excellent example!

.