Subj : Re: Naive Bayes Algorithm?
To   : comp.programming
From : Richard Heathfield
Date : Thu Jul 21 2005 11:58 pm

Mike wrote:

> Ah, I see. I'm not using, wanting, the algorithm for studying spam/ham.
> I'm playing with data from work fed into a bayes algorithm (that's my
> goal) to see if I can use the bayes theorems to help with analysis
> at work.

How DARE you get creative!?!? :-)

> For the numbers is there a way to distinguish where some event is good
> and has a number of 7.1 versus an event that is bad with a number of 7.2?

I don't quite see what you're getting at. The whole point of the Bayesian 
algorithm is that, at the marking stage, you simply identify "good" and 
"bad", without worrying about the data itself. After "training", the 
algorithm calculates a probability of "goodness" (or "badness", if you 
prefer). Where you draw the line between "good" and "bad" is up to you; you 
will have a continuum of goodness/badness, so a cut-off point is bound to 
be arbitrary.

You might want to graph out a probability (X) vs frequency (Y) curve, and 
see if there's any point that jumps out as a cut-off (e.g. if you get a 
"hockey stick" curve, cutting somewhere near the "blade" probably makes a 
lot of sense).

> Treating the data as only strings does greatly simplify things.

Yes.


-- 
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
mail: rjh at above domain

.