home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.ai
- Path: sparky!uunet!charon.amdahl.com!pacbell.com!sgiblab!spool.mu.edu!news.nd.edu!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin
- From: hrubin@pop.stat.purdue.edu (Herman Rubin)
- Subject: Re: Looking for Good Intro to Bayesian Classifiers
- Message-ID: <BxvGts.FLr@mentor.cc.purdue.edu>
- Sender: news@mentor.cc.purdue.edu (USENET News)
- Organization: Purdue University Statistics Department
- References: <19293@ucdavis.ucdavis.edu>
- Distribution: usa
- Date: Tue, 17 Nov 1992 18:03:27 GMT
- Lines: 44
-
- In article <19293@ucdavis.ucdavis.edu> f175003@wilma.ucdavis.edu writes:
- >I am looking for a book or paper with a good introduction into using
- >Bayesian Classifiers preferably with a few examples showing how to
- >determine the probabilities. I understand the idea behind a Bayesian
- >Classifier, but I have looked at a couple of examples of Bayesian
- >Classifiers and they appeared to use maximum likelihood estimators to
- >determine the values for the various probabilities. According to my
- >statistics book, this method provides excellent estimates if the sample
- >size is large, but in the cases I hope to be dealing with I will only
- >have a few samples and want to get as much information from them as I can.
-
- >There has also been some discussion here about whether
- >there exist some values for the probabilities of the classes and the
- >probabilities of the various attributes given the class, which tells
- >you as much as possible from the given samples. The methods I have
- >seen require a prior distribution and show you how to determine the
- >posterior distribution after seeing each example. One then chooses a value
- >from this distribution using their favorite function, such as mean, median,
- >or mode, or equivalently by using some loss function. It seems that their
- >is no value for the probabilities which will give you as much
- >information as possible about the samples without chosing some
- >type of loss function. Is this correct and is there some way to
- >prove if it is or is not correct?
-
- To do Bayesian statistics "correctly" requires that one sets up a prior
- measure on the collection of states of nature and a loss function, and
- then uses that procedure which minimizes the expected loss. Now it is
- a physical impossibility to do this accurately. Fortunately, there are
- robustness results which point out that certain approximations may do a
- good job in certain situations. Using maximum likelihood estimators to
- approximate parameters may or may not be robust.
-
- With a sufficiently large sample size, one can always do a better job.
- But the largest sample size is not any better than knowing the distribution.
-
- The prior measure and the loss function come from the user, and there is
- no dictum which can be given as to how to set these up. Now there are
- packages on the market which claim to do this for the user with little
- input; they should be used only with the greatest of suspicion.
- --
- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
- Phone: (317)494-6054
- hrubin@snap.stat.purdue.edu (Internet, bitnet)
- {purdue,pur-ee}!snap.stat!hrubin(UUCP)
-