How to Write Bayesian Classifiers in Python
Today, I worked on porting the Bayesian classifier from ruby to python. Why port between languages when languages don't matter? It's a good way to see if one understands the idea behind the code, which is the more important portion of any exercise. It's not a complete port, but what I have right now is pasted after the flip.
class Bayes(object):
def __init__(self, categories):
self.categories = set()
[self.categories.add(dict(c)) for c in categories]
self.total_words = 0
def train(self,category, text):
words = text.split(' ')
for word in words:
try:
category[word]=category[word]+1
except KeyError:
category[word]=1
self.categories.add(category)
self.total_words = self.total_words + len(words)
def untrain(self,category, text):
words = text.split('" ")
for word in words:
try:
category[word]=category[word]+1
except KeyError:
category[word]=1
self.total_words = self.total_words - len(words)
def classifications(self, text):
score = dict()
for cat in self.categories.keys():
score[cat]=0
total = self.categories[cat]+total
words = text.split(" ")
for word in words:
try:
s = self.categories[word]
except KeyError:
s = 0.1
import math
score [cat] = math.log(s/float(total))
return score
def classify(self,text):
return self.categories.sort(cmp=self.comparison)
def categories(self):
return self.categories.keys()
def comparison(self, other):
return cmp(hash(self), hash(other))
Giving Users a Stake in Prsnl
Jameel and TQ mentioned that they want so many new expert systems in prsnl (currency support, € to US$ conversion, company identification, street addresses, phone numbers, etc.) that I've decided to give up on writing all the expert systems myself. I've written an interface and will port the existing expert systems to the new platform. In doing so, I'll release the two existing source systems as tutorials on how to use the interface to implement your own expert systems for whatever you'd dream of having prsnl match automatically. I'll have a new drop up Monday and look forward to putting up a repository for user-submitted expert systems over the next week, complete with digg-like ranking.
Prsnl, Now With AI Support 2
prsnl has had artificial intelligence support added to it. It will recognise email addresses and web urls, for now. I'd like to add address and phone number support as well. Try it out and let me know what you think.
