Thursday, September 27, 2007

 

'Programming Collective Intelligence' in Common Lisp, Chapter 3 - Hierarchical Clusters

I have Common Lisp code for Chapter 3 up to Hierarchical Clusters. It's longer and not that interesting, so I'll just put up a link to the code. Once again, map and reduce are taking the place of list comprehensions, but other than that it's not particularly Lispy, no macrology or anything. Maybe next time.

Last time's tangent was a shortcut to hash syntax. This time I'm not happy with deleting items out of the middle of adjustable arrays. The code puts items on the end of an array and deletes them out of the middle, from a particular position.
In Python it's
del clust[lowestpair[0]]
What I have is

(defun truep (x) (declare (ignore x)) t)
(setf clust (delete-if #'truep clust :start (car lowestpair) :end (1+ (car lowestpair))))


Instead of PCI's use of the Python Imaging Library or some Lisp equivalent, I used cl-pdf, which was easy to use to make the simple graphs, lines and text. This is the full hierarchical cluster, then a zoom view.



Wednesday, September 12, 2007

 

'Programming Collective Intelligence' in Common Lisp, Chapter 2

Like many others, I've been reading Toby Segaran's Programming Collective Intelligence. Toby's examples are in Python. Inspired by loucal's posting of code examples in Ruby, I've decided to put up my own Common Lisp examples. These are from Chapter 2, going up to page 15, "Ranking the Critics".

In order to just have the recommendations in the file, I used assoc lists instead of hashes. One place where Python (and Ruby) has it over Lisp is in hash syntax, just critics[person] instead of (gethash person critics) or (cdr (assoc person critics :test #'equalp)). I made a 'critics' function to keep down the verbosity. Is there a good way to change the syntax for hash lookup?

The other big difference is my use of mapcar and reduce everywhere instead of Python's list comprehensions, with the occaisional intersection thrown in.

If the Python source code gets posted, I may try some benchmarks with later examples.


(defparameter *RECOMMENDATIONS*
'(
("Lisa Rose" . (("Lady in the Water" . 2.5) ("Snakes on a Plane" . 3.5) ("Just My Luck" . 3.0)
("Superman Returns" . 3.5) ("You, Me and Dupree" . 2.5) ("The Night Listener" . 3.0)))
("Gene Seymour" . (("Lady in the Water" . 3.0) ("Snakes on a Plane" . 3.5) ("Just My Luck" . 1.5)
("Superman Returns" . 5.0) ("The Night Listener" . 3.0) ("You, Me and Dupree" . 3.5)))
("Michael Phillips" . (("Lady in the Water" . 2.5) ("Snakes on a Plane" . 3.0)
("Superman Returns" . 3.5) ("The Night Listener" . 4.0)))
("Claudia Puig" . (("Snakes on a Plane" . 3.5) ("Just My Luck" . 3.0) ("The Night Listener" . 4.5)
("Superman Returns" . 4.0) ("You, Me and Dupree" . 2.5)))
("Mick LaSalle" . (("Lady in the Water" . 3.0) ("Snakes on a Plane" . 4.0) ("Just My Luck" . 2.0)
("Superman Returns" . 3.0) ("The Night Listener" . 3.0) ("You, Me and Dupree" . 2.0)))
("Jack Matthews" . (("Lady in the Water" . 3.0) ("Snakes on a Plane" . 4.0) ("The Night Listener" . 3.0)
("Superman Returns" . 5.0) ("You, Me and Dupree" . 3.5)))
("Toby" . (("Snakes on a Plane" . 4.5) ("You, Me and Dupree" . 1.0)
("Superman Returns" . 4.0)))))


(defun critics (reviewer &optional movie)
(labels ((get-movie (ms m)
(cdr (assoc m ms :test #'equalp))))
(let ((movies (cdr (assoc reviewer *RECOMMENDATIONS* :test #'equalp))))
(if movie (get-movie movies movie) movies))))

(defun similar (person1 person2 distance)
(let* ((movies1 (critics person1))
(movies2 (critics person2))
(common-movies (mapcar #'car (intersection movies1 movies2
:test #'(lambda (x y) (equalp (car x) (car y)))))))
(if (null common-movies)
nil
(funcall distance person1 person2 common-movies))))

(defun euclidean-distance (person1 person2 common-movies)
(let* ((sum-of-squares (reduce #'+ (mapcar
#'(lambda (cm)
(expt (- (critics person1 cm) (critics person2 cm)) 2))
common-movies)))
(distance (/ 1 (1+ sum-of-squares))))
distance))

(defun sim-distance (person1 person2)
(similar person1 person2 #'euclidean-distance))


(defun pearson-distance (person1 person2 common-movies)
(let* ((n (length common-movies))
(scores1 (mapcar #'(lambda (x) (critics person1 x)) common-movies))
(scores2 (mapcar #'(lambda (x) (critics person2 x)) common-movies))
(sum1 (reduce #'+ scores1))
(sum2 (reduce #'+ scores2))
(sum1-sq (reduce #'+ (mapcar #'(lambda (x) (* x x)) scores1)))
(sum2-sq (reduce #'+ (mapcar #'(lambda (x) (* x x)) scores2)))
(psum (reduce #'+ (mapcar #'* scores1 scores2)))
(num (- psum (/ (* sum1 sum2) n)))
(den (sqrt (* (- sum1-sq (/ (expt sum1 2) n)) (- sum2-sq (/ (expt sum2 2) n))))))
(if (zerop den) 0 (/ num den))))

(defun sim-pearson (person1 person2)
(similar person1 person2 #'pearson-distance))

(defun top-matches (person &optional (n 5) (similarity #'sim-pearson))
(let* ((scores (mapcar #'(lambda (x) (cons (funcall similarity person x) x))
(remove-if #'(lambda (x) (equalp x person)) (mapcar #'car *RECOMMENDATIONS*))))
(sorted-scores (sort scores #'> :key #'car))
(len (length sorted-scores)))
(if (<= len n)
sorted-scores
(butlast sorted-scores (- len n)))))


This page is powered by Blogger. Isn't yours?