Sunday, September 04, 2005
Fuzzy Queries on top of CLSQL
I originally planned to incorporate Fuzzy SQL more into CLSQL, but I don't think it's necessary for my purposes. With CLSQL's def-view-class extension of CLOS I can use generic methods, and it will look like the rest of the class. Later in the book I'm using, Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration there's a chapter on Genetic Tuning of Fuzzy Models, which I think will work better with external functions and methods as opposed to MOP or macro-based solutions inside CLSQL. Besides, I couldn't even get db-reader to work with my classes, so I'll just leave well enough alone.
This will be based on my previous big and tall entry. My data setup is
and I have a corresponding CLSQL view-class
The key is each class will have a qcix (query compatibility index) method that will be the overall membership function, doing some combination of individual membership functions of individual columns in the database. For our "big and tall" prospects we'll have a tall method for membership based on height, and a heavy method for membership based on weight. These functions will return a value from 0.0 to 1.0, with a linear scale from the 0 cutoff to the 1 cutoff. For overall membership we'll average the two other membership function.
So far, so good. Now we want to select only some records that have a qcix over a particular threshold. We can use CLSQL's addition to loop:
Oops! Perhaps another reason not to delve into the innards of CLSQL. Plan B:
So, starting with a CLSQL def-view-class we can externally add generic methods to support fuzzy queries.
This will be based on my previous big and tall entry. My data setup is
rrc=# select * from btprospects;
name | height | weight | age
----------+--------+--------+-----
Saunders | 74 | 215 | 52
Cassey | 73 | 188 | 40
Miller | 71 | 157 | 25
Freeman | 70 | 202 | 34
OMalley | 65 | 163 | 48
Jackson | 63 | 170 | 38
and I have a corresponding CLSQL view-class
(clsql:def-view-class btprospects ()
((name :db-kind :key
:type (string 20)
:initarg :name
:accessor name)
(height :type integer
:initarg :height
:accessor height)
(weight :type integer
:initarg :weight
:accessor weight)
(age :type integer
:initarg :age
:accessor age)))
The key is each class will have a qcix (query compatibility index) method that will be the overall membership function, doing some combination of individual membership functions of individual columns in the database. For our "big and tall" prospects we'll have a tall method for membership based on height, and a heavy method for membership based on weight. These functions will return a value from 0.0 to 1.0, with a linear scale from the 0 cutoff to the 1 cutoff. For overall membership we'll average the two other membership function.
(defmethod tall ((prospect btprospects))
(let ((raw-height (slot-value prospect 'height)))
(cond ((<= raw-height 54) 0.0)
((>= raw-height 72) 1.0)
(t (float (/ (- raw-height 54) (- 72 54)))))))
(defmethod heavy ((prospect btprospects))
(let ((raw-weight (slot-value prospect 'weight)))
(cond ((<= raw-weight 182) 0.0)
((>= raw-weight 220) 1.0)
(t (float (/ (- raw-weight 182) (- 220 182)))))))
(defmethod qcix ((prospect btprospects))
(float (/ (+ (tall prospect) (heavy prospect)) 2)))
CL-USER> (clsql:map-query 'list #'tall [clsql:select 'btprospects])
(1.0 1.0 0.9444444 0.8888889 0.6111111 0.5)
CL-USER> (clsql:map-query 'list #'heavy [clsql:select 'btprospects])
(0.8684211 0.15789473 0.0 0.5263158 0.0 0.0)
CL-USER> (clsql:map-query 'list #'qcix [clsql:select 'btprospects])
(0.93421054 0.57894737 0.4722222 0.7076024 0.30555555 0.25)
So far, so good. Now we want to select only some records that have a qcix over a particular threshold. We can use CLSQL's addition to loop:
(defun fuzzy-select (class-name cutoff)
(loop for (rec) being the records in (clsql:select class-name)
when (> (qcix rec) cutoff) collect rec))
CL-USER> (fuzzy-select 'btprospects 0.5)
; Warning: database-query-result-set not implemented for database type POSTGRESQL-SOCKET.
; While executing: #
Oops! Perhaps another reason not to delve into the innards of CLSQL. Plan B:
(defun fuzzy-select (class-name cutoff)
(let ((result '()))
(clsql:do-query ((rec) [clsql:select class-name])
(when (>= (qcix rec) cutoff)
(push rec result)))
result))
CL-USER> (mapcar #'name (fuzzy-select 'btprospects .5))
("Freeman" "Cassey" "Saunders")
So, starting with a CLSQL def-view-class we can externally add generic methods to support fuzzy queries.