Things to do:

 - SLIM: gaussian seems to pick up on some stuff it intuitively
 shouldn't, eg swainsons and wilsons are viewed as pretty similar.
 perhaps I should use a bayesian prior to force the gaussian's mean
 and variance to be close to zero and one?

 - look at other LM/IR types of measures (talk to John and Cheng?)
 - move in Pradeep's adaptive string-matching code
 - e/m learning
   -- model M vs U as mixture of multinomials
   -- model each cluster using smoothed document models

Little things to do:

 - stemming
 - blockers could be smarter about whether to cluster or not
 - experimenter should keep "notebook" of all experiments, and have a 
   gui browser of previously-run experiments
 - experiment with blockers, SLIM
 - add TunedDistance marker interface, for ones that I recommend
 - add ScaledDistance marker interface, for ones scaled between zero and one?

 - write DistanceFactory that converts class/class/class to
 a distance eg, SoftTFIDF/WinklerVariant/Jaro would build
 JaroWinklerTFIDF.

Activities:

 3/20

 - SoftTFIDF is abstract; use JWSoftTFIDF or SLIMSoftFIDF for concrete distances

  3/13
 - add package-level documentation
 - reorganize directories, setup ANT build file, javadocs, testcases
 - don't use MatchData in main package (except in SoftTokenFelligiSunter)
 - test cases for AbstractStatisticalTokenDistance subclasses
 - add default packages for Blockers, Distances in expt 

  3/20
 - checked in working version of SLIM.java
 - got rid of class directory from CVS
 - refactored Winkler variant of Jaro as separate class
