Source: [original](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets) -
This is the poker dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows:
-join test and train datasets (non-scaled versions)
-relabel classes 0=positive class and 1,2,...9=negative class
-normalize each file columnwise according to the following rules:
-If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity.
-If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one.
-If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.
NOTE: please keep in mind that poker has a mild redundancy, e.g. some duplicated data points, roughly 0.2%, within each file (train,test). these duplicated points have not been removed!