1241
codrnaNorm
1
Normalized form of codrna (351)
**Author**: Andrew V Uzilov","Joshua M Keegan","David H Mathews.
**Source**: [original](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets) -
**Please cite**: [AVU06a]
Andrew V Uzilov, Joshua M Keegan, and David H Mathews.
Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change.
BMC Bioinformatics, 7(173), 2006.
This is the cod-rna dataset, retrieved 2014-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows:
-join test, train and rest datasets
-normalize each file columnwise according to the following rules:
-If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity.
-If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one.
-If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.
NOTE: please keep in mind that cod-rna has many duplicated data points, within each file (train,test,rest) and also accross these files. these duplicated points have not been removed!
1
Sparse_ARFF
2015-02-14T02:50:40
Public https://api.openml.org/data/v1/download/522888/codrnaNorm.sparse_arff
522888 codrna_Y ChemistryLife Sciencestudy_16 public active
2020-11-20 21:09:08 7e0d78d35b49661e86f80439ce2f6380