Data
vehicle_sensIT

vehicle_sensIT

active Sparse_ARFF Publicly available Visibility: public Uploaded 29-08-2014 by aydin demircioglu
0 likes downloaded by 22 people , 29 total downloads 0 issues 0 downvotes
  • concept_drift mythbusting_1 study_1 study_15 study_20 study_41
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: M. Duarte, Y. H. Hu Source: [original](http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets) - 2013-11-14 - Please cite: M. Duarte and Y. H. Hu. Vehicle classification in distributed sensor networks. Journal of Parallel and Distributed Computing, 64(7):826-838, July 2004. This is the SensIT Vehicle (combined) dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and train datasets (2 files, already pre-combined) -relabel classes 1,2=positive class and 3=negative class -normalize each file columnwise according to the following rules: -If a column only contains one value (constant feature), it will set to zero and thus removed by sparsity. -If a column contains two values (binary feature), the value occuring more often will be set to zero, the other to one. -If a column contains more than two values (multinary/real feature), the column is divided by its std deviation.

101 features

Y (target)nominal2 unique values
0 missing
X1numeric90612 unique values
0 missing
X2numeric83667 unique values
0 missing
X3numeric93241 unique values
0 missing
X4numeric91467 unique values
0 missing
X5numeric78817 unique values
0 missing
X6numeric84543 unique values
0 missing
X7numeric84445 unique values
0 missing
X8numeric92565 unique values
0 missing
X9numeric94507 unique values
0 missing
X10numeric95144 unique values
0 missing
X11numeric95227 unique values
0 missing
X12numeric96721 unique values
0 missing
X13numeric97419 unique values
0 missing
X14numeric97866 unique values
0 missing
X15numeric97836 unique values
0 missing
X16numeric97807 unique values
0 missing
X17numeric97734 unique values
0 missing
X18numeric97799 unique values
0 missing
X19numeric97732 unique values
0 missing
X20numeric97777 unique values
0 missing
X21numeric97762 unique values
0 missing
X22numeric97541 unique values
0 missing
X23numeric97438 unique values
0 missing
X24numeric97227 unique values
0 missing
X25numeric97271 unique values
0 missing
X26numeric97280 unique values
0 missing
X27numeric97346 unique values
0 missing
X28numeric97276 unique values
0 missing
X29numeric97273 unique values
0 missing
X30numeric97328 unique values
0 missing
X31numeric97475 unique values
0 missing
X32numeric97429 unique values
0 missing
X33numeric97414 unique values
0 missing
X34numeric97452 unique values
0 missing
X35numeric97445 unique values
0 missing
X36numeric97266 unique values
0 missing
X37numeric97316 unique values
0 missing
X38numeric97198 unique values
0 missing
X39numeric97291 unique values
0 missing
X40numeric97359 unique values
0 missing
X41numeric97194 unique values
0 missing
X42numeric97219 unique values
0 missing
X43numeric97187 unique values
0 missing
X44numeric97212 unique values
0 missing
X45numeric97250 unique values
0 missing
X46numeric97178 unique values
0 missing
X47numeric97180 unique values
0 missing
X48numeric97239 unique values
0 missing
X49numeric97171 unique values
0 missing
X50numeric97155 unique values
0 missing
X51numeric97873 unique values
0 missing
X52numeric73164 unique values
0 missing
X53numeric72925 unique values
0 missing
X54numeric86238 unique values
0 missing
X55numeric92160 unique values
0 missing
X56numeric95063 unique values
0 missing
X57numeric94849 unique values
0 missing
X58numeric96214 unique values
0 missing
X59numeric96389 unique values
0 missing
X60numeric96484 unique values
0 missing
X61numeric96837 unique values
0 missing
X62numeric96845 unique values
0 missing
X63numeric97026 unique values
0 missing
X64numeric97039 unique values
0 missing
X65numeric97039 unique values
0 missing
X66numeric97126 unique values
0 missing
X67numeric97132 unique values
0 missing
X68numeric97107 unique values
0 missing
X69numeric97162 unique values
0 missing
X70numeric97157 unique values
0 missing
X71numeric96984 unique values
0 missing
X72numeric96817 unique values
0 missing
X73numeric96994 unique values
0 missing
X74numeric97019 unique values
0 missing
X75numeric97082 unique values
0 missing
X76numeric97111 unique values
0 missing
X77numeric97269 unique values
0 missing
X78numeric97320 unique values
0 missing
X79numeric97220 unique values
0 missing
X80numeric97392 unique values
0 missing
X81numeric97384 unique values
0 missing
X82numeric97386 unique values
0 missing
X83numeric97449 unique values
0 missing
X84numeric97365 unique values
0 missing
X85numeric97410 unique values
0 missing
X86numeric97316 unique values
0 missing
X87numeric97361 unique values
0 missing
X88numeric97371 unique values
0 missing
X89numeric97368 unique values
0 missing
X90numeric97311 unique values
0 missing
X91numeric97315 unique values
0 missing
X92numeric97370 unique values
0 missing
X93numeric97377 unique values
0 missing
X94numeric97318 unique values
0 missing
X95numeric97387 unique values
0 missing
X96numeric97361 unique values
0 missing
X97numeric97373 unique values
0 missing
X98numeric97344 unique values
0 missing
X99numeric97282 unique values
0 missing
X100numeric97340 unique values
0 missing

107 properties

98528
Number of instances (rows) of the dataset.
101
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
100
Number of numeric attributes.
1
Number of nominal attributes.
0.19
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
15.01
First quartile of kurtosis among attributes of the numeric type.
1
Third quartile of standard deviation of attributes of the numeric type.
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
-0.35
Mean of means among attributes of the numeric type.
0.61
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
-0.85
First quartile of means among attributes of the numeric type.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.16
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
1
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
2.41
First quartile of skewness among attributes of the numeric type.
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0
Standard deviation of the number of distinct values among attributes of the nominal type.
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
2
Average number of distinct values among the attributes of the nominal type.
1
First quartile of standard deviation of attributes of the numeric type.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.16
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.74
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
3.37
Mean skewness among attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.26
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
50
Percentage of instances belonging to the most frequent class.
1
Mean standard deviation of attributes of the numeric type.
19.02
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
1
Entropy of the target attribute values.
0.48
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
49264
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
-0.24
Second quartile (Median) of means among attributes of the numeric type.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.79
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-1.44
Minimum kurtosis among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
0.15
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.21
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
119.15
Maximum kurtosis among attributes of the numeric type.
-1.47
Minimum of means among attributes of the numeric type.
3.31
Second quartile (Median) of skewness among attributes of the numeric type.
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.58
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0.91
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
0.99
Percentage of binary attributes.
1
Second quartile (Median) of standard deviation of attributes of the numeric type.
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
0
Percentage of instances having missing values.
Third quartile of entropy among attributes.
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
2
The maximum number of distinct values among attributes of the nominal type.
-1.43
Minimum skewness among attributes of the numeric type.
0
Percentage of missing values.
50.91
Third quartile of kurtosis among attributes of the numeric type.
0.5
Average class difference between consecutive instances.
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
7.98
Maximum skewness among attributes of the numeric type.
1
Minimum standard deviation of attributes of the numeric type.
99.01
Percentage of numeric attributes.
-0.09
Third quartile of means among attributes of the numeric type.
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.78
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
1
Maximum standard deviation of attributes of the numeric type.
50
Percentage of instances belonging to the least frequent class.
0.99
Percentage of nominal attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
0.16
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.22
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
49264
Number of instances belonging to the least frequent class.
First quartile of entropy among attributes.
5
Third quartile of skewness among attributes of the numeric type.
0.68
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.57
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
34.02
Mean kurtosis among attributes of the numeric type.
0.85
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

7 tasks

233 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Y
129 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: Y
0 runs - estimation_procedure: 33% Holdout set - evaluation_measure: predictive_accuracy - target_feature: Y
41 runs - estimation_procedure: Interleaved Test then Train - target_feature: Y
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task