Data
Filter results by:
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes1 downloads1 reach5 impact
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
185 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
321 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach7 impact
410 instances - 10937 features - 2 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 16 features - 0 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach3 impact
1000000 instances - 18 features - 22 classes - 0 missing values
1. Title: Employee Rejection\Acceptance (Orinal ERA) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 5 features - 0 classes - 0 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach5 impact
2536 instances - 73 features - 0 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
31 runs0 likes1 downloads1 reach7 impact
105 instances - 13 features - 6 classes - 61 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
Systematic determination of genetic network architecture. Nature Genetics, 1999 Jul;22(3):281-5. Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church (web…
0 runs0 likes1 downloads1 reach5 impact
17 instances - 2884 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
11 runs0 likes1 downloads1 reach6 impact
283 instances - 54622 features - 3 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
47 instances - 14 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
2 runs0 likes1 downloads1 reach7 impact
53 instances - 12 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
39 instances - 5 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
3 runs0 likes1 downloads1 reach5 impact
50 instances - 6 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
200 instances - 21 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
150 instances - 5 features - 0 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes1 downloads1 reach6 impact
95 instances - 22278 features - 5 classes - 0 missing values
17x17x2x2 tables of counts in GLIM-ready format used for the analyses in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of Family Disruption on Social Mobility." American Sociological…
3 runs0 likes1 downloads1 reach5 impact
1156 instances - 6 features - 0 classes - 0 missing values
This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to…
0 runs0 likes1 downloads1 reach5 impact
60 instances - 16 features - 0 classes - 0 missing values
------------------------------------------------------------------------------- TIME SERIES USED IN LONG-MEMORY PROCESSES, THE ALLAN VARIANCE AND WAVELETS BY D. B. PERCIVAL AND P. GUTTORP, A CHAPTER…
0 runs0 likes1 downloads1 reach5 impact
6875 instances - 1 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach5 impact
500 instances - 8 features - 0 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach2 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
10 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
6 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach2 impact
1000000 instances - 37 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 19 features - 4 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
44 runs0 likes1 downloads1 reach1 impact
1000000 instances - 13 features - 11 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
144 instances - 77 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
17496 instances - 10 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes1 downloads1 reach5 impact
365 instances - 4 features - 0 classes - 30 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach5 impact
559 instances - 5 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes1 downloads1 reach3 impact
228 instances - 8 features - classes - 20 missing values
Geographical Analysis Spatial Data This georeferenced data set was used in: Pace, R. Kelley, and Ronald Barry, Quick Computation of Regressions with a Spatially Autoregressive Dependent Variable,…
4 runs1 likes1 downloads2 reach5 impact
3107 instances - 7 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach5 impact
500 instances - 8 features - 0 classes - 0 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
0 runs0 likes1 downloads1 reach3 impact
31 instances - 16 features - classes - 150 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach5 impact
559 instances - 5 features - 0 classes - 0 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach5 impact
559 instances - 5 features - 0 classes - 0 missing values
This analysis describes and summarizes the relationships between 1987 salaries of major league baseball players and the player's performance. The salary data were taken from Sports Illustrated, April…
0 runs1 likes1 downloads2 reach5 impact
26 instances - 9 features - 0 classes - 0 missing values
Veteran's Administration Lung Cancer Trial Taken from Kalbfleisch and Prentice, pages 223-224 Variables Treatment 1=standard, 2=test Celltype 1=squamous, 2=smallcell, 3=adeno, 4=large Survival in days…
2 runs0 likes1 downloads1 reach5 impact
137 instances - 8 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes1 downloads1 reach5 impact
468 instances - 4 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 30016, and it has 97 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
97 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 100872, and it has 533 rows and 1026 features…
1 runs0 likes1 downloads1 reach3 impact
533 instances - 1026 features - 0 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 101433, and it has 81 rows and 1026 features (including…
1 runs0 likes1 downloads1 reach3 impact
81 instances - 1026 features - 0 classes - 0 missing values