Data
Filter results by:
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach5 impact
100 instances - 51 features - 0 classes - 0 missing values
Systematic determination of genetic network architecture. Nature Genetics, 1999 Jul;22(3):281-5. Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church (web…
0 runs0 likes1 downloads1 reach5 impact
17 instances - 2884 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
47 instances - 14 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
39 instances - 5 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
200 instances - 21 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes2 downloads2 reach5 impact
48 instances - 8 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach5 impact
150 instances - 5 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
0 runs0 likes0 downloads0 reach5 impact
31 instances - 31 features - 0 classes - 33 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
0 runs0 likes0 downloads0 reach5 impact
145 instances - 95 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes0 downloads0 reach6 impact
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes0 downloads0 reach5 impact
40 instances - 7 features - 0 classes - 3 missing values
This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to…
0 runs0 likes1 downloads1 reach5 impact
60 instances - 16 features - 0 classes - 0 missing values
DATA FILE: Data on patient deaths within 30 days of surgery in 131 U.S. hospitals. See Christiansen and Morris, Bayesian Biostatistics, D. Berry and D. Stangl, editors, 1996, Marcel Dekker, Inc. Data…
0 runs0 likes0 downloads0 reach5 impact
131 instances - 4 features - 0 classes - 0 missing values
------------------------------------------------------------------------------- TIME SERIES USED IN LONG-MEMORY PROCESSES, THE ALLAN VARIANCE AND WAVELETS BY D. B. PERCIVAL AND P. GUTTORP, A CHAPTER…
0 runs0 likes1 downloads1 reach5 impact
6875 instances - 1 features - 0 classes - 0 missing values
The data consist of annual observations on the level of strike volume (days lost due to industrial disputes per 1000 wage salary earners), and their covariates in 18 OECD countries from 1951-1985. The…
0 runs0 likes2 downloads2 reach5 impact
625 instances - 7 features - 0 classes - 0 missing values
File README ----------- smoothmeth A collection of the data sets used in the book "Smoothing Methods in Statistics," by Jeffrey S. Simonoff, Springer-Verlag, New York, 1996. Submitted by Jeff Simonoff…
0 runs0 likes0 downloads0 reach5 impact
2178 instances - 4 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
libSVM","AAD group Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Cell Biology, 96:6745-6750, 1999. #Dataset from…
0 runs0 likes3 downloads3 reach5 impact
62 instances - 2001 features - 0 classes - 0 missing values
libSVM","AAD group A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University, 2003. #Dataset from the LIBSVM data repository…
0 runs0 likes0 downloads0 reach5 impact
7089 instances - 5 features - 0 classes - 0 missing values
libSVM","AAD group A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics, 19(17):2246-2253, 2003. #Dataset from the LIBSVM data repository.…
0 runs0 likes2 downloads2 reach5 impact
86 instances - 7130 features - 0 classes - 0 missing values
Building projectable classifiers of arbitrary complexity. In Proceedings of the 13th International Conference on Pattern Recognition, pages 880-885, Vienna, Austria, August 1996. #Dataset from the…
0 runs0 likes3 downloads3 reach5 impact
862 instances - 3 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach5 impact
1000 instances - 25 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes2 downloads2 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes0 downloads0 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes0 downloads0 reach5 impact
32561 instances - 124 features - 0 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
0 runs0 likes7 downloads7 reach5 impact
226 instances - 24 features - 2 classes - 0 missing values
Modified version of the training dataset of the Bike Sharing Demand challenge running on Kaggle (http://www.kaggle.com/c/bike-sharing-demand/) If you use the problem in publication, please cite:…
0 runs0 likes3 downloads3 reach4 impact
10886 instances - 12 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach3 impact
24 instances - 5 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach5 impact
209 instances - 8 features - classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 101 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
78732 instances - 11 features - 0 classes - 0 missing values
Data on predicting clicks on ads in a search engine.
0 runs0 likes7 downloads7 reach5 impact
1496391 instances - 12 features - 2 classes - 0 missing values
Even smaller sample of version 1
0 runs0 likes3 downloads3 reach4 impact
149639 instances - 12 features - 2 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach1 impact
116640 instances - 10 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 13 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
531441 instances - 12 features - 0 classes - 0 missing values
No data.
0 runs0 likes2 downloads2 reach1 impact
1000000 instances - 37 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 41 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 91 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
1000000 instances - 33 features - 0 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
0 runs0 likes2 downloads2 reach2 impact
798964 instances - 12 features - 3 classes - 399482 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
144 instances - 77 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 19 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 26 features - 0 classes - 0 missing values
No data.
0 runs0 likes2 downloads2 reach1 impact
31104 instances - 10 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 16 features - 0 classes - 0 missing values
Juan J. Rodriguez, Ludmila I. Kuncheva, Carlos J. Alonso (2006). Rotation Forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(10):1619-1630.…
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 12 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 19 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
17496 instances - 10 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach1 impact
59049 instances - 10 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes1 downloads1 reach5 impact
365 instances - 4 features - 0 classes - 30 missing values
Relationship between IQ and Brain Size Summary: Monozygotic twins share numerous physical, psychological, and pathological traits. Recent advances in in vivo brain image acquisition and analysis have…
0 runs0 likes0 downloads0 reach5 impact
20 instances - 9 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach5 impact
48 instances - 5 features - 0 classes - 0 missing values
One of two multivariate regression data sets from paper industry, from an experiment at the paper plant Norske Skog, Skogn, Norway. They have been described and analysed in: Aldrin, M. (1996),…
0 runs0 likes2 downloads2 reach5 impact
One of two multivariate regression data sets from paper industry, from an experiment at the paper plant Saugbruksforeningen, Norway. They have been described and analysed in: Aldrin, M. (1996),…
0 runs0 likes0 downloads0 reach5 impact
30 instances - 41 features - 0 classes - 0 missing values
This is the hip measurement data from Table B.13 in Chatfield's Problem Solving (1995, 2nd edn, Chapman and Hall). It is given in 8 columns. First 4 columns are for Control Group. Last 4 columns are…
0 runs0 likes0 downloads0 reach3 impact
54 instances - 8 features - classes - 120 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes2 downloads2 reach3 impact
100 instances - 10 features - classes - 0 missing values