Data
Filter results by:
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes1 downloads2 reach13 impact
4450 instances - 203 features - 0 classes - 0 missing values
Pittsburgh bridges This version is derived from version 1 by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not a classification dataset,…
34 runs0 likes1 downloads1 reach15 impact
105 instances - 12 features - 6 classes - 61 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach9 impact
186 instances - 61 features - 0 classes - 0 missing values
This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to…
0 runs0 likes1 downloads1 reach13 impact
60 instances - 16 features - 0 classes - 0 missing values
Veteran's Administration Lung Cancer Trial Taken from Kalbfleisch and Prentice, pages 223-224 Variables Treatment 1=standard, 2=test Celltype 1=squamous, 2=smallcell, 3=adeno, 4=large Survival in days…
2 runs0 likes1 downloads1 reach13 impact
137 instances - 8 features - 0 classes - 0 missing values
Information about the dataset CLASSTYPE: numeric CLASSINDEX: last
2 runs0 likes1 downloads1 reach13 impact
559 instances - 5 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes1 downloads1 reach13 impact
468 instances - 4 features - 0 classes - 0 missing values
17x17x2x2 tables of counts in GLIM-ready format used for the analyses in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of Family Disruption on Social Mobility." American Sociological…
3 runs0 likes1 downloads1 reach14 impact
1156 instances - 6 features - 0 classes - 0 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
0 runs0 likes1 downloads1 reach11 impact
31 instances - 16 features - classes - 150 missing values
This S dump contains 22 data sets from the book Visualizing Data published by Hobart Press (books@hobart.com). The dump was created by data.dump() and can be read back into S by data.restore(). The…
2 runs0 likes1 downloads1 reach13 impact
8641 instances - 5 features - 0 classes - 0 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach13 impact
2536 instances - 73 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
144 instances - 77 features - 0 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Automated file upload of BNG(segment)
99 runs0 likes1 downloads1 reach11 impact
1000000 instances - 20 features - 7 classes - 0 missing values
Multi-label dataset. The image benchmark dataset consists of 2000 natural scene images. Zhou and Zhang (2007) extracted 135 features for each image and made it publicly available as processed image…
0 runs1 likes1 downloads2 reach9 impact
2000 instances - 140 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
10 runs0 likes1 downloads1 reach19 impact
20000 instances - 4297 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
13 runs0 likes1 downloads1 reach18 impact
58310 instances - 181 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
12 runs0 likes1 downloads1 reach18 impact
83733 instances - 55 features - 4 classes - 0 missing values
Sensor data measurements of one Boiler, containing WaterInput/SteamOutput (flow, temperature, pressure) for one month, which is measured every minute.
0 runs0 likes1 downloads1 reach9 impact
44643 instances - 8 features - classes - 44643 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
3 runs0 likes1 downloads1 reach18 impact
5418 instances - 1637 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach16 impact
425240 instances - 79 features - 2 classes - 2734000 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes1 downloads1 reach16 impact
4147 instances - 49 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach15 impact
100 instances - 10001 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
11 runs1 likes1 downloads2 reach19 impact
20000 instances - 4297 features - 2 classes - 0 missing values
This is the dataset used for the 2016 IDA Industrial Challenge, courtesy of Scania. For a full description, see http://archive.ics.uci.edu/ml/datasets/IDA2016Challenge . This dataset contains both the…
7 runs0 likes1 downloads1 reach17 impact
76000 instances - 171 features - 2 classes - 1078695 missing values
These weekly averages are ultimately based on measurements of 4 air samples per hour taken atop intake lines on several towers during steady periods of CO2 concentration of not less than 6 hours per…
0 runs1 likes1 downloads2 reach9 impact
2225 instances - 7 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes1 downloads1 reach15 impact
321 instances - 10936 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
1. Title: Social Workers Decisions (Ordinal SWD) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 11 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 6 features - 0 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach9 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 51 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach12 impact
177147 instances - 11 features - 0 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 26 features - 0 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
11 runs0 likes1 downloads1 reach14 impact
283 instances - 54622 features - 3 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach16 impact
32561 instances - 124 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 33 features - 0 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
1 runs0 likes1 downloads1 reach13 impact
500 instances - 26 features - 0 classes - 0 missing values
1. Title: Employee Rejection\Acceptance (Orinal ERA) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 5 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
20 instances - 10 features - 0 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach10 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 51 features - 0 classes - 0 missing values
MyExampleIris
32 runs0 likes1 downloads1 reach20 impact
150 instances - 5 features - 3 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 16 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 14 features - 0 classes - 0 missing values
Systematic determination of genetic network architecture. Nature Genetics, 1999 Jul;22(3):281-5. Data also used in Biclustering of Expression Data, by Yizong Cheng and George M. Church (web…
0 runs0 likes1 downloads1 reach13 impact
17 instances - 2884 features - 0 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 51 features - 0 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach12 impact
17496 instances - 10 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 11 features - 0 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 37 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach13 impact
200 instances - 20 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 6 features - 0 classes - 0 missing values
No data.
10 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes1 downloads1 reach14 impact
105 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs1 likes1 downloads2 reach14 impact
105 instances - 22284 features - 3 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: The original Adult data set has 14 features, among which six are continuous and eight are categorical. In this data set,…
0 runs0 likes1 downloads1 reach16 impact
32561 instances - 124 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
0 runs0 likes1 downloads1 reach16 impact
49749 instances - 301 features - 0 classes - 0 missing values
This is a test dataset
0 runs0 likes0 downloads0 reach0 impact
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes0 downloads0 reach20 impact
70 instances - 3 features - 3 classes - 0 missing values
This database contains the HTML source of web pages plus the ratings of a single user on these web pages. The web pages are on four separate subjects (Bands- recording artists; Goats; Sheep; and…
0 runs0 likes0 downloads0 reach20 impact
61 instances - 3 features - 3 classes - 0 missing values
This analysis describes and summarizes the relationships between 1987 salaries of major league baseball players and the player's performance. The salary data were taken from Sports Illustrated, April…
0 runs0 likes0 downloads0 reach13 impact
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
67 instances - 16 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
468 instances - 3 features - 0 classes - 0 missing values
Data Used in "A BAYESIAN APPROACH TO DATA DISCLOSURE: OPTIMAL INTRUDER BEHAVIOR FOR CONTINUOUS DATA" by Stephen E. Fienberg, Udi E. Makov, and Ashish P. Sanil Background: ========== In this paper we…
0 runs0 likes0 downloads0 reach13 impact
662 instances - 4 features - 0 classes - 0 missing values