Data
Filter results by:
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
686 runs0 likes5 downloads5 reach15 impact
782 instances - 9 features - 2 classes - 466 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
683 runs0 likes5 downloads5 reach14 impact
60 instances - 11 features - 2 classes - 14 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
700 runs0 likes5 downloads5 reach14 impact
67 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
111 runs0 likes5 downloads5 reach14 impact
52 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
737 runs0 likes5 downloads5 reach14 impact
47 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
740 runs0 likes5 downloads5 reach14 impact
51 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
106 runs0 likes5 downloads5 reach14 impact
76 instances - 45 features - 2 classes - 22 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
721 runs0 likes5 downloads5 reach15 impact
226 instances - 70 features - 2 classes - 317 missing values
No data.
948 runs0 likes5 downloads5 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
203 runs0 likes5 downloads5 reach21 impact
878 instances - 7455 features - 10 classes - 0 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
50 runs0 likes5 downloads5 reach14 impact
95 instances - 8 features - 5 classes - 9 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
455 runs0 likes5 downloads5 reach14 impact
108 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
730 runs0 likes5 downloads5 reach14 impact
93 instances - 23 features - 2 classes - 14 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
721 runs0 likes5 downloads5 reach14 impact
60 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
687 runs0 likes5 downloads5 reach14 impact
52 instances - 24 features - 2 classes - 39 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
717 runs0 likes5 downloads5 reach15 impact
303 instances - 14 features - 2 classes - 7 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
103 runs0 likes5 downloads5 reach14 impact
107 instances - 12 features - 2 classes - 71 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
732 runs0 likes5 downloads5 reach14 impact
63 instances - 32 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
731 runs0 likes5 downloads5 reach23 impact
151 instances - 7 features - 3 classes - 0 missing values
--Title: AR3 /Software Defect Prediction --Date: February, 4th, 2009 --Data from a Turkish white-goods manufacturer --Donated by: Software Research Laboratory (Softlab), Bogazici University, Istanbul,…
718 runs0 likes5 downloads5 reach14 impact
63 instances - 30 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
180 runs0 likes5 downloads5 reach24 impact
294 instances - 11 features - 2 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D4 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
121 runs0 likes4 downloads4 reach14 impact
8654 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: E5 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
105 runs0 likes4 downloads4 reach14 impact
1112 instances - 4 features - 5 classes - 0 missing values
* Abstract: A 3-class version of abalone dataset. * Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and…
176 runs0 likes4 downloads4 reach14 impact
4177 instances - 9 features - 3 classes - 0 missing values
A 4-class version of breast-tissue dataset.
299 runs0 likes4 downloads4 reach13 impact
106 instances - 10 features - 4 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach14 impact
10176 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach14 impact
10386 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
273 runs0 likes4 downloads4 reach14 impact
3252 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach14 impact
1623 instances - 4 features - 5 classes - 0 missing values
Kung chi
1 runs0 likes4 downloads4 reach12 impact
123 instances - 40 features - 2 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach9 impact
20000 instances - 17 features - 3 classes - 10000 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
11 runs1 likes3 downloads4 reach14 impact
220 instances - 22284 features - 3 classes - 0 missing values
No data.
292 runs0 likes4 downloads4 reach11 impact
1000000 instances - 37 features - 6 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach15 impact
468 instances - 10937 features - 2 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
66 runs0 likes4 downloads4 reach14 impact
277 instances - 10 features - 2 classes - 0 missing values
This is a sesnor data for test it is not complete.
0 runs0 likes4 downloads4 reach11 impact
127591 instances - 27 features - classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10907, and it has 1443 rows and 1026 features…
1 runs0 likes4 downloads4 reach11 impact
1443 instances - 1026 features - 0 classes - 0 missing values
Predicting the Geographical Origin of Music, ICDM, 2014 Abstract: Instances in this dataset contain audio features extracted from 1059 wave files. The task associated with the data is to predict the…
0 runs0 likes4 downloads4 reach13 impact
1059 instances - 118 features - 0 classes - 0 missing values
USDA, NRCS. 2008. The PLANTS Database ([Web Link], 31 December 2008). National Plant Data Center, Baton Rouge, LA 70874-4490 USA. Abstract: Data has been extracted from the USDA plants database. It…
0 runs0 likes4 downloads4 reach11 impact
Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace…
3 runs1 likes3 downloads4 reach13 impact
1030 instances - 9 features - classes - 0 missing values
Author: Gregory Gay, Tim Menzies, Misty Davies, Karen Gundy-Burlet Source: [Zenodo](https://zenodo.org/record/322475) Please cite: Misty Davies. (2009). bike [Data set]. Zenodo. DOI:…
0 runs0 likes4 downloads4 reach10 impact
4435 instances - 11 features - classes - 0 missing values
The goal is to predict the Fare. Variable description: pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower age: Age is fractional if less than 1. If the age is…
0 runs0 likes4 downloads4 reach10 impact
1307 instances - 8 features - 0 classes - 0 missing values
This dataset is gather to detect whether a person is running or walking based on deep neural networks and sensor data collected from iOS devices. The dataset represents 88588 sensor data samples…
1 runs0 likes4 downloads4 reach14 impact
88588 instances - 7 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach15 impact
470 instances - 10936 features - 2 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach11 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach11 impact
1000000 instances - 17 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach14 impact
138 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes4 downloads4 reach15 impact
195 instances - 10936 features - 2 classes - 0 missing values
No data.
65 runs0 likes4 downloads4 reach10 impact
1000000 instances - 40 features - 2 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
2 runs1 likes3 downloads4 reach9 impact
22784 instances - 9 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach15 impact
355 instances - 10936 features - 2 classes - 0 missing values
No data.
69 runs0 likes4 downloads4 reach9 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
63 runs0 likes4 downloads4 reach11 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach9 impact
1000000 instances - 17 features - 7 classes - 0 missing values
No data.
143 runs0 likes4 downloads4 reach11 impact
1000000 instances - 39 features - 6 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2841 runs0 likes4 downloads4 reach24 impact
630 instances - 10936 features - 2 classes - 0 missing values
No data.
230 runs0 likes4 downloads4 reach11 impact
1000000 instances - 35 features - 2 classes - 0 missing values
No data.
219 runs0 likes4 downloads4 reach11 impact
1000000 instances - 58 features - 2 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach11 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
334 runs0 likes4 downloads4 reach11 impact
1000000 instances - 33 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach15 impact
193 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2834 runs0 likes4 downloads4 reach24 impact
1545 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes4 downloads4 reach24 impact
1545 instances - 10936 features - 2 classes - 0 missing values
No data.
27 runs1 likes3 downloads4 reach10 impact
1000000 instances - 26 features - 7 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
125 runs0 likes4 downloads4 reach14 impact
8753 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
120 runs0 likes4 downloads4 reach14 impact
10668 instances - 4 features - 5 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes4 downloads4 reach15 impact
421 instances - 10936 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach15 impact
484 instances - 10936 features - 2 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach10 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes4 downloads4 reach14 impact
92 instances - 59005 features - 5 classes - 0 missing values
Even smaller sample of version 1
0 runs0 likes4 downloads4 reach12 impact
149639 instances - 12 features - 2 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach10 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
117 runs0 likes4 downloads4 reach9 impact
1000000 instances - 20 features - 5 classes - 0 missing values
No data.
306 runs0 likes4 downloads4 reach11 impact
1000000 instances - 4 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
9007 runs1 likes3 downloads4 reach15 impact
1941 instances - 28 features - 7 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes4 downloads4 reach11 impact
15000 instances - 49 features - 0 classes - 0 missing values
This is the Tecator data set: The task is to predict the fat content of a meat sample on the basis of its near infrared absorbance spectrum. 1. Statement of permission from Tecator (the original data…
0 runs0 likes4 downloads4 reach14 impact
240 instances - 125 features - 0 classes - 0 missing values
shuttle-pmlb
10 runs0 likes4 downloads4 reach23 impact
58000 instances - 10 features - 7 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
6890 runs0 likes4 downloads4 reach17 impact
44819 instances - 7 features - 3 classes - 0 missing values
This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). This dataset is ordered. It first contains all signal…
12 runs0 likes4 downloads4 reach13 impact
130064 instances - 51 features - 2 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
9931 runs0 likes4 downloads4 reach24 impact
2310 instances - 20 features - 7 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs1 likes3 downloads4 reach14 impact
95 instances - 22278 features - 5 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
129 runs0 likes4 downloads4 reach13 impact
117 instances - 91 features - 3 classes - 0 missing values
this is titanic survival prediction
0 runs0 likes4 downloads4 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
No data.
108 runs0 likes4 downloads4 reach22 impact
927 instances - 10129 features - 7 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes4 downloads4 reach11 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach11 impact
1000000 instances - 11 features - 2 classes - 0 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes4 downloads4 reach8 impact
1460 instances - 81 features - 0 classes - 6965 missing values
Wisconsin Prognostic Breast Cancer (WPBC) Various versions of this data have been used in the following publications: - W. N. Street, O. L. Mangasarian, and W.H. Wolberg. An inductive learning…
5 runs0 likes4 downloads4 reach9 impact
194 instances - 33 features - 0 classes - 0 missing values
University of Sao Paulo, School of Art, Sciences and Humanities, Sao Paulo, SP, Brazil ### LIBRAS Movement Database LIBRAS, acronym of the Portuguese name "LIngua BRAsileira de Sinais", is the…
0 runs0 likes4 downloads4 reach19 impact
360 instances - 91 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
12 runs0 likes4 downloads4 reach15 impact
8192 instances - 13 features - 0 classes - 0 missing values
El Nino Data Data Type spatio-temporal Abstract The data set contains oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific. The…
0 runs1 likes3 downloads4 reach13 impact
1. U. S. Department of Commerce, Bureau of the Census, Census Of Population And Housing 1990 United States: Summary Tape File 1a & 3a (Computer Files), 2. U.S. Department Of Commerce, Bureau Of The…
0 runs1 likes3 downloads4 reach13 impact
1994 instances - 128 features - 0 classes - 39202 missing values
University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D. Heart Disease Databases Cholesterol…
160 runs0 likes4 downloads4 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
Primary Biliary Cirrhosis The data set found in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. The only differences are: age is in days status is coded as…
18 runs1 likes3 downloads4 reach14 impact
418 instances - 20 features - 0 classes - 1033 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
672 runs0 likes4 downloads4 reach15 impact
158 instances - 8 features - 2 classes - 87 missing values