Data
Filter results by:
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
118 runs0 likes3 downloads3 reach8 impact
228 instances - 10 features - 2 classes - 20 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes3 downloads3 reach8 impact
527 instances - 39 features - 2 classes - 560 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
113 runs0 likes3 downloads3 reach8 impact
366 instances - 6 features - 2 classes - 1 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
701 runs0 likes3 downloads3 reach8 impact
736 instances - 20 features - 2 classes - 448 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
104 runs0 likes3 downloads3 reach7 impact
57 instances - 12 features - 2 classes - 1 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-cd1-400 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
144 runs0 likes3 downloads3 reach6 impact
400 instances - 41 features - 8 classes - 0 missing values
* Abstract: A 3-class version of abalone dataset. * Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and…
105 runs0 likes3 downloads3 reach7 impact
4177 instances - 9 features - 3 classes - 0 missing values
* Dataset Title: Wall-Following Robot Navigation Data Data Set (version with 2 Attributes) * Abstract: The data were collected as the SCITOS G5 robot navigates through the room following the wall in a…
109 runs0 likes3 downloads3 reach7 impact
5456 instances - 3 features - 4 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
129 runs0 likes3 downloads3 reach6 impact
117 instances - 91 features - 3 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B4 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
123 runs0 likes3 downloads3 reach7 impact
10190 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
118 runs0 likes3 downloads3 reach7 impact
9172 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: D3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
126 runs0 likes3 downloads3 reach7 impact
9285 instances - 4 features - 5 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
12 runs0 likes4 downloads4 reach6 impact
8192 instances - 13 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2855 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331-339, 1995. #Dataset from the LIBSVM data repository. Preprocessing: First…
0 runs0 likes4 downloads4 reach5 impact
19928 instances - 62062 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2834 runs0 likes4 downloads4 reach16 impact
1545 instances - 10937 features - 2 classes - 0 missing values
No data.
292 runs0 likes4 downloads4 reach2 impact
1000000 instances - 37 features - 6 classes - 0 missing values
No data.
312 runs0 likes4 downloads4 reach3 impact
1000000 instances - 14 features - 3 classes - 0 missing values
No data.
117 runs0 likes4 downloads4 reach1 impact
1000000 instances - 20 features - 5 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
306 runs0 likes4 downloads4 reach2 impact
1000000 instances - 4 features - 2 classes - 0 missing values
This is the Tecator data set: The task is to predict the fat content of a meat sample on the basis of its near infrared absorbance spectrum. 1. Statement of permission from Tecator (the original data…
0 runs0 likes4 downloads4 reach5 impact
240 instances - 125 features - 0 classes - 0 missing values
Primary Biliary Cirrhosis This data set is a follow-up to the original PBC data set, as discussed in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. An…
0 runs0 likes4 downloads4 reach5 impact
1945 instances - 19 features - 0 classes - 1133 missing values
Predicting the Geographical Origin of Music, ICDM, 2014 Abstract: Instances in this dataset contain audio features extracted from 1059 wave files. The task associated with the data is to predict the…
0 runs0 likes4 downloads4 reach4 impact
1059 instances - 118 features - 0 classes - 0 missing values
USDA, NRCS. 2008. The PLANTS Database ([Web Link], 31 December 2008). National Plant Data Center, Baton Rouge, LA 70874-4490 USA. Abstract: Data has been extracted from the USDA plants database. It…
0 runs0 likes4 downloads4 reach3 impact
This is a sesnor data for test it is not complete.
0 runs0 likes4 downloads4 reach3 impact
127591 instances - 27 features - classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach2 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
65 runs0 likes4 downloads4 reach1 impact
1000000 instances - 40 features - 2 classes - 0 missing values
No data.
63 runs0 likes4 downloads4 reach2 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
48 runs1 likes4 downloads5 reach2 impact
1000000 instances - 77 features - 10 classes - 0 missing values
1. Title: Wisconsin Prognostic Breast Cancer (WPBC) 2. Source Information a) Creators: Dr. William H. Wolberg, General Surgery Dept., University of Wisconsin, Clinical Sciences Center, Madison, WI…
5 runs0 likes4 downloads4 reach1 impact
194 instances - 33 features - 0 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
7 runs1 likes4 downloads5 reach1 impact
159 instances - 16 features - 0 classes - 0 missing values
This is an artificial data set described in Breiman et al. (1984,p.238) (with variance 1 instead of 2). Generate the values of the 10 attributes independently using the following probabilities: P(X_1…
2 runs1 likes4 downloads5 reach2 impact
40768 instances - 11 features - 0 classes - 0 missing values
No data.
230 runs0 likes4 downloads4 reach2 impact
1000000 instances - 35 features - 2 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach2 impact
1000000 instances - 11 features - 2 classes - 0 missing values
Synthetic dataset. Almost identical to [dataset 152](https://www.openml.org/d/153/edit)
319 runs0 likes4 downloads4 reach2 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
90 runs0 likes4 downloads4 reach1 impact
137781 instances - 10 features - 7 classes - 0 missing values
No data.
219 runs0 likes4 downloads4 reach2 impact
1000000 instances - 58 features - 2 classes - 0 missing values
No data.
334 runs0 likes4 downloads4 reach2 impact
1000000 instances - 33 features - 2 classes - 0 missing values
No data.
108 runs0 likes4 downloads4 reach10 impact
927 instances - 10129 features - 7 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach2 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach2 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach2 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach2 impact
1000000 instances - 14 features - 2 classes - 0 missing values
University of Sao Paulo, School of Art, Sciences and Humanities, Sao Paulo, SP, Brazil ### LIBRAS Movement Database LIBRAS, acronym of the Portuguese name "LIngua BRAsileira de Sinais", is the…
0 runs0 likes4 downloads4 reach7 impact
360 instances - 91 features - 0 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach1 impact
1000000 instances - 18 features - 7 classes - 0 missing values
No data.
69 runs0 likes4 downloads4 reach1 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
51 runs1 likes4 downloads5 reach2 impact
1000000 instances - 48 features - 10 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
143 runs0 likes4 downloads4 reach2 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Automated file upload of BNG(ionosphere)
99 runs1 likes4 downloads5 reach3 impact
1000000 instances - 35 features - 2 classes - 0 missing values
The langLog dataset includes 1004 textual predictors and was originally compiled in the doctorial thesis of Read (2010). It consists of 956 text samples that can be assigned to one or more topics such…
0 runs0 likes4 downloads4 reach3 impact
1460 instances - 1079 features - 2 classes - 0 missing values
Data from https://doi.org/10.5281/zenodo.269636
0 runs0 likes4 downloads4 reach5 impact
4758 instances - 39 features - classes - 0 missing values
This is a 20,000 instance sample of the original CIFAR-10 dataset. Sampled randomly and stratified, with 2000 examples per class. Training and test set are merged. Find the corresponding task for the…
380 runs0 likes4 downloads4 reach12 impact
20000 instances - 3073 features - 10 classes - 0 missing values
This dataset contains QSAR data (from ChEMBL version 17) showing activity values (unit is pseudo-pCI50) of several compounds on drug target TID: 10907, and it has 1443 rows and 1026 features…
1 runs0 likes4 downloads4 reach3 impact
1443 instances - 1026 features - 0 classes - 0 missing values
Kung chi
1 runs0 likes4 downloads4 reach4 impact
123 instances - 40 features - 2 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B2 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
120 runs0 likes4 downloads4 reach6 impact
10668 instances - 4 features - 5 classes - 0 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
143 runs0 likes4 downloads4 reach12 impact
60000 instances - 3073 features - 10 classes - 0 missing values
This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). This dataset is ordered. It first contains all signal…
6 runs0 likes4 downloads4 reach5 impact
130064 instances - 51 features - 2 classes - 0 missing values
Small dataset with time series of RAM prices over the years.
0 runs1 likes4 downloads5 reach4 impact
333 instances - 3 features - 0 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: B3 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
119 runs0 likes4 downloads4 reach7 impact
10386 instances - 4 features - 5 classes - 0 missing values
libSVM","AAD group Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Cell Biology, 96:6745-6750, 1999. #Dataset from…
0 runs0 likes4 downloads4 reach6 impact
62 instances - 2001 features - 0 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach7 impact
138 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes4 downloads4 reach8 impact
421 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach8 impact
470 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
65 runs0 likes4 downloads4 reach8 impact
468 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach8 impact
484 instances - 10937 features - 2 classes - 0 missing values
Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
66 runs0 likes4 downloads4 reach7 impact
277 instances - 10 features - 2 classes - 0 missing values
* Dataset Title: MicroMass - Mixed (mixed spectra version) * Abstract: A dataset to explore machine learning approaches for the identification of microorganisms from mass-spectrometry data. * Source:…
64 runs1 likes4 downloads5 reach6 impact
360 instances - 1301 features - 10 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2841 runs0 likes4 downloads4 reach17 impact
630 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach8 impact
355 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
66 runs0 likes4 downloads4 reach8 impact
195 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach8 impact
193 instances - 10937 features - 2 classes - 0 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes4 downloads5 reach6 impact
768 instances - 10 features - 37 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes4 downloads4 reach2 impact
15000 instances - 49 features - 0 classes - 0 missing values
Abstract: CART book's waveform domains Source: Original Owners: Breiman,L., Friedman,J.H., Olshen,R.A., & Stone,C.J. (1984). Classification and Regression Trees. Wadsworth International Group:…
0 runs1 likes4 downloads5 reach4 impact
5000 instances - 22 features - classes - 0 missing values
Squash Harvest Stored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
867 runs0 likes4 downloads4 reach8 impact
52 instances - 25 features - 3 classes - 7 missing values
Squash Harvest Unstored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
876 runs0 likes4 downloads4 reach8 impact
52 instances - 24 features - 3 classes - 39 missing values
No data.
211 runs0 likes4 downloads4 reach11 impact
313 instances - 5805 features - 8 classes - 0 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes4 downloads4 reach7 impact
67 instances - 17 features - 5 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach7 impact
92 instances - 11 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
490 runs0 likes4 downloads4 reach6 impact
364 instances - 33 features - 6 classes - 101 missing values
CODING: ITEM 1 = BUSINESS CONDIDIONS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 2 = JOBS 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 3 = FAMILY INCOME 6 MONTHS FROM NOW (CONFERENCE BOARD) ITEM 4 =…
560 runs0 likes4 downloads4 reach7 impact
72 instances - 4 features - 6 classes - 0 missing values
Hayes-Roth Database This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Source…
384 runs0 likes4 downloads4 reach18 impact
160 instances - 5 features - 3 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
537 runs0 likes4 downloads4 reach7 impact
285 instances - 8 features - 7 classes - 27 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
104 runs0 likes4 downloads4 reach7 impact
52 instances - 10 features - 2 classes - 0 missing values
No data.
949 runs0 likes4 downloads4 reach4 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
996 runs0 likes4 downloads4 reach4 impact
74 instances - 63 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes4 downloads4 reach7 impact
54 instances - 8 features - 2 classes - 120 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
670 runs0 likes4 downloads4 reach7 impact
62 instances - 8 features - 2 classes - 8 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
700 runs0 likes4 downloads4 reach8 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
733 runs0 likes4 downloads4 reach7 impact
87 instances - 11 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
708 runs0 likes4 downloads4 reach9 impact
286 instances - 10 features - 2 classes - 9 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
672 runs0 likes4 downloads4 reach8 impact
158 instances - 8 features - 2 classes - 87 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes4 downloads4 reach7 impact
67 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
714 runs0 likes4 downloads4 reach8 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
117 runs0 likes4 downloads4 reach7 impact
50 instances - 6 features - 2 classes - 0 missing values