OpenML
Filter results by:
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
135 runs0 likes9 downloads9 reach15 impact
3190 instances - 61 features - 2 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A4 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
136 runs0 likes5 downloads5 reach14 impact
1515 instances - 4 features - 5 classes - 0 missing values
* Dataset Title: Wall-Following Robot Navigation Data Data Set (version with 4 Attributes) * Abstract: The data were collected as the SCITOS G5 robot navigates through the room following the wall in a…
138 runs1 likes7 downloads8 reach15 impact
5456 instances - 5 features - 4 classes - 0 missing values
* Dataset: This is a reprocessed version of heart-h (hungarian), the heart disease reprocessed hungarian dataset from UCI.
138 runs0 likes8 downloads8 reach13 impact
294 instances - 14 features - 5 classes - 0 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach12 impact
106 instances - 58 features - 2 classes - 0 missing values
Yeast dataset Past Usage: André Elisseeff and Jason Weston. A kernel method for multi-labelled classification. In Thomas G. Dietterich, Susan Becker, and Zoubin Ghahramani, editors, Advances in…
139 runs0 likes8 downloads8 reach14 impact
2417 instances - 117 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
140 runs0 likes6 downloads6 reach15 impact
194 instances - 29 features - 2 classes - 0 missing values
* Title: Planning Relax Data Set * Abstract: The dataset concerns with the classification of two mental stages from recorded EEG signals: Planning (during imagination of motor act) and Relax state. *…
141 runs0 likes8 downloads8 reach14 impact
182 instances - 13 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
141 runs0 likes7 downloads7 reach14 impact
500 instances - 23 features - 2 classes - 0 missing values
No data.
143 runs0 likes4 downloads4 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
143 runs1 likes11 downloads12 reach15 impact
531 instances - 102 features - 2 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au6-cd1-400 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
144 runs0 likes3 downloads3 reach13 impact
400 instances - 41 features - 8 classes - 0 missing values
* Title: Thoracic Surgery Data Data Set * Abstract: The data is dedicated to classification problem related to the post-operative life expectancy in the lung cancer patients: class 1 - death within…
145 runs0 likes8 downloads8 reach14 impact
470 instances - 17 features - 2 classes - 0 missing values
* Abstract: Predict the Bankruptcy from Qualitative parameters from experts. * Source: Source Information -- Creator : Mr.A.Martin(jayamartin '@' yahoo.com) Mr.J.Uthayakumar (uthayakumar17691 '@'…
147 runs0 likes11 downloads11 reach14 impact
250 instances - 7 features - 2 classes - 0 missing values
* Title: seismic-bumps Data Set * Abstract: The data describe the problem of high energy (higher than 10^4 J) seismic bumps forecasting in a coal mine. Data come from two of longwalls located in a…
152 runs0 likes37 downloads37 reach13 impact
210 instances - 8 features - 3 classes - 0 missing values
* Title: User Knowledge Modeling Data Set * Abstract: It is the real dataset about the students' knowledge status about the subject of Electrical DC Machines. The dataset had been obtained from Ph.D.…
153 runs1 likes8 downloads9 reach13 impact
403 instances - 6 features - 5 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
153 runs0 likes8 downloads8 reach14 impact
81 instances - 12 features - 3 classes - 0 missing values
* Dataset Title: Vertebra Column - 3 classes * Abstract: Data set containing values for six biomechanical features used to classify orthopaedic patients into 3 classes (normal, disk hernia or…
154 runs0 likes5 downloads5 reach13 impact
310 instances - 7 features - 3 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
154 runs0 likes9 downloads9 reach15 impact
2001 instances - 2 features - 2 classes - 0 missing values
* Title: South Africa Heart Disease Dataset * Description A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case…
155 runs0 likes14 downloads14 reach14 impact
462 instances - 10 features - 2 classes - 0 missing values
No data.
159 runs0 likes11 downloads11 reach22 impact
1657 instances - 3759 features - 25 classes - 0 missing values
* Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779 * Data Set Information: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In…
159 runs1 likes5 downloads6 reach13 impact
200 instances - 14 features - 5 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Cholesterol treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
160 runs0 likes4 downloads4 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
0. airplane 1. automobile 2. bird 3. cat 4. deer 5. dog 6. frog 7. horse 8. ship 9. truck CIFAR-10 contains 6000 images per class. The original train-test split randomly divided these into 5000 train…
160 runs0 likes6 downloads6 reach21 impact
60000 instances - 3073 features - 10 classes - 0 missing values
* Title: Wholesale customers Data Set * Abstract: The data set refers to clients of a wholesale distributor. It includes the annual spending in monetary units (m.u.) on diverse product categories *…
161 runs0 likes11 downloads11 reach14 impact
440 instances - 9 features - 2 classes - 0 missing values
Dataset title laLSVT Voice Rehabilitation Data Set Source: The dataset was created by Athanasios Tsanas (tsanasthanasis '@' gmail.com) of the University of Oxford. Abstract: 126 samples from 14…
162 runs0 likes5 downloads5 reach13 impact
126 instances - 311 features - 2 classes - 0 missing values
No data.
163 runs0 likes5 downloads5 reach9 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
163 runs0 likes13 downloads13 reach22 impact
1560 instances - 8461 features - 20 classes - 0 missing values
An artificial data set where instances belongs to several clusters with a banana shape. There are two attributes At1 and At2 corresponding to the x and y axis, respectively. The class label (-1 and 1)…
163 runs2 likes17 downloads19 reach14 impact
5300 instances - 3 features - 2 classes - 0 missing values
The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the…
164 runs0 likes4 downloads4 reach14 impact
406 instances - 8 features - 3 classes - 14 missing values
No data.
167 runs0 likes9 downloads9 reach12 impact
399940 instances - 1002 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
169 runs0 likes8 downloads8 reach16 impact
600 instances - 61 features - 2 classes - 0 missing values
* Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779 * Data Set Information: This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In…
170 runs0 likes9 downloads9 reach13 impact
123 instances - 13 features - 5 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
173 runs0 likes6 downloads6 reach24 impact
106 instances - 58 features - 2 classes - 0 missing values
Data file: This data from "Problem-Solving" on "backache in pregnancy" is in somewhat different format from that listed in the book. Each integer is preceded by a space. This makes it easier to read.…
174 runs0 likes6 downloads6 reach15 impact
180 instances - 32 features - 2 classes - 0 missing values
A simple database containing 17 Boolean-valued attributes describing animals. The "type" attribute appears to be the class attribute. Notes: * I find it unusual that there are 2 instances of "frog"…
175 runs3 likes19 downloads22 reach9 impact
101 instances - 17 features - 7 classes - 0 missing values
* Abstract: A 3-class version of abalone dataset. * Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and…
176 runs0 likes4 downloads4 reach14 impact
4177 instances - 9 features - 3 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
176 runs0 likes7 downloads7 reach14 impact
101 instances - 17 features - 2 classes - 0 missing values
* Abstract: Oxford Parkinson's Disease Detection Dataset * Source: The dataset was created by Max Little of the University of Oxford, in collaboration with the National Centre for Voice and Speech,…
179 runs1 likes15 downloads16 reach15 impact
195 instances - 23 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
180 runs0 likes5 downloads5 reach24 impact
294 instances - 11 features - 2 classes - 0 missing values
Mega watt
183 runs0 likes8 downloads8 reach15 impact
253 instances - 38 features - 2 classes - 0 missing values
Pizza cutter 3
188 runs0 likes7 downloads7 reach14 impact
1043 instances - 38 features - 2 classes - 0 missing values
* Title: seeds Data Set * Abstract: Measurements of geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct…
190 runs0 likes5 downloads5 reach13 impact
210 instances - 8 features - 3 classes - 0 missing values
The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each line in the dataset constitutes the record of a…
191 runs2 likes30 downloads32 reach11 impact
345 instances - 6 features - 0 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach12 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Pizza cutter
197 runs0 likes8 downloads8 reach14 impact
661 instances - 38 features - 2 classes - 0 missing values
No data.
203 runs0 likes5 downloads5 reach21 impact
878 instances - 7455 features - 10 classes - 0 missing values
Oil dataset Past Usage: 1. Kubat, M., Holte, R.,
204 runs3 likes19 downloads22 reach25 impact
937 instances - 50 features - 2 classes - 0 missing values
No data.
206 runs0 likes3 downloads3 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
211 runs0 likes3 downloads3 reach12 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
211 runs0 likes4 downloads4 reach21 impact
313 instances - 5805 features - 8 classes - 0 missing values
No data.
215 runs0 likes7 downloads7 reach21 impact
204 instances - 5833 features - 6 classes - 0 missing values
No data.
216 runs0 likes12 downloads12 reach63 impact
11162 instances - 11466 features - 10 classes - 0 missing values
Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service…
216 runs0 likes11 downloads11 reach12 impact
110393 instances - 55 features - 7 classes - 0 missing values
Mammography dataset Past Usage: 1. Woods, K., Doss, C., Bowyer, K., Solka, J., Priebe, C.,
218 runs5 likes48 downloads53 reach25 impact
11183 instances - 7 features - 2 classes - 0 missing values
No data.
219 runs0 likes4 downloads4 reach12 impact
1000000 instances - 58 features - 2 classes - 0 missing values
No data.
219 runs0 likes5 downloads5 reach21 impact
414 instances - 6430 features - 9 classes - 0 missing values
No data.
220 runs0 likes7 downloads7 reach21 impact
336 instances - 7903 features - 6 classes - 0 missing values
No data.
222 runs0 likes11 downloads11 reach15 impact
1504 instances - 2887 features - 13 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
223 runs0 likes18 downloads18 reach18 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
No data.
225 runs0 likes7 downloads7 reach12 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
230 runs0 likes4 downloads4 reach12 impact
1000000 instances - 35 features - 2 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) Database of surgeries on horses. Possible class attributes: 24 (whether lesion is surgical), others include: 23, 25, 26, and 27 Notes: * Hospital_Number…
236 runs0 likes9 downloads9 reach9 impact
368 instances - 27 features - 2 classes - 1927 missing values
No data.
253 runs0 likes9 downloads9 reach9 impact
1076790 instances - 30 features - 2 classes - 7275 missing values
No data.
264 runs0 likes11 downloads11 reach47 impact
3204 instances - 13196 features - 6 classes - 0 missing values
No data.
268 runs0 likes9 downloads9 reach47 impact
3075 instances - 12433 features - 6 classes - 0 missing values
* Dataset Title: Volcanoes on Venus - JARtool experiment Data Set Experiment: A1 * Source: Michael C. Burl MS 126-347, JPL 4800 Oak Grove Drive Pasadena, CA 91109 (818) 393-5345 Michael.C.Burl '@'…
273 runs0 likes4 downloads4 reach14 impact
3252 instances - 4 features - 5 classes - 0 missing values
* Source: JP Marques de Sá, INEB-Instituto de Engenharia Biomédica, Porto, Portugal; e-mail: jpmdesa '@' gmail.com J Jossinet, inserm, Lyon, France * Data Set Information: Impedance measurements…
280 runs0 likes5 downloads5 reach13 impact
106 instances - 10 features - 6 classes - 0 missing values
No data.
283 runs0 likes7 downloads7 reach23 impact
96 instances - 4027 features - 11 classes - 19667 missing values
No data.
288 runs0 likes2 downloads2 reach10 impact
1000000 instances - 15 features - 9 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach12 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach9 impact
1000000 instances - 17 features - 7 classes - 0 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure.
291 runs0 likes31 downloads31 reach16 impact
539383 instances - 8 features - 2 classes - 0 missing values
No data.
292 runs0 likes4 downloads4 reach12 impact
1000000 instances - 37 features - 6 classes - 0 missing values
No data.
293 runs0 likes2 downloads2 reach12 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
296 runs0 likes7 downloads7 reach9 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
296 runs0 likes6 downloads6 reach23 impact
96 instances - 4027 features - 9 classes - 19667 missing values
No data.
298 runs0 likes3 downloads3 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
A 4-class version of breast-tissue dataset.
299 runs0 likes4 downloads4 reach13 impact
106 instances - 10 features - 4 classes - 0 missing values
No data.
304 runs0 likes7 downloads7 reach12 impact
1000000 instances - 25 features - 10 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
305 runs0 likes3 downloads3 reach12 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
305 runs0 likes2 downloads2 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
306 runs0 likes4 downloads4 reach12 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
306 runs0 likes3 downloads3 reach9 impact
1000000 instances - 13 features - 6 classes - 0 missing values
No data.
307 runs0 likes5 downloads5 reach12 impact
1000000 instances - 4 features - 2 classes - 0 missing values
No data.
307 runs0 likes2 downloads2 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
307 runs0 likes3 downloads3 reach12 impact
1000000 instances - 41 features - 3 classes - 0 missing values
No data.
308 runs0 likes2 downloads2 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
Normalized form of codrna (351) Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC…
309 runs0 likes5 downloads5 reach9 impact
488565 instances - 9 features - 2 classes - 0 missing values
No data.
309 runs0 likes3 downloads3 reach12 impact
1000000 instances - 11 features - 5 classes - 0 missing values
No data.
309 runs0 likes6 downloads6 reach12 impact
1000000 instances - 35 features - 6 classes - 0 missing values
No data.
310 runs0 likes2 downloads2 reach9 impact
1000000 instances - 14 features - 5 classes - 0 missing values
No data.
310 runs0 likes4 downloads4 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
310 runs0 likes5 downloads5 reach12 impact
1000000 instances - 11 features - 2 classes - 0 missing values
No data.
311 runs0 likes5 downloads5 reach12 impact
1000000 instances - 10 features - 2 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach12 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
312 runs2 likes8 downloads10 reach13 impact
1000000 instances - 14 features - 3 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach9 impact
1000000 instances - 23 features - 2 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
313 runs0 likes36 downloads36 reach15 impact
399482 instances - 12 features - 2 classes - 0 missing values