OpenML
Filter results by:
Donor: David W. Aha (aha@ics.uci.edu) This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one…
37 runs0 likes5 downloads5 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 14 features - 2 classes - 0 missing values
No data.
312 runs1 likes5 downloads6 reach12 impact
1000000 instances - 14 features - 3 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 14 features - 0 classes - 0 missing values
This database contains 13 attributes (which have been extracted from a larger set of 75) Attribute Information: ------------------------ -- 1. age -- 2. sex -- 3. chest pain type (4 values) -- 4.…
3214 runs0 likes19 downloads19 reach11 impact
270 instances - 14 features - 2 classes - 0 missing values
1. Title of Database: Wine recognition data Updated Sept 21, 1998 by C.Blake : Added attribute information 2. Sources: (a) Forina, M. et al, PARVUS - An Extendible Package for Data Exploration,…
1187 runs1 likes20 downloads21 reach11 impact
178 instances - 14 features - 3 classes - 0 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1789 runs0 likes12 downloads12 reach9 impact
294 instances - 14 features - 2 classes - 782 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1763 runs0 likes10 downloads10 reach10 impact
303 instances - 14 features - 2 classes - 7 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
714 runs0 likes4 downloads4 reach15 impact
303 instances - 14 features - 2 classes - 6 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 14 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
737 runs0 likes5 downloads5 reach15 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
762 runs0 likes9 downloads9 reach15 impact
315 instances - 14 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
700 runs0 likes4 downloads4 reach15 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
734 runs0 likes10 downloads10 reach15 impact
506 instances - 14 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
688 runs0 likes4 downloads4 reach14 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
717 runs0 likes5 downloads5 reach15 impact
303 instances - 14 features - 2 classes - 7 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
780 runs0 likes9 downloads9 reach15 impact
178 instances - 14 features - 2 classes - 0 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
14257 runs1 likes25 downloads26 reach37 impact
48842 instances - 15 features - 2 classes - 6465 missing values
* Title of Database: Spoken Arabic Digit * Abstract: This dataset contains time series of mel-frequency cepstrum coefficients (MFCCs) corresponding to spoken Arabic digits. Includes data from 44 males…
1 runs0 likes8 downloads8 reach15 impact
263256 instances - 15 features - 10 classes - 0 missing values
All data is from one continuous EEG measurement with the Emotiv EEG Neuroheadset. The duration of the measurement was 117 seconds. The eye state was detected via a camera during the EEG measurement…
165839 runs3 likes94 downloads97 reach29 impact
14980 instances - 15 features - 2 classes - 0 missing values
No data.
51 runs0 likes2 downloads2 reach9 impact
1000000 instances - 15 features - 2 classes - 0 missing values
In the early 2000s, Billy Beane and Paul DePodesta worked for the Oakland Athletics. While there, they literally changed the game of baseball. They didn't do it using a bat or glove, and they…
0 runs0 likes7 downloads7 reach12 impact
1232 instances - 15 features - 0 classes - 3600 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4202 runs0 likes6 downloads6 reach14 impact
690 instances - 15 features - 2 classes - 0 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017.
0 runs0 likes2 downloads2 reach12 impact
5465575 instances - 15 features - 0 classes - 132617 missing values
hmeq_p,BAD,binary
0 runs0 likes0 downloads0 reach8 impact
5960 instances - 15 features - classes - 5271 missing values
Short Summary: Lists estimates of the percentage of body fat determined by underwater weighing and various body circumference measurements for 252 men. Classroom use of this data set: This data set…
25 runs0 likes5 downloads5 reach19 impact
252 instances - 15 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 1. Title: Assessing the Reliability of a Human Estimator…
0 runs0 likes0 downloads0 reach13 impact
75 instances - 15 features - 0 classes - 0 missing values
analysis of stocks
0 runs0 likes0 downloads0 reach8 impact
245 instances - 15 features - classes - 0 missing values
Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC) [http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml]. The dataset includes TLC trips of the green line in…
0 runs0 likes0 downloads0 reach9 impact
581835 instances - 15 features - 0 classes - 0 missing values
Dataset sales
0 runs0 likes0 downloads0 reach10 impact
10738 instances - 15 features - 0 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
778 runs0 likes8 downloads8 reach16 impact
4562 instances - 15 features - 2 classes - 88 missing values
Experiment data obtained by running random configurations of an SVM through mlr on 106 different classification tasks from openml.
0 runs0 likes0 downloads0 reach7 impact
540576 instances - 15 features - classes - 658962 missing values
wind daily average wind speeds for 1961-1978 at 12 synoptic meteorological stations in the Republic of Ireland (Haslett and raftery 1989). These data were analyzed in detail in the following article:…
0 runs0 likes6 downloads6 reach13 impact
6574 instances - 15 features - 0 classes - 0 missing values
No data.
44 runs0 likes3 downloads3 reach11 impact
1000000 instances - 15 features - 2 classes - 0 missing values
No data.
288 runs0 likes2 downloads2 reach10 impact
1000000 instances - 15 features - 9 classes - 0 missing values
This dataset records 640 time series of 12 LPC cepstrum coefficients taken from nine male speakers. The data was collected for examining our newly developed classifier for multidimensional curves…
23157 runs0 likes12 downloads12 reach54 impact
9961 instances - 15 features - 9 classes - 0 missing values
Schizophrenic Eye-Tracking Data in Rubin and Wu (1997) Biometrics. Yingnian Wu (wu@hustat.harvard.edu) [14/Oct/97] Information about the dataset CLASSTYPE: nominal CLASSINDEX: last
748 runs0 likes7 downloads7 reach23 impact
340 instances - 15 features - 2 classes - 834 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs1 likes32 downloads33 reach11 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
769 runs0 likes12 downloads12 reach15 impact
252 instances - 15 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
102 runs0 likes4 downloads4 reach14 impact
67 instances - 15 features - 2 classes - 0 missing values
The AAUP dataset for the ASA Statistical Graphics Section's 1995 Data Analysis Exposition contains information on faculty salaries for 1161 American colleges and universities. The data may be obtained…
32 runs0 likes3 downloads3 reach14 impact
1161 instances - 15 features - 4 classes - 256 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
760 runs0 likes13 downloads13 reach15 impact
6574 instances - 15 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
485 runs0 likes5 downloads5 reach13 impact
76 instances - 15 features - 7 classes - 37 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
857 runs0 likes13 downloads13 reach17 impact
9961 instances - 15 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
67 instances - 16 features - 0 classes - 0 missing values
Abstract: This dataset consists in a collection of shape and texture features extracted from digital images of leaf specimens originating from a total of 40 different plant species. Source: This…
112 runs0 likes10 downloads10 reach13 impact
340 instances - 16 features - 30 classes - 0 missing values
No data.
73 runs0 likes5 downloads5 reach11 impact
1000000 instances - 16 features - 2 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 16 features - 0 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach3 impact
20058 instances - 16 features - classes - 0 missing values
No data.
326 runs0 likes4 downloads4 reach11 impact
1000000 instances - 16 features - 2 classes - 0 missing values
Graeme D. Hutcheson and Nick Sofroniou 1999 The Multivariate Social Scientist: Introductory Statistics Using Generalized Linear Models. SAGE Publications. Copyright: Graeme D. Hutcheson & Nick…
0 runs0 likes0 downloads0 reach13 impact
42 instances - 16 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! All nominal attributes and instances with missing values are deleted. Price treated as the class attribute. As used by…
2 runs0 likes0 downloads0 reach12 impact
159 instances - 16 features - 0 classes - 0 missing values
This file contains the Economic data information of USA from 01/04/1980 to 02/04/2000 on a weekly basis. From given features, the goal is to predict 1 Month CD Rate. 1. 1Y-CMaturityRate real [77.055,…
0 runs0 likes0 downloads0 reach8 impact
1049 instances - 16 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
2 runs0 likes0 downloads0 reach14 impact
59 instances - 16 features - 0 classes - 0 missing values
Experiment data obtained by running random configurations of ranger through mlr on 119 different classification tasks from openml.
0 runs0 likes0 downloads0 reach7 impact
278863 instances - 16 features - classes - 138965 missing values
This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to…
0 runs0 likes1 downloads1 reach13 impact
60 instances - 16 features - 0 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
11 runs1 likes4 downloads5 reach10 impact
159 instances - 16 features - 0 classes - 0 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
0 runs0 likes1 downloads1 reach11 impact
31 instances - 16 features - classes - 150 missing values
County data from the 2000 Presidential Election in Florida. Compiled by Brett Presnell Department of Statistics, University of Florida These data are derived from three sources, described below. As…
32 runs0 likes4 downloads4 reach14 impact
67 instances - 16 features - 5 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
100 runs0 likes3 downloads3 reach14 impact
31 instances - 16 features - 2 classes - 150 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
790 runs0 likes10 downloads10 reach15 impact
159 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
718 runs0 likes6 downloads6 reach15 impact
159 instances - 16 features - 2 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
1000000 instances - 16 features - 0 classes - 0 missing values
This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect the confidentiality of the data. This dataset is interesting because…
25075 runs1 likes33 downloads34 reach11 impact
690 instances - 16 features - 2 classes - 67 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
114 runs0 likes5 downloads5 reach14 impact
42 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
101 runs0 likes5 downloads5 reach15 impact
1161 instances - 16 features - 2 classes - 256 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
754 runs0 likes10 downloads10 reach14 impact
60 instances - 16 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
700 runs0 likes5 downloads5 reach14 impact
67 instances - 16 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
68 runs0 likes7 downloads7 reach24 impact
32561 instances - 16 features - 2 classes - 4262 missing values
* Dataset: Reduced version (10 % of the examples) of bank-marketing dataset.
1254 runs1 likes17 downloads18 reach15 impact
4521 instances - 17 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach9 impact
20000 instances - 17 features - 3 classes - 10000 missing values
Abstract: The data set is composed of 60 chorales (5665 events) by J.S. Bach (1675-1750). Each event of each chorale is labelled using 1 among 101 chord labels and described through 14 features.…
31 runs0 likes2 downloads2 reach13 impact
5665 instances - 17 features - 102 classes - 0 missing values
The data was collected retrospectively at Wroclaw Thoracic Surgery Centre for patients who underwent major lung resections for primary lung cancer in the years 2007 - 2011. The Centre is associated…
31 runs0 likes5 downloads5 reach12 impact
470 instances - 17 features - 2 classes - 0 missing values
#study_1
0 runs0 likes0 downloads0 reach10 impact
944 instances - 17 features - classes - 0 missing values
uci adult partitioned
0 runs0 likes0 downloads0 reach8 impact
48844 instances - 17 features - classes - 6495 missing values
No data.
332 runs0 likes4 downloads4 reach11 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
60 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
This dataset contains all Premier League matches, with player statistic take from Sofifa, from 2008 to 2016
0 runs0 likes0 downloads0 reach8 impact
2961 instances - 17 features - classes - 0 missing values
No data.
67 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 10 classes - 0 missing values
No data.
311 runs0 likes3 downloads3 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 17 features - classes - 0 missing values
No data.
356 runs0 likes8 downloads8 reach9 impact
131072 instances - 17 features - 2 classes - 0 missing values
No data.
71 runs0 likes5 downloads5 reach11 impact
1000000 instances - 17 features - 2 classes - 0 missing values
No data.
291 runs0 likes4 downloads4 reach9 impact
1000000 instances - 17 features - 7 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
519 runs0 likes7 downloads7 reach14 impact
203 instances - 17 features - 11 classes - 0 missing values
Identify jets of particles from the LHC, created for the study of ultra low latency inference with hls4ml. Use 16 high level features to identify the 5 jet classes: quark (q), gluon (g), W boson (w),…
0 runs0 likes0 downloads0 reach6 impact
830000 instances - 17 features - 5 classes - 0 missing values
No data.
293 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 10 classes - 0 missing values
This database was designed on the basis of data provided by US Census Bureau [http://www.census.gov] (under Lookup Access [http://www.census.gov/cdrom/lookup]: Summary Tape File 1). The data were…
0 runs1 likes6 downloads7 reach14 impact
22784 instances - 17 features - 0 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
34 runs0 likes2 downloads2 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
%-*- text -*- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage…
2 runs0 likes2 downloads2 reach13 impact
60 instances - 17 features - 0 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach11 impact
1000000 instances - 17 features - 26 classes - 0 missing values
* Title: Thoracic Surgery Data Data Set * Abstract: The data is dedicated to classification problem related to the post-operative life expectancy in the lung cancer patients: class 1 - death within…
145 runs0 likes8 downloads8 reach14 impact
470 instances - 17 features - 2 classes - 0 missing values
1. Title: 1984 United States Congressional Voting Records Database 2. Source Information: (a) Source: Congressional Quarterly Almanac, 98th Congress, 2nd session 1984, Volume XL: Congressional…
2262 runs0 likes17 downloads17 reach9 impact
435 instances - 17 features - 2 classes - 392 missing values