OpenML
Filter results by:
UCI Thyroid allbp dataset.
97 runs0 likes10 downloads10 reach14 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
92 runs0 likes6 downloads6 reach14 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
32 runs0 likes8 downloads8 reach13 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes9 downloads10 reach13 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes10 downloads11 reach13 impact
2800 instances - 27 features - 5 classes - 0 missing values
These weekly averages are ultimately based on measurements of 4 air samples per hour taken atop intake lines on several towers during steady periods of CO2 concentration of not less than 6 hours per…
0 runs1 likes1 downloads2 reach9 impact
2225 instances - 7 features - 0 classes - 0 missing values
No data.
311 runs0 likes5 downloads5 reach11 impact
1000000 instances - 10 features - 2 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 6 features - 0 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
100 instances - 6 features - 0 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
250 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
No data.
6 runs0 likes3 downloads3 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
100 instances - 6 features - 0 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
9 runs0 likes2 downloads2 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach13 impact
100 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
500 instances - 6 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 6 features - 0 classes - 0 missing values
No data.
10 runs0 likes1 downloads1 reach12 impact
1000000 instances - 39 features - 6 classes - 0 missing values
The original Titanic dataset, describing the survival status of individual passengers on the Titanic. The titanic data does not contain information from the crew, but it does contain actual ages of…
0 runs2 likes30 downloads32 reach12 impact
1309 instances - 14 features - 2 classes - 3855 missing values
SK daily COVID19
0 runs0 likes0 downloads0 reach0 impact
280 instances - 7 features - classes - 0 missing values
Customer purchases on Black Friday
0 runs0 likes1 downloads1 reach12 impact
166821 instances - 10 features - 0 classes - 0 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
14257 runs1 likes25 downloads26 reach37 impact
48842 instances - 15 features - 2 classes - 6465 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4202 runs0 likes6 downloads6 reach14 impact
690 instances - 15 features - 2 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
6890 runs0 likes4 downloads4 reach17 impact
44819 instances - 7 features - 3 classes - 0 missing values
The original Annealing dataset from UCI. The exact meaning of the features and classes is largely unknown. Annealing, in metallurgy and materials science, is a heat treatment that alters the physical…
13774 runs0 likes16 downloads16 reach12 impact
898 instances - 39 features - 5 classes - 22175 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
35343 runs0 likes17 downloads17 reach11 impact
2000 instances - 7 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
780 runs0 likes10 downloads10 reach15 impact
625 instances - 7 features - 2 classes - 0 missing values
Mammography dataset Past Usage: 1. Woods, K., Doss, C., Bowyer, K., Solka, J., Priebe, C.,
218 runs5 likes48 downloads53 reach24 impact
11183 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
856 runs0 likes11 downloads11 reach15 impact
209 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
714 runs0 likes4 downloads4 reach15 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
740 runs0 likes5 downloads5 reach14 impact
51 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
800 runs0 likes10 downloads10 reach15 impact
209 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
737 runs0 likes5 downloads5 reach15 impact
303 instances - 14 features - 2 classes - 6 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
700 runs0 likes4 downloads4 reach15 impact
294 instances - 14 features - 2 classes - 782 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1789 runs0 likes12 downloads12 reach9 impact
294 instances - 14 features - 2 classes - 782 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
1763 runs0 likes10 downloads10 reach10 impact
303 instances - 14 features - 2 classes - 7 missing values
This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect the confidentiality of the data. This dataset is interesting because…
25075 runs1 likes33 downloads34 reach11 impact
690 instances - 16 features - 2 classes - 67 missing values
1. Title: Hepatitis Domain 2. Sources: (a) unknown (b) Donor: G.Gong (Carnegie-Mellon University) via Bojan Cestnik Jozef Stefan Institute Jamova 39 61000 Ljubljana Yugoslavia (tel.: (38)(+61) 214-399…
2134 runs1 likes12 downloads13 reach9 impact
155 instances - 20 features - 2 classes - 167 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach14 impact
92 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
686 runs0 likes5 downloads5 reach15 impact
782 instances - 9 features - 2 classes - 466 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
745 runs0 likes11 downloads11 reach15 impact
3107 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
774 runs0 likes14 downloads14 reach15 impact
9517 instances - 7 features - 2 classes - 0 missing values
The Committee on Statistical Graphics of the American Statistical Association (ASA) invites you to participate in its Second (1983) Exposition of Statistical Graphics Technology. The purposes of the…
164 runs0 likes3 downloads3 reach14 impact
406 instances - 8 features - 3 classes - 14 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
721 runs0 likes5 downloads5 reach14 impact
60 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
120 runs0 likes5 downloads5 reach14 impact
50 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
817 runs0 likes8 downloads8 reach15 impact
400 instances - 7 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
712 runs0 likes8 downloads8 reach15 impact
898 instances - 39 features - 2 classes - 22175 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
717 runs0 likes5 downloads5 reach15 impact
303 instances - 14 features - 2 classes - 7 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
744 runs0 likes5 downloads5 reach14 impact
130 instances - 10 features - 2 classes - 97 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
718 runs0 likes6 downloads6 reach15 impact
406 instances - 9 features - 2 classes - 14 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
688 runs0 likes4 downloads4 reach14 impact
294 instances - 14 features - 2 classes - 782 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
773 runs0 likes9 downloads9 reach15 impact
2000 instances - 7 features - 2 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach13 impact
150 instances - 5 features - 0 classes - 0 missing values
* Title: User Knowledge Modeling Data Set * Abstract: It is the real dataset about the students' knowledge status about the subject of Electrical DC Machines. The dataset had been obtained from Ph.D.…
153 runs1 likes8 downloads9 reach13 impact
403 instances - 6 features - 5 classes - 0 missing values
libSVM","AAD group A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University, 2003. #Dataset from the LIBSVM data repository…
0 runs0 likes0 downloads0 reach16 impact
7089 instances - 5 features - 0 classes - 0 missing values
simple engine data
52 runs0 likes6 downloads6 reach12 impact
383 instances - 6 features - 3 classes - 0 missing values
cleve-pmlb
32 runs0 likes1 downloads1 reach20 impact
303 instances - 14 features - 2 classes - 0 missing values
new-thyroid-pmlb
31 runs0 likes2 downloads2 reach20 impact
215 instances - 6 features - 3 classes - 0 missing values
b gtrg
0 runs0 likes0 downloads0 reach7 impact
4 instances - 7 features - classes - 0 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes2 downloads2 reach8 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
Estimated article influence scores in 2015
0 runs0 likes0 downloads0 reach8 impact
3615 instances - 7 features - 3169 classes - 48 missing values
test
0 runs0 likes0 downloads0 reach3 impact
8553 instances - 10 features - classes - 18454 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identifier attribute deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based…
2 runs0 likes2 downloads2 reach12 impact
398 instances - 8 features - 0 classes - 6 missing values
Normalized version of the pokerhand data set. Automated file upload of pokerhand-normalized.arff
314 runs0 likes12 downloads12 reach11 impact
829201 instances - 11 features - 10 classes - 0 missing values
This S dump contains 22 data sets from the book Visualizing Data published by Hobart Press (books@hobart.com). The dump was created by data.dump() and can be read back into S by data.restore(). The…
0 runs0 likes1 downloads1 reach15 impact
323 instances - 5 features - 0 classes - 0 missing values
A shar archive of data from the book Data Analysis: An Introduction(1992) Prentice Hall bu Jeff Witmer. Submitted by Jeff Witmer (fwitmer@ocvaxa.cc.oberlin.edu) [28/Jun/94] (29 kbytes) Note:…
2 runs0 likes0 downloads0 reach13 impact
50 instances - 5 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
53 runs0 likes2 downloads2 reach17 impact
92 instances - 6 features - 0 classes - 26 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Points scored per minute is being treated as…
2 runs0 likes0 downloads0 reach9 impact
96 instances - 5 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach13 impact
60 instances - 11 features - 0 classes - 14 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
2 runs0 likes0 downloads0 reach13 impact
93 instances - 7 features - 0 classes - 0 missing values
This analysis describes and summarizes the relationships between 1987 salaries of major league baseball players and the player's performance. The salary data were taken from Sports Illustrated, April…
0 runs1 likes1 downloads2 reach13 impact
26 instances - 8 features - 0 classes - 0 missing values
1. Title: Employee Selection (Ordinal ESL) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes0 downloads0 reach13 impact
488 instances - 5 features - 0 classes - 0 missing values
1. Title: Employee Selection (Ordinal ESL) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes0 downloads0 reach13 impact
488 instances - 5 features - 0 classes - 0 missing values
Electrical-Maintenance data set This problem consists of four input variables and the available data set is comprised of a representative number of well distributed examples. In this case, the…
0 runs0 likes0 downloads0 reach8 impact
1056 instances - 5 features - 0 classes - 0 missing values
This data set was originally a univariate time record of a single observed quantity, recorded from a Far-Infrared-Laser in a chaotic state. The original set 1000 points has been adapted for regression…
0 runs0 likes0 downloads0 reach8 impact
993 instances - 5 features - 0 classes - 0 missing values
Data has been taken from various sources such as data gov and various other websites and has been pre processed for analysis purpose
0 runs0 likes0 downloads0 reach7 impact
204 instances - 5 features - classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
18 runs0 likes0 downloads0 reach14 impact
159 instances - 10 features - 0 classes - 6 missing values
This dataset is synthetic. It was generated by David Coleman at RCA Laboratories in Princeton, N.J. For convenience, we will refer to it as the POLLEN DATA. The first three variables are the lengths…
0 runs0 likes0 downloads0 reach13 impact
3848 instances - 5 features - 0 classes - 0 missing values
newtest3
0 runs0 likes0 downloads0 reach3 impact
2 instances - 6 features - classes - 0 missing values
This is the same data as version 5 (OpenML ID = 1220) with '_id' features coded as nominal factor variables.
0 runs0 likes0 downloads0 reach10 impact
39948 instances - 12 features - 2 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Gasoline comnsumption is being treated as…
2 runs0 likes0 downloads0 reach9 impact
27 instances - 5 features - 0 classes - 0 missing values
DATA-SETS FROM DIGGLE, P.J. (1990). TIME SERIES : A BIOSTATISTICAL INTRODUCTION. Oxford University Press. Table: Table A1 Lutenizing hormone Information about the dataset CLASSTYPE: numeric…
0 runs0 likes0 downloads0 reach13 impact
48 instances - 5 features - 0 classes - 0 missing values
Dataset Title: Localization Data for Person Activity Data Set Abstract: Data contains recordings of five people performing different activities. Each person wore four sensors (tags) while performing…
6 runs0 likes6 downloads6 reach15 impact
164860 instances - 8 features - 11 classes - 0 missing values
1. Title: Lecturers Evaluation (Ordinal LEV) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs1 likes2 downloads3 reach13 impact
1000 instances - 5 features - 0 classes - 0 missing values
1. Title: Employee Rejection\Acceptance (Orinal ERA) 2. Source Informaion: Donor: Arie Ben David MIS, Dept. of Technology Management Holon Academic Inst. of Technology 52 Golomb St. Holon 58102 Israel…
0 runs0 likes1 downloads1 reach13 impact
1000 instances - 5 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
3 runs0 likes3 downloads3 reach13 impact
50 instances - 5 features - 0 classes - 0 missing values
__Changes w.r.t. version 1: renamed variables such that they match description.__ ### Dataset: Wilt Data Set ### Abstract: High-resolution Remote Sensing data set (Quickbird). Small number of training…
10946 runs0 likes2 downloads2 reach20 impact
4839 instances - 6 features - 2 classes - 0 missing values
This is the same data as version 5 (OpenML ID = 1220) with '_id' features coded as nominal factor variables.
0 runs0 likes0 downloads0 reach1 impact
39948 instances - 12 features - 2 classes - 0 missing values