OpenML
Filter results by:
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
29 runs0 likes2 downloads2 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
29 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes2 downloads2 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 17 features - 26 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes3 downloads3 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach0 impact
1000000 instances - 37 features - 2 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes3 downloads4 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes5 downloads5 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes2 downloads2 reach0 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach0 impact
1000000 instances - 19 features - 4 classes - 0 missing values
postoperative-patient-data-pmlb
26 runs0 likes1 downloads1 reach8 impact
88 instances - 9 features - 2 classes - 0 missing values
This is a 10% stratified subsample of the data from the 1999 ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php). Modified by TunedIT (converted to ARFF format)…
25 runs1 likes33 downloads34 reach5 impact
494020 instances - 42 features - 23 classes - 0 missing values
Abstract: A chess endgame data set representing the positions on the board of the white king, the white rook, and the black king. The task is to determine the optimum number of turn required for white…
25 runs0 likes5 downloads5 reach5 impact
28056 instances - 7 features - 18 classes - 0 missing values
This is the poker dataset, retrieved 2013-11-14 from the libSVM site. Additional to the preprocessing done there (see LibSVM site for details), this dataset was created as follows: -join test and…
23 runs0 likes17 downloads17 reach6 impact
1025010 instances - 11 features - 2 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
22 runs0 likes2 downloads2 reach4 impact
400 instances - 8 features - 0 classes - 0 missing values
This is the famous covertype dataset in its binary version, retrieved 2013-11-13 from the libSVM site (called covtype.binary there). Additional to the preprocessing done there (see LibSVM site for…
22 runs0 likes7 downloads7 reach6 impact
581012 instances - 55 features - 2 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Cholesterol treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
20 runs0 likes3 downloads3 reach0 impact
303 instances - 14 features - 0 classes - 6 missing values
This is data set is concerned with the forward kinematics of an 8 link robot arm. Among the existing variants of this data set we have used the variant 8nm, which is known to be highly non-linear and…
19 runs0 likes7 downloads7 reach0 impact
8192 instances - 9 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
18 runs0 likes0 downloads0 reach4 impact
159 instances - 10 features - 0 classes - 6 missing values
------------------------------------------------------------------------ Primary Biliary Cirrhosis The data set found in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis,…
18 runs0 likes2 downloads2 reach4 impact
418 instances - 20 features - 0 classes - 1033 missing values
ARCENE's task is to distinguish cancer versus normal patterns from mass-spectrometric data. This is a two-class classification problem with continuous input variables. This dataset is one of 5…
16 runs0 likes8 downloads8 reach5 impact
200 instances - 10001 features - 2 classes - 0 missing values
* Title: Skin Segmentation Data Set * Abstract: The Skin Segmentation dataset is constructed over B, G, R color space. Skin and Nonskin dataset is generated using skin textures from face images of…
15 runs1 likes9 downloads10 reach5 impact
245057 instances - 4 features - 2 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
15 runs0 likes0 downloads0 reach0 impact
4704 instances - 47 features - 3 classes - 0 missing values
Determinants of Plasma Retinol and Beta-Carotene Levels Summary: Observational studies have suggested that low dietary intake or low plasma concentrations of retinol, beta-carotene, or other…
14 runs0 likes0 downloads0 reach4 impact
315 instances - 14 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
14 runs0 likes0 downloads0 reach4 impact
526 instances - 6 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
12 runs0 likes2 downloads2 reach0 impact
130 instances - 10 features - 0 classes - 97 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
12 runs0 likes0 downloads0 reach0 impact
4704 instances - 47 features - 3 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
12 runs0 likes0 downloads0 reach0 impact
2351 instances - 47 features - 2 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
12 runs0 likes3 downloads3 reach4 impact
8192 instances - 13 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%% Data-Description % %%%%%%%%%%%%%%%%%%% COIL 1999 Competition Data Data Type multivariate Abstract This data set is from the 1999 Computational Intelligence and Learning (COIL)…
12 runs0 likes0 downloads0 reach4 impact
316 instances - 12 features - 0 classes - 56 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
12 runs0 likes0 downloads0 reach4 impact
100 instances - 26 features - 0 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach0 impact
4704 instances - 47 features - 3 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach0 impact
44819 instances - 47 features - 3 classes - 10584 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach0 impact
5880 instances - 47 features - 3 classes - 3528 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach0 impact
5880 instances - 47 features - 3 classes - 3528 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes0 downloads0 reach0 impact
4704 instances - 47 features - 3 classes - 0 missing values
This file contains 9 sets of sanitized user data drawn from the command histories of 8 UNIX computer users at Purdue over the course of up to 2 years (USER0 and USER1 were generated by the same…
11 runs0 likes8 downloads8 reach5 impact
9100 instances - 3 features - 9 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. X treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes1 downloads1 reach0 impact
418 instances - 19 features - 0 classes - 1239 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning…
10 runs1 likes2 downloads3 reach0 impact
195 instances - 12 features - 0 classes - 2 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Weight treated as the class attribute. Identifier deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes2 downloads2 reach0 impact
158 instances - 8 features - 0 classes - 87 missing values
No data.
10 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
10 runs0 likes2 downloads2 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
10 runs0 likes0 downloads0 reach0 impact
3660 instances - 47 features - 2 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
10 runs0 likes0 downloads0 reach0 impact
5880 instances - 47 features - 3 classes - 3528 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
10 runs0 likes0 downloads0 reach0 impact
2352 instances - 47 features - 2 classes - 0 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
10 runs0 likes0 downloads0 reach0 impact
294 instances - 14 features - 0 classes - 782 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes1 downloads1 reach5 impact
283 instances - 54622 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
105 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
105 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes1 downloads1 reach5 impact
95 instances - 22278 features - 5 classes - 0 missing values
No data.
9 runs0 likes2 downloads2 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
9 runs0 likes2 downloads2 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Over 92 thousand images (32x32 pixels) of 46 characters from Devanagari script. Includes the alphabet as well as the numbers. Devanagari is an Indic script and forms a basis for over 100 languages…
9 runs1 likes6 downloads7 reach3 impact
92000 instances - 1025 features - 46 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes0 downloads0 reach5 impact
89 instances - 54614 features - 4 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
9 runs0 likes3 downloads3 reach5 impact
92 instances - 59005 features - 5 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes3 downloads3 reach5 impact
220 instances - 22284 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
9 runs0 likes3 downloads3 reach5 impact
214 instances - 45102 features - 7 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. All datasets contain between 100 and 400 samples, characterized by values of 20,000 - 65,000 attributes. Samples are assigned to several (2-10) classes.…
8 runs0 likes2 downloads2 reach5 impact
283 instances - 54622 features - 3 classes - 0 missing values
Data from the RSCTC 2010 Discovery Challenge. Example datasets for 6 different problems of DNA microarray data analysis and classification. All datasets contain gene expression data characterized by…
8 runs0 likes2 downloads2 reach5 impact
113 instances - 54676 features - 5 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
7 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
7 runs0 likes2 downloads2 reach4 impact
50 instances - 8 features - 0 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics; (b) its assigned insurance risk rating,; (c) its normalized losses in use as…
6 runs1 likes4 downloads5 reach0 impact
159 instances - 16 features - 0 classes - 0 missing values
No data.
6 runs0 likes3 downloads3 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
No data.
6 runs0 likes1 downloads1 reach0 impact
1000000 instances - 39 features - 6 classes - 0 missing values
Dataset Title: Localization Data for Person Activity Data Set Abstract: Data contains recordings of five people performing different activities. Each person wore four sensors (tags) while performing…
6 runs0 likes4 downloads4 reach5 impact
164860 instances - 8 features - 11 classes - 0 missing values
1. Title: Wisconsin Prognostic Breast Cancer (WPBC) 2. Source Information a) Creators: Dr. William H. Wolberg, General Surgery Dept., University of Wisconsin, Clinical Sciences Center, Madison, WI…
5 runs0 likes4 downloads4 reach0 impact
194 instances - 33 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
4 runs0 likes1 downloads1 reach0 impact
61 instances - 3 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identifier attribute deleted. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: Sexual activity and the lifespan of male fruitflies TYPE: Designed (almost factorial)…
4 runs0 likes1 downloads1 reach0 impact
125 instances - 5 features - 0 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identification code deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based…
4 runs0 likes0 downloads0 reach0 impact
189 instances - 10 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach0 impact
55 instances - 3 features - 0 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) Data set for KDD Cup 1999 Modified by TunedIT (converted to ARFF format)…
4 runs0 likes18 downloads18 reach5 impact
4898431 instances - 42 features - 23 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) SUMMARY: Data from an experiment on the affects of machine adjustments on the time to count bolts. Data appear as the STATS (Issue 10) Challenge. DATA:…
4 runs0 likes0 downloads0 reach0 impact
40 instances - 8 features - 0 classes - 0 missing values
### Description __Changes to version 1:__ all categorical features transformed as such. This dataset represents a set of possible advertisements on Internet pages. ### Sources (a) Creator and donor:…
4 runs0 likes2 downloads2 reach6 impact
3279 instances - 1559 features - 2 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) The infamous Longley data, "An appraisal of least-squares programs from the point of view of the user", JASA, 62(1967) p819-841. Variables are: Number of…
3 runs0 likes1 downloads1 reach0 impact
16 instances - 7 features - 0 classes - 0 missing values
* Dataset: DBworld e-mails data set Task: dbworld-bodies * Source: Michele Filannino, PhD University of Manchester Centre for Doctoral Training Email: filannim_AT_cs.man.ac.uk * Data Set Information:…
3 runs0 likes6 downloads6 reach4 impact
64 instances - 4703 features - 2 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach3 impact
442 instances - 24 features - 0 classes - 0 missing values
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch,…
3 runs0 likes4 downloads4 reach8 impact
506 instances - 14 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes1 downloads2 reach0 impact
8192 instances - 22 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
2 runs0 likes3 downloads3 reach0 impact
9517 instances - 7 features - 0 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes3 downloads3 reach0 impact
15000 instances - 49 features - 0 classes - 0 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
2 runs0 likes1 downloads1 reach0 impact
186 instances - 61 features - 0 classes - 0 missing values