OpenML
Filter results by:
Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond, Knowledge and Information Systems, Vol. 14, No. 3, 2008. 1 . Abstract: Two ground ozone level data sets are included in…
187955 runs0 likes16 downloads16 reach21 impact
2534 instances - 73 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
801 runs0 likes8 downloads8 reach13 impact
841 instances - 71 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
28846 runs0 likes7 downloads7 reach32 impact
841 instances - 71 features - 4 classes - 0 missing values
price col is int now. autoHorse dataset
11 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 0 classes - 0 missing values
Fixed dataset for autoHorse.csv I suggest...
0 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 186 classes - 0 missing values
The experiments were carried out with a group of 30 volunteers within an age bracket of 19-48 years. They performed a protocol of activities composed of six basic activities: three static postures…
83 runs0 likes9 downloads9 reach5 impact
180 instances - 68 features - 6 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach1 impact
1000000 instances - 65 features - 10 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143050 runs1 likes16 downloads17 reach411 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143288 runs1 likes37 downloads38 reach414 impact
1600 instances - 65 features - 100 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38639 runs0 likes19 downloads19 reach9 impact
2000 instances - 65 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
794 runs0 likes9 downloads9 reach13 impact
2000 instances - 65 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
765 runs0 likes12 downloads12 reach13 impact
5620 instances - 65 features - 2 classes - 0 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
35798 runs3 likes22 downloads25 reach9 impact
5620 instances - 65 features - 10 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143077 runs2 likes64 downloads66 reach416 impact
1599 instances - 65 features - 100 classes - 0 missing values
CD4 count prediction date
0 runs0 likes0 downloads0 reach3 impact
16484 instances - 62 features - classes - 0 missing values
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project", "BL89/2017-IST-ID grant. In this dataset, we present usability (SUS), workload…
0 runs0 likes0 downloads0 reach0 impact
31 instances - 62 features - classes - 0 missing values
No data.
948 runs0 likes5 downloads5 reach9 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
949 runs0 likes4 downloads4 reach9 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
996 runs0 likes4 downloads4 reach9 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
882 runs0 likes6 downloads6 reach9 impact
71 instances - 63 features - 6 classes - 0 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach1 impact
186 instances - 61 features - 0 classes - 0 missing values
Test dataset
0 runs0 likes1 downloads1 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes1 downloads1 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes0 downloads0 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach6 impact
3175 instances - 61 features - 0 classes - 0 missing values
No data.
296 runs0 likes7 downloads7 reach1 impact
1000000 instances - 61 features - 2 classes - 0 missing values
Test dataset
3 runs0 likes0 downloads0 reach7 impact
15547 instances - 61 features - 2 classes - 280 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach8 impact
416188 instances - 61 features - 355 classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes4 downloads4 reach5 impact
39644 instances - 61 features - 0 classes - 0 missing values
### Description Synthetic Control Chart Time Series. This is actually time series classification. ### Sources ``` * Original Owner and Donor Dr Robert Alcock rob@skyblue.csd.auth.gr ``` ### Dataset…
20355 runs0 likes10 downloads10 reach47 impact
600 instances - 62 features - 6 classes - 0 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2366 runs1 likes25 downloads26 reach7 impact
208 instances - 61 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
169 runs0 likes8 downloads8 reach14 impact
600 instances - 62 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
806 runs0 likes8 downloads8 reach13 impact
186 instances - 61 features - 2 classes - 0 missing values
This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the…
28060 runs19 likes158 downloads177 reach31 impact
8378 instances - 123 features - 2 classes - 18372 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au4-2500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
4222 runs0 likes7 downloads7 reach20 impact
2500 instances - 101 features - 3 classes - 0 missing values
SPAM E-mail Database The "spam" concept is diverse: advertisements for products/websites, make money fast schemes, chain letters, pornography... Our collection of spam e-mails came from our postmaster…
161528 runs4 likes88 downloads92 reach9 impact
4601 instances - 58 features - 2 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach5 impact
31 instances - 54 features - 0 classes - 0 missing values
This is the famous covertype dataset in its binary version, retrieved 2013-11-13 from the libSVM site (called covtype.binary there). Additional to the preprocessing done there (see LibSVM site for…
22 runs0 likes9 downloads9 reach7 impact
581012 instances - 55 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
12 runs0 likes1 downloads1 reach10 impact
83733 instances - 55 features - 4 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach5 impact
14 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach6 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 51 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 51 features - 0 classes - 0 missing values
Source: James P Bridge, Sean B Holden and Lawrence C Paulson University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD UK +44 (0)1223 763500…
26323 runs1 likes20 downloads21 reach36 impact
6118 instances - 52 features - 6 classes - 0 missing values
This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). This dataset is ordered. It first contains all signal…
12 runs0 likes4 downloads4 reach5 impact
130064 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
646 runs0 likes9 downloads9 reach13 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
784 runs0 likes7 downloads7 reach13 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
621 runs0 likes8 downloads8 reach13 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
748 runs0 likes6 downloads6 reach13 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
620 runs0 likes10 downloads10 reach13 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
614 runs0 likes9 downloads9 reach13 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
801 runs0 likes9 downloads9 reach13 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
781 runs0 likes8 downloads8 reach13 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes6 downloads6 reach13 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
764 runs0 likes6 downloads6 reach12 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
810 runs0 likes6 downloads6 reach12 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
788 runs0 likes7 downloads7 reach12 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
786 runs0 likes6 downloads6 reach13 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
807 runs0 likes7 downloads7 reach13 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
766 runs0 likes7 downloads7 reach12 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
636 runs0 likes8 downloads8 reach13 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
775 runs0 likes6 downloads6 reach13 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
772 runs0 likes7 downloads7 reach13 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
817 runs0 likes7 downloads7 reach13 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
773 runs0 likes6 downloads6 reach12 impact
100 instances - 51 features - 2 classes - 0 missing values
This is a commercial application described in Weiss & Indurkhya (1995). The data describes a telecommunication problem. No further information is available. Characteristics: (10000+5000) cases, 49…
2 runs0 likes4 downloads4 reach2 impact
15000 instances - 49 features - 0 classes - 0 missing values
efe def
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
Oil dataset Past Usage: 1. Kubat, M., Holte, R.,
204 runs3 likes19 downloads22 reach22 impact
937 instances - 50 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes1 downloads1 reach9 impact
4147 instances - 49 features - 2 classes - 0 missing values
rrvrf 4rr
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
ef f
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values
sd vfv
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - 2 classes - 0 missing values
r rg
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - classes - 0 missing values
dd ref
0 runs0 likes0 downloads0 reach0 impact
4 instances - 50 features - classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
624 runs0 likes10 downloads10 reach13 impact
15000 instances - 49 features - 2 classes - 0 missing values
These data are estimated correlations between daily 3 p.m. wind measurements during September and October 1997 for a network of 45 stations in the Sydney region. The first column below gives a list of…
0 runs0 likes0 downloads0 reach3 impact
45 instances - 47 features - classes - 0 missing values
No data.
52 runs0 likes3 downloads3 reach2 impact
1000000 instances - 48 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
34558 runs0 likes21 downloads21 reach9 impact
2000 instances - 48 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
792 runs0 likes8 downloads8 reach13 impact
2000 instances - 48 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
729 runs0 likes9 downloads9 reach12 impact
45 instances - 47 features - 2 classes - 0 missing values
No data.
43 runs0 likes2 downloads2 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
No data.
47 runs0 likes1 downloads1 reach1 impact
1000000 instances - 45 features - 2 classes - 0 missing values
This is a corrected version of the previous data file in version 1, which contained a dataset (349 instances) incorrectly merged from the original training and test sets available on UCI (there are…
0 runs0 likes3 downloads3 reach5 impact
267 instances - 45 features - 2 classes - 0 missing values