Data
Filter results by:
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_50_EDM-2_001-pmlb
0 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes0 downloads0 reach12 impact
1600 instances - 21 features - 2 classes - 0 missing values
A dataset relating characteristics of telephony account features and usage and whether or not the customer churned. Originally used in [Discovering Knowledge in Data: An Introduction to Data…
6658 runs1 likes5 downloads6 reach15 impact
5000 instances - 21 features - 2 classes - 0 missing values
__Major changes w.r.t. version 1: deactivated first two variables as they describe the batch of the experiments and should not be used for prediction. Also transformed the target from numeric to…
6501 runs0 likes3 downloads3 reach5 impact
540 instances - 21 features - 2 classes - 0 missing values
microaggregation2_nominal
0 runs0 likes0 downloads0 reach3 impact
20000 instances - 21 features - 5 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au1-1000 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
3255 runs0 likes8 downloads8 reach15 impact
1000 instances - 21 features - 2 classes - 0 missing values
Lucas, D. D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., and Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models, Geosci. Model Dev. Discuss.,…
162436 runs0 likes20 downloads20 reach17 impact
540 instances - 21 features - 2 classes - 0 missing values
* Twonorm dataset This is an implementation of Leo Breiman's twonorm example[1]. It is a 20 dimensional, 2 class classification example. Each class is drawn from a multivariate normal distribution…
118 runs0 likes5 downloads5 reach6 impact
7400 instances - 21 features - 2 classes - 0 missing values
1: Abstract: This is a 20 dimensional, 2 class classification problem. Each class is drawn from a multivariate normal distribution. Class 1 has mean zero and covariance 4 times the identity. Class 2…
120 runs0 likes8 downloads8 reach6 impact
7400 instances - 21 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
720 runs0 likes8 downloads8 reach7 impact
506 instances - 21 features - 2 classes - 0 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes1 downloads1 reach0 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
504985 runs15 likes177 downloads192 reach9 impact
1000 instances - 21 features - 2 classes - 0 missing values
1. Title: meta-data 2. Sources: (a) Creator: LIACC - University of Porto R.Campo Alegre 823 4150 PORTO (b) Donor: P.B.Brazdil or J.Gama Tel.: +351 600 1672 LIACC, University of Porto Fax.: +351 600…
32 runs0 likes2 downloads2 reach7 impact
528 instances - 22 features - 0 classes - 504 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 22 features - 0 classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
0 runs0 likes6 downloads6 reach5 impact
8192 instances - 22 features - 0 classes - 0 missing values
Abstract: CART book's waveform domains Source: Original Owners: Breiman,L., Friedman,J.H., Olshen,R.A., & Stone,C.J. (1984). Classification and Regression Trees. Wadsworth International Group:…
0 runs1 likes3 downloads4 reach3 impact
5000 instances - 22 features - classes - 0 missing values
Source: The dataset was created by Athanasios Tsanas (tsanasthanasis '@' gmail.com) and Max Little (littlem '@' physics.ox.ac.uk) of the University of Oxford, in collaboration with 10 medical centers…
0 runs1 likes2 downloads3 reach3 impact
5875 instances - 22 features - classes - 0 missing values
The Computer Activity databases are a collection of computer systems activity measures. The data was collected from a Sun Sparcstation 20/712 with 128 Mbytes of memory running in a multi-user…
2 runs1 likes1 downloads2 reach1 impact
8192 instances - 22 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach5 impact
13 instances - 22 features - 0 classes - 0 missing values
The data is cleaned, regularized and encrypted global equity data. The first 21 columns (feature1 - feature21) are features, and target is the binary class you’re trying to predict.
858 runs1 likes1 downloads2 reach6 impact
96320 instances - 22 features - 2 classes - 0 missing values
This directory contains Thyroid datasets. "ann-train.data" contains 3772 learning examples and "ann-test.data" contains 3428 testing examples. I have obtained this data from…
31 runs0 likes2 downloads2 reach6 impact
3772 instances - 22 features - 3 classes - 0 missing values
car-evaluation-pmlb
31 runs0 likes1 downloads1 reach11 impact
1728 instances - 22 features - 4 classes - 0 missing values
This is a PROMISE data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering. If you publish material based…
19125 runs0 likes18 downloads18 reach19 impact
10885 instances - 22 features - 2 classes - 25 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
691 runs0 likes6 downloads6 reach7 impact
528 instances - 22 features - 2 classes - 504 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
760 runs0 likes10 downloads10 reach7 impact
8192 instances - 22 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from software for storage management for receiving and processing ground data. Data comes from McCabe and Halstead features extractors of…
159209 runs2 likes22 downloads24 reach19 impact
2109 instances - 22 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from software for science data processing. Data comes from McCabe and Halstead features extractors of source code. These features were…
174308 runs0 likes21 downloads21 reach18 impact
522 instances - 22 features - 2 classes - 0 missing values
One of the NASA Metrics Data Program defect data sets. Data from flight software for earth orbiting satellite. Data comes from McCabe and Halstead features extractors of source code. These features…
147871 runs0 likes25 downloads25 reach19 impact
1109 instances - 22 features - 2 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Attributes 2,4, and 6 deleted. Midrange price treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M.…
0 runs0 likes0 downloads0 reach7 impact
93 instances - 23 features - 0 classes - 14 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: Original data: someone from Germany working with the car industry.
0 runs0 likes1 downloads1 reach5 impact
1243 instances - 23 features - 0 classes - 0 missing values
libSVM","AAD group IJCNN 2001 neural network competition. Slide presentation in IJCNN'01, Ford Research Laboratory, 2001. http://www.geocities.com/ijcnn/nnc_ijcnn01.pdf . #Dataset from the LIBSVM data…
0 runs0 likes8 downloads8 reach6 impact
191681 instances - 23 features - 0 classes - 0 missing values
No data.
45 runs0 likes2 downloads2 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
313 runs0 likes3 downloads3 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
72 runs0 likes3 downloads3 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
326 runs1 likes4 downloads5 reach2 impact
1000000 instances - 23 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach1 impact
1000000 instances - 23 features - 2 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
13 runs0 likes0 downloads0 reach0 impact
527 instances - 23 features - 4 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach5 impact
31406 instances - 23 features - 2 classes - 29756 missing values
* Abstract: Oxford Parkinson's Disease Detection Dataset * Source: The dataset was created by Max Little of the University of Oxford, in collaboration with the National Centre for Voice and Speech,…
179 runs1 likes14 downloads15 reach6 impact
195 instances - 23 features - 2 classes - 0 missing values
### Description This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible. ### Source ``` (a) Origin: Mushroom records are drawn from…
16392 runs1 likes40 downloads41 reach3 impact
8124 instances - 23 features - 2 classes - 2480 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) In this version (version 2), some features were removed. It is unclear why of how this was done.
1883 runs0 likes9 downloads9 reach1 impact
368 instances - 23 features - 2 classes - 1927 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
730 runs0 likes5 downloads5 reach6 impact
93 instances - 23 features - 2 classes - 14 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
698 runs0 likes5 downloads5 reach6 impact
36 instances - 23 features - 2 classes - 0 missing values
SPECT heart data This is a merged version of the separate train and test set which are usually distributed. On OpenML this train-test split can be found as one of the possible tasks. Sources: --…
1296 runs1 likes12 downloads13 reach8 impact
267 instances - 23 features - 2 classes - 0 missing values
Pasture Production Data source: Dave Barker AgResearch Grasslands, Palmerston North, New Zealand The objective was to predict pasture production from a variety of biophysical factors. Vegetation and…
878 runs0 likes6 downloads6 reach7 impact
36 instances - 23 features - 3 classes - 0 missing values
%-*- text -*- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage…
2 runs0 likes2 downloads2 reach5 impact
93 instances - 24 features - 0 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
3 runs0 likes3 downloads3 reach4 impact
442 instances - 24 features - 0 classes - 0 missing values
Michel Lang fRMA-normalized. Only "Kratz-genes"*. \* (see: A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international…
0 runs0 likes7 downloads7 reach5 impact
226 instances - 24 features - 2 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach5 impact
16 instances - 24 features - 0 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
0 runs0 likes1 downloads1 reach0 impact
2108 instances - 24 features - 0 classes - 0 missing values
Data used in an analysis of the Brown and Frown corpora for my doctoral dissertation titled ``Variations in Written English: Characterizing Authors' Rhetorical Language Choices Across Corpora of…
2046 runs0 likes0 downloads0 reach4 impact
1000 instances - 24 features - 30 classes - 0 missing values
Squash Harvest Unstored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
876 runs0 likes4 downloads4 reach7 impact
52 instances - 24 features - 3 classes - 39 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
141 runs0 likes7 downloads7 reach6 impact
500 instances - 24 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
687 runs0 likes5 downloads5 reach6 impact
52 instances - 24 features - 2 classes - 39 missing values
No data.
0 runs0 likes0 downloads0 reach5 impact
1000 instances - 25 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach5 impact
1000 instances - 25 features - 0 classes - 0 missing values
No data.
304 runs0 likes6 downloads6 reach2 impact
1000000 instances - 25 features - 10 classes - 0 missing values
led24-pmlb
31 runs0 likes1 downloads1 reach12 impact
3200 instances - 25 features - 10 classes - 0 missing values
The data were collected as the SCITOS G5 robot navigates through the room following the wall in a clockwise direction, for 4 rounds, using 24 ultrasound sensors arranged circularly around its 'waist'.…
22309 runs0 likes20 downloads20 reach26 impact
5456 instances - 25 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
707 runs0 likes5 downloads5 reach6 impact
52 instances - 25 features - 2 classes - 7 missing values
Squash Harvest Stored Data source: Winna Harvey Crop and Food Research, Christchurch, New Zealand The purpose of the research was to determine the changes taking place in squash fruit during the…
867 runs0 likes4 downloads4 reach7 impact
52 instances - 25 features - 3 classes - 7 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
1 runs0 likes1 downloads1 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
21 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes6 downloads6 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes5 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
28 runs0 likes3 downloads3 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes2 downloads2 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 26 features - 0 classes - 0 missing values
No data.
63 runs0 likes3 downloads3 reach1 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
65 runs0 likes8 downloads8 reach1 impact
1000000 instances - 26 features - 7 classes - 0 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach0 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
No data.
27 runs1 likes3 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach1 impact
1137 instances - 26 features - classes - 9255 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach1 impact
1137 instances - 26 features - classes - 9255 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
604 runs0 likes9 downloads9 reach7 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
608 runs0 likes9 downloads9 reach7 impact
1000 instances - 26 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
812 runs0 likes7 downloads7 reach7 impact
250 instances - 26 features - 2 classes - 0 missing values