Data
Filter results by:
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
733 runs0 likes9 downloads9 reach10 impact
7485 instances - 56 features - 2 classes - 32427 missing values
1. Title: Lung Cancer Data 2. Source Information: - Data was published in : Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the…
1238 runs0 likes18 downloads18 reach6 impact
32 instances - 57 features - 3 classes - 5 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
754 runs0 likes10 downloads10 reach10 impact
8844 instances - 57 features - 2 classes - 34843 missing values
No data.
219 runs0 likes4 downloads4 reach2 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Automated file upload of BNG(spambase)
98 runs0 likes3 downloads3 reach2 impact
1000000 instances - 58 features - 2 classes - 0 missing values
SPAM E-mail Database The "spam" concept is diverse: advertisements for products/websites, make money fast schemes, chain letters, pornography... Our collection of spam e-mails came from our postmaster…
159122 runs4 likes85 downloads89 reach5 impact
4601 instances - 58 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
173 runs0 likes6 downloads6 reach17 impact
106 instances - 59 features - 2 classes - 0 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach5 impact
106 instances - 59 features - 2 classes - 0 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach1 impact
186 instances - 61 features - 0 classes - 0 missing values
No data.
296 runs0 likes7 downloads7 reach1 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach1 impact
1000000 instances - 61 features - 2 classes - 0 missing values
Test dataset
0 runs0 likes1 downloads1 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes1 downloads1 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes0 downloads0 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
3 runs0 likes0 downloads0 reach7 impact
15547 instances - 61 features - 2 classes - 280 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach6 impact
3175 instances - 61 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach8 impact
416188 instances - 61 features - 355 classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes4 downloads4 reach5 impact
39644 instances - 61 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
806 runs0 likes8 downloads8 reach9 impact
186 instances - 61 features - 2 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
434 runs0 likes10 downloads10 reach8 impact
7019 instances - 61 features - 8 classes - 48089 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
366 runs0 likes10 downloads10 reach8 impact
8844 instances - 61 features - 7 classes - 51515 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
354 runs0 likes7 downloads7 reach8 impact
7485 instances - 61 features - 7 classes - 52048 missing values
Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. Splice junctions are points on a DNA sequence at which 'superfluous' DNA is removed during the process of protein…
23613 runs1 likes16 downloads17 reach3 impact
3190 instances - 61 features - 3 classes - 0 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2366 runs1 likes25 downloads26 reach3 impact
208 instances - 61 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
744 runs0 likes8 downloads8 reach10 impact
7019 instances - 61 features - 2 classes - 43814 missing values
CD4 count prediction date
0 runs0 likes0 downloads0 reach3 impact
16484 instances - 62 features - classes - 0 missing values
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project", "BL89/2017-IST-ID grant. In this dataset, we present usability (SUS), workload…
0 runs0 likes0 downloads0 reach0 impact
31 instances - 62 features - classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
135 runs0 likes9 downloads9 reach9 impact
3190 instances - 62 features - 2 classes - 0 missing values
### Description Synthetic Control Chart Time Series. This is actually time series classification. ### Sources ``` * Original Owner and Donor Dr Robert Alcock rob@skyblue.csd.auth.gr ``` ### Dataset…
20355 runs0 likes10 downloads10 reach43 impact
600 instances - 62 features - 6 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
169 runs0 likes8 downloads8 reach10 impact
600 instances - 62 features - 2 classes - 0 missing values
No data.
996 runs0 likes4 downloads4 reach5 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
882 runs0 likes6 downloads6 reach5 impact
71 instances - 63 features - 6 classes - 0 missing values
No data.
948 runs0 likes5 downloads5 reach5 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
949 runs0 likes4 downloads4 reach5 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach1 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach2 impact
1000000 instances - 65 features - 10 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143077 runs2 likes63 downloads65 reach411 impact
1599 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143050 runs1 likes16 downloads17 reach411 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143288 runs1 likes35 downloads36 reach409 impact
1600 instances - 65 features - 100 classes - 0 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
34399 runs3 likes22 downloads25 reach5 impact
5620 instances - 65 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
765 runs0 likes12 downloads12 reach9 impact
5620 instances - 65 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
36821 runs0 likes19 downloads19 reach5 impact
2000 instances - 65 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
794 runs0 likes9 downloads9 reach9 impact
2000 instances - 65 features - 2 classes - 0 missing values
The experiments were carried out with a group of 30 volunteers within an age bracket of 19-48 years. They performed a protocol of activities composed of six basic activities: three static postures…
83 runs0 likes9 downloads9 reach5 impact
180 instances - 68 features - 6 classes - 0 missing values
Fixed dataset for autoHorse.csv I suggest...
0 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 186 classes - 0 missing values
price col is int now. autoHorse dataset
11 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
622 runs0 likes6 downloads6 reach11 impact
10108 instances - 69 features - 2 classes - 2699 missing values
No data.
37 runs0 likes2 downloads2 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes2 downloads2 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach3 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
7303 runs0 likes12 downloads12 reach5 impact
226 instances - 70 features - 24 classes - 317 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
721 runs0 likes5 downloads5 reach9 impact
226 instances - 70 features - 2 classes - 317 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
801 runs0 likes8 downloads8 reach9 impact
841 instances - 71 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
26477 runs0 likes7 downloads7 reach28 impact
841 instances - 71 features - 4 classes - 0 missing values
Internet Usage Data Data Type multivariate Abstract This data contains general demographic information on internet users in 1997. Sources Original Owner [1]Graphics, Visualization, & Usability Center…
0 runs1 likes5 downloads6 reach3 impact
10108 instances - 72 features - 46 classes - 2699 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach2 impact
9125 instances - 72 features - classes - 3264 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach5 impact
2536 instances - 73 features - 0 classes - 0 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes27 downloads27 reach4 impact
1455525 instances - 73 features - 10 classes - 0 missing values
Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond, Knowledge and Information Systems, Vol. 14, No. 3, 2008. 1 . Abstract: Two ground ozone level data sets are included in…
184632 runs0 likes16 downloads16 reach21 impact
2534 instances - 73 features - 2 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes1 downloads1 reach2 impact
565163 instances - 75 features - 0 classes - 15247061 missing values
No data.
405 runs0 likes7 downloads7 reach5 impact
45164 instances - 75 features - 11 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach1 impact
144 instances - 77 features - 0 classes - 0 missing values
No data.
48 runs1 likes4 downloads5 reach2 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach2 impact
1000000 instances - 77 features - 10 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Supply Chain Management datasets are derived from the Trading Agent Competition in Supply…
0 runs0 likes0 downloads0 reach2 impact
8966 instances - 77 features - classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
758 runs0 likes10 downloads10 reach9 impact
2000 instances - 77 features - 2 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
36068 runs0 likes11 downloads11 reach5 impact
2000 instances - 77 features - 10 classes - 0 missing values
Abstract: This data-set contains examples of buzz events from two different social networks: Twitter, and Tom's Hardware, a forum network focusing on new technology with more conservative dynamics.…
0 runs0 likes0 downloads0 reach3 impact
583250 instances - 78 features - 0 classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs2 likes5 downloads7 reach3 impact
593 instances - 78 features - 2 classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes0 downloads0 reach2 impact
593 instances - 78 features - classes - 0 missing values
Multi-label dataset. Audio dataset (emotions) consists of 593 musical files with 6 clustered emotional labels and 72 predictors. Each song can be labeled with one or more of the labels…
0 runs0 likes1 downloads1 reach3 impact
593 instances - 78 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach8 impact
425240 instances - 79 features - 2 classes - 2734000 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes0 downloads0 reach0 impact
1460 instances - 80 features - 0 classes - 6965 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 81 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 81 features - 2 classes - 0 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes1 downloads1 reach1 impact
1460 instances - 81 features - 0 classes - 6965 missing values
Two colour spotted cDNA array data set of a series of experiments to identify which genes in Yeast are cell cycle regulated.
0 runs0 likes0 downloads0 reach1 impact
6178 instances - 82 features - classes - 59017 missing values
Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning. The…
7062 runs0 likes0 downloads0 reach12 impact
1080 instances - 82 features - 8 classes - 1396 missing values
Information about customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company…
0 runs0 likes3 downloads3 reach6 impact
9822 instances - 86 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 91 features - 0 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes2 downloads2 reach5 impact
47 instances - 91 features - 5 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes0 downloads0 reach5 impact
47 instances - 91 features - 4 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 91 features - 2 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
71 runs0 likes2 downloads2 reach6 impact
88 instances - 91 features - 4 classes - 0 missing values
University of Sao Paulo, School of Art, Sciences and Humanities, Sao Paulo, SP, Brazil ### LIBRAS Movement Database LIBRAS, acronym of the Portuguese name "LIngua BRAsileira de Sinais", is the…
0 runs0 likes4 downloads4 reach11 impact
360 instances - 91 features - 0 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
129 runs0 likes3 downloads3 reach6 impact
117 instances - 91 features - 3 classes - 0 missing values
* Dataset Title: Robot Execution Failures Data Set * Abstract: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque…
130 runs0 likes6 downloads6 reach6 impact
164 instances - 91 features - 5 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
0 runs0 likes0 downloads0 reach5 impact
145 instances - 95 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
747 runs0 likes7 downloads7 reach8 impact
145 instances - 95 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
765 runs0 likes7 downloads7 reach8 impact
145 instances - 95 features - 2 classes - 0 missing values
Source: Creators : François Kawala (1,2) Ahlame Douzal (1) Eric Gaussier (1) Eustache Diemert (2) Institutions : (1) Université Joseph Fourier (Grenoble I) Laboratoire d'informatique de…
0 runs0 likes1 downloads1 reach3 impact
28179 instances - 97 features - classes - 0 missing values
This data was collected from combine primary and secondary sources, through questionnaire, verbal interview and some part of the hospital’s record department’s data, from the selected…
0 runs0 likes0 downloads0 reach2 impact
281 instances - 98 features - 2 classes - 2 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
100 instances - 101 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 101 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
500 instances - 101 features - 0 classes - 0 missing values