Data
Filter results by:
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
817 runs0 likes7 downloads7 reach15 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
773 runs0 likes6 downloads6 reach14 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
764 runs0 likes6 downloads6 reach14 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
807 runs0 likes7 downloads7 reach15 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
766 runs0 likes7 downloads7 reach14 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
646 runs0 likes9 downloads9 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
781 runs0 likes8 downloads8 reach15 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
786 runs0 likes6 downloads6 reach15 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes6 downloads6 reach15 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
636 runs0 likes8 downloads8 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
614 runs0 likes9 downloads9 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
801 runs0 likes9 downloads9 reach15 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
748 runs0 likes6 downloads6 reach15 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
784 runs0 likes7 downloads7 reach15 impact
500 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
621 runs0 likes8 downloads8 reach15 impact
1000 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
775 runs0 likes6 downloads6 reach15 impact
250 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
788 runs0 likes7 downloads7 reach14 impact
100 instances - 51 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
810 runs0 likes6 downloads6 reach14 impact
100 instances - 51 features - 2 classes - 0 missing values
uci
0 runs0 likes0 downloads0 reach8 impact
101766 instances - 52 features - classes - 192849 missing values
Source: James P Bridge, Sean B Holden and Lawrence C Paulson University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD UK +44 (0)1223 763500…
26323 runs1 likes21 downloads22 reach43 impact
6118 instances - 52 features - 6 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
31 instances - 54 features - 0 classes - 0 missing values
This is the original version of the famous covertype dataset in ARFF format. Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a…
9 runs1 likes14 downloads15 reach23 impact
581012 instances - 55 features - 7 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
12 runs0 likes1 downloads1 reach18 impact
83733 instances - 55 features - 4 classes - 0 missing values
### Data Set Information: Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was…
342 runs1 likes39 downloads40 reach12 impact
581012 instances - 55 features - 7 classes - 0 missing values
Predicting forest cover type from cartographic variables only (no remotely sensed data). The actual forest cover type for a given observation (30 x 30 meter cell) was determined from US Forest Service…
216 runs0 likes11 downloads11 reach12 impact
110393 instances - 55 features - 7 classes - 0 missing values
This is the famous covertype dataset in its binary version, retrieved 2013-11-13 from the libSVM site (called covtype.binary there). Additional to the preprocessing done there (see LibSVM site for…
22 runs0 likes9 downloads9 reach15 impact
581012 instances - 55 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
733 runs0 likes9 downloads9 reach16 impact
7485 instances - 56 features - 2 classes - 32427 missing values
Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991. Lung Cancer Data * Past Usage: - Hong, Z.Q. and Yang, J.Y. "Optimal…
1238 runs0 likes19 downloads19 reach13 impact
32 instances - 57 features - 3 classes - 5 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
754 runs0 likes10 downloads10 reach16 impact
8844 instances - 57 features - 2 classes - 34843 missing values
No data.
219 runs0 likes4 downloads4 reach11 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Automated file upload of BNG(spambase)
98 runs0 likes3 downloads3 reach11 impact
1000000 instances - 58 features - 2 classes - 0 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
0 runs0 likes0 downloads0 reach0 impact
595212 instances - 58 features - 2 classes - 846458 missing values
SPAM E-mail Database The "spam" concept is diverse: advertisements for products/websites, make money fast schemes, chain letters, pornography... Our collection of spam e-mails came from our postmaster…
161528 runs5 likes89 downloads94 reach12 impact
4601 instances - 58 features - 2 classes - 0 missing values
Compilation of promoters with known transcriptional start points for E. coli genes. The task is to recognize promoters in strings that represent nucleotides (one of A, G, T, or C). A promoter is a…
138 runs1 likes9 downloads10 reach12 impact
106 instances - 58 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
173 runs0 likes6 downloads6 reach24 impact
106 instances - 58 features - 2 classes - 0 missing values
Version with url set as row id, creator data missing due to bad formatting.**Author**: Kelwin Fernandes (INESC TEC, Universidade doPorto), Pedro Vinagre (ALGORITMI Research Centre, Universidade do…
0 runs0 likes0 downloads0 reach0 impact
39644 instances - 60 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach16 impact
3175 instances - 61 features - 0 classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes5 downloads5 reach11 impact
39644 instances - 61 features - 0 classes - 0 missing values
Test dataset
0 runs0 likes1 downloads1 reach13 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes1 downloads1 reach13 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
0 runs0 likes0 downloads0 reach13 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
3 runs0 likes0 downloads0 reach15 impact
15547 instances - 61 features - 2 classes - 280 missing values
No data.
296 runs0 likes7 downloads7 reach9 impact
1000000 instances - 61 features - 2 classes - 0 missing values
No data.
50 runs0 likes3 downloads3 reach9 impact
1000000 instances - 61 features - 2 classes - 0 missing values
The problem is to learn a regression equation/rule/tree to predict the activity from the descriptive structural attributes. The data and methodology is described in detail in: - King, Ross .D., Hurst,…
5 runs0 likes1 downloads1 reach9 impact
186 instances - 61 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach16 impact
416188 instances - 61 features - 355 classes - 0 missing values
Primate splice-junction gene sequences (DNA) with associated imperfect domain theory. Splice junctions are points on a DNA sequence at which 'superfluous' DNA is removed during the process of protein…
24188 runs1 likes17 downloads18 reach9 impact
3190 instances - 61 features - 3 classes - 0 missing values
NAME: Sonar, Mines vs. Rocks SUMMARY: This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network [1]. The task is to train a network…
2366 runs1 likes25 downloads26 reach9 impact
208 instances - 61 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
744 runs0 likes8 downloads8 reach16 impact
7019 instances - 61 features - 2 classes - 43814 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
806 runs0 likes8 downloads8 reach15 impact
186 instances - 61 features - 2 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
354 runs0 likes7 downloads7 reach14 impact
7485 instances - 61 features - 7 classes - 52048 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
135 runs0 likes9 downloads9 reach15 impact
3190 instances - 61 features - 2 classes - 0 missing values
### Description Synthetic Control Chart Time Series. This is actually time series classification. ### Sources ``` * Original Owner and Donor Dr Robert Alcock rob@skyblue.csd.auth.gr ``` ### Dataset…
20355 runs0 likes10 downloads10 reach50 impact
600 instances - 61 features - 6 classes - 0 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
366 runs0 likes10 downloads10 reach14 impact
8844 instances - 61 features - 7 classes - 51515 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
434 runs0 likes10 downloads10 reach14 impact
7019 instances - 61 features - 8 classes - 48089 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
169 runs0 likes8 downloads8 reach16 impact
600 instances - 61 features - 2 classes - 0 missing values
CD4 count prediction date
0 runs0 likes0 downloads0 reach10 impact
16484 instances - 62 features - classes - 0 missing values
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project", "BL89/2017-IST-ID grant. In this dataset, we present usability (SUS), workload…
0 runs0 likes0 downloads0 reach7 impact
31 instances - 62 features - classes - 0 missing values
No data.
996 runs0 likes4 downloads4 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
948 runs0 likes5 downloads5 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
949 runs0 likes4 downloads4 reach12 impact
74 instances - 63 features - 4 classes - 0 missing values
No data.
882 runs0 likes6 downloads6 reach12 impact
71 instances - 63 features - 6 classes - 0 missing values
No data.
194 runs0 likes3 downloads3 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
52 runs0 likes2 downloads2 reach10 impact
1000000 instances - 65 features - 10 classes - 0 missing values
No data.
50 runs0 likes1 downloads1 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
Automated file upload of BNG(optdigits)
100 runs1 likes1 downloads2 reach11 impact
1000000 instances - 65 features - 10 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Shape). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143288 runs1 likes39 downloads40 reach416 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Margin). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143050 runs1 likes17 downloads18 reach418 impact
1600 instances - 65 features - 100 classes - 0 missing values
### Description One-hundred plant species leaves dataset (Class = Texture). ### Sources ``` (a) Original owners of colour Leaves Samples: James Cope, Thibaut Beghin, Paolo Remagnino, Sarah Barman. The…
143077 runs2 likes66 downloads68 reach418 impact
1599 instances - 65 features - 100 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38639 runs0 likes20 downloads20 reach12 impact
2000 instances - 65 features - 10 classes - 0 missing values
1. Title of Database: Optical Recognition of Handwritten Digits 2. Source: E. Alpaydin, C. Kaynak Department of Computer Engineering Bogazici University, 80815 Istanbul Turkey alpaydin@boun.edu.tr…
35799 runs3 likes22 downloads25 reach12 impact
5620 instances - 65 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
765 runs0 likes12 downloads12 reach15 impact
5620 instances - 65 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
794 runs0 likes9 downloads9 reach15 impact
2000 instances - 65 features - 2 classes - 0 missing values
Data reported to the police about the circumstances of personal injury road accidents in Great Britain from 1979, and the maker and model information of vehicles involved in the respective accident.…
0 runs0 likes0 downloads0 reach0 impact
363243 instances - 67 features - 3 classes - 2181757 missing values
The experiments were carried out with a group of 30 volunteers within an age bracket of 19-48 years. They performed a protocol of activities composed of six basic activities: three static postures…
83 runs0 likes9 downloads9 reach12 impact
180 instances - 68 features - 6 classes - 0 missing values
Fixed dataset for autoHorse.csv I suggest...
0 runs0 likes0 downloads0 reach11 impact
201 instances - 69 features - 186 classes - 0 missing values
price col is int now. autoHorse dataset
15 runs0 likes0 downloads0 reach12 impact
201 instances - 69 features - 0 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
622 runs0 likes6 downloads6 reach17 impact
10108 instances - 69 features - 2 classes - 2699 missing values
No data.
33 runs0 likes4 downloads4 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
30 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
31 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
No data.
37 runs0 likes2 downloads2 reach12 impact
1000000 instances - 70 features - 24 classes - 0 missing values
This database is a standardized version of the original audiology database (see audiology.* in this directory). The non-standard set of attributes have been converted to a standard set of attributes…
7303 runs0 likes12 downloads12 reach12 impact
226 instances - 70 features - 24 classes - 317 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
721 runs0 likes5 downloads5 reach15 impact
226 instances - 70 features - 2 classes - 317 missing values
A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on…
28846 runs0 likes8 downloads8 reach34 impact
841 instances - 71 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
801 runs0 likes8 downloads8 reach15 impact
841 instances - 71 features - 2 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach9 impact
9125 instances - 72 features - classes - 3264 missing values
### Internet Usage Data #### Data Type multivariate #### Abstract This data contains general demographic information on internet users in 1997. ### Data Characteristics This data comes from a survey…
0 runs1 likes6 downloads7 reach12 impact
10108 instances - 72 features - 46 classes - 2699 missing values
Dataset created to study concept drift in stream mining. It is constructed by combining the Covertype, Poker-Hand, and Electricity datasets. More details can be found in: Albert Bifet, Geoff Holmes,…
332 runs0 likes27 downloads27 reach12 impact
1455525 instances - 73 features - 10 classes - 0 missing values
1. Title: Ozone Level Detection 2. Source: Kun Zhang zhang.kun05 '@' gmail.com Department of Computer Science, Xavier University of Lousiana Wei Fan wei.fan '@' gmail.com IBM T.J.Watson Research…
0 runs0 likes1 downloads1 reach13 impact
2536 instances - 73 features - 0 classes - 0 missing values
Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond, Knowledge and Information Systems, Vol. 14, No. 3, 2008. 1 . Abstract: Two ground ozone level data sets are included in…
187955 runs1 likes19 downloads20 reach28 impact
2534 instances - 73 features - 2 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes1 downloads1 reach8 impact
565163 instances - 75 features - 0 classes - 15247061 missing values
No data.
405 runs0 likes7 downloads7 reach12 impact
45164 instances - 75 features - 11 classes - 0 missing values
No data.
290 runs0 likes5 downloads5 reach11 impact
1000000 instances - 77 features - 10 classes - 0 missing values
No data.
0 runs0 likes1 downloads1 reach9 impact
144 instances - 77 features - 0 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Supply Chain Management datasets are derived from the Trading Agent Competition in Supply…
0 runs0 likes0 downloads0 reach9 impact
8966 instances - 77 features - classes - 0 missing values
No data.
48 runs1 likes4 downloads5 reach12 impact
1000000 instances - 77 features - 10 classes - 0 missing values
One of a set of 6 datasets describing features of handwritten numerals (0 - 9) extracted from a collection of Dutch utility maps. Corresponding patterns in different datasets correspond to the same…
38026 runs0 likes12 downloads12 reach12 impact
2000 instances - 77 features - 10 classes - 0 missing values