Data
Filter results by:
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
736 runs0 likes6 downloads6 reach9 impact
364 instances - 33 features - 2 classes - 80 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
490 runs0 likes4 downloads4 reach7 impact
364 instances - 33 features - 6 classes - 101 missing values
Data file: This data from "Problem-Solving" on "backache in pregnancy" is in somewhat different format from that listed in the book. Each integer is preceded by a space. This makes it easier to read.…
174 runs0 likes6 downloads6 reach9 impact
180 instances - 33 features - 2 classes - 0 missing values
#sbox
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 32 features - classes - 0 missing values
White Clover Persistence Trials Data source: Ian Tarbotton AgResearch, Whatawhata Research Centre, Hamilton, New Zealand The objective was to determine the mechanisms which influence the persistence…
858 runs0 likes5 downloads5 reach9 impact
63 instances - 32 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
732 runs0 likes5 downloads5 reach8 impact
63 instances - 32 features - 2 classes - 0 missing values
Om algos te testen
74 runs0 likes5 downloads5 reach7 impact
14240 instances - 31 features - 2 classes - 0 missing values
The YouTube personality dataset consists of a collection of behavorial features, speech transcriptions, and personality impression scores for a set of 404 YouTube vloggers that explicitly show…
0 runs0 likes0 downloads0 reach2 impact
404 instances - 31 features - classes - 0 missing values
The YouTube personality dataset consists of a collection of behavorial features, speech transcriptions, and personality impression scores for a set of 404 YouTube vloggers that explicitly show…
0 runs0 likes0 downloads0 reach2 impact
404 instances - 31 features - classes - 0 missing values
Anonymized data of dating profiles from OkCupid
0 runs0 likes1 downloads1 reach1 impact
59946 instances - 31 features - 0 classes - 273249 missing values
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807…
355 runs0 likes55 downloads55 reach12 impact
284807 instances - 31 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
0 runs0 likes0 downloads0 reach6 impact
31 instances - 31 features - 0 classes - 33 missing values
Source: Rami Mustafa A Mohammad ( University of Huddersfield, rami.mohammad '@' hud.ac.uk, rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield,t.l.mccluskey '@' hud.ac.uk ) Fadi…
51012 runs1 likes21 downloads22 reach19 impact
11055 instances - 31 features - 2 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes0 downloads0 reach1 impact
284807 instances - 31 features - 2 classes - 0 missing values
Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed. ! Note that there is also a related Breast Cancer Wisconsin (Original) Data Set with a different set of…
225296 runs3 likes35 downloads38 reach21 impact
569 instances - 31 features - 2 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes4 downloads4 reach2 impact
284807 instances - 31 features - 0 classes - 0 missing values
No data.
73 runs0 likes5 downloads5 reach1 impact
1000000 instances - 30 features - 2 classes - 0 missing values
No data.
65 runs0 likes5 downloads5 reach1 impact
1000000 instances - 30 features - 4 classes - 0 missing values
allbp-pmlb
31 runs0 likes1 downloads1 reach11 impact
3772 instances - 30 features - 3 classes - 0 missing values
allrep-pmlb
31 runs0 likes0 downloads0 reach11 impact
3772 instances - 30 features - 4 classes - 0 missing values
dis-pmlb
31 runs0 likes0 downloads0 reach12 impact
3772 instances - 30 features - 2 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Water Quality dataset (Dzeroski et al. 2000) has 14 target attributes that refer to the…
0 runs0 likes0 downloads0 reach2 impact
1060 instances - 30 features - classes - 0 missing values
Embedding of atoms for HIV inhibitors dataser
0 runs0 likes0 downloads0 reach0 impact
1069964 instances - 30 features - classes - 0 missing values
Embedding of molecules bonds in HIV inhibitors dataset
0 runs0 likes0 downloads0 reach0 impact
1151940 instances - 30 features - classes - 0 missing values
The sick dataset from the OpenCC18 with all categorical data label encoded so all data is numeric
0 runs0 likes0 downloads0 reach1 impact
3772 instances - 30 features - classes - 0 missing values
Sick dataset from the opencc18 with all textual binary variables label encoded.
1 runs0 likes0 downloads0 reach2 impact
3772 instances - 30 features - 2 classes - 0 missing values
Re-upload of the dataset as it is present in the Penn ML Benchmark (https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/fars). It's a dataset on traffic accidents,…
1 runs0 likes1 downloads1 reach15 impact
100968 instances - 30 features - 8 classes - 0 missing values
No data.
253 runs0 likes9 downloads9 reach2 impact
1076790 instances - 30 features - 2 classes - 7275 missing values
ARFF version of UCI dataset 'flags'. Creators: Collected primarily from the "Collins Gem Guide to Flags": Collins Publishers (1986). Donor: Richard S. Forsyth. Date 5/15/1990 This data file contains…
108 runs0 likes8 downloads8 reach11 impact
194 instances - 30 features - 8 classes - 0 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes11 downloads11 reach3 impact
3772 instances - 30 features - 4 classes - 6064 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19940 runs0 likes31 downloads31 reach3 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
140 runs0 likes6 downloads6 reach9 impact
194 instances - 30 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
737 runs0 likes9 downloads9 reach9 impact
3772 instances - 30 features - 2 classes - 6064 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
756 runs0 likes8 downloads8 reach8 impact
121 instances - 30 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
789 runs0 likes9 downloads9 reach8 impact
101 instances - 30 features - 2 classes - 0 missing values
No data.
794 runs1 likes13 downloads14 reach8 impact
107 instances - 30 features - 2 classes - 0 missing values
No data.
726 runs0 likes9 downloads9 reach8 impact
36 instances - 30 features - 2 classes - 0 missing values
No data.
718 runs0 likes5 downloads5 reach8 impact
63 instances - 30 features - 2 classes - 0 missing values
Data Set Information: The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. The…
0 runs1 likes5 downloads6 reach5 impact
98050 instances - 29 features - 0 classes - 9 missing values
Source: 1. Muhammad Naeem, Centre of Research in Data Engineering(CORDE) & Department of Computer Science, MAJU Islamabad Pakistan(naeems.naeem '@' gmail.com). 2. Sohail Asghar, Director/Associate…
0 runs0 likes1 downloads1 reach3 impact
65554 instances - 29 features - classes - 0 missing values
Source: 1. Olcay KURSUN, PhD., Istanbul University, Department of Computer Engineering, 34320, Istanbul, Turkey Phone: +90 (212) 473 7070 - 17827 Email: okursun '@' istanbul.edu.tr 2. Betul ERDOGDU…
0 runs0 likes3 downloads3 reach3 impact
1039 instances - 29 features - classes - 0 missing values
### Attribute Information * The first column is the class label (1 for signal, 0 for background) * 21 low-level features (kinematic properties): lepton pT, lepton eta, lepton phi, missing energy…
14236 runs1 likes9 downloads10 reach19 impact
98050 instances - 29 features - 2 classes - 9 missing values
No data.
70 runs0 likes3 downloads3 reach1 impact
1000000 instances - 28 features - 2 classes - 0 missing values
No data.
163 runs0 likes5 downloads5 reach1 impact
1000000 instances - 28 features - 2 classes - 0 missing values
__Changes w.r.t. version 1: included one target factor with 7 levels as target variable for the classification. Also deleted the previous 7 binary target variables.__ A dataset of steel plates'…
7374 runs1 likes2 downloads3 reach8 impact
1941 instances - 28 features - 7 classes - 0 missing values
The task consists of Learning Quantitative Structure Activity Relationships (QSARs). The Inhibition of Dihydrofolate Reductase by Pyrimidines.The data are described in: King, Ross .D., Muggleton,…
6 runs0 likes2 downloads2 reach2 impact
74 instances - 28 features - 0 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
10 runs0 likes1 downloads1 reach9 impact
65196 instances - 28 features - 100 classes - 0 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1744 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1744 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach5 impact
163 instances - 28 features - 5 classes - 9 missing values
Jarkko Salojarvi, Kai Puolamaki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski. Inferring Relevance from Eye Movements: Feature Extraction. Helsinki University of Technology, Publications in…
440 runs0 likes11 downloads11 reach9 impact
10936 instances - 28 features - 3 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
734 runs0 likes6 downloads6 reach8 impact
74 instances - 28 features - 2 classes - 0 missing values
This is a sesnor data for test it is not complete.
0 runs0 likes4 downloads4 reach3 impact
127591 instances - 27 features - classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
32 runs0 likes7 downloads7 reach5 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes8 downloads9 reach5 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
31 runs1 likes8 downloads9 reach5 impact
2800 instances - 27 features - 5 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
20 runs0 likes0 downloads0 reach3 impact
2108 instances - 27 features - 0 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
14 runs0 likes0 downloads0 reach2 impact
527 instances - 27 features - 4 classes - 0 missing values
uci
0 runs0 likes0 downloads0 reach1 impact
30000 instances - 27 features - classes - 0 missing values
UCI Thyroid allbp dataset.
97 runs0 likes9 downloads9 reach7 impact
2800 instances - 27 features - 5 classes - 0 missing values
General Description of Thyroid Disease Databases and Related Files This directory contains 6 databases, corresponding test set, and corresponding documentation. They were left at the University of…
92 runs0 likes5 downloads5 reach7 impact
2800 instances - 27 features - 5 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) Database of surgeries on horses. Possible class attributes: 24 (whether lesion is surgical), others include: 23, 25, 26, and 27 Notes: * Hospital_Number…
236 runs0 likes9 downloads9 reach3 impact
368 instances - 27 features - 2 classes - 1927 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach5 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach5 impact
1000 instances - 26 features - 0 classes - 0 missing values
No data.
32 runs0 likes1 downloads1 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
33 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes6 downloads6 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
29 runs0 likes4 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes4 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes5 downloads5 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
28 runs0 likes3 downloads3 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs0 likes2 downloads2 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 26 features - 0 classes - 0 missing values
No data.
63 runs0 likes3 downloads3 reach1 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
65 runs0 likes8 downloads8 reach1 impact
1000000 instances - 26 features - 7 classes - 0 missing values
No data.
27 runs1 likes3 downloads4 reach2 impact
1000000 instances - 26 features - 7 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach2 impact
1137 instances - 26 features - classes - 9255 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach2 impact
1137 instances - 26 features - classes - 9255 missing values
This dataset is an artificial simulation of the Duffing system with random changes from the chaotic to the non-chaotic regime at different noise levels.
0 runs0 likes0 downloads0 reach1 impact
2493200 instances - 26 features - classes - 0 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach1 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
1 runs0 likes1 downloads1 reach6 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes2 downloads2 reach6 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
21 runs0 likes0 downloads0 reach7 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
500 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes1 downloads1 reach6 impact
1000 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
100 instances - 26 features - 0 classes - 0 missing values
The Friedman datasets are 80 artificially generated datasets originating from: J.H. Friedman (1999). Stochastic Gradient Boosting The dataset names are coded as…
0 runs0 likes0 downloads0 reach6 impact
250 instances - 26 features - 0 classes - 0 missing values
This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as…
3252 runs2 likes26 downloads28 reach4 impact
205 instances - 26 features - 6 classes - 59 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
770 runs0 likes8 downloads8 reach8 impact
100 instances - 26 features - 2 classes - 0 missing values