Data
Filter results by:
test
0 runs0 likes0 downloads0 reach0 impact
270 instances - 14 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
270 instances - 14 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
270 instances - 14 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
270 instances - 14 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
4324 instances - 9 features - classes - 3360 missing values
Pittsburgh bridges This version is derived from version 2 (the discretized version) by removing all instances with missing values in the last (target) attribute. The bridges dataset is originally not…
31 runs0 likes3 downloads3 reach12 impact
105 instances - 13 features - 6 classes - 61 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
2 runs0 likes2 downloads2 reach6 impact
2178 instances - 4 features - 0 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
60197 instances - 6 features - classes - 128136 missing values
Salary Emp
0 runs0 likes0 downloads0 reach0 impact
31 instances - 2 features - classes - 0 missing values
"The speech dataset was also provided by (see citation request) and contains real world data from recorded English language. The normal class contains data from persons having an American accent…
1599 runs0 likes6 downloads6 reach15 impact
3686 instances - 401 features - 2 classes - 0 missing values
DESCRIPTIVE ABSTRACT: The data set contains the oral, written and combined test scores for 2003 New Haven Fire Department promotion exams. The Race and Position for each test taker are also given.…
0 runs0 likes0 downloads0 reach0 impact
118 instances - 6 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
101 instances - 18 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
2178 instances - 4 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
336 instances - 8 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
8124 instances - 23 features - classes - 2480 missing values
URL dataset
0 runs0 likes0 downloads0 reach0 impact
121001 instances - 501 features - 0 classes - 0 missing values
URL dataset 3
0 runs0 likes0 downloads0 reach0 impact
18982 instances - 80 features - 5 classes - 0 missing values
Email dataset 1a
0 runs0 likes0 downloads0 reach0 impact
4585 instances - 4 features - 0 classes - 0 missing values
Email dataset 1c
0 runs0 likes0 downloads0 reach0 impact
4585 instances - 792 features - 0 classes - 0 missing values
Email dataset 1b
0 runs0 likes0 downloads0 reach0 impact
4585 instances - 24 features - 0 classes - 161 missing values
Phishing website 1
0 runs0 likes0 downloads0 reach0 impact
11055 instances - 31 features - 0 classes - 0 missing values
URL dataset 2
0 runs0 likes0 downloads0 reach0 impact
95911 instances - 13 features - 0 classes - 0 missing values
Email dataset 1d
0 runs0 likes0 downloads0 reach0 impact
4585 instances - 11 features - 0 classes - 0 missing values
Multiclass cancer diagnosis using 16063 tumor gene expression signatures. PNAS, VOL 98, no 26, pp. 15149-15154, December 18, 2001. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, C.-H. Yeang, M.…
116 runs0 likes8 downloads8 reach20 impact
190 instances - 16064 features - 14 classes - 0 missing values
Balanced version of click prediction data
36 runs0 likes14 downloads14 reach11 impact
1997410 instances - 12 features - 2 classes - 0 missing values
* Abstract: 9-class version of poker-hand dataset, it was removed the minority class.
1 runs0 likes3 downloads3 reach12 impact
1025000 instances - 11 features - 9 classes - 0 missing values
Data used in an analysis of the Brown and Frown corpora for my doctoral dissertation titled ``Variations in Written English: Characterizing Authors' Rhetorical Language Choices Across Corpora of…
2048 runs0 likes1 downloads1 reach10 impact
1000 instances - 24 features - 30 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes3 downloads3 reach10 impact
286 instances - 10 features - 0 classes - 9 missing values
Dataset Title: Localization Data for Person Activity Data Set Abstract: Data contains recordings of five people performing different activities. Each person wore four sensors (tags) while performing…
6 runs0 likes6 downloads6 reach13 impact
164860 instances - 8 features - 11 classes - 0 missing values
Email dataset 1e
0 runs0 likes0 downloads0 reach0 impact
4585 instances - 580 features - 0 classes - 0 missing values
* Title: South Africa Heart Disease Dataset * Description A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case…
155 runs0 likes14 downloads14 reach12 impact
462 instances - 10 features - 2 classes - 0 missing values
Re-upload of the dataset as it is present in the Penn ML Benchmark (https://github.com/EpistasisLab/penn-ml-benchmarks/tree/master/datasets/classification/fars). It's a dataset on traffic accidents,…
1 runs0 likes2 downloads2 reach20 impact
100968 instances - 30 features - 8 classes - 0 missing values
Email dataset 2
0 runs0 likes0 downloads0 reach0 impact
11507 instances - 4 features - 0 classes - 0 missing values
Testing dataset
0 runs0 likes0 downloads0 reach0 impact
134731 instances - 31 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
10 runs0 likes2 downloads2 reach15 impact
65196 instances - 28 features - 100 classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 0.1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
63420 runs0 likes16 downloads16 reach23 impact
39948 instances - 12 features - 2 classes - 0 missing values
Human Activity Recognition (HAR) database built from the recordings of 30 subjects performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors.…
24372 runs0 likes25 downloads25 reach40 impact
10299 instances - 562 features - 6 classes - 0 missing values
####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working…
20229 runs0 likes12 downloads12 reach16 impact
5500 instances - 41 features - 11 classes - 0 missing values
data from yahoo finance
0 runs0 likes0 downloads0 reach0 impact
1259 instances - 7 features - classes - 0 missing values
MY Dataset
0 runs0 likes0 downloads0 reach0 impact
120 instances - 7 features - classes - 0 missing values
This is weather data in arff format
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - classes - 0 missing values
Multi-label dataset. The UC Berkeley enron4 dataset represents a subset of the original enron5 dataset and consists of 1684 cases of emails with 21 labels and 1001 predictor variables.
1 runs0 likes4 downloads4 reach12 impact
1702 instances - 1054 features - 2 classes - 0 missing values
this is test data
0 runs0 likes0 downloads0 reach0 impact
5 instances - 5 features - classes - 0 missing values
test data
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
sample
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - classes - 0 missing values
AutoML challenge 2014. Original task: regression. Test and validation sets can be obtained on the Cha Learn website: https://automl.chalearn.org/data
0 runs0 likes0 downloads0 reach0 impact
99 instances - 200001 features - 0 classes - 0 missing values
AutoML challenge 2014. Original task: regression. Test and validation sets can be obtained on the Cha Learn website: https://automl.chalearn.org/data
0 runs0 likes0 downloads0 reach0 impact
400000 instances - 101 features - 0 classes - 0 missing values
Coal mining requires working in hazardous conditions. Miners in an underground coal mine can face several threats, such as, e.g. methane explosions or rock-burst. To provide protection for people…
0 runs0 likes0 downloads0 reach0 impact
test3
0 runs0 likes0 downloads0 reach0 impact
2 instances - 8 features - classes - 0 missing values
newtest3
0 runs0 likes0 downloads0 reach0 impact
2 instances - 6 features - classes - 0 missing values
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project", "BL89/2017-IST-ID grant. In this dataset, we present usability (SUS), workload…
0 runs0 likes0 downloads0 reach5 impact
31 instances - 62 features - classes - 0 missing values
Global soil saturated hydraulic conductivity measurements for geoscience applications. Total of 1,832 sites with 13,072 Ksat measurements were assembled from published literature and other sources and…
0 runs0 likes1 downloads1 reach5 impact
This dataset contains 10962 houses to rent with 13 diferent features. Some values in the dataset can be considered as outliers for further analyses. Bear in mind that the Web Crawler was used only to…
0 runs0 likes0 downloads0 reach0 impact
10692 instances - 13 features - 0 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes7 downloads7 reach6 impact
284807 instances - 31 features - 0 classes - 0 missing values
Context It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content The…
0 runs0 likes1 downloads1 reach6 impact
284807 instances - 31 features - 2 classes - 0 missing values
Global soil hydraulic properties (Ksat, Water Content 33 kPa <2mm, Water Content 1500 kPa <2mm) for geoscience applications. Total of 155,649 measurements were assembled from published…
0 runs0 likes1 downloads1 reach5 impact
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
This data is used to test water contamination
0 runs0 likes0 downloads0 reach0 impact
26 instances - 8 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
test4
0 runs0 likes0 downloads0 reach0 impact
26 instances - 8 features - classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
67 runs0 likes2 downloads2 reach13 impact
458 instances - 10937 features - 2 classes - 0 missing values
% Title: Flora % Source: https://automl.chalearn.org/data % % Dataset from the first ChaLearn AutoML challenge (2014). % Only the training data is included, as there were no labels for validation and…
0 runs0 likes0 downloads0 reach0 impact
15000 instances - 200001 features - 0 classes - 0 missing values
### Description The data consists of real historical data collected from 2010 & 2011. Employees are manually allowed or denied access to resources over time. The data is used to create an algorithm…
35323 runs0 likes19 downloads19 reach25 impact
32769 instances - 10 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
778 runs0 likes8 downloads8 reach14 impact
4562 instances - 15 features - 2 classes - 88 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes5 downloads5 reach22 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2866 runs0 likes8 downloads8 reach22 impact
546 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach13 impact
355 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach13 impact
363 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes5 downloads5 reach13 impact
250 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach13 impact
203 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach13 impact
329 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes3 downloads3 reach12 impact
130 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2862 runs0 likes8 downloads8 reach22 impact
1545 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
2860 runs0 likes7 downloads7 reach22 impact
604 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach13 impact
337 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach13 impact
484 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
79 runs0 likes3 downloads3 reach13 impact
322 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes5 downloads5 reach13 impact
275 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
76 runs0 likes5 downloads5 reach13 impact
187 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach13 impact
413 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
78 runs0 likes4 downloads4 reach13 impact
421 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
82 runs0 likes5 downloads5 reach13 impact
405 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes2 downloads2 reach13 impact
384 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes3 downloads3 reach13 impact
201 instances - 10937 features - 2 classes - 0 missing values
GEMLeR provides a collection of gene expression datasets that can be used for benchmarking gene expression oriented machine learning algorithms. They can be used for estimation of different quality…
77 runs0 likes4 downloads4 reach13 impact
193 instances - 10937 features - 2 classes - 0 missing values
No data.
697 runs0 likes7 downloads7 reach13 impact
320 instances - 9 features - 2 classes - 0 missing values
Datasets from the Agnostic Learning vs. Prior Knowledge Challenge (http://www.agnostic.inf.ethz.ch) Dataset from: http://www.agnostic.inf.ethz.ch/datasets.php Modified by TunedIT (converted to ARFF…
548 runs0 likes9 downloads9 reach14 impact
3468 instances - 785 features - 2 classes - 0 missing values
test
0 runs0 likes1 downloads1 reach6 impact
1994 instances - 127 features - 0 classes - 39202 missing values
A subset of the 3D dataset from Princeton\'s COS 429 Computer Vision course. The dataset consists of 40 models organised into 4 classes of 10 objects each.
0 runs0 likes0 downloads0 reach0 impact
16000 instances - 4 features - classes - 0 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes2 downloads2 reach6 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
Anonymized data of dating profiles from OkCupid
0 runs0 likes3 downloads3 reach6 impact
59946 instances - 31 features - 0 classes - 273249 missing values
According to Epsilon research, 80% of customers are more likely to do business with you if you provide personalized service. Banking is no exception. The digitalization of everyday lives means that…
0 runs0 likes1 downloads1 reach5 impact
4459 instances - 4992 features - 0 classes - 0 missing values
source: http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/ authors: L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown translator in coseal format: M. Lindauer with the help of Alexandre Frechette the data do…
0 runs0 likes1 downloads1 reach6 impact
4440 instances - 117 features - 0 classes - 27150 missing values
Modified version of the training dataset of the Bike Sharing Demand challenge running on Kaggle (http://www.kaggle.com/c/bike-sharing-demand/) If you use the problem in publication, please cite:…
0 runs0 likes3 downloads3 reach4 impact
10886 instances - 12 features - 0 classes - 0 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes0 downloads0 reach0 impact
17379 instances - 13 features - 0 classes - 0 missing values
Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. Through these systems, user is able to easily rent…
0 runs0 likes0 downloads0 reach0 impact
17379 instances - 13 features - 0 classes - 0 missing values
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each…
0 runs0 likes3 downloads3 reach6 impact
9228 instances - 13 features - 0 classes - 11169 missing values
this is titanic survival prediction
0 runs0 likes3 downloads3 reach5 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes2 downloads2 reach5 impact
891 instances - 8 features - 0 classes - 0 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes11 downloads11 reach7 impact
3772 instances - 30 features - 4 classes - 6064 missing values