Data
Filter results by:
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
313 runs0 likes36 downloads36 reach7 impact
399482 instances - 12 features - 2 classes - 0 missing values
Normalized version of vehicle dataset (http://www.openml.org/d/54) NAME vehicle silhouettes PURPOSE to classify a given silhouette as one of four types of vehicle, using a set of features extracted…
372 runs0 likes10 downloads10 reach3 impact
98528 instances - 101 features - 2 classes - 0 missing values
Multiclass from binary: Expanding one-vs-all, one-vs-one and ECOC-based approaches. Dataset taken from LIBSVM: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html In this dataset…
0 runs0 likes0 downloads0 reach1 impact
108000 instances - 129 features - 1000 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
3 runs0 likes1 downloads1 reach9 impact
5418 instances - 1637 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
3 runs0 likes2 downloads2 reach9 impact
10000 instances - 2001 features - 5 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
10 runs0 likes1 downloads1 reach10 impact
20000 instances - 4297 features - 2 classes - 0 missing values
Much of machine learning research focuses on producing models which perform well on benchmark tasks, in turn improving our understanding of the challenges associated with those tasks. From the…
1 runs0 likes0 downloads0 reach4 impact
270912 instances - 785 features - 49 classes - 0 missing values
The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011. We cordially invite researchers…
1 runs0 likes0 downloads0 reach4 impact
51839 instances - 2917 features - 43 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes0 downloads0 reach8 impact
416188 instances - 61 features - 355 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach8 impact
425240 instances - 79 features - 2 classes - 2734000 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
11 runs0 likes0 downloads0 reach10 impact
10000 instances - 7201 features - 10 classes - 0 missing values
This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). This dataset is ordered. It first contains all signal…
12 runs0 likes4 downloads4 reach5 impact
130064 instances - 51 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
3 runs0 likes0 downloads0 reach9 impact
8237 instances - 801 features - 7 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
4 runs0 likes2 downloads2 reach9 impact
2984 instances - 145 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
4 runs0 likes1 downloads1 reach9 impact
5124 instances - 21 features - 2 classes - 0 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure.
291 runs0 likes29 downloads29 reach8 impact
539383 instances - 8 features - 2 classes - 0 missing values
Klaverjas is an example of the Jack-Nine card games, which are characterized as trick-taking games where the the Jack and nine of the trump suit are the highest-ranking trumps, and the tens and aces…
0 runs0 likes1 downloads1 reach2 impact
981541 instances - 33 features - 2 classes - 0 missing values
This is the dataset used for the 2016 IDA Industrial Challenge, courtesy of Scania. For a full description, see http://archive.ics.uci.edu/ml/datasets/IDA2016Challenge . This dataset contains both the…
7 runs0 likes1 downloads1 reach9 impact
76000 instances - 171 features - 2 classes - 1078695 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
2 runs0 likes0 downloads0 reach4 impact
595212 instances - 38 features - 2 classes - 846458 missing values
# Data Description This is the historical price data of the FOREX USD/CHF from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach2 impact
375840 instances - 12 features - 2 classes - 0 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
223 runs0 likes17 downloads17 reach10 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
No data.
353 runs0 likes17 downloads17 reach4 impact
120919 instances - 1002 features - 2 classes - 0 missing values
1. Data set title: Nomao Data Set 2. Abstract: Nomao collects data about places (name, phone, localization...) from many sources. Deduplication consists in detecting what data refer to the same place.…
67106 runs0 likes16 downloads16 reach19 impact
34465 instances - 119 features - 2 classes - 0 missing values
* Title of Database: Spoken Arabic Digit * Abstract: This dataset contains time series of mel-frequency cepstrum coefficients (MFCCs) corresponding to spoken Arabic digits. Includes data from 44 males…
1 runs0 likes8 downloads8 reach7 impact
263256 instances - 15 features - 10 classes - 0 missing values
* Abstract: Purpose is to predict poker hands * Source - Creators: Robert Cattral (cattral '@' gmail.com) Franz Oppacher (oppacher '@' scs.carleton.ca) Carleton University, Department of Computer…
1 runs0 likes5 downloads5 reach7 impact
1025009 instances - 11 features - 10 classes - 0 missing values
The dataset collects data from an Android smartphone positioned in the chest pocket. Accelerometer Data are collected from 22 participants walking in the wild over a predefined path. The dataset is…
80 runs0 likes8 downloads8 reach7 impact
149332 instances - 5 features - 22 classes - 0 missing values
Fashion-MNIST is a dataset of Zalando's article images, consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a…
450 runs0 likes11 downloads11 reach16 impact
70000 instances - 785 features - 10 classes - 0 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
5083 runs0 likes2 downloads2 reach8 impact
44819 instances - 7 features - 3 classes - 0 missing values
shuttle-pmlb
10 runs0 likes3 downloads3 reach15 impact
58000 instances - 10 features - 7 classes - 0 missing values
This dataset was retrieved 2014-11-14 from the UCI site and converted to the ARFF format. __Major changes w.r.t. version 3: dataset from UCI that matches description and data types__ ### Feature…
4202 runs0 likes6 downloads6 reach6 impact
690 instances - 15 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
13 runs0 likes1 downloads1 reach9 impact
58310 instances - 181 features - 10 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
12 runs0 likes1 downloads1 reach10 impact
83733 instances - 55 features - 4 classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach0 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach0 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach0 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach0 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach0 impact
891 instances - 8 features - classes - 0 missing values
Original data from https://github.com/propublica/compas-analysis/ by ProPublica. The data was subsequently preprocessed and reduced to relevant features for classification. The target variable is…
0 runs0 likes0 downloads0 reach1 impact
5278 instances - 14 features - 2 classes - 0 missing values
No data.
67 runs0 likes12 downloads12 reach14 impact
9558 instances - 26833 features - 44 classes - 0 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes1 downloads1 reach1 impact
1795 instances - 9 features - 42 classes - 1 missing values
Chocolate Bar Ratings. Expert ratings of over 1,700 chocolate bars. Each chocolate is evaluated from a combination of both objective qualities and subjective interpretation. A rating here only…
0 runs0 likes1 downloads1 reach1 impact
1794 instances - 9 features - 41 classes - 0 missing values
At Santander our mission is to help people and businesses prosper. We are always looking for ways to help our customers understand their financial health and identify which products and services might…
0 runs0 likes1 downloads1 reach1 impact
200000 instances - 202 features - 2 classes - 0 missing values
Experiment data obtained by running random configurations of an SVM through mlr on 106 different classification tasks from openml.
0 runs0 likes0 downloads0 reach0 impact
540576 instances - 15 features - classes - 658962 missing values
Experiment data obtained by running random configurations of the hnsw kNN through mlr on 116 different classification tasks from openml.
0 runs0 likes0 downloads0 reach0 impact
111753 instances - 13 features - classes - 0 missing values
Experiment data obtained by running random configurations of rpart through mlr on 115 different classification tasks from openml.
0 runs0 likes0 downloads0 reach0 impact
92067 instances - 12 features - classes - 0 missing values
Experiment data obtained by running random configurations of glmnet through mlr on 114 different classification tasks from openml.
0 runs0 likes0 downloads0 reach0 impact
104820 instances - 10 features - classes - 0 missing values
r fgtgt
0 runs0 likes1 downloads1 reach0 impact
2 instances - 8 features - classes - 0 missing values
Experiment data obtained by running random configurations of xgboost through mlr on 118 different classification tasks from openml. Parameter descriptions:…
0 runs0 likes0 downloads0 reach0 impact
2955210 instances - 21 features - classes - 7051006 missing values
Experiment data obtained by running random configurations of ranger through mlr on 119 different classification tasks from openml.
0 runs0 likes0 downloads0 reach0 impact
278863 instances - 16 features - classes - 138965 missing values
dataset for bme
0 runs0 likes0 downloads0 reach0 impact
63 instances - 12 features - classes - 52 missing values
Multi-label dataset for text-classification. It consists of article titles and partial blurbs. Blurbs can be assigned to several categories (e.g. Science, News, Games) based on word predictors.
0 runs0 likes2 downloads2 reach5 impact
3782 instances - 1101 features - classes - 0 missing values
dd fgrfg
0 runs0 likes0 downloads0 reach0 impact
2 instances - 3 features - classes - 0 missing values
The database covers all the international short track games in the last 5 years. Currently it contains only men's 500m. Detailed lap data including personal time and ranking in each game from seasons…
0 runs0 likes1 downloads1 reach4 impact
cars1-pmlb
31 runs0 likes3 downloads3 reach14 impact
392 instances - 8 features - 3 classes - 0 missing values
Identify jets of particles from the LHC, created for the study of ultra low latency inference with hls4ml. Use 16 high level features to identify the 5 jet classes: quark (q), gluon (g), W boson (w),…
0 runs0 likes0 downloads0 reach0 impact
830000 instances - 17 features - 5 classes - 0 missing values
efef ffrf
0 runs0 likes0 downloads0 reach0 impact
9 instances - 3 features - classes - 0 missing values
ssc vdv
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 2 features - classes - 0 missing values
ssf
0 runs0 likes0 downloads0 reach1 impact
2 instances - 2 features - classes - 0 missing values
Data Set Information: This research aimed at the case of customers’ default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From…
0 runs0 likes0 downloads0 reach0 impact
30000 instances - 24 features - 2 classes - 0 missing values
This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was…
0 runs0 likes1 downloads1 reach2 impact
395 instances - 33 features - 0 classes - 0 missing values
e fvr
0 runs0 likes0 downloads0 reach0 impact
2 instances - 11 features - classes - 0 missing values
This dataset summarizes a heterogeneous set of features about articles published by Mashable in a period of two years. The goal is to predict the number of shares in social networks (popularity). *…
0 runs0 likes4 downloads4 reach5 impact
39644 instances - 61 features - 0 classes - 0 missing values
efe rgrg
0 runs0 likes0 downloads0 reach0 impact
sdsw frfr
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 3 features - classes - 0 missing values
swd dced
0 runs0 likes0 downloads0 reach0 impact
589 instances - 3 features - classes - 0 missing values
frf r
0 runs0 likes0 downloads0 reach0 impact
2 instances - 3 features - classes - 0 missing values
e3r4vr t4r
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
e eded
0 runs0 likes0 downloads0 reach0 impact
2 instances - 4 features - classes - 0 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017. For this version, the task was downsampled to 0.5 percent. Some…
0 runs0 likes0 downloads0 reach0 impact
27327 instances - 18 features - 0 classes - 657 missing values
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community…
0 runs0 likes0 downloads0 reach0 impact
52358 instances - 8 features - 0 classes - 650 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure. For this…
0 runs0 likes0 downloads0 reach0 impact
26969 instances - 8 features - 2 classes - 0 missing values
b gtrg
0 runs0 likes0 downloads0 reach0 impact
4 instances - 7 features - classes - 0 missing values
sqs efrf
0 runs0 likes0 downloads0 reach0 impact
4 instances - 5 features - classes - 0 missing values
f fr
0 runs0 likes0 downloads0 reach0 impact
2 instances - 5 features - classes - 0 missing values
dd efrg
0 runs0 likes0 downloads0 reach0 impact
1556 instances - 5629 features - classes - 0 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1744 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach0 impact
2778 instances - 28 features - 10 classes - 1744 missing values
test
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - classes - 0 missing values
Each record represents 100 points on a two-dimensional graph. When plotted in order (from 1 through 100) as the Y coordinate, the points will create either a Hill (a “bump” in the terrain) or a…
183264 runs0 likes23 downloads23 reach19 impact
1212 instances - 101 features - 2 classes - 0 missing values
wind daily average wind speeds for 1961-1978 at 12 synoptic meteorological stations in the Republic of Ireland (Haslett and raftery 1989). These data were analyzed in detail in the following article:…
0 runs0 likes6 downloads6 reach7 impact
6574 instances - 15 features - 0 classes - 0 missing values
This is the Tecator data set: The task is to predict the fat content of a meat sample on the basis of its near infrared absorbance spectrum. 1. Statement of permission from Tecator (the original data…
0 runs0 likes4 downloads4 reach7 impact
240 instances - 125 features - 0 classes - 0 missing values
Human Development Index [DATA] United Nations Development Program compiled an Index of Human Development. Column 1: Country(character) 2: Index 3: GNP GNP PER CAPITA RANK RANK - RANK HDI 1987 GNP RANK…
2 runs0 likes0 downloads0 reach7 impact
130 instances - 4 features - 0 classes - 0 missing values
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch,…
6 runs0 likes5 downloads5 reach11 impact
506 instances - 14 features - 0 classes - 0 missing values
File README ----------- smoothmeth A collection of the data sets used in the book "Smoothing Methods in Statistics," by Jeffrey S. Simonoff, Springer-Verlag, New York, 1996. Submitted by Jeff Simonoff…
0 runs0 likes0 downloads0 reach7 impact
2178 instances - 4 features - 0 classes - 0 missing values
A family of datasets synthetically generated from a simulation of how bank-customers choose their banks. Tasks are based on predicting the fraction of bank customers who leave the bank because of full…
0 runs0 likes2 downloads2 reach7 impact
8192 instances - 33 features - 0 classes - 0 missing values
17x17x2x2 tables of counts in GLIM-ready format used for the analyses in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of Family Disruption on Social Mobility." American Sociological…
3 runs0 likes1 downloads1 reach7 impact
1156 instances - 6 features - 0 classes - 0 missing values
Data for the sensory evaluation experiment in Brien, C.J. and Payne, R.W. (1996) Tiers, structure formulae and the analysis of complicated experiments. submitted for publication. The experiment…
2 runs0 likes2 downloads2 reach7 impact
576 instances - 12 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach7 impact
500 instances - 8 features - 0 classes - 0 missing values
The data consist of annual observations on the level of strike volume (days lost due to industrial disputes per 1000 wage salary earners), and their covariates in 18 OECD countries from 1951-1985. The…
0 runs0 likes2 downloads2 reach7 impact
625 instances - 7 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
2 runs0 likes7 downloads7 reach3 impact
16599 instances - 19 features - 0 classes - 0 missing values
S&P Letters Data We collected information on the variables using all the block groups in California from the 1990 Census. In this sample a block group on average includes 1425.5 individuals living in…
0 runs0 likes6 downloads6 reach7 impact
20640 instances - 9 features - 0 classes - 0 missing values
Veteran's Administration Lung Cancer Trial Taken from Kalbfleisch and Prentice, pages 223-224 Variables Treatment 1=standard, 2=test Celltype 1=squamous, 2=smallcell, 3=adeno, 4=large Survival in days…
2 runs0 likes1 downloads1 reach7 impact
137 instances - 8 features - 0 classes - 0 missing values
eevrr der
0 runs0 likes0 downloads0 reach0 impact
1557 instances - 5629 features - classes - 0 missing values
ede wey
0 runs0 likes0 downloads0 reach0 impact
589 instances - 2909 features - classes - 0 missing values
This database contains all legal 8-ply positions in the game of connect-4 in which neither player has won yet, and in which the next move is not forced. Attributes represent board positions on a 6x6…
9329 runs0 likes8 downloads8 reach19 impact
67557 instances - 43 features - 3 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: We used binary encoding for each feature (o, b, x), so the number of features is 42*3 = 126
0 runs0 likes3 downloads3 reach10 impact
67557 instances - 127 features - 0 classes - 0 missing values
This is a test dataset
0 runs0 likes0 downloads0 reach0 impact
Touch samples 2
0 runs0 likes0 downloads0 reach0 impact
265 instances - 11 features - 8 classes - 0 missing values
valores de saida de fardamento com temperaturas, admissões e demissões
0 runs0 likes0 downloads0 reach0 impact
6277 instances - 7 features - 0 classes - 0 missing values
rrvrf 4rr
0 runs0 likes0 downloads0 reach0 impact
4 instances - 49 features - classes - 0 missing values