Data
Filter results by:
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 25108569 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 25108569 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 25108569 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes1 downloads1 reach8 impact
565163 instances - 75 features - 0 classes - 15247061 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 14616450 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 14616450 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 15001 features - 2 classes - 14616450 missing values
Domain dataset
0 runs0 likes0 downloads0 reach9 impact
1637 instances - 9839 features - 3 classes - 13231887 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes2 downloads2 reach8 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn). Churn (wikipedia…
10984 runs0 likes16 downloads16 reach25 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
11303 runs0 likes12 downloads12 reach25 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
Datasets from ACM KDD Cup (http://www.sigkdd.org/kddcup/index.php) KDD Cup 2009 http://www.kddcup-orange.com Converted to ARFF format by TunedIT Customer Relationship Management (CRM) is a key element…
223 runs0 likes18 downloads18 reach18 impact
50000 instances - 231 features - 2 classes - 8024152 missing values
This dataset contains traffic violation information from all electronic traffic violations issued in the County. Any information that can be used to uniquely identify the vehicle, the vehicle owner or…
0 runs0 likes1 downloads1 reach8 impact
1578154 instances - 43 features - 4 classes - 8006541 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach8 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
Experiment data obtained by running random configurations of xgboost through mlr on 118 different classification tasks from openml. Parameter descriptions:…
0 runs0 likes0 downloads0 reach7 impact
2955210 instances - 21 features - classes - 7051006 missing values
Dataset KDD98 challenge: https://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html The goal is to estimate the return from a direct mailing in order to maximize donation profits. This dataset…
0 runs0 likes5 downloads5 reach12 impact
191260 instances - 479 features - 0 classes - 5587563 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
0 runs0 likes1 downloads1 reach16 impact
425240 instances - 79 features - 2 classes - 2734000 missing values
Dataset KDD98 challenge: https://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html The goal is to estimate the return from a direct mailing in order to maximize donation profits. This dataset…
0 runs0 likes0 downloads0 reach9 impact
82318 instances - 478 features - 2 classes - 2399311 missing values
Data reported to the police about the circumstances of personal injury road accidents in Great Britain from 1979, and the maker and model information of vehicles involved in the respective accident.…
0 runs0 likes0 downloads0 reach0 impact
363266 instances - 67 features - 4 classes - 2182126 missing values
UserID
0 runs0 likes0 downloads0 reach8 impact
1974675 instances - 10 features - classes - 1974675 missing values
web services evaluations in this table
0 runs0 likes0 downloads0 reach9 impact
1974675 instances - 10 features - classes - 1974675 missing values
# Achieved Frames per Second (FPS) in video games This dataset contains FPS measurement of video games executed on computers. Each row of the dataset describes the outcome of FPS measurement (outcome…
0 runs0 likes0 downloads0 reach0 impact
425833 instances - 45 features - 0 classes - 1299988 missing values
This is the dataset used for the 2016 IDA Industrial Challenge, courtesy of Scania. For a full description, see http://archive.ics.uci.edu/ml/datasets/IDA2016Challenge . This dataset contains both the…
7 runs0 likes1 downloads1 reach17 impact
76000 instances - 171 features - 2 classes - 1078695 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
2 runs0 likes0 downloads0 reach12 impact
595212 instances - 38 features - 2 classes - 846458 missing values
Training dataset of the 'Porto Seguros Safe Driver Prediction' Kaggle challenge [https://www.kaggle.com/c/porto-seguro-safe-driver-prediction]. The goal was to predict whether a driver will file an…
0 runs0 likes0 downloads0 reach0 impact
595212 instances - 58 features - 2 classes - 846458 missing values
Experiment data obtained by running random configurations of an SVM through mlr on 106 different classification tasks from openml.
0 runs0 likes0 downloads0 reach7 impact
540576 instances - 15 features - classes - 658962 missing values
Product listing data submitted to the U.S. FDA for all unfinished, unapproved drugs.
0 runs0 likes0 downloads0 reach0 impact
120215 instances - 20 features - 7 classes - 443305 missing values
This version has feature names based on https://www2.1010data.com/documentationcenter/beta/Tutorials/MachineLearningExamples/CensusIncomeDataSet.html Missing data is also properly encoded in this…
0 runs0 likes0 downloads0 reach0 impact
199523 instances - 42 features - 2 classes - 415717 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
0 runs1 likes2 downloads3 reach10 impact
798964 instances - 10 features - 3 classes - 399482 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The river flow datasets concern the prediction of river network flows for 48 h in the future at…
0 runs0 likes0 downloads0 reach9 impact
9125 instances - 584 features - classes - 356160 missing values
Anonymized data of dating profiles from OkCupid
0 runs0 likes3 downloads3 reach8 impact
59946 instances - 31 features - 0 classes - 273249 missing values
Author: Marius Lindauer Date: 27.02.2014 These data set was generated for a publication about claspfolio 2.0, i.e., an algorithm selector for ASP. The algorithm portfolio of clasp (2.1.4)…
0 runs0 likes0 downloads0 reach9 impact
14234 instances - 143 features - 0 classes - 200838 missing values
uci
0 runs0 likes0 downloads0 reach8 impact
101766 instances - 52 features - classes - 192849 missing values
Wine data gathered by https://www.kaggle.com/zynicideThe data was scraped from WineEnthusiast during the week of June 15th, 2017. The code for the scraper can be found at…
0 runs0 likes0 downloads0 reach8 impact
150930 instances - 10 features - classes - 174477 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach10 impact
50789 instances - 20 features - 3 classes - 154107 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach1 impact
50789 instances - 20 features - 3 classes - 154107 missing values
One of the biggest challenges of an auto dealership purchasing a used car at an auto auction is the risk of that the vehicle might have serious issues that prevent it from being sold to customers. The…
3 runs0 likes3 downloads3 reach13 impact
72983 instances - 33 features - 2 classes - 149271 missing values
Experiment data obtained by running random configurations of ranger through mlr on 119 different classification tasks from openml.
0 runs0 likes0 downloads0 reach7 impact
278863 instances - 16 features - classes - 138965 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017.
0 runs0 likes2 downloads2 reach12 impact
5465575 instances - 15 features - 0 classes - 132617 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 128136 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 128136 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes1 downloads1 reach9 impact
7063 instances - 50 features - 0 classes - 125494 missing values
Version with corrected feature types. 'PrivacySuppressed' are converted to None. Regroups information for about 7800 different US colleges. Including geographical information, stats about the…
0 runs0 likes0 downloads0 reach0 impact
7063 instances - 47 features - 0 classes - 104305 missing values
Modified version for the automl benchmark. Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation…
0 runs0 likes0 downloads0 reach0 impact
7063 instances - 45 features - 0 classes - 104249 missing values
Payments given by healthcare manufacturing companies to medical doctors or hospitals
0 runs0 likes0 downloads0 reach0 impact
73558 instances - 6 features - 2 classes - 83182 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes2 downloads2 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
Data on tree growth used in the Case Study published in the September, 1995 issue of the Canadian Journal of Statistics. This data set was been provided by Dr. Fernando Camacho, Ontario Hydro…
18457 runs1 likes15 downloads16 reach39 impact
2796 instances - 35 features - 6 classes - 68100 missing values
--------------------------------------------------------------------------- Short description --------------------------------------------------------------------------- Data on tree growth used in…
0 runs0 likes2 downloads2 reach11 impact
2796 instances - 35 features - 6 classes - 68100 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
0 runs0 likes0 downloads0 reach11 impact
2796 instances - 35 features - 2 classes - 68100 missing values
This file holds global land temperatures by country
0 runs0 likes1 downloads1 reach10 impact
577462 instances - 4 features - classes - 64563 missing values
holds information on average temperature per country
0 runs0 likes0 downloads0 reach10 impact
577462 instances - 4 features - classes - 64563 missing values
Two colour spotted cDNA array data set of a series of experiments to identify which genes in Yeast are cell cycle regulated.
0 runs0 likes0 downloads0 reach9 impact
6178 instances - 82 features - classes - 59017 missing values
Abstract: This dataset contains timeseries of mel-frequency cepstrum coefficients (MFCCs) corresponding to spoken Arabic digits. Includes data from 44 male and 44 female native Arabic speakers.…
0 runs0 likes3 downloads3 reach11 impact
178526 instances - 13 features - classes - 57200 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
354 runs0 likes7 downloads7 reach14 impact
7485 instances - 61 features - 7 classes - 52048 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
366 runs0 likes10 downloads10 reach14 impact
8844 instances - 61 features - 7 classes - 51515 missing values
This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990. The coding schemes have been standardized (by the IPUMS project) to be…
434 runs0 likes10 downloads10 reach14 impact
7019 instances - 61 features - 8 classes - 48089 missing values
The dataset contains all the statistics for each player from 2008 to 2016.
0 runs0 likes1 downloads1 reach8 impact
183978 instances - 42 features - classes - 47301 missing values
Sensor data measurements of one Boiler, containing WaterInput/SteamOutput (flow, temperature, pressure) for one month, which is measured every minute.
0 runs0 likes1 downloads1 reach9 impact
44643 instances - 8 features - classes - 44643 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
744 runs0 likes8 downloads8 reach16 impact
7019 instances - 61 features - 2 classes - 43814 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 42138 missing values
test
0 runs0 likes0 downloads0 reach3 impact
60197 instances - 6 features - classes - 42138 missing values
Title: Communities and Crime Abstract: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and…
0 runs1 likes3 downloads4 reach13 impact
1994 instances - 128 features - 0 classes - 39202 missing values
test
0 runs0 likes1 downloads1 reach8 impact
1994 instances - 127 features - 0 classes - 39202 missing values
Ignores community name.**Author**: Title: Communities and Crime Abstract: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from…
0 runs0 likes0 downloads0 reach0 impact
1994 instances - 127 features - 0 classes - 39202 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
754 runs0 likes10 downloads10 reach16 impact
8844 instances - 57 features - 2 classes - 34843 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
733 runs0 likes9 downloads9 reach16 impact
7485 instances - 56 features - 2 classes - 32427 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes0 downloads0 reach14 impact
31406 instances - 23 features - 2 classes - 29756 missing values
source: http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/ authors: L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown translator in coseal format: M. Lindauer with the help of Alexandre Frechette the data do…
0 runs0 likes1 downloads1 reach9 impact
4440 instances - 117 features - 0 classes - 27150 missing values
The original Annealing dataset from UCI. The exact meaning of the features and classes is largely unknown. Annealing, in metallurgy and materials science, is a heat treatment that alters the physical…
13774 runs0 likes16 downloads16 reach12 impact
898 instances - 39 features - 5 classes - 22175 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
712 runs0 likes8 downloads8 reach15 impact
898 instances - 39 features - 2 classes - 22175 missing values
No data.
283 runs0 likes5 downloads5 reach22 impact
96 instances - 4027 features - 11 classes - 19667 missing values
No data.
296 runs0 likes5 downloads5 reach22 impact
96 instances - 4027 features - 9 classes - 19667 missing values
test
0 runs0 likes0 downloads0 reach3 impact
8553 instances - 10 features - classes - 18454 missing values
This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the…
28060 runs19 likes161 downloads180 reach33 impact
8378 instances - 123 features - 2 classes - 18372 missing values
Author: Marius Lindauer Date: 27.02.2014 These data set was generated for a publication about claspfolio 2.0, i.e., an algorithm selector for ASP. The algorithm portfolio of clasp (2.1.4)…
0 runs0 likes0 downloads0 reach8 impact
1294 instances - 143 features - 11 classes - 18258 missing values
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each…
0 runs0 likes3 downloads3 reach8 impact
9228 instances - 13 features - 0 classes - 11169 missing values
### Description ### This dataset is part of a collection datasets based on the game "Jungle Chess" (a.k.a. Dou Shou Qi). For a description of the rules, please refer to the paper (link attached). The…
11 runs0 likes1 downloads1 reach12 impact
44819 instances - 47 features - 3 classes - 10584 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles Online Product Sales competition…
0 runs0 likes0 downloads0 reach9 impact
639 instances - 413 features - classes - 10012 missing values
No data.
68 runs0 likes4 downloads4 reach9 impact
20000 instances - 17 features - 3 classes - 10000 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach9 impact
1137 instances - 26 features - classes - 9255 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : This is a pre-processed version of the dataset used in Kaggles See Click Predict Fix competition…
0 runs0 likes0 downloads0 reach9 impact
1137 instances - 26 features - classes - 9255 missing values
Testing this plattform
0 runs0 likes0 downloads0 reach11 impact
36203 instances - 18 features - 0 classes - 8971 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
104 runs0 likes7 downloads7 reach15 impact
1302 instances - 34 features - 2 classes - 7830 missing values
No data.
253 runs0 likes9 downloads9 reach9 impact
1076790 instances - 30 features - 2 classes - 7275 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes3 downloads3 reach8 impact
1460 instances - 81 features - 0 classes - 6965 missing values
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's…
0 runs0 likes1 downloads1 reach8 impact
1460 instances - 80 features - 0 classes - 6965 missing values
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community…
0 runs0 likes0 downloads0 reach8 impact
523590 instances - 8 features - 144 classes - 6916 missing values
uci adult partitioned
0 runs0 likes0 downloads0 reach8 impact
48844 instances - 17 features - classes - 6495 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
14257 runs1 likes25 downloads26 reach37 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs1 likes32 downloads33 reach11 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19940 runs0 likes31 downloads31 reach9 impact
3772 instances - 30 features - 2 classes - 6064 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes11 downloads11 reach9 impact
3772 instances - 30 features - 4 classes - 6064 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
737 runs0 likes9 downloads9 reach15 impact
3772 instances - 30 features - 2 classes - 6064 missing values
No data.
496 runs0 likes6 downloads6 reach22 impact
45 instances - 4027 features - 2 classes - 5948 missing values
hmeq_p,BAD,binary
0 runs0 likes0 downloads0 reach8 impact
5960 instances - 15 features - classes - 5271 missing values