Data
Filter results by:
1. TITLE: Letter Image Recognition Data The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The…
69266 runs1 likes73 downloads74 reach12 impact
20000 instances - 17 features - 26 classes - 0 missing values
A simple database containing 17 Boolean-valued attributes describing animals. The "type" attribute appears to be the class attribute. Notes: * I find it unusual that there are 2 instances of "frog"…
175 runs3 likes20 downloads23 reach9 impact
101 instances - 17 features - 7 classes - 0 missing values
1. Title: 1984 United States Congressional Voting Records Database 2. Source Information: (a) Source: Congressional Quarterly Almanac, 98th Congress, 2nd session 1984, Volume XL: Congressional…
2262 runs0 likes17 downloads17 reach9 impact
435 instances - 17 features - 2 classes - 392 missing values
We create a digit database by collecting 250 samples from 44 writers. The samples written by 30 writers are used for training, cross-validation and writer dependent testing, and the digits written by…
37245 runs0 likes21 downloads21 reach12 impact
10992 instances - 17 features - 10 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
41 runs0 likes2 downloads2 reach15 impact
1340 instances - 17 features - 3 classes - 20 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
604 runs0 likes14 downloads14 reach15 impact
22784 instances - 17 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
676 runs0 likes14 downloads14 reach15 impact
10992 instances - 17 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
176 runs0 likes7 downloads7 reach14 impact
101 instances - 17 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
131 runs0 likes6 downloads6 reach15 impact
1340 instances - 17 features - 2 classes - 20 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
639 runs0 likes13 downloads13 reach15 impact
20000 instances - 17 features - 2 classes - 0 missing values
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was…
65709 runs3 likes41 downloads44 reach31 impact
45211 instances - 17 features - 2 classes - 0 missing values
No data.
50 runs0 likes2 downloads2 reach13 impact
1000000 instances - 18 features - 22 classes - 0 missing values
No data.
65 runs1 likes2 downloads3 reach9 impact
1000000 instances - 18 features - 7 classes - 0 missing values
The database was created with records of behavior of the urban traffic of the city of Sao Paulo in Brazil from December 14, 2009 to December 18, 2009 (From Monday to Friday). Registered from 7:00 to…
0 runs0 likes0 downloads0 reach0 impact
135 instances - 18 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
16599 instances - 18 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach0 impact
12330 instances - 18 features - classes - 0 missing values
This data set includes hourly air pollutants data from 12nationally-controlled air-quality monitoring sites.
0 runs0 likes1 downloads1 reach0 impact
420768 instances - 18 features - classes - 74027 missing values
test
0 runs0 likes0 downloads0 reach2 impact
101 instances - 18 features - classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Electrical Discharge Machining dataset (Karalic and Bratko 1997) represents a two-target…
0 runs0 likes0 downloads0 reach9 impact
154 instances - 18 features - classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Jura (Goovaerts 1997) dataset consists of measurements of concentrations of seven heavy…
0 runs0 likes0 downloads0 reach9 impact
359 instances - 18 features - classes - 0 missing values
Zurich public transport delay data 2016-10-30 03:30:00 CET - 2016-11-27 01:20:00 CET cleaned and prepared at Open Data Day 2017. For this version, the task was downsampled to 0.5 percent. Some…
0 runs0 likes0 downloads0 reach7 impact
27327 instances - 18 features - 0 classes - 657 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Electrical Discharge Machining dataset (Karalic and Bratko 1997) represents a two-target…
0 runs0 likes0 downloads0 reach9 impact
154 instances - 18 features - classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Jura (Goovaerts 1997) dataset consists of measurements of concentrations of seven heavy…
0 runs0 likes0 downloads0 reach9 impact
359 instances - 18 features - classes - 0 missing values
Testing this plattform
0 runs0 likes0 downloads0 reach12 impact
36203 instances - 18 features - 0 classes - 8971 missing values
This data set measures the running time of a matrix-matrix product A x B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 241600 possible parameter…
0 runs0 likes0 downloads0 reach0 impact
241600 instances - 18 features - classes - 0 missing values
Citation Request: This primary tumor domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
1261 runs0 likes16 downloads16 reach12 impact
339 instances - 18 features - 21 classes - 225 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
752 runs0 likes7 downloads7 reach15 impact
339 instances - 18 features - 2 classes - 225 missing values
No data.
30 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
27 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
28 runs0 likes1 downloads1 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 19 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 19 features - 0 classes - 0 missing values
This data set is also obtained from the task of controlling a F16 aircraft, although the target variable and attributes are different from the ailerons domain. In this case the goal variable is…
2 runs0 likes7 downloads7 reach11 impact
16599 instances - 19 features - 0 classes - 0 missing values
No data.
68 runs0 likes2 downloads2 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
304 runs0 likes3 downloads3 reach9 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
63 runs0 likes4 downloads4 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
Primary Biliary Cirrhosis This data set is a follow-up to the original PBC data set, as discussed in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. An…
0 runs0 likes5 downloads5 reach13 impact
1945 instances - 19 features - 0 classes - 1133 missing values
No data.
310 runs0 likes4 downloads4 reach12 impact
1000000 instances - 19 features - 4 classes - 0 missing values
String datetime information extracted to numeric columns.Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC)…
0 runs0 likes0 downloads0 reach1 impact
581835 instances - 19 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach1 impact
1000000 instances - 19 features - 4 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach2 impact
37 instances - 19 features - classes - 0 missing values
The energy dispersive X-ray fluorescence (EDXRF) was used to determine the chemical composition of celadon body and glaze in Longquan kiln (at Dayao County) and Jingdezhen kiln. Forty typical shards…
0 runs0 likes0 downloads0 reach0 impact
88 instances - 19 features - classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. X treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric…
10 runs0 likes1 downloads1 reach12 impact
418 instances - 19 features - 0 classes - 1239 missing values
No data.
29 runs0 likes2 downloads2 reach13 impact
1000000 instances - 19 features - 4 classes - 0 missing values
Citation Request: This lymphography domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data.…
1972 runs0 likes31 downloads31 reach12 impact
148 instances - 19 features - 4 classes - 0 missing values
NAME vehicle silhouettes PURPOSE to classify a given silhouette as one of four types of vehicle, using a set of features extracted from the silhouette. The vehicle may be viewed from one of many…
31508 runs2 likes35 downloads37 reach11 impact
846 instances - 19 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
680 runs0 likes5 downloads5 reach15 impact
1945 instances - 19 features - 2 classes - 1133 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1176 runs0 likes12 downloads12 reach15 impact
16599 instances - 19 features - 2 classes - 0 missing values
Dataset from `Pattern Recognition and Neural Networks' by B.D. Ripley. Cambridge University Press (1996) ISBN 0-521-46086-7 The background to the datasets is described in section 1.4; this file…
587 runs0 likes5 downloads5 reach14 impact
61 instances - 19 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
723 runs0 likes5 downloads5 reach15 impact
418 instances - 19 features - 2 classes - 1239 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
748 runs0 likes8 downloads8 reach14 impact
148 instances - 19 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
810 runs0 likes8 downloads8 reach15 impact
846 instances - 19 features - 2 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach13 impact
200 instances - 20 features - 0 classes - 0 missing values
No data.
211 runs0 likes3 downloads3 reach12 impact
1000000 instances - 20 features - 7 classes - 0 missing values
No data.
69 runs0 likes4 downloads4 reach9 impact
1000000 instances - 20 features - 2 classes - 0 missing values
No data.
331 runs0 likes7 downloads7 reach9 impact
1000000 instances - 20 features - 2 classes - 0 missing values
------------------------------------------------------------------------ Primary Biliary Cirrhosis The data set found in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis,…
18 runs1 likes3 downloads4 reach14 impact
418 instances - 20 features - 0 classes - 1033 missing values
Data for an stock long position
0 runs0 likes0 downloads0 reach6 impact
4477 instances - 20 features - 0 classes - 0 missing values
Automated file upload of BNG(segment)
99 runs0 likes1 downloads1 reach12 impact
1000000 instances - 20 features - 7 classes - 0 missing values
#modelage
31 runs0 likes0 downloads0 reach8 impact
202 instances - 20 features - 2 classes - 17 missing values
Context "Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets] Content Each row represents a…
0 runs1 likes2 downloads3 reach8 impact
7043 instances - 20 features - 2 classes - 0 missing values
#modelage
87 runs0 likes0 downloads0 reach8 impact
224 instances - 20 features - 6 classes - 205 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach10 impact
50789 instances - 20 features - 3 classes - 154107 missing values
https://www.kaggle.com/harlfoxem/ This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features…
0 runs0 likes4 downloads4 reach8 impact
21613 instances - 20 features - classes - 0 missing values
Premier league matches from 2008 to 2014 with TDA features extracted.
0 runs0 likes0 downloads0 reach8 impact
2565 instances - 20 features - classes - 0 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach1 impact
50789 instances - 20 features - 3 classes - 154107 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
120 instances - 20 features - 0 classes - 0 missing values
We aggregated screen movements into screen-fixations using a Salvucci & Goldberg (2000) dispersion-threshold algorithm, and defined Perception Action Cycles (PACs) as fixations with at least one…
0 runs0 likes0 downloads0 reach0 impact
3395 instances - 20 features - classes - 168 missing values
This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features plus the price and the id columns,…
0 runs0 likes4 downloads4 reach9 impact
21613 instances - 20 features - 0 classes - 0 missing values
Product listing data submitted to the U.S. FDA for all unfinished, unapproved drugs.
0 runs0 likes1 downloads1 reach0 impact
120215 instances - 20 features - 7 classes - 443305 missing values
No data.
117 runs0 likes4 downloads4 reach9 impact
1000000 instances - 20 features - 5 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. __Major changes w.r.t.…
9969 runs0 likes7 downloads7 reach25 impact
2310 instances - 20 features - 7 classes - 0 missing values
The instances were drawn randomly from a database of 7 outdoor images. The images were hand-segmented to create a classification for every pixel. Each instance is a 3x3 region. ### Attribute…
23124 runs0 likes24 downloads24 reach12 impact
2310 instances - 20 features - 7 classes - 0 missing values
The objective was to determine which seedlots in a species are best for soil conservation in seasonally dry hill country. Determination is found by measurement of height, diameter by height, survival,…
27236 runs0 likes12 downloads12 reach10 impact
736 instances - 20 features - 5 classes - 448 missing values
1. Title: Hepatitis Domain 2. Sources: (a) unknown (b) Donor: G.Gong (Carnegie-Mellon University) via Bojan Cestnik Jozef Stefan Institute Jamova 39 61000 Ljubljana Yugoslavia (tel.: (38)(+61) 214-399…
2134 runs1 likes12 downloads13 reach9 impact
155 instances - 20 features - 2 classes - 167 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
701 runs0 likes3 downloads3 reach15 impact
736 instances - 20 features - 2 classes - 448 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
772 runs0 likes15 downloads15 reach15 impact
2310 instances - 20 features - 2 classes - 0 missing values
No data.
68 runs0 likes4 downloads4 reach12 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
225 runs0 likes7 downloads7 reach12 impact
1000000 instances - 21 features - 2 classes - 0 missing values
No data.
2 runs0 likes0 downloads0 reach14 impact
506 instances - 21 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
528 instances - 21 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
528 instances - 21 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach5 impact
41188 instances - 21 features - classes - 0 missing values
The database was created with records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil. The data set allows for several new combinations of attributes and attribute…
0 runs0 likes0 downloads0 reach0 impact
740 instances - 21 features - classes - 0 missing values
https://www.kaggle.com/dansbecker/nba-shot-logs
0 runs0 likes0 downloads0 reach0 impact
128069 instances - 21 features - classes - 5567 missing values
A spatio-temporal dataset of weekly chickenpox cases from Hungary.
0 runs0 likes0 downloads0 reach0 impact
522 instances - 21 features - classes - 0 missing values
It covers features from various categories of technical indicators, futures contracts, price of commodities, important indices of markets around the world, price of major companies in the U.S. market,…
0 runs0 likes0 downloads0 reach0 impact
522 instances - 21 features - classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.1H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_2-Way_20atts_0.4H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Epistasis_3-Way_20atts_0.2H_EDM-1_1-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_50_EDM-2_001-pmlb
0 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
GAMETES_Heterogeneity_20atts_1600_Het_0.4_0.2_75_EDM-2_001-pmlb
31 runs0 likes1 downloads1 reach22 impact
1600 instances - 21 features - 2 classes - 0 missing values
Experiment data obtained by running random configurations of xgboost through mlr on 118 different classification tasks from openml. Parameter descriptions:…
0 runs0 likes0 downloads0 reach7 impact
2955210 instances - 21 features - classes - 7051006 missing values
test
0 runs0 likes0 downloads0 reach6 impact
1000 instances - 21 features - classes - 0 missing values