Data
Filter results by:
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1190 runs0 likes9 downloads9 reach14 impact
111 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
739 runs0 likes6 downloads6 reach15 impact
662 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
802 runs0 likes8 downloads8 reach15 impact
662 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1268 runs0 likes11 downloads11 reach14 impact
131 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
722 runs0 likes5 downloads5 reach15 impact
285 instances - 8 features - 2 classes - 27 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
1032 runs0 likes7 downloads7 reach15 impact
151 instances - 6 features - 2 classes - 0 missing values
This is a dataset about the academic performance of students in different subjects
0 runs0 likes1 downloads1 reach0 impact
1000 instances - 8 features - 2 classes - 0 missing values
No data.
44 runs0 likes3 downloads3 reach12 impact
1000000 instances - 15 features - 2 classes - 0 missing values
%-*- text -*- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage…
2 runs0 likes2 downloads2 reach13 impact
60 instances - 17 features - 0 classes - 0 missing values
No data.
0 runs0 likes3 downloads3 reach12 impact
116640 instances - 10 features - 0 classes - 0 missing values
Juan J. Rodriguez, Ludmila I. Kuncheva, Carlos J. Alonso (2006). Rotation Forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(10):1619-1630.…
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 12 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag. Electicity usage is being treated as the…
4 runs0 likes0 downloads0 reach9 impact
55 instances - 3 features - 0 classes - 0 missing values
Human Development Index [DATA] United Nations Development Program compiled an Index of Human Development. Column 1: Country(character) 2: Index 3: GNP GNP PER CAPITA RANK RANK - RANK HDI 1987 GNP RANK…
2 runs0 likes0 downloads0 reach13 impact
130 instances - 2 features - 0 classes - 0 missing values
17x17x2x2 tables of counts in GLIM-ready format used for the analyses in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of Family Disruption on Social Mobility." American Sociological…
3 runs0 likes1 downloads1 reach15 impact
1156 instances - 6 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach14 impact
2001 instances - 2 features - 0 classes - 0 missing values
Dataset from Smoothing Methods in Statistics (ftp stat.cmu.edu/datasets) Simonoff, J.S. (1996). Smoothing Methods in Statistics. New York: Springer-Verlag.
7 runs0 likes2 downloads2 reach9 impact
61 instances - 3 features - 0 classes - 0 missing values
Data originating from the book "Analyzing Categorical Data" by Jeffrey S. Simonoff.
1087 runs0 likes9 downloads9 reach15 impact
50 instances - 5 features - 2 classes - 0 missing values
Email dataset 1a
0 runs0 likes0 downloads0 reach4 impact
4585 instances - 4 features - 0 classes - 0 missing values
Email dataset 2
0 runs0 likes0 downloads0 reach2 impact
11507 instances - 4 features - 0 classes - 0 missing values
Autistic Spectrum Disorder (ASD) is a neurodevelopment condition associated with significant healthcare costs, and early diagnosis can significantly reduce these. Unfortunately, waiting times for an…
0 runs0 likes0 downloads0 reach0 impact
704 instances - 21 features - classes - 192 missing values
1987 National Indonesia Contraceptive Prevalence Survey
0 runs0 likes0 downloads0 reach0 impact
1473 instances - 10 features - classes - 0 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
dgf_test
0 runs0 likes0 downloads0 reach0 impact
3415 instances - 5 features - 2 classes - 1 missing values
Outliers data set extracted from the Illustration (Fig. 3) in "Novelty detection with application to data streams"
0 runs0 likes0 downloads0 reach0 impact
75 instances - 3 features - 4 classes - 0 missing values
This is weather data in arff format
0 runs0 likes0 downloads0 reach2 impact
14 instances - 5 features - classes - 0 missing values
sample
0 runs0 likes0 downloads0 reach2 impact
14 instances - 5 features - classes - 0 missing values
Data Sets for 'Regression Models for Time Series Analysis' by B. Kedem and K. Fokianos, Wiley 2002. Submitted by Kostas Fokianos (fokianos@ucy.ac.cy) [8/Nov/02] (176k) Note: - attribute names were…
2 runs0 likes0 downloads0 reach13 impact
264 instances - 3 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach11 impact
50 instances - 3 features - classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach11 impact
185 instances - 2 features - classes - 0 missing values
__Major changes w.r.t. version 2: ignored variable 3 in this upload as this seems to be ea perfect predictor.__ Tamilnadu Electricity Board Hourly Readings dataset. Real-time readings were collected…
0 runs0 likes2 downloads2 reach19 impact
45781 instances - 4 features - 20 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community…
0 runs0 likes0 downloads0 reach8 impact
523590 instances - 8 features - 144 classes - 6916 missing values
Airlines Dataset Inspired in the regression dataset from Elena Ikonomovska. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure. For this…
0 runs0 likes2 downloads2 reach6 impact
26969 instances - 8 features - 2 classes - 0 missing values
frf r
0 runs0 likes0 downloads0 reach6 impact
2 instances - 3 features - classes - 0 missing values
This file holds global land temperatures by country
0 runs0 likes1 downloads1 reach10 impact
577462 instances - 4 features - classes - 64563 missing values
holds information on average temperature per country
0 runs0 likes0 downloads0 reach10 impact
577462 instances - 4 features - classes - 64563 missing values
The dataset freMTPL2sev contains claim amounts for 26,639 motor third-part liability policies.
0 runs0 likes0 downloads0 reach9 impact
26639 instances - 2 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach3 impact
4324 instances - 9 features - classes - 3360 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach10 impact
50789 instances - 20 features - 3 classes - 154107 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes2 downloads2 reach9 impact
14 instances - 5 features - 2 classes - 0 missing values
Historical Rainfall data of Bangladesh
0 runs0 likes0 downloads0 reach12 impact
16755 instances - 4 features - 0 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes1 downloads1 reach9 impact
14 instances - 5 features - 2 classes - 0 missing values
Wine data gathered by https://www.kaggle.com/zynicideThe data was scraped from WineEnthusiast during the week of June 15th, 2017. The code for the scraper can be found at…
0 runs0 likes0 downloads0 reach8 impact
150930 instances - 10 features - classes - 174477 missing values
This dataset contains all the player names and player ids, taken from Sofifa
0 runs0 likes0 downloads0 reach8 impact
11009 instances - 3 features - classes - 0 missing values
good
0 runs0 likes0 downloads0 reach8 impact
10 instances - 4 features - classes - 2 missing values
testing
0 runs0 likes0 downloads0 reach7 impact
366 instances - 3 features - classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes1 downloads1 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes1 downloads1 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach8 impact
14 instances - 5 features - 2 classes - 0 missing values
User profile data for San Francisco OkCupid users published in [Kim, A. Y., & Escobedo-Land, A. (2015). OKCupid data for introductory statistics and data science courses. Journal of Statistics…
0 runs0 likes0 downloads0 reach1 impact
50789 instances - 20 features - 3 classes - 154107 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes2 downloads2 reach11 impact
366 instances - 5 features - classes - 2 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
2 runs0 likes0 downloads0 reach13 impact
108 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
450 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
475 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
475 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach13 impact
100 instances - 3 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach14 impact
649 instances - 3 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes2 downloads2 reach11 impact
100 instances - 10 features - classes - 0 missing values
* Dataset Title: Wall-Following Robot Navigation Data Data Set (version with 2 Attributes) * Abstract: The data were collected as the SCITOS G5 robot navigates through the room following the wall in a…
109 runs0 likes3 downloads3 reach14 impact
5456 instances - 3 features - 4 classes - 0 missing values
An artificial data set where instances belongs to several clusters with a banana shape. There are two attributes At1 and At2 corresponding to the x and y axis, respectively. The class label (-1 and 1)…
163 runs2 likes17 downloads19 reach14 impact
5300 instances - 3 features - 2 classes - 0 missing values
The data was collected from car parks in Birmingham that are operated by NCP from Birmingham City Council. It contains the occupancy rates (8:00 to 16:30) from 2016/10/04 to 2016/12/19. ### Attribute…
0 runs0 likes0 downloads0 reach0 impact
35717 instances - 4 features - classes - 0 missing values
This dataset attributes first names to genders, giving counts and probabilities. It combines open-source government data from the US, UK, Canada, and Australia. This dataset combines raw counts for…
0 runs0 likes0 downloads0 reach0 impact
147269 instances - 4 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 0 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1643 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1624 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1538 instances - 3 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Tumor-size treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
0 runs0 likes3 downloads3 reach12 impact
286 instances - 10 features - 0 classes - 9 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Case number deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning…
10 runs2 likes4 downloads6 reach12 impact
195 instances - 11 features - 0 classes - 2 missing values
130k wine reviews with variety, location, winery, price, and description. Downloaded from Kaggle [https://www.kaggle.com/zynicide/wine-reviews/home] on 29.10.2018. The original data was scraped from…
0 runs0 likes0 downloads0 reach9 impact
129971 instances - 13 features - 0 classes - 204752 missing values
Gold medal winning pace in minutes per kilometer for the men's marathon from the first 1896 until 2016.
0 runs0 likes0 downloads0 reach10 impact
28 instances - 2 features - classes - 0 missing values
Abstract: The data set is composed of 60 chorales (5665 events) by J.S. Bach (1675-1750). Each event of each chorale is labelled using 1 among 101 chord labels and described through 14 features.…
31 runs0 likes2 downloads2 reach13 impact
5665 instances - 17 features - 102 classes - 0 missing values
1. Title: Haberman's Survival Data 2. Sources: (a) Donor: Tjen-Sien Lim (limt@stat.wisc.edu) (b) Date: March 4, 1999 3. Past Usage: 1. Haberman, S. J. (1976). Generalized Residuals for Log-Linear…
3241 runs1 likes19 downloads20 reach9 impact
306 instances - 4 features - 2 classes - 0 missing values
Data on educational transitions for a sample of 500 Irish schoolchildren aged 11 in 1967. The data were collected by Greaney and Kelleghan (1984), and reanalyzed by Raftery and Hout (1985, 1993). ###…
16022 runs0 likes16 downloads16 reach25 impact
500 instances - 6 features - 2 classes - 32 missing values
Prediction task is to determine whether a person makes over 50K a year. Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the…
2671 runs1 likes33 downloads34 reach12 impact
48842 instances - 15 features - 2 classes - 6465 missing values
Datasets for `Pattern Recognition and Neural Networks' by B.D. Ripley ===================================================================== Cambridge University Press (1996) ISBN 0-521-46086-7 The…
41 runs0 likes2 downloads2 reach14 impact
27 instances - 3 features - 4 classes - 0 missing values
ARFF version of UCI dataset 'flags'. Creators: Collected primarily from the "Collins Gem Guide to Flags": Collins Publishers (1986). Donor: Richard S. Forsyth. Date 5/15/1990 This data file contains…
108 runs0 likes10 downloads10 reach18 impact
194 instances - 29 features - 8 classes - 0 missing values
No data.
965 runs0 likes9 downloads9 reach9 impact
55296 instances - 10 features - 3 classes - 0 missing values
1. Title: Contraceptive Method Choice 2. Sources: (a) Origin: This dataset is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey (b) Creator: Tjen-Sien Lim (limt@stat.wisc.edu)…
23919 runs0 likes20 downloads20 reach10 impact
1473 instances - 10 features - 3 classes - 0 missing values
Grass Grubs and Damage Ranking Data source: R. J. Townsend AgResearch, Lincoln, New Zealand Grass grubs are one of the major insect pests of pasture in Canterbury and can cause severe pasture damage…
988 runs0 likes8 downloads8 reach15 impact
155 instances - 9 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1043 runs0 likes10 downloads10 reach14 impact
125 instances - 5 features - 2 classes - 0 missing values
Data Sets for 'Regression Models for Time Series Analysis' by B. Kedem and K. Fokianos, Wiley 2002. Submitted by Kostas Fokianos (fokianos@ucy.ac.cy) [8/Nov/02] (176k) Note: - attribute names were…
599 runs1 likes12 downloads13 reach15 impact
1024 instances - 3 features - 4 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
111 runs0 likes5 downloads5 reach14 impact
52 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
119 runs0 likes7 downloads7 reach14 impact
50 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
767 runs0 likes9 downloads9 reach15 impact
189 instances - 10 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
511 runs0 likes6 downloads6 reach15 impact
185 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
762 runs0 likes6 downloads6 reach14 impact
88 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
842 runs0 likes7 downloads7 reach15 impact
155 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
771 runs0 likes9 downloads9 reach15 impact
468 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
903 runs0 likes8 downloads8 reach15 impact
468 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1164 runs0 likes7 downloads7 reach15 impact
222 instances - 3 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1073 runs0 likes10 downloads10 reach14 impact
140 instances - 4 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
1823 runs0 likes9 downloads9 reach14 impact
120 instances - 3 features - 2 classes - 0 missing values