Data
Filter results by:
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
10 runs0 likes0 downloads0 reach9 impact
294 instances - 14 features - 0 classes - 782 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
0 runs0 likes0 downloads0 reach13 impact
30 instances - 7 features - 0 classes - 6 missing values
This file contains data from Regression Analysis By Example, 2nd Edition, by Samprit Chatterjee and Bertram Price, John Wiley, 1991. Data sets have names of the form 'rabe.xxx' where xxx is the page…
0 runs0 likes0 downloads0 reach13 impact
51 instances - 7 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 7 features - 0 classes - 0 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19940 runs0 likes31 downloads31 reach9 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) In this version (version 2), some features were removed. It is unclear why of how this was done.
1883 runs1 likes10 downloads11 reach9 impact
368 instances - 23 features - 2 classes - 1927 missing values
1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and Fisheries, Tasmania…
34899 runs0 likes18 downloads18 reach9 impact
4177 instances - 9 features - 28 classes - 0 missing values
1. Title: Protein Localization Sites 2. Creator and Maintainer: Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University 1-3 Yamada-oka, Suita 565 Japan nakai@imcb.osaka-u.ac.jp…
1806 runs0 likes14 downloads14 reach12 impact
336 instances - 8 features - 8 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) Database of surgeries on horses. Possible class attributes: 24 (whether lesion is surgical), others include: 23, 25, 26, and 27 Notes: * Hospital_Number…
236 runs0 likes9 downloads9 reach9 impact
368 instances - 27 features - 2 classes - 1927 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
739 runs0 likes11 downloads11 reach15 impact
4052 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
670 runs0 likes4 downloads4 reach14 impact
62 instances - 8 features - 2 classes - 8 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
747 runs0 likes14 downloads14 reach15 impact
4177 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
737 runs0 likes5 downloads5 reach14 impact
47 instances - 8 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
698 runs0 likes6 downloads6 reach14 impact
97 instances - 11 features - 2 classes - 0 missing values
February 23, 1982 The 1982 annual meetings of the American Statistical Association (ASA) will be held August 16-19, 1982 in Cincinnati. At that meeting, the ASA Committee on Statistical Graphics plans…
759 runs0 likes9 downloads9 reach24 impact
209 instances - 9 features - 2 classes - 15 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
722 runs0 likes6 downloads6 reach14 impact
60 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
791 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
764 runs0 likes6 downloads6 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
788 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
779 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes12 downloads12 reach9 impact
3772 instances - 30 features - 4 classes - 6064 missing values
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
505958 runs23 likes266 downloads289 reach30 impact
1000 instances - 21 features - 2 classes - 0 missing values
The dataset (originally named ELEC2) contains 45,312 instances dated from 7 May 1996 to 5 December 1998. Each example of the dataset refers to a period of 30 minutes, i.e. there are 48 instances for…
106854 runs3 likes40 downloads43 reach12 impact
45312 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
758 runs0 likes8 downloads8 reach15 impact
500 instances - 8 features - 2 classes - 0 missing values
No data.
697 runs0 likes5 downloads5 reach14 impact
89 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
554 runs0 likes10 downloads10 reach15 impact
40768 instances - 11 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
819 runs0 likes10 downloads10 reach15 impact
500 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
756 runs0 likes6 downloads6 reach15 impact
310 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes4 downloads4 reach14 impact
54 instances - 8 features - 2 classes - 120 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
737 runs0 likes9 downloads9 reach15 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
815 runs0 likes9 downloads9 reach15 impact
336 instances - 8 features - 2 classes - 0 missing values
SUMMARY: Data from an experiment on the affects of machine adjustments on the time to count bolts. Data appear as the STATS (Issue 10) Challenge. DATA: Submitted by W. Robert Stephenson, Iowa State…
754 runs0 likes9 downloads9 reach14 impact
40 instances - 8 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
180 runs0 likes5 downloads5 reach24 impact
294 instances - 11 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
68 runs0 likes8 downloads8 reach25 impact
32561 instances - 16 features - 2 classes - 4262 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes5 downloads6 reach13 impact
768 instances - 10 features - 37 classes - 0 missing values
* Title: South Africa Heart Disease Dataset * Description A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case…
155 runs0 likes14 downloads14 reach14 impact
462 instances - 10 features - 2 classes - 0 missing values
"The debutanizer column is part of a desulfuring and naphtha splitter plant." u1 Top temperature u2 Top pressure u3 Reflux flow u4 Flow to next process u5 6th tray temperature u6 Bottom…
0 runs0 likes1 downloads1 reach12 impact
2394 instances - 8 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
78732 instances - 11 features - 0 classes - 0 missing values
Normalized form of codrna (351) Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC…
309 runs0 likes5 downloads5 reach9 impact
488565 instances - 9 features - 2 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-300-drift-au7-cpd1-800 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and…
7130 runs0 likes11 downloads11 reach35 impact
1100 instances - 13 features - 5 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-700 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
4537 runs0 likes7 downloads7 reach27 impact
700 instances - 13 features - 3 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-cpd1-500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
7145 runs0 likes7 downloads7 reach35 impact
500 instances - 13 features - 5 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D.…
0 runs0 likes1 downloads1 reach9 impact
62 instances - 8 features - 0 classes - 12 missing values
This is the hip measurement data from Table B.13 in Chatfield's Problem Solving (1995, 2nd edn, Chapman and Hall). It is given in 8 columns. First 4 columns are for Control Group. Last 4 columns are…
0 runs0 likes0 downloads0 reach11 impact
54 instances - 8 features - classes - 120 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach14 impact
500 instances - 8 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach14 impact
500 instances - 8 features - 0 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach12 impact
1000000 instances - 17 features - 2 classes - 0 missing values
Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7(173), 2006. This…
31 runs0 likes10 downloads10 reach15 impact
488565 instances - 9 features - 2 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach14 impact
40768 instances - 11 features - 0 classes - 0 missing values
This dataset describes 100,000 realistic, synthetically generated worker compensation insurance claims. Along the ultimate financial losses, each claim is described by the initial case estimate, date…
0 runs0 likes0 downloads0 reach0 impact
100000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
Make target (age) numeric**Author**: 1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of…
0 runs0 likes0 downloads0 reach1 impact
4177 instances - 9 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
22 runs0 likes2 downloads2 reach14 impact
400 instances - 8 features - 0 classes - 0 missing values
BitcoinHeist Ransomware Dataset Akcora, C.G., Li, Y., Gel, Y.R. and Kantarcioglu, M., 2019. BitcoinHeist. Topological Data Analysis for Ransomware Detection on the Bitcoin Blockchain. IJCAI-PRICAI…
0 runs1 likes0 downloads1 reach6 impact
2916697 instances - 10 features - 29 classes - 0 missing values
The dataset freMTPL2freq contains risk features for 677,991 motor third-part liability policies (observed mostly on one year).
0 runs1 likes2 downloads3 reach9 impact
678013 instances - 12 features - classes - 0 missing values
diabetes
0 runs0 likes2 downloads2 reach6 impact
768 instances - 9 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
4 runs0 likes2 downloads2 reach18 impact
2984 instances - 145 features - 2 classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes0 downloads0 reach15 impact
31406 instances - 23 features - 2 classes - 29756 missing values
uci adult partitioned
0 runs0 likes0 downloads0 reach8 impact
48844 instances - 17 features - classes - 6495 missing values
#modelage
28 runs0 likes0 downloads0 reach8 impact
202 instances - 13 features - 3 classes - 202 missing values
nfl_games
0 runs0 likes0 downloads0 reach8 impact
16274 instances - 12 features - classes - 0 missing values
The goal is to predict the Fare. Variable description: pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower age: Age is fractional if less than 1. If the age is…
0 runs0 likes4 downloads4 reach11 impact
1307 instances - 8 features - 0 classes - 0 missing values
test001
0 runs1 likes0 downloads1 reach11 impact
768 instances - 9 features - classes - 0 missing values
``**Author**: Cigdem Inan Aci","Mehmet Fatih Akay ### Data Set Information All simulations have done under the software named OPNET Modeler. Message passing is used as the communication mechanism in…
0 runs0 likes0 downloads0 reach8 impact
640 instances - 10 features - classes - 0 missing values
this is titanic survival prediction
0 runs0 likes3 downloads3 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes3 downloads3 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
this is titanic survival prediction
0 runs0 likes4 downloads4 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes1 downloads1 reach8 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
6 runs0 likes0 downloads0 reach8 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes2 downloads2 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes1 downloads1 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach14 impact
4052 instances - 8 features - 0 classes - 0 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
0 runs0 likes0 downloads0 reach13 impact
47 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17…
0 runs0 likes0 downloads0 reach0 impact
2111 instances - 17 features - classes - 0 missing values
In our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to…
0 runs0 likes0 downloads0 reach0 impact
750 instances - 16 features - classes - 60 missing values
1. Title: Pima Indians Diabetes Database 2. Sources: (a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases (b) Donor of database: Vincent Sigillito…
202121 runs6 likes91 downloads97 reach16 impact
768 instances - 9 features - 2 classes - 0 missing values
No data.
2198 runs1 likes17 downloads18 reach9 impact
1484 instances - 9 features - 10 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
760 runs0 likes14 downloads14 reach15 impact
8192 instances - 9 features - 2 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
103 runs0 likes4 downloads4 reach14 impact
52 instances - 9 features - 2 classes - 0 missing values
DATA-SETS FROM DIGGLE, P.J. (1990). TIME SERIES : A BIOSTATISTICAL INTRODUCTION. Oxford University Press. Table: Table A2 Wool prices Information about the dataset CLASSTYPE: numeric CLASSINDEX: none…
626 runs0 likes6 downloads6 reach14 impact
310 instances - 9 features - 9 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
581 runs0 likes11 downloads11 reach15 impact
20640 instances - 9 features - 2 classes - 0 missing values
Date: Tue, 15 Nov 88 15:44:08 EST From: stan To: aha@ICS.UCI.EDU 1. Title: Final settlements in labor negotitions in Canadian industry 2. Source Information -- Creators:…
7681 runs0 likes17 downloads17 reach12 impact
57 instances - 17 features - 2 classes - 326 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
107 runs0 likes3 downloads3 reach14 impact
74 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
733 runs0 likes4 downloads4 reach14 impact
87 instances - 11 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
485 runs0 likes5 downloads5 reach13 impact
76 instances - 15 features - 7 classes - 37 missing values
No data.
747 runs0 likes7 downloads7 reach15 impact
369 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
112 runs0 likes5 downloads5 reach14 impact
42 instances - 10 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
118 runs0 likes3 downloads3 reach15 impact
228 instances - 9 features - 2 classes - 20 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
761 runs0 likes14 downloads14 reach15 impact
8192 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
796 runs0 likes9 downloads9 reach15 impact
8192 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
589 runs0 likes12 downloads12 reach15 impact
22784 instances - 9 features - 2 classes - 0 missing values
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable,…
908 runs0 likes9 downloads9 reach14 impact
130 instances - 9 features - 2 classes - 0 missing values
No data.
748 runs0 likes7 downloads7 reach15 impact
274 instances - 9 features - 2 classes - 0 missing values
Source: David Gil, dgil '@' dtic.ua.es, Lucentia Research Group, Department of Computer Technology, University of Alicante Jose Luis Girela, girela '@' ua.es, Department of Biotechnology, University…
451 runs0 likes7 downloads7 reach13 impact
100 instances - 10 features - 2 classes - 0 missing values