Data
Filter results by:
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
test
0 runs0 likes0 downloads0 reach6 impact
891 instances - 12 features - classes - 866 missing values
nominal features and target for COMPAS
0 runs0 likes1 downloads1 reach9 impact
5278 instances - 14 features - 2 classes - 0 missing values
Original data from https://github.com/propublica/compas-analysis/ by ProPublica. The data was subsequently preprocessed and reduced to relevant features for classification. The target variable is…
0 runs0 likes1 downloads1 reach10 impact
5278 instances - 14 features - 2 classes - 0 missing values
Test
0 runs0 likes0 downloads0 reach7 impact
6330 instances - 8 features - classes - 0 missing values
This classic dataset contains the prices and other attributes of almost 54,000 diamonds. It's a great dataset for beginners learning to work with data analysis and visualization. Content price price…
0 runs0 likes1 downloads1 reach9 impact
53940 instances - 10 features - 0 classes - 0 missing values
Prediction of residuary resistance of sailing yachts at the initial design stage is of a great value for evaluating the ship’s performance and for estimating the required propulsive…
0 runs0 likes0 downloads0 reach8 impact
308 instances - 7 features - 0 classes - 0 missing values
testing temperature and ph
0 runs0 likes0 downloads0 reach5 impact
26 instances - 8 features - classes - 0 missing values
This data is used to test water contamination
0 runs0 likes0 downloads0 reach9 impact
26 instances - 8 features - classes - 0 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
0 runs0 likes0 downloads0 reach13 impact
30 instances - 7 features - 0 classes - 6 missing values
This file contains data from Regression Analysis By Example, 2nd Edition, by Samprit Chatterjee and Bertram Price, John Wiley, 1991. Data sets have names of the form 'rabe.xxx' where xxx is the page…
0 runs0 likes0 downloads0 reach13 impact
51 instances - 7 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 7 features - 0 classes - 0 missing values
No data.
697 runs0 likes7 downloads7 reach15 impact
320 instances - 9 features - 2 classes - 0 missing values
* Title: Wholesale customers Data Set * Abstract: The data set refers to clients of a wholesale distributor. It includes the annual spending in monetary units (m.u.) on diverse product categories *…
161 runs0 likes11 downloads11 reach14 impact
440 instances - 9 features - 2 classes - 0 missing values
* Title: seismic-bumps Data Set * Abstract: The data describe the problem of high energy (higher than 10^4 J) seismic bumps forecasting in a coal mine. Data come from two of longwalls located in a…
152 runs0 likes37 downloads37 reach13 impact
210 instances - 8 features - 3 classes - 0 missing values
* Title: seeds Data Set * Abstract: Measurements of geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct…
190 runs0 likes5 downloads5 reach13 impact
210 instances - 8 features - 3 classes - 0 missing values
* Dataset: Reduced version (10 % of the examples) of bank-marketing dataset.
1254 runs1 likes17 downloads18 reach15 impact
4521 instances - 17 features - 2 classes - 0 missing values
* Abstract: A 3-class version of abalone dataset. * Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and…
176 runs0 likes4 downloads4 reach14 impact
4177 instances - 9 features - 3 classes - 0 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Survival treated as the class attribute As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
12 runs0 likes2 downloads2 reach12 impact
130 instances - 10 features - 0 classes - 97 missing values
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Cholesterol treated as the class attribute. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using…
160 runs0 likes4 downloads4 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
Publication Request: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This file describes the contents of the heart-disease directory. This directory contains 4 databases…
10 runs0 likes0 downloads0 reach9 impact
294 instances - 14 features - 0 classes - 782 missing values
Donor: David W. Aha (aha@ics.uci.edu) This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one…
37 runs0 likes5 downloads5 reach9 impact
303 instances - 14 features - 0 classes - 6 missing values
This data set is also obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons domain. The target variable here…
7 runs0 likes3 downloads3 reach10 impact
9517 instances - 7 features - 0 classes - 0 missing values
The problem concerns Relative CPU Performance Data. More information can be obtained in the UCI Machine Learning repository (http://www.ics.uci.edu/~mlearn/MLSummary.html). The used attributes are :…
2 runs0 likes2 downloads2 reach12 impact
209 instances - 7 features - 0 classes - 0 missing values
Automated file upload of BNG(credit-g)
99 runs0 likes4 downloads4 reach12 impact
1000000 instances - 21 features - 2 classes - 0 missing values
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was…
65399 runs3 likes38 downloads41 reach31 impact
45211 instances - 17 features - 2 classes - 0 missing values
Airlines Departure Delay Prediction (Regression). Original data can be found at: http://www.transtats.bts.gov This is a processed version of the original data, designed to predict departure delay (in…
0 runs2 likes1 downloads3 reach2 impact
10000000 instances - 10 features - 0 classes - 0 missing values
Airlines Departure Delay Prediction (Regression). Original data can be found at: http://www.transtats.bts.gov This is a processed version of the original data, designed to predict departure delay (in…
0 runs0 likes2 downloads2 reach2 impact
1000000 instances - 10 features - 0 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) Database of surgeries on horses. Possible class attributes: 24 (whether lesion is surgical), others include: 23, 25, 26, and 27 Notes: * Hospital_Number…
236 runs0 likes9 downloads9 reach9 impact
368 instances - 27 features - 2 classes - 1927 missing values
This dataset classifies people described by a set of attributes as good or bad credit risks. This dataset comes with a cost matrix: ``` Good Bad (predicted) Good 0 1 (actual) Bad 5 0 ``` It is worse…
505962 runs28 likes286 downloads314 reach32 impact
1000 instances - 21 features - 2 classes - 0 missing values
Donor: Will Taylor (taylor@pluto.arc.nasa.gov) In this version (version 2), some features were removed. It is unclear why of how this was done.
1883 runs1 likes10 downloads11 reach9 impact
368 instances - 23 features - 2 classes - 1927 missing values
1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of Primary Industry and Fisheries, Tasmania…
34899 runs0 likes18 downloads18 reach9 impact
4177 instances - 9 features - 28 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
739 runs0 likes11 downloads11 reach15 impact
4052 instances - 8 features - 2 classes - 0 missing values
; ; Thyroid disease records supplied by the Garavan Institute and J. Ross ; Quinlan, New South Wales Institute, Syndney, Australia. ; ; 1987. ; hypothyroid, primary hypothyroid, compensated…
883 runs0 likes13 downloads13 reach9 impact
3772 instances - 30 features - 4 classes - 6064 missing values
February 23, 1982 The 1982 annual meetings of the American Statistical Association (ASA) will be held August 16-19, 1982 in Cincinnati. At that meeting, the ASA Committee on Statistical Graphics plans…
759 runs1 likes9 downloads10 reach24 impact
209 instances - 9 features - 2 classes - 15 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
722 runs0 likes6 downloads6 reach14 impact
60 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
747 runs0 likes14 downloads14 reach15 impact
4177 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
670 runs0 likes4 downloads4 reach14 impact
62 instances - 8 features - 2 classes - 8 missing values
Attribute information: ``` sick, negative. | classes age: continuous. sex: M, F. on thyroxine: f, t. query on thyroxine: f, t. on antithyroid medication: f, t. sick: f, t. pregnant: f, t. thyroid…
19941 runs0 likes31 downloads31 reach9 impact
3772 instances - 30 features - 2 classes - 6064 missing values
1. Title: Protein Localization Sites 2. Creator and Maintainer: Kenta Nakai Institue of Molecular and Cellular Biology Osaka, University 1-3 Yamada-oka, Suita 565 Japan nakai@imcb.osaka-u.ac.jp…
1806 runs0 likes14 downloads14 reach12 impact
336 instances - 8 features - 8 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
698 runs0 likes6 downloads6 reach14 impact
97 instances - 11 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
819 runs0 likes10 downloads10 reach15 impact
500 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
554 runs0 likes10 downloads10 reach15 impact
40768 instances - 11 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
791 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
779 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
737 runs0 likes10 downloads10 reach15 impact
3772 instances - 30 features - 2 classes - 6064 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
756 runs0 likes6 downloads6 reach15 impact
310 instances - 9 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
758 runs0 likes8 downloads8 reach15 impact
500 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
737 runs0 likes5 downloads5 reach14 impact
47 instances - 8 features - 2 classes - 0 missing values
No data.
697 runs0 likes5 downloads5 reach14 impact
89 instances - 9 features - 2 classes - 0 missing values
SUMMARY: Data from an experiment on the affects of machine adjustments on the time to count bolts. Data appear as the STATS (Issue 10) Challenge. DATA: Submitted by W. Robert Stephenson, Iowa State…
754 runs0 likes9 downloads9 reach14 impact
40 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). The multi-class target feature is converted to a two-class nominal target feature by re-labeling the majority class as positive ('P') and…
815 runs0 likes9 downloads9 reach15 impact
336 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
764 runs0 likes6 downloads6 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
755 runs0 likes4 downloads4 reach14 impact
54 instances - 8 features - 2 classes - 120 missing values
Binarized version of the original data set (see version 1). It converts the numeric target feature to a two-class nominal target feature by computing the mean and classifying all instances with a…
788 runs0 likes7 downloads7 reach15 impact
400 instances - 8 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
68 runs0 likes12 downloads12 reach25 impact
32561 instances - 16 features - 2 classes - 4262 missing values
The dataset (originally named ELEC2) contains 45,312 instances dated from 7 May 1996 to 5 December 1998. Each example of the dataset refers to a period of 30 minutes, i.e. there are 48 instances for…
106857 runs3 likes43 downloads46 reach12 impact
45312 instances - 9 features - 2 classes - 0 missing values
Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/
180 runs0 likes5 downloads5 reach24 impact
294 instances - 11 features - 2 classes - 0 missing values
"The debutanizer column is part of a desulfuring and naphtha splitter plant." u1 Top temperature u2 Top pressure u3 Reflux flow u4 Flow to next process u5 6th tray temperature u6 Bottom…
0 runs0 likes1 downloads1 reach12 impact
2394 instances - 8 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
78732 instances - 11 features - 0 classes - 0 missing values
Normalized form of codrna (351) Andrew V Uzilov, Joshua M Keegan, and David H Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC…
309 runs0 likes5 downloads5 reach9 impact
488565 instances - 9 features - 2 classes - 0 missing values
Data from StatLib (ftp stat.cmu.edu/datasets) Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D.…
0 runs0 likes1 downloads1 reach9 impact
62 instances - 8 features - 0 classes - 12 missing values
This is the hip measurement data from Table B.13 in Chatfield's Problem Solving (1995, 2nd edn, Chapman and Hall). It is given in 8 columns. First 4 columns are for Control Group. Last 4 columns are…
0 runs0 likes0 downloads0 reach11 impact
54 instances - 8 features - classes - 120 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach14 impact
500 instances - 8 features - 0 classes - 0 missing values
The data are a subsample of 500 observations from a data set that originate in a study where air pollution at a road is related to traffic volume and meteorological variables, collected by the…
2 runs0 likes1 downloads1 reach14 impact
500 instances - 8 features - 0 classes - 0 missing values
No data.
332 runs0 likes4 downloads4 reach12 impact
1000000 instances - 17 features - 2 classes - 0 missing values
This is an artificial data set with dependencies between the attribute values. The cases are generated using the following method: X1 : uniformly distributed over [-5,5] X2 : uniformly distributed…
3 runs1 likes5 downloads6 reach14 impact
40768 instances - 11 features - 0 classes - 0 missing values
This dataset describes 100,000 realistic, synthetically generated worker compensation insurance claims. Along the ultimate financial losses, each claim is described by the initial case estimate, date…
0 runs0 likes0 downloads0 reach0 impact
100000 instances - 14 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1484 instances - 18 features - classes - 0 missing values
Make target (age) numeric**Author**: 1. Title of Database: Abalone data 2. Sources: (a) Original owners of database: Marine Resources Division Marine Research Laboratories - Taroona Department of…
0 runs0 likes0 downloads0 reach1 impact
4177 instances - 9 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
22 runs0 likes2 downloads2 reach14 impact
400 instances - 8 features - 0 classes - 0 missing values
BitcoinHeist Ransomware Dataset Akcora, C.G., Li, Y., Gel, Y.R. and Kantarcioglu, M., 2019. BitcoinHeist. Topological Data Analysis for Ransomware Detection on the Bitcoin Blockchain. IJCAI-PRICAI…
0 runs1 likes0 downloads1 reach6 impact
2916697 instances - 10 features - 29 classes - 0 missing values
The dataset freMTPL2freq contains risk features for 677,991 motor third-part liability policies (observed mostly on one year).
0 runs1 likes2 downloads3 reach9 impact
678013 instances - 12 features - classes - 0 missing values
The goal of this challenge is to expose the research community to real world datasets of interest to 4Paradigm. All datasets are formatted in a uniform way, though the type of data might differ. The…
1 runs0 likes0 downloads0 reach15 impact
31406 instances - 23 features - 2 classes - 29756 missing values
uci adult partitioned
0 runs0 likes0 downloads0 reach8 impact
48844 instances - 17 features - classes - 6495 missing values
#modelage
28 runs0 likes0 downloads0 reach8 impact
202 instances - 13 features - 3 classes - 202 missing values
nfl_games
0 runs0 likes0 downloads0 reach8 impact
16274 instances - 12 features - classes - 0 missing values
The goal is to predict the Fare. Variable description: pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower age: Age is fractional if less than 1. If the age is…
0 runs0 likes4 downloads4 reach11 impact
1307 instances - 8 features - 0 classes - 0 missing values
test001
0 runs1 likes0 downloads1 reach11 impact
768 instances - 9 features - classes - 0 missing values
``**Author**: Cigdem Inan Aci","Mehmet Fatih Akay ### Data Set Information All simulations have done under the software named OPNET Modeler. Message passing is used as the communication mechanism in…
0 runs0 likes0 downloads0 reach8 impact
640 instances - 10 features - classes - 0 missing values
this is titanic survival prediction
0 runs0 likes3 downloads3 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes3 downloads3 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes1 downloads1 reach8 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes0 downloads0 reach7 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
6 runs0 likes0 downloads0 reach8 impact
891 instances - 8 features - classes - 0 missing values
titanic surviual prediction
0 runs0 likes2 downloads2 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
titanic surviual prediction
0 runs0 likes1 downloads1 reach7 impact
891 instances - 8 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
0 runs0 likes0 downloads0 reach14 impact
4052 instances - 8 features - 0 classes - 0 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
0 runs0 likes0 downloads0 reach13 impact
47 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
400 instances - 8 features - 0 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-cpd1-500 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity…
7145 runs0 likes7 downloads7 reach35 impact
500 instances - 13 features - 5 classes - 0 missing values
* Dataset Title: AutoUniv Dataset data problem: autoUniv-au7-700 * Abstract: AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of…
4537 runs0 likes7 downloads7 reach27 impact
700 instances - 13 features - 3 classes - 0 missing values
* Title: South Africa Heart Disease Dataset * Description A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case…
155 runs0 likes14 downloads14 reach14 impact
462 instances - 10 features - 2 classes - 0 missing values
Source: The dataset was created by Angeliki Xifara (angxifara @ gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis @ gmail.com, Oxford Centre for Industrial…
103 runs1 likes5 downloads6 reach13 impact
768 instances - 10 features - 37 classes - 0 missing values
This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17…
0 runs0 likes0 downloads0 reach0 impact
2111 instances - 17 features - classes - 0 missing values
In our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to…
0 runs0 likes0 downloads0 reach0 impact
750 instances - 16 features - classes - 60 missing values