Data
Filter results by:
![palmerpenguins](https://github.com/allisonhorst/palmerpenguins/raw/master/man/figures/logo.png) ## Description The goal of palmerpenguins is to provide a great dataset for data exploration &…
0 runs0 likes0 downloads0 reach6 impact
344 instances - 7 features - 3 classes - 18 missing values
The midwest survey dataset contain individual responses from surveys about regional identification conducted for FiveThirtyEight by SurveyMonkey.
0 runs0 likes0 downloads0 reach6 impact
2778 instances - 28 features - 10 classes - 1744 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes1 downloads1 reach13 impact
5 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
10 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
6 instances - 1143 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
31 instances - 54 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes2 downloads2 reach13 impact
195 instances - 33 features - 0 classes - 0 missing values
This is the hip measurement data from Table B.13 in Chatfield's Problem Solving (1995, 2nd edn, Chapman and Hall). It is given in 8 columns. First 4 columns are for Control Group. Last 4 columns are…
0 runs0 likes0 downloads0 reach11 impact
54 instances - 8 features - classes - 120 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
14 instances - 51 features - 0 classes - 0 missing values
Contains 110 data sets from the book 'The Statistical Sleuth' by Fred Ramsey and Dan Schafer; Duxbury Press, 1997. (schafer@stat.orst.edu) [14/Oct/97] (172k) Note: description taken from this web…
0 runs0 likes0 downloads0 reach13 impact
42 instances - 10 features - 0 classes - 0 missing values
The data consist of 2001 observations taken from a balloon about 30 kilometres above the surface of the earth. In the section of the flight shown here the balloon increases in height. As radiation…
0 runs1 likes2 downloads3 reach13 impact
2001 instances - 2 features - 0 classes - 0 missing values
This file contains data from Regression Analysis By Example, 2nd Edition, by Samprit Chatterjee and Bertram Price, John Wiley, 1991. Data sets have names of the form 'rabe.xxx' where xxx is the page…
0 runs0 likes1 downloads1 reach13 impact
120 instances - 3 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%% Data-Description % %%%%%%%%%%%%%%%%%%% COIL 1999 Competition Data Data Type multivariate Abstract This data set is from the 1999 Computational Intelligence and Learning (COIL)…
0 runs0 likes0 downloads0 reach13 impact
316 instances - 12 features - 0 classes - 56 missing values
This file contains data from Regression Analysis By Example, 2nd Edition, by Samprit Chatterjee and Bertram Price, John Wiley, 1991. Data sets have names of the form 'rabe.xxx' where xxx is the page…
0 runs0 likes0 downloads0 reach13 impact
50 instances - 6 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
16 instances - 24 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs0 likes0 downloads0 reach13 impact
15 instances - 10 features - 0 classes - 0 missing values
Data on the homicide rate in Detroit for the years 1961-1973. This is the data set called DETROIT in the book 'Subset selection in regression' by Alan J. Miller published in the Chapman & Hall series…
0 runs0 likes0 downloads0 reach13 impact
13 instances - 14 features - 0 classes - 0 missing values
This dataset contains 3 more features compared to version 1 of the same dataset. Data from which conclusions were drawn in the article "Sleep in Mammals: Ecological and Constitutional Correlates" by…
0 runs0 likes0 downloads0 reach13 impact
62 instances - 10 features - 0 classes - 38 missing values
This is one of a family of datasets synthetically generated from a realistic simulation of the dynamics of a Unimation Puma 560 robot arm. There are eight datastets in this family . In this repository…
0 runs0 likes6 downloads6 reach13 impact
8192 instances - 33 features - 0 classes - 0 missing values
The data consist of annual observations on the level of strike volume (days lost due to industrial disputes per 1000 wage salary earners), and their covariates in 18 OECD countries from 1951-1985. The…
0 runs0 likes2 downloads2 reach13 impact
625 instances - 7 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%% Data-Description % %%%%%%%%%%%%%%%%%%% COIL 1999 Competition Data Data Type multivariate Abstract This data set is from the 1999 Computational Intelligence and Learning (COIL)…
0 runs0 likes0 downloads0 reach14 impact
316 instances - 12 features - 0 classes - 56 missing values
File README ----------- chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S.…
0 runs0 likes0 downloads0 reach13 impact
222 instances - 3 features - 0 classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes1 downloads1 reach13 impact
39 instances - 4 features - 0 classes - 0 missing values
This is one of 41 drug design datasets. The datasets with 1143 features are formed using Adriana.Code software (www.molecular-networks.com/software/adrianacode). The molecules and outputs are taken…
0 runs1 likes0 downloads1 reach14 impact
8885 instances - 267 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000000 instances - 41 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach12 impact
1000000 instances - 16 features - 0 classes - 0 missing values
This collection includes 21 data sets of one-dimensional ultrasound raw RF data (A-Scans) acquired from the calf muscles of 8 healthy volunteers. The subjects were asked to manually annotate the data…
0 runs0 likes1 downloads1 reach8 impact
212872 instances - 4 features - classes - 0 missing values
Data contains the information of 9144 samples form 220 spectral bands. The classes represent land-use types: alfalfa, corn, grass, hay, oats, soybeans, trees, and wheat.
0 runs0 likes2 downloads2 reach10 impact
9144 instances - 221 features - 8 classes - 0 missing values
Binarized version of the semeion dataset (see version 1). Only instances with class labels 1 and 2 from the original dataset are considered.
0 runs0 likes0 downloads0 reach9 impact
319 instances - 257 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach8 impact
156 instances - 81 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach8 impact
156 instances - 91 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach8 impact
156 instances - 81 features - 2 classes - 0 missing values
source: http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/ authors: L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown translator in coseal format: M. Lindauer with the help of Alexandre Frechette the data do…
0 runs0 likes1 downloads1 reach9 impact
4440 instances - 117 features - 0 classes - 27150 missing values
uci adult partitioned
0 runs0 likes0 downloads0 reach8 impact
48844 instances - 17 features - classes - 6495 missing values
uci
0 runs0 likes0 downloads0 reach8 impact
30000 instances - 27 features - classes - 0 missing values
uci
0 runs0 likes0 downloads0 reach8 impact
101766 instances - 52 features - classes - 192849 missing values
hmeq_p,BAD,binary
0 runs0 likes0 downloads0 reach8 impact
5960 instances - 15 features - classes - 5271 missing values
kaggle_santander_p
0 runs0 likes0 downloads0 reach8 impact
200000 instances - 203 features - classes - 0 missing values
Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering…
0 runs0 likes0 downloads0 reach8 impact
5000 instances - 3 features - 0 classes - 0 missing values
classification
0 runs0 likes1 downloads1 reach8 impact
150 instances - 5 features - classes - 0 missing values
This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features plus the price and the id columns,…
0 runs0 likes2 downloads2 reach9 impact
21613 instances - 20 features - 0 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes1 downloads1 reach8 impact
iris with ignored features Sepal.Width and Petal.Length
0 runs1 likes1 downloads2 reach8 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
Israeli lottery
0 runs0 likes0 downloads0 reach8 impact
1153 instances - 11 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
Wine data gathered by https://www.kaggle.com/zynicideThe data was scraped from WineEnthusiast during the week of June 15th, 2017. The code for the scraper can be found at…
0 runs0 likes0 downloads0 reach8 impact
150930 instances - 10 features - classes - 174477 missing values
Data are collected from Kickstarter Platform You'll find most useful data for project analysis. Columns are self explanatory except: usd_pledged: conversion in US dollars of the pledged column…
0 runs0 likes0 downloads0 reach8 impact
331675 instances - 14 features - classes - 210 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
https://www.kaggle.com/harlfoxem/ This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features…
0 runs0 likes2 downloads2 reach8 impact
21613 instances - 20 features - classes - 0 missing values
General Description 2015-current: greater than $200.00. The Commission categorizes contributions from individuals using the calendar year-to-date amount for political action committee (PAC) and party…
0 runs0 likes2 downloads2 reach8 impact
3348209 instances - 21 features - 0 classes - 10786577 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes0 downloads0 reach8 impact
Estimated article influence scores in 2015
0 runs0 likes0 downloads0 reach8 impact
3615 instances - 7 features - 3169 classes - 48 missing values
Annual salary information including gross pay and overtime pay for all active, permanent employees of Montgomery County, MD paid in calendar year 2016. This information will be published annually each…
0 runs0 likes3 downloads3 reach8 impact
9228 instances - 13 features - 0 classes - 11169 missing values
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs1 likes1 downloads2 reach8 impact
163065 instances - 12 features - 0 classes - 0 missing values
Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering…
0 runs0 likes0 downloads0 reach8 impact
5000 instances - 3 features - 0 classes - 0 missing values
Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering…
0 runs0 likes0 downloads0 reach8 impact
5000 instances - 3 features - 0 classes - 0 missing values
Synthetic 2-d data with N=5000 vectors and k=15 Gaussian clusters with different degree of cluster overlap P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering…
0 runs0 likes0 downloads0 reach8 impact
5000 instances - 3 features - 0 classes - 0 missing values
Public procurement data for the European Economic Area, Switzerland, and the Macedonia. 2015
0 runs0 likes0 downloads0 reach8 impact
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes0 downloads0 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
This dataset consists of beer reviews from Beeradvocate. The data span a period of more than 10 years, including all ~1.5 million reviews up to November 2011. Each review includes ratings in terms of…
0 runs0 likes2 downloads2 reach8 impact
1586614 instances - 13 features - 104 classes - 68148 missing values
Employee remuneration and expenses (earning over 75,000CAD per year). This data set includes remuneration and expenses from employees earning over 75,000CAD per year. Attributes: NAME: Name of…
0 runs0 likes0 downloads0 reach8 impact
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes1 downloads1 reach8 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes1 downloads1 reach8 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes1 downloads1 reach8 impact
Daily air quality measurements in New York, May to September 1973. This data is taken from R.
0 runs0 likes0 downloads0 reach8 impact
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach8 impact
150 instances - 5 features - classes - 0 missing values
Phishing website 1
0 runs0 likes0 downloads0 reach2 impact
11055 instances - 31 features - 0 classes - 0 missing values
Email dataset 1a
0 runs0 likes0 downloads0 reach2 impact
4585 instances - 4 features - 0 classes - 0 missing values
Email dataset 1b
0 runs0 likes0 downloads0 reach2 impact
4585 instances - 24 features - 0 classes - 161 missing values
Email dataset 1c
0 runs0 likes0 downloads0 reach2 impact
4585 instances - 792 features - 0 classes - 0 missing values
Email dataset 1d
0 runs0 likes0 downloads0 reach2 impact
4585 instances - 11 features - 0 classes - 0 missing values
Email dataset 1e
0 runs0 likes0 downloads0 reach2 impact
4585 instances - 580 features - 0 classes - 0 missing values
Email dataset 2
0 runs0 likes0 downloads0 reach2 impact
11507 instances - 4 features - 0 classes - 0 missing values
Testing dataset
0 runs0 likes1 downloads1 reach3 impact
134731 instances - 31 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
336 instances - 8 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
2178 instances - 4 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
8124 instances - 23 features - classes - 2480 missing values
DESCRIPTIVE ABSTRACT: The data set contains the oral, written and combined test scores for 2003 New Haven Fire Department promotion exams. The Race and Position for each test taker are also given.…
0 runs0 likes0 downloads0 reach2 impact
118 instances - 6 features - 2 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
101 instances - 18 features - classes - 0 missing values
URL dataset
0 runs0 likes0 downloads0 reach2 impact
121001 instances - 501 features - 0 classes - 0 missing values
URL dataset 2
0 runs0 likes0 downloads0 reach2 impact
95911 instances - 13 features - 0 classes - 0 missing values
URL dataset 3
0 runs0 likes0 downloads0 reach2 impact
18982 instances - 80 features - 5 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach6 impact
1000 instances - 21 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach6 impact
1000 instances - 21 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach6 impact
1000 instances - 21 features - classes - 0 missing values
https://www.kaggle.com/harlfoxem/ This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. It contains 19 house features…
0 runs0 likes0 downloads0 reach6 impact
21613 instances - 21 features - 0 classes - 0 missing values