OpenML
Filter results by:
The Inpatient Utilization and Payment Public Use File (Inpatient PUF) provides information on inpatient discharges for Medicare fee-for-service beneficiaries. The Inpatient PUF includes information on…
0 runs0 likes0 downloads0 reach0 impact
163065 instances - 12 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
8378 instances - 123 features - classes - 18372 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
841 instances - 74 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
841 instances - 74 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
5472 instances - 15 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
252 instances - 14 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
106 instances - 15 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
20640 instances - 8 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
2465 instances - 28 features - classes - 0 missing values
50 Danish words with their pronunciation from Dansk Ordbog
0 runs0 likes0 downloads0 reach0 impact
51 instances - 2 features - classes - 2 missing values
This is a data set of Physicochemical Properties of Protein Tertiary Structure. The data set is taken from CASP 5-9. There are 45730 decoys and size varying from 0 to 21 armstrong. ### Attribute…
0 runs0 likes0 downloads0 reach0 impact
45730 instances - 10 features - classes - 0 missing values
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from…
0 runs1 likes0 downloads1 reach0 impact
2215023 instances - 9 features - 2 classes - 0 missing values
Payments given by healthcare manufacturing companies to medical doctors or hospitals
0 runs0 likes0 downloads0 reach0 impact
73558 instances - 6 features - 2 classes - 83182 missing values
Data reported to the police about the circumstances of personal injury road accidents in Great Britain from 1979, and the maker and model information of vehicles involved in the respective accident
0 runs0 likes2 downloads2 reach0 impact
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
421 instances - 3 features - 1 classes - 0 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
2 instances - 57 features - 1 classes - 22 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
1 instances - 57 features - 1 classes - 11 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
709 instances - 57 features - 6 classes - 8199 missing values
Arbres urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
We aggregated screen movements into screen-fixations using a Salvucci & Goldberg (2000) dispersion-threshold algorithm, and defined Perception Action Cycles (PACs) as fixations with at least one…
0 runs0 likes0 downloads0 reach0 impact
3395 instances - 20 features - classes - 168 missing values
arbres-urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
arbres-urbains
0 runs0 likes0 downloads0 reach0 impact
699 instances - 57 features - 5 classes - 7889 missing values
The goal of the research is to help the auditors by building a classification model that can predict the fraudulent firm on the basis the present and historical risk factors. The information about the…
0 runs0 likes0 downloads0 reach0 impact
1552 instances - 37 features - 0 classes - 19402 missing values
In the dataset there are 5 types of dataset.QCM3, QCM6, QCM7, QCM10, QCM12In each of dataset, there is alcohol classification of five types,1-octanol, 1-propanol, 2-butanol, 2-propanol, 1-isobutanolIn…
0 runs0 likes1 downloads1 reach0 impact
125 instances - 15 features - classes - 0 missing values
Sanitized and anonymized Cargo 2000 (C2K) airfreight tracking and tracing events, covering five months of business execution (3,942 process instances, 7,932 transport legs, 56,082 activities). ###…
0 runs0 likes0 downloads0 reach0 impact
3943 instances - 98 features - classes - 210284 missing values
A Vicon motion capture camera system was used to record 12 users performing 5 hand postures with markers attached to a left-handed glove. A rigid pattern of markers on the back of the glove was used…
0 runs0 likes0 downloads0 reach0 impact
78096 instances - 38 features - classes - 974700 missing values
The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela. The dataset comprises demographic information, habits, and historic medical records of 858 patients. Several…
0 runs0 likes0 downloads0 reach0 impact
858 instances - 36 features - classes - 3622 missing values
The dataset contains 19 attributes regarding ca cervix behavior risk with class label is ca_cervix with 1 and 0 as values which means the respondent with and without ca cervix, respectively. ###…
0 runs0 likes0 downloads0 reach0 impact
858 instances - 36 features - classes - 3622 missing values
This data set measures the running time of a matrix-matrix product A x B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 241600 possible parameter…
0 runs0 likes0 downloads0 reach0 impact
241600 instances - 18 features - classes - 0 missing values
This dataset can be used to predict the chronic kidney disease and it can be collected from the hospital nearly 2 months of period. ### Attribute information We use 24 + class = 25 ( 11 numeric ,14…
0 runs0 likes0 downloads0 reach0 impact
400 instances - 26 features - classes - 1009 missing values
The energy dispersive X-ray fluorescence (EDXRF) was used to determine the chemical composition of celadon body and glaze in Longquan kiln (at Dayao County) and Jingdezhen kiln. Forty typical shards…
0 runs0 likes0 downloads0 reach0 impact
88 instances - 19 features - classes - 0 missing values
This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17…
0 runs0 likes0 downloads0 reach0 impact
2111 instances - 17 features - classes - 0 missing values
The data was collected from car parks in Birmingham that are operated by NCP from Birmingham City Council. It contains the occupancy rates (8:00 to 16:30) from 2016/10/04 to 2016/12/19. ### Attribute…
0 runs0 likes0 downloads0 reach0 impact
35717 instances - 4 features - classes - 0 missing values
In our research each record (row) is data for a week. Each record also has the percentage of return that stock has in the following week (percent_change_next_weeks_price). Ideally, you want to…
0 runs0 likes0 downloads0 reach0 impact
750 instances - 16 features - classes - 60 missing values
This data set was collected from the internet traffic records on a university's firewall. There are 12 features in total. Action feature is used as a class. There are 4 classes in total. These are…
0 runs0 likes0 downloads0 reach0 impact
65532 instances - 12 features - classes - 0 missing values
Author: Francesca Grisoni, Claudia S. Neuhaus, Miyabi Hishinuma, Gisela Gabernet, Jan A. Hiss, - Masaaki Kotera, Gisbert Schneider Source:…
0 runs0 likes1 downloads1 reach0 impact
949 instances - 3 features - classes - 0 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
125 instances - 42 features - classes - 362 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
114 instances - 42 features - classes - 562 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
122 instances - 42 features - classes - 906 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
76 instances - 42 features - classes - 574 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
127 instances - 42 features - classes - 722 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
126 instances - 42 features - classes - 978 missing values
The Garment Industry is one of the key examples of the industrial globalization of this modern era. It is a highly labour-intensive industry with lots of manual processes. Satisfying the huge global…
0 runs0 likes0 downloads0 reach0 impact
1197 instances - 15 features - classes - 506 missing values
The classification task of this database is to determine where patients in a postoperative recovery area should be sent to next. Because hypothermia is a significant concern after surgery (Woolery, L.…
0 runs0 likes0 downloads0 reach0 impact
65532 instances - 12 features - classes - 0 missing values
This dataset attributes first names to genders, giving counts and probabilities. It combines open-source government data from the US, UK, Canada, and Australia. This dataset combines raw counts for…
0 runs0 likes0 downloads0 reach0 impact
147269 instances - 4 features - classes - 0 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
126 instances - 42 features - classes - 446 missing values
This is a part of collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a minimum of 76 rows.…
0 runs0 likes0 downloads0 reach0 impact
127 instances - 42 features - classes - 788 missing values
Product listing data submitted to the U.S. FDA for all unfinished, unapproved drugs.
0 runs0 likes1 downloads1 reach0 impact
120215 instances - 20 features - 7 classes - 443305 missing values
Data reported to the police about the circumstances of personal injury road accidents in Great Britain from 1979, and the maker and model information of vehicles involved in the respective accident.…
0 runs0 likes1 downloads1 reach0 impact
363243 instances - 67 features - 3 classes - 2181757 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
leak detection file
0 runs0 likes0 downloads0 reach0 impact
23 instances - 4 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 0 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - classes - 0 missing values
artificial no anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 2 features - 0 classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - 2 classes - 0 missing values
This version has feature names based on https://www2.1010data.com/documentationcenter/beta/Tutorials/MachineLearningExamples/CensusIncomeDataSet.html Missing data is also properly encoded in this…
0 runs0 likes1 downloads1 reach0 impact
199523 instances - 42 features - 2 classes - 415717 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1643 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1624 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1538 instances - 3 features - classes - 0 missing values
Online advertisement clicking rates, where the metrics are cost-per-click (CPC) and cost per thousand impressions (CPM).
0 runs0 likes0 downloads0 reach0 impact
1643 instances - 2 features - classes - 0 missing values
This is an experimental data set for trying to classify numbers in a lottery as "Highly likely to be picked" or "Not very likely to be picked". It is based on a little more than a…
0 runs0 likes0 downloads0 reach0 impact
12528 instances - 36 features - classes - 0 missing values
ARFF Training Data
0 runs0 likes0 downloads0 reach0 impact
177640 instances - 40 features - classes - 0 missing values
This is the full version of the KDD Cup 2009 dataset Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large…
0 runs0 likes0 downloads0 reach0 impact
50000 instances - 14892 features - 2 classes - 19658569 missing values
source: http://plato.asu.edu/ftp/solvable.html authors: Rolf-David Bergdoll PAR10 performances of modern solvers on the solvable instances of MIPLIB2010. http://miplib.zib.de/ The algorithm runtime…
0 runs0 likes0 downloads0 reach0 impact
1090 instances - 145 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach0 impact
1090 instances - 147 features - 0 classes - 0 missing values
This collection includes data sets of one-dimensional ultrasound raw RF data (A-Scans) acquired from the biceps brachii muscles of a single healthy volunteer. The annotation was performed by labeling…
0 runs0 likes0 downloads0 reach0 impact
318 instances - 8 features - classes - 0 missing values
This collection includes data sets of one-dimensional ultrasound raw RF data (A-Scans) acquired from the biceps brachii muscles of 21 healthy volunteers. The annotation was performed by labeling the…
0 runs0 likes0 downloads0 reach0 impact
347 instances - 8 features - classes - 0 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
This dataset combines records from the MLCQ dataset with metrics extracted using the PMD Tool and the Understand tool, to determine whether a file contains code smells. Please note that the records…
0 runs0 likes0 downloads0 reach0 impact
86467 instances - 67 features - 0 classes - 2852906 missing values
This dataset combines records from the MLCQ dataset with metrics extracted using the PMD Tool and the Understand tool, to determine whether a file contains code smells. Please note that the records…
0 runs0 likes0 downloads0 reach0 impact
83943 instances - 67 features - 0 classes - 2801627 missing values
The dataset consists of 384 features extracted from CT images. The class variable is numeric and denotes the relative location of the CT slice on the axial axis of the human body. The data was…
0 runs0 likes1 downloads1 reach0 impact
53500 instances - 386 features - classes - 0 missing values
Modified version for the automl benchmark. Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation…
0 runs0 likes0 downloads0 reach0 impact
7063 instances - 45 features - 0 classes - 104249 missing values
kaggle 30day ml
0 runs0 likes0 downloads0 reach0 impact
300000 instances - 25 features - 0 classes - 0 missing values
asdfasd
0 runs0 likes0 downloads0 reach0 impact
761 instances - 42 features - classes - 630 missing values
artificial with anomaly
0 runs0 likes0 downloads0 reach0 impact
4032 instances - 3 features - classes - 0 missing values
This data is derived from the 2012 KDD Cup. The data is subsampled to 0.1% of the original number of instances, downsampling the majority class (click=0) so that the target feature is reasonably…
0 runs0 likes1 downloads1 reach0 impact
39948 instances - 10 features - 2 classes - 0 missing values
It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period.…
0 runs0 likes0 downloads0 reach0 impact
350 instances - 5 features - classes - 0 missing values
The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews from and Zomato website on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The…
0 runs0 likes0 downloads0 reach0 impact
3916 instances - 3 features - classes - 0 missing values
It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period.…
0 runs0 likes0 downloads0 reach0 impact
350 instances - 5 features - classes - 0 missing values
It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period.…
0 runs0 likes0 downloads0 reach0 impact
370 instances - 5 features - classes - 0 missing values
It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period.…
0 runs0 likes0 downloads0 reach0 impact
438 instances - 5 features - classes - 0 missing values
Laboratory dataset
0 runs0 likes0 downloads0 reach0 impact
1750 instances - 7 features - classes - 0 missing values
Laboratorio_dataset_car
0 runs0 likes0 downloads0 reach0 impact
1750 instances - 7 features - classes - 0 missing values
It is a public set of comments collected for spam research. It has five datasets composed by 1,956 real messages extracted from five videos that were among the 10 most viewed on the collection period.…
0 runs0 likes0 downloads0 reach0 impact
448 instances - 5 features - classes - 245 missing values
The dataset contains information on 13,932 single-family homes sold in Miami in 2016. Besides publicly available information, the dataset creator Steven C. Bourassa has added distance variables,…
0 runs0 likes0 downloads0 reach0 impact
13932 instances - 17 features - 0 classes - 0 missing values
This is a dataset about the actors and actresses that have voice characters in Disney movies
0 runs0 likes0 downloads0 reach0 impact
1000 instances - 8 features - 2 classes - 0 missing values
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach0 impact
14 instances - 5 features - 2 classes - 0 missing values
This is a dataset about the academic performance of students in different subjects
0 runs0 likes1 downloads1 reach0 impact
1000 instances - 8 features - 2 classes - 0 missing values
bases-de-donnees-annuelles-des-accidents-corporels-de-la-circulation-routiere-annees-de-2005-a-2019
0 runs0 likes0 downloads0 reach0 impact
132977 instances - 55 features - 0 classes - 550521 missing values
I am testing to upload data
0 runs0 likes0 downloads0 reach0 impact
601 instances - 7 features - classes - 0 missing values
There are 10 predictors, all quantitative, and a binary dependent variable, indicating the presence or absence of breast cancer. The predictors are anthropometric data and parameters which can be…
0 runs0 likes1 downloads1 reach0 impact
116 instances - 10 features - 0 classes - 0 missing values
We use the following representation to collect the dataset age - age bp - blood pressure sg - specific gravity al - albumin su - sugar rbc - red blood cells pc - pus cell pcc - pus cell clumps ba -…
0 runs0 likes1 downloads1 reach0 impact
250 instances - 28 features - classes - 0 missing values
Autistic Spectrum Disorder (ASD) is a neurodevelopment condition associated with significant healthcare costs, and early diagnosis can significantly reduce these. Unfortunately, waiting times for an…
0 runs1 likes0 downloads1 reach0 impact
704 instances - 21 features - classes - 192 missing values
This dataset contains the medical records of 299 heart failure patients collected at the Faisalabad Institute of Cardiology and at the Allied Hospital in Faisalabad (Punjab, Pakistan), between…
0 runs0 likes1 downloads1 reach0 impact
299 instances - 13 features - classes - 0 missing values
whitewine
0 runs0 likes0 downloads0 reach0 impact
78 instances - 4 features - classes - 0 missing values
red wine dataset
0 runs0 likes0 downloads0 reach0 impact
0 instances - 66 features - classes - 0 missing values
Detailed movie descriptions - ideal for Recommendation Engines
0 runs0 likes0 downloads0 reach0 impact
4803 instances - 11 features - classes - 514 missing values
redwine dataset
0 runs0 likes0 downloads0 reach0 impact
571 instances - 4 features - classes - 0 missing values
redwine data
0 runs0 likes0 downloads0 reach0 impact
571 instances - 4 features - classes - 0 missing values