Data
Filter results by:
The weather problem is a tiny dataset that we will use repeatedly to illustrate machine learning methods. Entirely fictitious, it supposedly concerns the conditions that are suitable for playing some…
0 runs0 likes0 downloads0 reach2 impact
14 instances - 5 features - 2 classes - 0 missing values
Test dataset
0 runs0 likes0 downloads0 reach3 impact
15547 instances - 61 features - 0 classes - 280 missing values
Test dataset
3 runs0 likes0 downloads0 reach7 impact
15547 instances - 61 features - 2 classes - 280 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - 3 classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
iris with ignored features Sepal.Width and Petal.Length
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - 3 classes - 0 missing values
dataset for feature extraction
0 runs0 likes0 downloads0 reach1 impact
69 instances - 37 features - classes - 0 missing values
analysis of stocks
0 runs0 likes0 downloads0 reach1 impact
245 instances - 15 features - classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with random changes from the chaotic to the non-chaotic regime at different noise levels.
0 runs0 likes0 downloads0 reach1 impact
2493200 instances - 26 features - classes - 0 missing values
This dataset is an artificial simulation of the Duffing system with one phase transition to the chaotic regime.
0 runs0 likes0 downloads0 reach1 impact
9983 instances - 4 features - classes - 0 missing values
Hourly particulate matter air polution data of Great Britain for the year 2017, provided by Ricardo Energy and Environment on behalf of the UK Department for Environment, Food and Rural Affairs…
0 runs0 likes0 downloads0 reach1 impact
394299 instances - 10 features - 0 classes - 0 missing values
Trip Record Data provided by the New York City Taxi and Limousine Commission (TLC) [http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml]. The dataset includes TLC trips of the green line in…
0 runs0 likes0 downloads0 reach1 impact
581835 instances - 15 features - 0 classes - 0 missing values
Embedding of atoms for HIV inhibitors dataser
0 runs0 likes0 downloads0 reach0 impact
1069964 instances - 30 features - classes - 0 missing values
#modelage
31 runs0 likes0 downloads0 reach1 impact
202 instances - 20 features - 2 classes - 17 missing values
Source: C. Okan Sakar a, Gorkem Serbes b, Aysegul Gunduz c, Hunkar C. Tunc a, Hatice Nizam d, Betul Erdogdu Sakar e, Melih Tutuncu c, Tarkan Aydin a, M. Erdem Isenkul d, Hulya Apaydin c a Department…
0 runs0 likes0 downloads0 reach2 impact
756 instances - 754 features - 0 classes - 0 missing values
1. Title: Echocardiogram Data 2. Source Information: -- Donor: Steven Salzberg (salzberg@cs.jhu.edu) -- Collector: -- Dr. Evlin Kinney -- The Reed Institute -- P.O. Box 402603 -- Maimi, FL 33140-0603…
0 runs0 likes0 downloads0 reach1 impact
132 instances - 8 features - 4 classes - 103 missing values
This dataset reflects incidents of crime in the City of Los Angeles dating back to 2010. This data is transcribed from original crime reports that are typed on paper and therefore there may be some…
0 runs0 likes0 downloads0 reach1 impact
1468825 instances - 26 features - 0 classes - 7881776 missing values
#modelage
28 runs0 likes0 downloads0 reach1 impact
202 instances - 13 features - 3 classes - 202 missing values
#modelage
87 runs0 likes0 downloads0 reach1 impact
224 instances - 20 features - 6 classes - 205 missing values
nominal features and target for COMPAS
0 runs0 likes0 downloads0 reach2 impact
5278 instances - 14 features - 2 classes - 0 missing values
The dataset contains the premier league matches for the season 2014-2015.
0 runs0 likes0 downloads0 reach1 impact
380 instances - 38 features - classes - 9 missing values
The dataset contains the serie a matches for season 2015-2016
0 runs0 likes0 downloads0 reach1 impact
379 instances - 38 features - classes - 44 missing values
This dataset contains all Premier League matches, with player statistic take from Sofifa, from 2008 to 2016
0 runs0 likes0 downloads0 reach1 impact
2961 instances - 17 features - classes - 0 missing values
This dataset contains, for each Premier League matches 2014-2015, the probabilities generated with the L2F models, as well as matches odds.
0 runs0 likes0 downloads0 reach1 impact
323 instances - 11 features - classes - 0 missing values
This dataset contains all the player names and player ids, taken from Sofifa
0 runs0 likes0 downloads0 reach1 impact
11009 instances - 3 features - classes - 0 missing values
This dataset contains a simulation of the Lorenz attractor with the parameter $\rho$ varying in time. The stable and chaotic regimes alternate.
0 runs0 likes0 downloads0 reach1 impact
4942 instances - 4 features - classes - 0 missing values
Dataset sales
0 runs0 likes0 downloads0 reach3 impact
10738 instances - 15 features - 0 classes - 0 missing values
Test file for ML training
0 runs0 likes0 downloads0 reach2 impact
1599 instances - 12 features - classes - 0 missing values
Premier league matches from 2008 to 2014 with TDA features extracted.
0 runs0 likes0 downloads0 reach1 impact
2565 instances - 20 features - classes - 0 missing values
Embedding of molecules bonds in HIV inhibitors dataset
0 runs0 likes0 downloads0 reach0 impact
1151940 instances - 30 features - classes - 0 missing values
Fixed dataset for autoHorse.csv I suggest...
0 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 186 classes - 0 missing values
price col is int now. autoHorse dataset
11 runs0 likes0 downloads0 reach1 impact
201 instances - 69 features - 0 classes - 0 missing values
Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending and post graduation career earnings.
0 runs0 likes0 downloads0 reach1 impact
7063 instances - 50 features - 0 classes - 125494 missing values
testing
0 runs0 likes0 downloads0 reach0 impact
366 instances - 3 features - classes - 0 missing values
test001
0 runs1 likes0 downloads1 reach1 impact
768 instances - 9 features - classes - 0 missing values
This data represents crime reported to the Seattle Police Department (SPD). Each row contains the record of a unique event where at least one criminal offense was reported by a member of the community…
0 runs0 likes0 downloads0 reach1 impact
523590 instances - 8 features - 144 classes - 6916 missing values
test
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
Binarized version of the USPS dataset (see version 2). Only instances with class labels 6 and 9 from the original dataset are considered and encoded as 0 (original class 6) and 1 (original class 9).
0 runs0 likes0 downloads0 reach2 impact
1424 instances - 257 features - 2 classes - 0 missing values
Binarized version of the isolet dataset (see version 1). Only instances with class labels 1 and 2 from the original dataset are considered.
0 runs0 likes0 downloads0 reach3 impact
600 instances - 618 features - 2 classes - 0 missing values
Binarized version of the cnae-9 dataset (see version 1). Only instances with class labels 1 and 2 from the original dataset are considered.
0 runs0 likes0 downloads0 reach2 impact
240 instances - 857 features - 2 classes - 0 missing values
testtest
0 runs0 likes0 downloads0 reach1 impact
1994 instances - 127 features - 0 classes - 0 missing values
Binarized version of the semeion dataset (see version 1). Only instances with class labels 1 and 2 from the original dataset are considered.
0 runs0 likes0 downloads0 reach2 impact
319 instances - 257 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 81 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 91 features - 2 classes - 0 missing values
This is a meta-dataset which describes the SVM hyperparameter tuning problem. The target attribute indicates whether tuning is required or default hyperparameter values are enough to each dataset…
0 runs0 likes0 downloads0 reach1 impact
156 instances - 81 features - 2 classes - 0 missing values
source: An Algorithm Selection Benchmark for the Container Pre-Marshalling Problem (CPMP) authors: K. Tierney and Y. Malitsky (features) / K. Tierney and D. Pacino and S. Voss (algorithms) translator…
14 runs0 likes0 downloads0 reach1 impact
527 instances - 23 features - 4 classes - 0 missing values
exercises
0 runs0 likes0 downloads0 reach1 impact
15000 instances - 8 features - classes - 0 missing values
source: http://plato.asu.edu/ftp/solvable.html authors: Rolf-David Bergdoll PAR10 performances of modern solvers on the solvable instances of MIPLIB2010. http://miplib.zib.de/ The algorithm runtime…
0 runs0 likes0 downloads0 reach1 impact
218 instances - 144 features - 5 classes - 0 missing values
# Data Description This is the historical price data of the FOREX USD/DKK from Dukascopy. One instance (row) is one candlestick of one minute. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach1 impact
375840 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/CAD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach1 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SGD from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach1 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX EUR/SEK from Dukascopy. One instance (row) is one candlestick of one hour. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach1 impact
43825 instances - 12 features - 2 classes - 0 missing values
# Data Description This is the historical price data of the FOREX USD/DKK from Dukascopy. One instance (row) is one candlestick of one day. The whole dataset has the data range from 1-1-2018 to…
0 runs0 likes0 downloads0 reach1 impact
1832 instances - 12 features - 2 classes - 0 missing values
Data set of around 45 language and 25 Category. Consist of articles.
0 runs0 likes0 downloads0 reach1 impact
65428 instances - 3 features - classes - 0 missing values
exercises
0 runs0 likes0 downloads0 reach1 impact
15000 instances - 8 features - classes - 0 missing values
The ILPD dataset from the OpenCC18 with all categorical variables label encoded
0 runs0 likes0 downloads0 reach1 impact
583 instances - 11 features - 0 classes - 0 missing values
The sick dataset from the OpenCC18 with all categorical data label encoded so all data is numeric
0 runs0 likes0 downloads0 reach1 impact
3772 instances - 30 features - classes - 0 missing values
The ILPD liver dataset from the OpenCC18 with the gender binary encoded so all features are numeric
1 runs0 likes0 downloads0 reach2 impact
583 instances - 11 features - 2 classes - 0 missing values
Sick dataset from the opencc18 with all textual binary variables label encoded.
1 runs0 likes0 downloads0 reach2 impact
3772 instances - 30 features - 2 classes - 0 missing values
test openml upload
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - 3 classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
test
0 runs0 likes0 downloads0 reach2 impact
150 instances - 5 features - classes - 0 missing values
2
0 runs0 likes0 downloads0 reach1 impact
375840 instances - 12 features - classes - 0 missing values
Branin test
0 runs0 likes0 downloads0 reach1 impact
225 instances - 3 features - classes - 0 missing values
Juan J. Rodriguez, Ludmila I. Kuncheva, Carlos J. Alonso (2006). Rotation Forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(10):1619-1630.…
0 runs0 likes0 downloads0 reach2 impact
1000000 instances - 12 features - 0 classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository. Preprocessing: scaled to [-1,1]
0 runs0 likes0 downloads0 reach6 impact
3175 instances - 61 features - 0 classes - 0 missing values
iris-example
0 runs0 likes0 downloads0 reach0 impact
150 instances - 5 features - 3 classes - 0 missing values
Test
0 runs0 likes0 downloads0 reach0 impact
6330 instances - 8 features - classes - 0 missing values
good
0 runs0 likes0 downloads0 reach0 impact
10 instances - 4 features - classes - 2 missing values
No data.
0 runs0 likes0 downloads0 reach4 impact
24 instances - 5 features - classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
0 runs0 likes0 downloads0 reach6 impact
49749 instances - 301 features - 0 classes - 0 missing values
Data has been taken from various sources such as data gov and various other websites and has been pre processed for analysis purpose
0 runs0 likes0 downloads0 reach0 impact
204 instances - 5 features - classes - 0 missing values
libSVM","AAD group #Dataset from the LIBSVM data repository.
0 runs0 likes0 downloads0 reach6 impact
49749 instances - 301 features - 0 classes - 0 missing values
Is some hand drawn digits with labels that are 1 or 0
0 runs1 likes0 downloads1 reach0 impact
libSVM","AAD group A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University, 2003. #Dataset from the LIBSVM data repository…
0 runs0 likes0 downloads0 reach6 impact
7089 instances - 5 features - 0 classes - 0 missing values
#sbox
0 runs0 likes0 downloads0 reach0 impact
10000 instances - 32 features - classes - 0 missing values
TEST STUDENTS DATA
0 runs0 likes0 downloads0 reach0 impact
100 instances - 11 features - classes - 0 missing values
Incident reports from the San Franciso Police Department between January 2003 and May 2018, provided by the City and County of San Francisco. The dataset was downloaded on 05.11.2018. from…
0 runs0 likes0 downloads0 reach0 impact
538638 instances - 7 features - 2 classes - 0 missing values
Dataset KDD98 challenge: https://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html The goal is to estimate the return from a direct mailing in order to maximize donation profits. This dataset…
0 runs0 likes0 downloads0 reach0 impact
82318 instances - 478 features - 2 classes - 2399311 missing values
tmm hjghjg vjgkjhbb nvhjgb
0 runs0 likes0 downloads0 reach0 impact
748 instances - 5 features - classes - 0 missing values
This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was…
0 runs0 likes0 downloads0 reach0 impact
649 instances - 33 features - 0 classes - 0 missing values
Conventional and Social Media Movies (CSM) - Dataset 2014 and 2015 Data Set 12 features categorized as conventional and social media features. Both conventional features, collected from movies…
0 runs0 likes0 downloads0 reach0 impact
232 instances - 14 features - classes - 60 missing values
%%%%%%%%%%%%%%%%%%% Data-Description % %%%%%%%%%%%%%%%%%%% COIL 1999 Competition Data Data Type multivariate Abstract This data set is from the 1999 Computational Intelligence and Learning (COIL)…
0 runs0 likes0 downloads0 reach7 impact
316 instances - 12 features - 0 classes - 56 missing values
No data.
2 runs0 likes0 downloads0 reach6 impact
506 instances - 21 features - 0 classes - 0 missing values
DATA FILE: Data on patient deaths within 30 days of surgery in 131 U.S. hospitals. See Christiansen and Morris, Bayesian Biostatistics, D. Berry and D. Stangl, editors, 1996, Marcel Dekker, Inc. Data…
0 runs0 likes0 downloads0 reach6 impact
131 instances - 4 features - 0 classes - 0 missing values
One of the data sets used in the book "Analyzing Categorical Data" by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. Further details concerning the book, including information on statistical…
2 runs0 likes0 downloads0 reach6 impact
108 instances - 5 features - 0 classes - 0 missing values
Data on the homicide rate in Detroit for the years 1961-1973. This is the data set called DETROIT in the book 'Subset selection in regression' by Alan J. Miller published in the Chapman & Hall series…
0 runs0 likes0 downloads0 reach6 impact
13 instances - 14 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach6 impact
475 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach6 impact
475 instances - 4 features - 0 classes - 0 missing values
analcatdata A collection of data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Simonoff, Springer-Verlag, New York, 2003. The submission consists of a zip file containing two…
2 runs0 likes0 downloads0 reach6 impact
450 instances - 4 features - 0 classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
209 instances - 8 features - classes - 0 missing values
Datasets of Data And Story Library, project illustrating use of basic statistic methods, converted to arff format by Hakan Kjellerstrand. Source: TunedIT: http://tunedit.org/repo/DASL DASL file…
0 runs0 likes0 downloads0 reach6 impact
40 instances - 7 features - 0 classes - 3 missing values
wine-quality-red-pmlb
31 runs1 likes0 downloads1 reach15 impact
1599 instances - 12 features - 6 classes - 0 missing values
Multivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Concrete Slump dataset (Yeh 2007) concerns the prediction of three properties of concrete…
0 runs1 likes0 downloads1 reach3 impact
103 instances - 10 features - classes - 0 missing values
No data.
0 runs0 likes0 downloads0 reach9 impact
1000 instances - 25 features - 0 classes - 0 missing values
Daily electric energy dataset The dee problem involves predicting the daily average price of TkWhe electricity energy in Spain. The data set contains real values from 2003 about the daily consumption…
0 runs0 likes0 downloads0 reach1 impact
365 instances - 7 features - 0 classes - 0 missing values
Electrical Length data set This problem with only two input variables involves a small search space (small complexity). However, it is still an interesting problem since the system is strongly…
0 runs0 likes0 downloads0 reach1 impact
495 instances - 3 features - 0 classes - 0 missing values
%%%%%%%%%%%%%%%%%%% Data-Description % %%%%%%%%%%%%%%%%%%% COIL 1999 Competition Data Data Type multivariate Abstract This data set is from the 1999 Computational Intelligence and Learning (COIL)…
12 runs0 likes0 downloads0 reach6 impact
316 instances - 12 features - 0 classes - 56 missing values