Data
SpeedDating

SpeedDating

active ARFF Publicly available Visibility: public Uploaded 16-11-2016 by Joaquin Vanschoren
16 likes downloaded by 140 people , 347 total downloads 0 issues 0 downvotes
  • OpenML100 study_123 study_135 study_14 study_144
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Ray Fisman and Sheena Iyengar Source: [Columbia Business School](http://www.stat.columbia.edu/~gelman/arm/examples/speed.dating/) - 2004 Please cite: None This data was gathered from participants in experimental speed dating events from 2002-2004. During the events, the attendees would have a four-minute "first date" with every other participant of the opposite sex. At the end of their four minutes, participants were asked if they would like to see their date again. They were also asked to rate their date on six attributes: Attractiveness, Sincerity, Intelligence, Fun, Ambition, and Shared Interests. The dataset also includes questionnaire data gathered from participants at different points in the process. These fields include: demographics, dating habits, self-perception across key attributes, beliefs on what others find valuable in a mate, and lifestyle information. ### Attribute Information ``` * gender: Gender of self * age: Age of self * age_o: Age of partner * d_age: Difference in age * race: Race of self * race_o: Race of partner * samerace: Whether the two persons have the same race or not. * importance_same_race: How important is it that partner is of same race? * importance_same_religion: How important is it that partner has same religion? * field: Field of study * pref_o_attractive: How important does partner rate attractiveness * pref_o_sinsere: How important does partner rate sincerity * pref_o_intelligence: How important does partner rate intelligence * pref_o_funny: How important does partner rate being funny * pref_o_ambitious: How important does partner rate ambition * pref_o_shared_interests: How important does partner rate having shared interests * attractive_o: Rating by partner (about me) at night of event on attractiveness * sincere_o: Rating by partner (about me) at night of event on sincerity * intelligence_o: Rating by partner (about me) at night of event on intelligence * funny_o: Rating by partner (about me) at night of event on being funny * ambitous_o: Rating by partner (about me) at night of event on being ambitious * shared_interests_o: Rating by partner (about me) at night of event on shared interest * attractive_important: What do you look for in a partner - attractiveness * sincere_important: What do you look for in a partner - sincerity * intellicence_important: What do you look for in a partner - intelligence * funny_important: What do you look for in a partner - being funny * ambtition_important: What do you look for in a partner - ambition * shared_interests_important: What do you look for in a partner - shared interests * attractive: Rate yourself - attractiveness * sincere: Rate yourself - sincerity * intelligence: Rate yourself - intelligence * funny: Rate yourself - being funny * ambition: Rate yourself - ambition * attractive_partner: Rate your partner - attractiveness * sincere_partner: Rate your partner - sincerity * intelligence_partner: Rate your partner - intelligence * funny_partner: Rate your partner - being funny * ambition_partner: Rate your partner - ambition * shared_interests_partner: Rate your partner - shared interests * sports: Your own interests [1-10] * tvsports * exercise * dining * museums * art * hiking * gaming * clubbing * reading * tv * theater * movies * concerts * music * shopping * yoga * interests_correlate: Correlation between participant’s and partner’s ratings of interests. * expected_happy_with_sd_people: How happy do you expect to be with the people you meet during the speed-dating event? * expected_num_interested_in_me: Out of the 20 people you will meet, how many do you expect will be interested in dating you? * expected_num_matches: How many matches do you expect to get? * like: Did you like your partner? * guess_prob_liked: How likely do you think it is that your partner likes you? * met: Have you met your partner before? * decision: Decision at night of event. * decision_o: Decision of partner at night of event. * match: Match (yes/no) ``` ### Relevant paper Raymond Fisman; Sheena S. Iyengar; Emir Kamenica; Itamar Simonson. Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment. The Quarterly Journal of Economics, Volume 121, Issue 2, 1 May 2006, Pages 673–697, [https://doi.org/10.1162/qjec.2006.121.2.673](https://doi.org/10.1162/qjec.2006.121.2.673)

123 features

match (target)nominal2 unique values
0 missing
has_nullnominal2 unique values
0 missing
wavenumeric21 unique values
0 missing
gendernominal2 unique values
0 missing
agenumeric24 unique values
95 missing
age_onumeric24 unique values
104 missing
d_agenumeric35 unique values
0 missing
d_d_agenominal4 unique values
0 missing
racenominal5 unique values
63 missing
race_onominal5 unique values
73 missing
sameracenominal2 unique values
0 missing
importance_same_racenumeric11 unique values
79 missing
importance_same_religionnumeric10 unique values
79 missing
d_importance_same_racenominal3 unique values
0 missing
d_importance_same_religionnominal3 unique values
0 missing
fieldnominal259 unique values
63 missing
pref_o_attractivenumeric94 unique values
89 missing
pref_o_sincerenumeric78 unique values
89 missing
pref_o_intelligencenumeric65 unique values
89 missing
pref_o_funnynumeric71 unique values
98 missing
pref_o_ambitiousnumeric82 unique values
107 missing
pref_o_shared_interestsnumeric85 unique values
129 missing
d_pref_o_attractivenominal3 unique values
0 missing
d_pref_o_sincerenominal3 unique values
0 missing
d_pref_o_intelligencenominal3 unique values
0 missing
d_pref_o_funnynominal3 unique values
0 missing
d_pref_o_ambitiousnominal3 unique values
0 missing
d_pref_o_shared_interestsnominal3 unique values
0 missing
attractive_onumeric18 unique values
212 missing
sinsere_onumeric14 unique values
287 missing
intelligence_onumeric17 unique values
306 missing
funny_onumeric17 unique values
360 missing
ambitous_onumeric15 unique values
722 missing
shared_interests_onumeric15 unique values
1076 missing
d_attractive_onominal3 unique values
0 missing
d_sinsere_onominal3 unique values
0 missing
d_intelligence_onominal3 unique values
0 missing
d_funny_onominal3 unique values
0 missing
d_ambitous_onominal3 unique values
0 missing
d_shared_interests_onominal3 unique values
0 missing
attractive_importantnumeric94 unique values
79 missing
sincere_importantnumeric78 unique values
79 missing
intellicence_importantnumeric65 unique values
79 missing
funny_importantnumeric71 unique values
89 missing
ambtition_importantnumeric82 unique values
99 missing
shared_interests_importantnumeric85 unique values
121 missing
d_attractive_importantnominal3 unique values
0 missing
d_sincere_importantnominal3 unique values
0 missing
d_intellicence_importantnominal3 unique values
0 missing
d_funny_importantnominal3 unique values
0 missing
d_ambtition_importantnominal3 unique values
0 missing
d_shared_interests_importantnominal3 unique values
0 missing
attractivenumeric9 unique values
105 missing
sincerenumeric9 unique values
105 missing
intelligencenumeric9 unique values
105 missing
funnynumeric8 unique values
105 missing
ambitionnumeric9 unique values
105 missing
d_attractivenominal3 unique values
0 missing
d_sincerenominal3 unique values
0 missing
d_intelligencenominal3 unique values
0 missing
d_funnynominal3 unique values
0 missing
d_ambitionnominal3 unique values
0 missing
attractive_partnernumeric17 unique values
202 missing
sincere_partnernumeric14 unique values
277 missing
intelligence_partnernumeric17 unique values
296 missing
funny_partnernumeric16 unique values
350 missing
ambition_partnernumeric15 unique values
712 missing
shared_interests_partnernumeric15 unique values
1067 missing
d_attractive_partnernominal3 unique values
0 missing
d_sincere_partnernominal3 unique values
0 missing
d_intelligence_partnernominal3 unique values
0 missing
d_funny_partnernominal3 unique values
0 missing
d_ambition_partnernominal3 unique values
0 missing
d_shared_interests_partnernominal3 unique values
0 missing
sportsnumeric10 unique values
79 missing
tvsportsnumeric10 unique values
79 missing
exercisenumeric10 unique values
79 missing
diningnumeric10 unique values
79 missing
museumsnumeric11 unique values
79 missing
artnumeric11 unique values
79 missing
hikingnumeric11 unique values
79 missing
gamingnumeric12 unique values
79 missing
clubbingnumeric11 unique values
79 missing
readingnumeric11 unique values
79 missing
tvnumeric10 unique values
79 missing
theaternumeric11 unique values
79 missing
moviesnumeric10 unique values
79 missing
concertsnumeric11 unique values
79 missing
musicnumeric10 unique values
79 missing
shoppingnumeric10 unique values
79 missing
yoganumeric11 unique values
79 missing
d_sportsnominal3 unique values
0 missing
d_tvsportsnominal3 unique values
0 missing
d_exercisenominal3 unique values
0 missing
d_diningnominal3 unique values
0 missing
d_museumsnominal3 unique values
0 missing
d_artnominal3 unique values
0 missing
d_hikingnominal3 unique values
0 missing
d_gamingnominal3 unique values
0 missing
d_clubbingnominal3 unique values
0 missing
d_readingnominal3 unique values
0 missing
d_tvnominal3 unique values
0 missing
d_theaternominal3 unique values
0 missing
d_moviesnominal3 unique values
0 missing
d_concertsnominal3 unique values
0 missing
d_musicnominal3 unique values
0 missing
d_shoppingnominal3 unique values
0 missing
d_yoganominal3 unique values
0 missing
interests_correlatenumeric155 unique values
158 missing
d_interests_correlatenominal3 unique values
0 missing
expected_happy_with_sd_peoplenumeric10 unique values
101 missing
expected_num_interested_in_menumeric18 unique values
6578 missing
expected_num_matchesnumeric17 unique values
1173 missing
d_expected_happy_with_sd_peoplenominal3 unique values
0 missing
d_expected_num_interested_in_menominal3 unique values
0 missing
d_expected_num_matchesnominal3 unique values
0 missing
likenumeric18 unique values
240 missing
guess_prob_likednumeric19 unique values
309 missing
d_likenominal3 unique values
0 missing
d_guess_prob_likednominal3 unique values
0 missing
metnumeric7 unique values
375 missing
decision (ignore)nominal2 unique values
0 missing
decision_o (ignore)nominal2 unique values
0 missing

62 properties

8378
Number of instances (rows) of the dataset.
123
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
18372
Number of missing values in the dataset.
7330
Number of instances with at least one value missing.
59
Number of numeric attributes.
64
Number of nominal attributes.
0.01
Number of attributes divided by the number of instances.
7.15
Average number of distinct values among the attributes of the nominal type.
0.37
Second quartile (Median) of kurtosis among attributes of the numeric type.
64.82
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
0.24
Mean skewness among attributes of the numeric type.
6.83
Second quartile (Median) of means among attributes of the numeric type.
83.53
Percentage of instances belonging to the most frequent class.
3.32
Mean standard deviation of attributes of the numeric type.
0
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
6998
Number of instances belonging to the most frequent class.
0.54
Minimal entropy among attributes.
-0.19
Second quartile (Median) of skewness among attributes of the numeric type.
7.05
Maximum entropy among attributes.
-1.16
Minimum kurtosis among attributes of the numeric type.
4.88
Percentage of binary attributes.
2.26
Second quartile (Median) of standard deviation of attributes of the numeric type.
269.47
Maximum kurtosis among attributes of the numeric type.
0.05
Minimum of means among attributes of the numeric type.
87.49
Percentage of instances having missing values.
1.5
Third quartile of entropy among attributes.
26.36
Maximum of means among attributes of the numeric type.
0
Minimal mutual information between the nominal attributes and the target attribute.
1.78
Percentage of missing values.
1.86
Third quartile of kurtosis among attributes of the numeric type.
0.07
Maximum mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
47.97
Percentage of numeric attributes.
10.68
Third quartile of means among attributes of the numeric type.
259
The maximum number of distinct values among attributes of the nominal type.
-1.08
Minimum skewness among attributes of the numeric type.
52.03
Percentage of nominal attributes.
0.01
Third quartile of mutual information between the nominal attributes and the target attribute.
12.73
Maximum skewness among attributes of the numeric type.
0.28
Minimum standard deviation of attributes of the numeric type.
1.25
First quartile of entropy among attributes.
0.57
Third quartile of skewness among attributes of the numeric type.
12.59
Maximum standard deviation of attributes of the numeric type.
16.47
Percentage of instances belonging to the least frequent class.
-0.37
First quartile of kurtosis among attributes of the numeric type.
4.6
Third quartile of standard deviation of attributes of the numeric type.
1.44
Average entropy of the attributes.
1380
Number of instances belonging to the least frequent class.
5.57
First quartile of means among attributes of the numeric type.
32.51
Standard deviation of the number of distinct values among attributes of the nominal type.
5.59
Mean kurtosis among attributes of the numeric type.
6
Number of binary attributes.
0
First quartile of mutual information between the nominal attributes and the target attribute.
8.92
Mean of means among attributes of the numeric type.
-0.55
First quartile of skewness among attributes of the numeric type.
0.75
Average class difference between consecutive instances.
0.01
Average mutual information between the nominal attributes and the target attribute.
1.79
First quartile of standard deviation of attributes of the numeric type.
0.65
Entropy of the target attribute values.
143.48
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
1.41
Second quartile (Median) of entropy among attributes.

10 tasks

20074 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: match
2538 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: area_under_roc_curve - target_feature: match
1170 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: match - cost matrix: [[0,1],[50,0]]
1071 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: match - cost matrix: [[0,1],[2,0]]
20 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: f_measure - target_feature: match
0 runs - estimation_procedure: 33% Holdout set - evaluation_measure: predictive_accuracy - target_feature: match
0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: average_cost - target_feature: match
0 runs - target_feature: match
Define a new task