Data

socmob

active
ARFF
Publicly available Visibility: public Uploaded 29-09-2014 by Joaquin Vanschoren

0 likes downloaded by 1 people , 2 total downloads 0 issues 0 downvotes

0 likes downloaded by 1 people , 2 total downloads 0 issues 0 downvotes

Issue | #Downvotes for this reason | By |
---|

Loading wiki

Help us complete this description
Edit

Author:
Source: Unknown - Date unknown
Please cite:
17x17x2x2 tables of counts in GLIM-ready format used for the analyses
in Biblarz, Timothy J., and Adrian E. Raftery. 1993. "The Effects of
Family Disruption on Social Mobility." American Sociological Review
(In press). For further details of the data, see this reference.
Column 1 is father's occupation, coded as follows:
17. Professional, Self-Employed
16. Professional-Salaried
15. Manager
14. Salesman-Nonretail
13. Proprietor
12. Clerk
11. Salesman-Retail
10. Craftsman-Manufacturing
9. Craftsmen-Other
8. Craftsman-Construction
7. Service Worker
6. Operative-Nonmanufacturing
5. Operative-Manufacturing
4. Laborer-Manufacturing
3. Laborer-Nonmanufacturing
2. Farmer/Farm Manager
1. Farm Laborer
Column 2 is son's occupation, coded in the same way as father's.
Column 3 is family structure, coded 1=intact family background and
2=nonintact family background.
Column 4 is race, coded 1=white and 2=black.
Column 5 is counts for son's first occupation.
Column 6 is counts for son's current occupation.
The counts have been weighted to take account of the survey
design, which is why they are not integers.
*
This file was constructed from publicly available data collected
by David Featherman and Robert Hauser in 1973: the "Occupational
Change in a Generation II" (OCG II) Survey. Permission is hereby given to
use the above data for non-commercial scholarly and teaching purposes.
If these data are used in a published article or book,
the authors, the original data (in the form given in
Biblarz and Raftery (1993), cited above), and StatLib should
all be acknowledged.
Information about the dataset
CLASSTYPE: numeric
CLASSINDEX: none specific

counts_for_sons_current_occupation (target) | numeric | 361 unique values 0 missing | |

fathers_occupation | nominal | 17 unique values 0 missing | |

sons_occupation | nominal | 17 unique values 0 missing | |

family_structure | nominal | 2 unique values 0 missing | |

race | nominal | 2 unique values 0 missing | |

counts_for_sons_first_occupation | numeric | 358 unique values 0 missing |

Minimal mutual information between the nominal attributes and the target attribute.

5.81

Second quartile (Median) of skewness among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Maximum mutual information between the nominal attributes and the target attribute.

2

The minimal number of distinct values among attributes of the nominal type.

42.68

Second quartile (Median) of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

17

The maximum number of distinct values among attributes of the nominal type.

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

81.88

Third quartile of kurtosis among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

Third quartile of mutual information between the nominal attributes and the target attribute.

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

7.19

Third quartile of skewness among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

25.1

First quartile of kurtosis among attributes of the numeric type.

44.36

Third quartile of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

Average mutual information between the nominal attributes and the target attribute.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

First quartile of mutual information between the nominal attributes and the target attribute.

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

9.5

Average number of distinct values among the attributes of the nominal type.

4.43

First quartile of skewness among attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

8.66

Standard deviation of the number of distinct values among attributes of the nominal type.

41

First quartile of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

53.49

Second quartile (Median) of kurtosis among attributes of the numeric type.

17.58

Second quartile (Median) of means among attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.