Data
bodyfat

# bodyfat

active ARFF Publicly available Visibility: public Uploaded 03-10-2014 by Joaquin Vanschoren
0 likes downloaded by 6 people , 9 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By

Author: Roger W. Johnson Source: [UCI (not available anymore)](https://archive.ics.uci.edu/ml/index.php), [TunedIT](http://tunedit.org/repo/UCI/numeric/bodyfat.arff) Please cite: None. Short Summary: Lists estimates of the percentage of body fat determined by underwater weighing and various body circumference measurements for 252 men. Classroom use of this data set: This data set can be used to illustrate multiple regression techniques. Accurate measurement of body fat is inconvenient/costly and it is desirable to have easy methods of estimating body fat that are not inconvenient/costly. More Details: A variety of popular health books suggest that the readers assess their health, at least in part, by estimating their percentage of body fat. In Bailey (1994), for instance, the reader can estimate body fat from tables using their age and various skin-fold measurements obtained by using a caliper. Other texts give predictive equations for body fat using body circumference measurements (e.g. abdominal circumference) and/or skin-fold measurements. See, for instance, Behnke and Wilmore (1974), pp. 66-67; Wilmore (1976), p. 247; or Katch and McArdle (1977), pp. 120-132). Percentage of body fat for an individual can be estimated once body density has been determined. Folks (e.g. Siri (1956)) assume that the body consists of two components - lean body tissue and fat tissue. Letting D = Body Density (gm/cm^3) A = proportion of lean body tissue B = proportion of fat tissue (A+B=1) a = density of lean body tissue (gm/cm^3) b = density of fat tissue (gm/cm^3) we have D = 1/[(A/a) + (B/b)] solving for B we find B = (1/D)*[ab/(a-b)] - [b/(a-b)]. Using the estimates a=1.10 gm/cm^3 and b=0.90 gm/cm^3 (see Katch and McArdle (1977), p. 111 or Wilmore (1976), p. 123) we come up with "Siri's equation": Percentage of Body Fat (i.e. 100*B) = 495/D - 450. Volume, and hence body density, can be accurately measured a variety of ways. The technique of underwater weighing "computes body volume as the difference between body weight measured in air and weight measured during water submersion. In other words, body volume is equal to the loss of weight in water with the appropriate temperature correction for the water's density" (Katch and McArdle (1977), p. 113). Using this technique, Body Density = WA/[(WA-WW)/c.f. - LV] where WA = Weight in air (kg) WW = Weight in water (kg) c.f. = Water correction factor (=1 at 39.2 deg F as one-gram of water occupies exactly one cm^3 at this temperature, =.997 at 76-78 deg F) LV = Residual Lung Volume (liters) (Katch and McArdle (1977), p. 115). Other methods of determining body volume are given in Behnke and Wilmore (1974), p. 22 ff. The variables listed below, from left to right, are: Density determined from underwater weighing Percent body fat from Siri's (1956) equation Age (years) Weight (lbs) Height (inches) Neck circumference (cm) Chest circumference (cm) Abdomen 2 circumference (cm) Hip circumference (cm) Thigh circumference (cm) Knee circumference (cm) Ankle circumference (cm) Biceps (extended) circumference (cm) Forearm circumference (cm) Wrist circumference (cm) (Measurement standards are apparently those listed in Benhke and Wilmore (1974), pp. 45-48 where, for instance, the abdomen 2 circumference is measured "laterally, at the level of the iliac crests, and anteriorly, at the umbilicus".) These data are used to produce the predictive equations for lean body weight given in the abstract "Generalized body composition prediction equation for men using simple measurement techniques", K.W. Penrose, A.G. Nelson, A.G. Fisher, FACSM, Human Performance Research Center, Brigham Young University, Provo, Utah 84602 as listed in _Medicine and Science in Sports and Exercise_, vol. 17, no. 2, April 1985, p. 189. (The predictive equations were obtained from the first 143 of the 252 cases that are listed below). The data were generously supplied by Dr. A. Garth Fisher who gave permission to freely distribute the data and use for non-commercial purposes. References: Bailey, Covert (1994). _Smart Exercise: Burning Fat, Getting Fit_, Houghton-Mifflin Co., Boston, pp. 179-186. Behnke, A.R. and Wilmore, J.H. (1974). _Evaluation and Regulation of Body Build and Composition_, Prentice-Hall, Englewood Cliffs, N.J. Siri, W.E. (1956), "Gross composition of the body", in _Advances in Biological and Medical Physics_, vol. IV, edited by J.H. Lawrence and C.A. Tobias, Academic Press, Inc., New York. Katch, Frank and McArdle, William (1977). _Nutrition, Weight Control, and Exercise_, Houghton Mifflin Co., Boston. Wilmore, Jack (1976). _Athletic Training and Physical Fitness: Physiological Principles of the Conditioning Process_, Allyn and Bacon, Inc., Boston.

### 15 features

 class (target) numeric 176 unique values 0 missing Density numeric 218 unique values 0 missing Age numeric 51 unique values 0 missing Weight numeric 197 unique values 0 missing Height numeric 48 unique values 0 missing Neck numeric 90 unique values 0 missing Chest numeric 174 unique values 0 missing Abdomen numeric 185 unique values 0 missing Hip numeric 152 unique values 0 missing Thigh numeric 139 unique values 0 missing Knee numeric 90 unique values 0 missing Ankle numeric 61 unique values 0 missing Biceps numeric 104 unique values 0 missing Forearm numeric 77 unique values 0 missing Wrist numeric 44 unique values 0 missing

### 107 properties

252
Number of instances (rows) of the dataset.
15
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
15
Number of numeric attributes.
0
Number of nominal attributes.
Third quartile of entropy among attributes.
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
The maximum number of distinct values among attributes of the nominal type.
-5.38
Minimum skewness among attributes of the numeric type.
0
Percentage of instances having missing values.
5.27
Third quartile of kurtosis among attributes of the numeric type.
-6.72
Average class difference between consecutive instances.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
2.26
Maximum skewness among attributes of the numeric type.
0.02
Minimum standard deviation of attributes of the numeric type.
0
Percentage of missing values.
92.56
Third quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
29.39
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
100
Percentage of numeric attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
0
Percentage of nominal attributes.
0.84
Third quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
6.31
Mean kurtosis among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
First quartile of entropy among attributes.
8.43
Third quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
56.38
Mean of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.4
First quartile of kurtosis among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
23.1
First quartile of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
0
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Standard deviation of the number of distinct values among attributes of the nominal type.
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
Average number of distinct values among the attributes of the nominal type.
0.15
First quartile of skewness among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.25
Mean skewness among attributes of the numeric type.
2.02
First quartile of standard deviation of attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
Percentage of instances belonging to the most frequent class.
6.55
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Entropy of the target attribute values.
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
1.06
Second quartile (Median) of kurtosis among attributes of the numeric type.
38.59
Second quartile (Median) of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-0.42
Minimum kurtosis among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
59.54
Maximum kurtosis among attributes of the numeric type.
1.06
Minimum of means among attributes of the numeric type.
0.52
Second quartile (Median) of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
178.92
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
3.66
Second quartile (Median) of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.06
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
The minimal number of distinct values among attributes of the nominal type.
0
Percentage of binary attributes.

### 19 tasks

1 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: mean_absolute_error - target_feature: class
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: mean_absolute_error - target_feature: class
0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
0 runs - estimation_procedure: Test on Training Data - evaluation_measure: predictive_accuracy - target_feature: class
0 runs - estimation_procedure: 33% Holdout set - evaluation_measure: root_mean_squared_error - target_feature: class
0 runs - estimation_procedure: Custom 10-fold Crossvalidation - evaluation_measure: predictive_accuracy - target_feature: class
0 runs - estimation_procedure: 50 times Clustering
0 runs
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering