Data

bodyfat

active
ARFF
Publicly available Visibility: public Uploaded 03-10-2014 by Joaquin Vanschoren

0 likes downloaded by 4 people , 5 total downloads 0 issues 0 downvotes

0 likes downloaded by 4 people , 5 total downloads 0 issues 0 downvotes

Issue | #Downvotes for this reason | By |
---|

Loading wiki

Help us complete this description
Edit

Author: Roger W. Johnson
Source: [UCI (not available anymore)](https://archive.ics.uci.edu/ml/index.php), [TunedIT](http://tunedit.org/repo/UCI/numeric/bodyfat.arff)
Please cite: None.
Short Summary:
Lists estimates of the percentage of body fat determined by underwater
weighing and various body circumference measurements for 252 men.
Classroom use of this data set:
This data set can be used to illustrate multiple regression techniques.
Accurate measurement of body fat is inconvenient/costly and it is
desirable to have easy methods of estimating body fat that are not
inconvenient/costly.
More Details:
A variety of popular health books suggest that the readers assess their
health, at least in part, by estimating their percentage of body fat. In
Bailey (1994), for instance, the reader can estimate body fat from tables
using their age and various skin-fold measurements obtained by using a
caliper. Other texts give predictive equations for body fat using body
circumference measurements (e.g. abdominal circumference) and/or skin-fold
measurements. See, for instance, Behnke and Wilmore (1974), pp. 66-67;
Wilmore (1976), p. 247; or Katch and McArdle (1977), pp. 120-132).
Percentage of body fat for an individual can be estimated once body density
has been determined. Folks (e.g. Siri (1956)) assume that the body consists
of two components - lean body tissue and fat tissue. Letting
D = Body Density (gm/cm^3)
A = proportion of lean body tissue
B = proportion of fat tissue (A+B=1)
a = density of lean body tissue (gm/cm^3)
b = density of fat tissue (gm/cm^3)
we have
D = 1/[(A/a) + (B/b)]
solving for B we find
B = (1/D)*[ab/(a-b)] - [b/(a-b)].
Using the estimates a=1.10 gm/cm^3 and b=0.90 gm/cm^3 (see Katch and McArdle
(1977), p. 111 or Wilmore (1976), p. 123) we come up with "Siri's equation":
Percentage of Body Fat (i.e. 100*B) = 495/D - 450.
Volume, and hence body density, can be accurately measured a variety of ways.
The technique of underwater weighing "computes body volume as the difference
between body weight measured in air and weight measured during water
submersion. In other words, body volume is equal to the loss of weight in
water with the appropriate temperature correction for the water's density"
(Katch and McArdle (1977), p. 113). Using this technique,
Body Density = WA/[(WA-WW)/c.f. - LV]
where
WA = Weight in air (kg)
WW = Weight in water (kg)
c.f. = Water correction factor (=1 at 39.2 deg F as one-gram of water
occupies exactly one cm^3 at this temperature, =.997 at 76-78 deg F)
LV = Residual Lung Volume (liters)
(Katch and McArdle (1977), p. 115). Other methods of determining body volume
are given in Behnke and Wilmore (1974), p. 22 ff.
The variables listed below, from left to right, are:
Density determined from underwater weighing
Percent body fat from Siri's (1956) equation
Age (years)
Weight (lbs)
Height (inches)
Neck circumference (cm)
Chest circumference (cm)
Abdomen 2 circumference (cm)
Hip circumference (cm)
Thigh circumference (cm)
Knee circumference (cm)
Ankle circumference (cm)
Biceps (extended) circumference (cm)
Forearm circumference (cm)
Wrist circumference (cm)
(Measurement standards are apparently those listed in Benhke and Wilmore
(1974), pp. 45-48 where, for instance, the abdomen 2 circumference is
measured "laterally, at the level of the iliac crests, and anteriorly, at
the umbilicus".)
These data are used to produce the predictive equations for lean
body weight given in the abstract "Generalized body composition prediction
equation for men using simple measurement techniques", K.W. Penrose, A.G.
Nelson, A.G. Fisher, FACSM, Human Performance Research Center, Brigham Young
University, Provo, Utah 84602 as listed in _Medicine and Science in Sports
and Exercise_, vol. 17, no. 2, April 1985, p. 189. (The predictive equations
were obtained from the first 143 of the 252 cases that are listed below).
The data were generously supplied by Dr. A. Garth Fisher who gave permission to
freely distribute the data and use for non-commercial purposes.
References:
Bailey, Covert (1994). _Smart Exercise: Burning Fat, Getting Fit_,
Houghton-Mifflin Co., Boston, pp. 179-186.
Behnke, A.R. and Wilmore, J.H. (1974). _Evaluation and Regulation of Body
Build and Composition_, Prentice-Hall, Englewood Cliffs, N.J.
Siri, W.E. (1956), "Gross composition of the body", in _Advances in
Biological and Medical Physics_, vol. IV, edited by J.H. Lawrence and C.A.
Tobias, Academic Press, Inc., New York.
Katch, Frank and McArdle, William (1977). _Nutrition, Weight Control, and
Exercise_, Houghton Mifflin Co., Boston.
Wilmore, Jack (1976). _Athletic Training and Physical Fitness: Physiological
Principles of the Conditioning Process_, Allyn and Bacon, Inc., Boston.

class (target) | numeric | 176 unique values 0 missing | |

Density | numeric | 218 unique values 0 missing | |

Age | numeric | 51 unique values 0 missing | |

Weight | numeric | 197 unique values 0 missing | |

Height | numeric | 48 unique values 0 missing | |

Neck | numeric | 90 unique values 0 missing | |

Chest | numeric | 174 unique values 0 missing | |

Abdomen | numeric | 185 unique values 0 missing | |

Hip | numeric | 152 unique values 0 missing | |

Thigh | numeric | 139 unique values 0 missing | |

Knee | numeric | 90 unique values 0 missing | |

Ankle | numeric | 61 unique values 0 missing | |

Biceps | numeric | 104 unique values 0 missing | |

Forearm | numeric | 77 unique values 0 missing | |

Wrist | numeric | 44 unique values 0 missing |

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

1.06

Second quartile (Median) of kurtosis among attributes of the numeric type.

38.59

Second quartile (Median) of means among attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

0.52

Second quartile (Median) of skewness among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Minimal mutual information between the nominal attributes and the target attribute.

3.66

Second quartile (Median) of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Maximum mutual information between the nominal attributes and the target attribute.

The minimal number of distinct values among attributes of the nominal type.

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

The maximum number of distinct values among attributes of the nominal type.

5.27

Third quartile of kurtosis among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

Third quartile of mutual information between the nominal attributes and the target attribute.

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

0.84

Third quartile of skewness among attributes of the numeric type.

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

8.43

Third quartile of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

Average mutual information between the nominal attributes and the target attribute.

First quartile of mutual information between the nominal attributes and the target attribute.

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

0.15

First quartile of skewness among attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

Standard deviation of the number of distinct values among attributes of the nominal type.

Average number of distinct values among the attributes of the nominal type.

2.02

First quartile of standard deviation of attributes of the numeric type.

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W