<oml:data_set_description xmlns:oml="http://openml.org/openml">
  <oml:id>1510</oml:id>
  <oml:name>wdbc</oml:name>
  <oml:version>1</oml:version>
  <oml:description>**Author**: William H. Wolberg, W. Nick Street, Olvi L. Mangasarian    
**Source**: [UCI](https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)), [University of Wisconsin](http://pages.cs.wisc.edu/~olvi/uwmp/cancer.html) - 1995  
**Please cite**: [UCI](https://archive.ics.uci.edu/ml/citation_policy.html)     

**Breast Cancer Wisconsin (Diagnostic) Data Set (WDBC).** Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. The target feature records the prognosis (benign (1) or malignant (2)). [Original data available here](ftp://ftp.cs.wisc.edu/math-prog/cpo-dataset/machine-learn/cancer/) 

Current dataset was adapted to ARFF format from the UCI version. Sample code ID's were removed.  

! Note that there is also a related Breast Cancer Wisconsin (Original) Data Set with a different set of features, better known as [breast-w](https://www.openml.org/d/15).


### Feature description  

Ten real-valued features are computed for each of 3 cell nuclei, yielding a total of 30 descriptive features. See the papers below for more details on how they were computed. The 10 features (in order) are:  

a) radius (mean of distances from center to points on the perimeter)  
b) texture (standard deviation of gray-scale values)  
c) perimeter  
d) area  
e) smoothness (local variation in radius lengths)  
f) compactness (perimeter^2 / area - 1.0)  
g) concavity (severity of concave portions of the contour)  
h) concave points (number of concave portions of the contour)  
i) symmetry  
j) fractal dimension (&quot;coastline approximation&quot; - 1)  

### Relevant Papers   

W.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&amp;T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993. 

O.L. Mangasarian, W.N. Street and W.H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pages 570-577, July-August 1995.</oml:description>
  <oml:description_version>3</oml:description_version>
  <oml:format>ARFF</oml:format>
        <oml:upload_date>2015-05-26T16:24:07</oml:upload_date>
    <oml:licence>Public</oml:licence>  <oml:url>https://openml.org/data/v1/download/1592318/wdbc.arff</oml:url>
  <oml:parquet_url>https://data.openml.org/datasets/0000/1510/dataset_1510.pq</oml:parquet_url>  <oml:file_id>1592318</oml:file_id>  <oml:default_target_attribute>Class</oml:default_target_attribute>          <oml:tag>Biology</oml:tag><oml:tag>cancer</oml:tag><oml:tag>Health</oml:tag><oml:tag>medical</oml:tag><oml:tag>Medicine</oml:tag><oml:tag>OpenML-CC18</oml:tag><oml:tag>OpenML100</oml:tag><oml:tag>Research</oml:tag><oml:tag>study_123</oml:tag><oml:tag>study_135</oml:tag><oml:tag>study_14</oml:tag><oml:tag>study_52</oml:tag><oml:tag>study_7</oml:tag><oml:tag>study_98</oml:tag><oml:tag>study_99</oml:tag><oml:tag>uci</oml:tag>  <oml:visibility>public</oml:visibility>        <oml:status>active</oml:status>
  <oml:processing_date>2018-10-03 21:41:34</oml:processing_date>      <oml:md5_checksum>7aa183d3657e364911ced0cbd6b272bd</oml:md5_checksum>
</oml:data_set_description>
