active 2016-01-29T20:03:14Z **Author**: Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF), Grenoble - France. **Source**: [ELENA project](https://www.elen.ucl.ac.be/neural-nets/Research/Projects/ELENA/databases/REAL/texture/) **Please cite**: None ####1. Summary This database was generated by the Laboratory of Image Processing and Pattern Recognition (INPG-LTIRF) in the development of the Esprit project ELENA No. 6891 and the Esprit working group ATHOS No. 6620. ``` (a) Original source: P. Brodatz "Textures: A Photographic Album for Artists and Designers", Dover Publications,Inc.,New York, 1966. (b) Creation: Laboratory of Image Processing and Pattern Recognition Institut National Polytechnique de Grenoble INPG Laboratoire de Traitement d'Image et de Reconnaissance de Formes LTIRF Av. Felix Viallet, 46 F-38031 Grenoble Cedex France (c) Contact: Dr. A. Guerin-Dugue, INPG-LTIRF, guerin@tirf.inpg.fr ``` ####2. Past Usage: This database has a private usage at the TIRF laboratory. It has been created in order to study the textures discrimination with high order statistics. ``` A.Guerin-Dugue, C. Aviles-Cruz, "High Order Statistics from Natural Textured Images", In ATHOS workshop on System Identification and High Order Statistics, Sophia-Antipolis, France, September 1993. Guerin-Dugue, A. and others, Deliverable R3-B4-P - Task B4: Benchmarks, Technical report, Elena-NervesII "Enhanced Learning for Evolutive Neural Architecture", ESPRIT-Basic Research Project Number 6891, June 1995. ``` ####3. Relevant Information: The aim is to distinguish between 11 different textures (Grass lawn, Pressed calf leather, Handmade paper, Raffia looped to a high pile, Cotton canvas, ...), each pattern (pixel) being characterised by 40 attributes built by the estimation of fourth order modified moments in four orientations: 0, 45, 90 and 135 degrees. A statistical method based on the extraction of fourth order moments for the characterization of natural micro-textures was developed called "fourth order modified moments" (mm4) [Guerin93], this method measures the deviation from first-order Gauss-Markov process, for each texture. The features were estimated in four directions to take into account the possible orientations of the textures (0, 45, 90 and 135 degrees). Only correlation between the current pixel, the first neighbourhood and the second neighbourhood are taken into account. This small neighbourhood is adapted to the fine grain property of the textures. The data set contains 11 classes of 500 instances and each class refers to a type of texture in the Brodatz album. The database dimension is 40 plus one for the class label. The 40 attributes were build respectively by the estimation of the following fourth order modified moments in four orientations: 0, 45, 90 and 135 degrees: mm4(000), mm4(001), mm4(002), mm4(011), mm4(012), mm4(022), mm4(111), mm4(112), mm4(122) and mm4(222). !! Patterns are always sorted by class and are presented in the increasing order of their class label in each dataset relative to the texture database (texture.dat, texture_CR.dat, texture_PCA.dat, texture_DFA.dat) ####4. Class: The class label is a code for the following classes: ``` Class Class label 2 Grass lawn (D09) 3 Pressed calf leather (D24) 4 Handmade paper (D57) 6 Raffia looped to a high pile: (D84) 7 Cotton canvas (D77) 8 Pigskin (D92) 9 Beach sand: (D28) 10 Beach sand (D29) 12 Oriental straw cloth (D53) 13 Oriental straw cloth (D78) 14 Oriental grass fiber cloth (D79) ``` ####5. Summary Statistics: Table here below provides for each attribute of the database the dynamic (Min and Max values), the mean value and the standard deviation. ``` Attribute Min Max Mean Standard deviation 1 -1.4495 0.7741 -1.0983 0.2034 2 -1.2004 0.3297 -0.5867 0.2055 3 -1.3099 0.3441 -0.5838 0.3135 4 -1.1104 0.5878 -0.4046 0.2302 5 -1.0534 0.4387 -0.3307 0.2360 6 -1.0029 0.4515 -0.2422 0.2225 7 -1.2076 0.5246 -0.6026 0.2003 8 -1.0799 0.3980 -0.4322 0.2210 9 -1.0570 0.4369 -0.3317 0.2361 10 -1.2580 0.3546 -0.5978 0.3268 11 -1.4495 0.7741 -1.0983 0.2034 12 -1.0831 0.3715 -0.5929 0.2056 13 -1.1194 0.6347 -0.4019 0.3368 14 -1.0182 0.1573 -0.6270 0.1390 15 -0.9435 0.1642 -0.4482 0.1952 16 -0.9944 0.0357 -0.5763 0.1587 17 -1.1722 0.0201 -0.7331 0.1955 18 -1.0174 0.1155 -0.4919 0.2335 19 -1.0044 0.0833 -0.4727 0.2257 20 -1.1800 0.4392 -0.4831 0.3484 21 -1.4495 0.7741 -1.0983 0.2034 22 -1.2275 0.5963 -0.7363 0.2220 23 -1.3412 0.4464 -0.7771 0.3290 24 -1.1774 0.6882 -0.5770 0.2646 25 -1.1369 0.4098 -0.5085 0.2538 26 -1.1099 0.3725 -0.4038 0.2515 27 -1.2393 0.6120 -0.7279 0.2278 28 -1.1540 0.4221 -0.5863 0.2446 29 -1.1323 0.3916 -0.5090 0.2526 30 -1.4224 0.4718 -0.7708 0.3264 31 -1.4495 0.7741 -1.0983 0.2034 32 -1.1789 0.5647 -0.6463 0.1890 33 -1.1473 0.6755 -0.4919 0.3304 34 -1.1228 0.3132 -0.6435 0.1441 35 -1.0145 0.3396 -0.4918 0.1922 36 -1.0298 0.1560 -0.5934 0.1704 37 -1.2534 0.0899 -0.7795 0.1641 38 -1.0966 0.1944 -0.5541 0.2111 39 -1.0765 0.2019 -0.5230 0.2015 40 -1.2155 0.4647 -0.5677 0.3091 ``` The dynamic of the attributes is in [-1.45 - 0.775]. The database resulting from the centering and reduction by attribute of the Texture database is on the ftp server in the `REAL/texture/texture_CR.dat.Z' file. ####6. Confusion matrix. The following confusion matrix of the k_NN classifier was obtained with a Leave_One_Out error counting method on the texture_CR.dat database. k was set to 1 in order to reach the minimum mean error rate : 1.0 +/- 0.8%. ``` Class 2 3 4 6 7 8 9 10 12 13 14 2 97.0 1.0 0.4 0.0 0.0 0.0 1.6 0.0 0.0 0.0 0.0 3 0.2 99.0 0.0 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.4 4 1.0 0.0 98.8 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 6 0.0 0.0 0.0 99.4 0.0 0.0 0.0 0.6 0.0 0.0 0.0 7 0.0 0.0 0.0 0.0 100.0 0.0 0.0 0.0 0.0 0.0 0.0 8 0.0 0.0 0.0 0.0 0.0 98.6 0.0 1.4 0.0 0.0 0.0 9 0.4 0.0 0.2 0.0 0.0 0.2 98.8 0.4 0.0 0.0 0.0 10 0.0 0.0 0.0 0.0 0.0 1.4 0.0 98.6 0.0 0.0 0.0 12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 0.0 0.0 13 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 99.8 0.2 14 0.0 0.4 0.0 0.0 0.0 0.4 0.0 0.0 0.2 0.0 99.0 ``` 7. Result of the Principal Component Analysis: The Principal Components Analysis is a very classical method in pattern recognition [Duda73]. PCA reduces the sample dimension in a linear way for the best representation in lower dimensions keeping the maximum of inertia. The best axe for the representation is however not necessary the best axe for the discrimination. After PCA, features are selected according to the percentage of initial inertia which is covered by the different axes and the number of features is determined according to the percentage of initial inertia to keep for the classification process. This selection method has been applied on the texture_CR database. When quasi-linear correlations exists between some initial features, these redundant dimensions are removed by PCA and this preprocessing is then recommended. In this case, before a PCA, the determinant of the data covariance matrix is near zero; this database is thus badly conditioned for all process which use this information (the quadratic classifier for example). The following file is available for the texture database: ''texture_PCA.dat.Z'', it is the projection of the ''texture_CR'' database on its principal components (sorted in a decreasing order of the related inertia percentage; so, if you desire to work on the database projected on its x first principal components you only have to keep the x first attributes of the texture_PCA.dat database and the class labels (last attribute)). Table here below provides the inertia percentages associated to the eigenvalues corresponding to the principal component axis sorted in the decreasing order of the associated inertia percentage. 99.85 percent of the total database inertia will remain if the 20 first principal components are kept. ``` Eigen Value Inertia Cumulated value percentage inertia 1 30.267500000 75.6687000000 75.6687000000 2 3.6512500000 9.1281300000 84.7969000000 3 2.2937000000 5.7342400000 90.5311000000 4 1.7039700000 4.2599300000 94.7910000000 5 0.6716540000 1.6791300000 96.4702000000 6 0.5015290000 1.2538200000 97.7240000000 7 0.1922830000 0.4807070000 98.2047000000 8 0.1561070000 0.3902670000 98.5950000000 9 0.1099570000 0.2748920000 98.8699000000 10 0.0890891000 0.2227230000 99.0926000000 11 0.0656016000 0.1640040000 99.2566000000 12 0.0489988000 0.1224970000 99.3791000000 13 0.0433819000 0.1084550000 99.4875000000 14 0.0345022000 0.0862554000 99.5738000000 15 0.0299203000 0.0748007000 99.6486000000 16 0.0248857000 0.0622141000 99.7108000000 17 0.0167901000 0.0419752000 99.7528000000 18 0.0161633000 0.0404083000 99.7932000000 19 0.0128898000 0.0322246000 99.8254000000 20 0.0113884000 0.0284710000 99.8539000000 21 0.0078481400 0.0196204000 99.8735000000 22 0.0071527800 0.0178820000 99.8914000000 23 0.0067661400 0.0169153000 99.9083000000 24 0.0053149500 0.0132874000 99.9216000000 25 0.0051102600 0.0127757000 99.9344000000 26 0.0047116600 0.0117792000 99.9461000000 27 0.0036193700 0.0090484300 99.9552000000 28 0.0033222000 0.0083054900 99.9635000000 29 0.0030722400 0.0076806100 99.9712000000 30 0.0026373300 0.0065933300 99.9778000000 31 0.0020996800 0.0052492000 99.9830000000 32 0.0019376500 0.0048441200 99.9879000000 33 0.0015642300 0.0039105700 99.9918000000 34 0.0009679080 0.0024197700 99.9942000000 35 0.0009578000 0.0023945000 99.9966000000 36 0.0007379780 0.0018449400 99.9984000000 37 0.0006280250 0.0015700600 100.000000000 38 0.0000000040 0.0000000099 100.000000000 39 0.0000000001 0.0000000003 100.000000000 40 0.0000000008 0.0000000019 100.000000000 ``` This matrix can be found in the texture_EV.dat file. The Discriminant Factorial Analysis (DFA) can be applied to a learning database where each learning sample belongs to a particular class [Duda73]. The number of discriminant features selected by DFA is fixed in function of the number of classes (c) and of the number of input dimensions (d); this number is equal to the minimum between d and c-1. In the usual case where d is greater than c, the output dimension is fixed equal to the number of classes minus one and the discriminant axes are selected in order to maximize the between-variance and to minimize the within-variance of the classes. The discrimination power (ratio of the projected between-variance over the projected within-variance) is not the same for each discriminant axis: this ratio decreases for each axis. So for a problem with many classes, this preprocessing will not be always efficient as the last output features will not be so discriminant. This analysis uses the information of the inverse of the global covariance matrix, so the covariance matrix must be well conditioned (for example, a preliminary PCA must be applied to remove the linearly correlated dimensions). The Discriminant Factorial Analysis (DFA) has been applied on the 18 first principal components of the texture_PCA database (thus by keeping only the 18 first attributes of these databases before to apply the DFA preprocessing) in order to build the texture_DFA.dat.Z database file, having 10 dimensions (the texture database having 11 classes). In the case of the texture database, experiments shown that a DFA preprocessing is very useful and most of the time improved the classifiers performances. [Duda73] Duda, R.O. and Hart, P.E.,Pattern Classification and Scene Analysis, John Wiley & Sons, 1973. Class 18836 https://www.openml.org/data/download/4535764/phpBDgUyY ARFF 0 11 1 texture Public public texture 2016-01-29T20:03:14Z 1