% 1. Title: Hayes-Roth & Hayes-Roth (1977) Database % % 2. Source Information: % (a) Creators: Barbara and Frederick Hayes-Roth % (b) Donor: David W. Aha (aha@ics.uci.edu) (714) 856-8779 % (c) Date: March, 1989 % % 3. Past Usage: % 1. Hayes-Roth, B., & Hayes-Roth, F. (1977). Concept learning and the % recognition and classification of exemplars. Journal of Verbal Learning % and Verbal Behavior, 16, 321-338. % -- Results: % -- Human subjects classification and recognition performance: % 1. decreases with distance from the prototype, % 2. is better on unseen prototypes than old instances, and % 3. improves with presentation frequency during learning. % 2. Anderson, J.R., & Kline, P.J. (1979). A learning system and its % psychological implications. In Proceedings of the Sixth International % Joint Conference on Artificial Intelligence (pp. 16-21). Tokyo, Japan: % Morgan Kaufmann. % -- Partitioned the results into 4 classes: % 1. prototypes % 2. near-prototypes with high presentation frequency during learning % 3. near-prototypes with low presentation frequency during learning % 4. instances that are far from protoypes % -- Described evidence that ACT's classification confidence and % recognition behaviors closely simulated human subjects' behaviors. % 3. Aha, D.W. (1989). Incremental learning of independent, overlapping, and % graded concept descriptions with an instance-based process framework. % Manuscript submitted for publication. % -- Used same partition as Anderson & Kline % -- Described evidence that Bloom's classification confidence behavior % is similar to the human subjects' behavior. Bloom fitted the data % more closely than did ACT. % % 4. Relevant Information: % This database contains 5 numeric-valued attributes. Only a subset of % 3 are used during testing (the latter 3). Furthermore, only 2 of the % 3 concepts are "used" during testing (i.e., those with the prototypes % 000 and 111). I've mapped all values to their zero-indexing equivalents. % % Some instances could be placed in either category 0 or 1. I've followed % the authors' suggestion, placing them in each category with equal % probability. % % I've replaced the actual values of the attributes (i.e., hobby has values % chess, sports and stamps) with numeric values. I think this is how % the authors' did this when testing the categorization models described % in the paper. I find this unfair. While the subjects were able to bring % background knowledge to bear on the attribute values and their % relationships, the algorithms were provided with no such knowledge. I'm % uncertain whether the 2 distractor attributes (name and hobby) are % presented to the authors' algorithms during testing. However, it is clear % that only the age, educational status, and marital status attributes are % given during the human subjects' transfer tests. % % 5. Number of Instances: 132 training instances, 28 test instances % % 6. Number of Attributes: 5 plus the class membership attribute. 3 concepts. % % 7. Attribute Information: % -- 1. name: distinct for each instance and represented numerically % -- 2. hobby: nominal values ranging between 1 and 3 % -- 3. age: nominal values ranging between 1 and 4 % -- 4. educational level: nominal values ranging between 1 and 4 % -- 5. marital status: nominal values ranging between 1 and 4 % -- 6. class: nominal value between 1 and 3 % % 9. Missing Attribute Values: none % % 10. Class Distribution: see below % % 11. Detailed description of the experiment: % 1. 3 categories (1, 2, and neither -- which I call 3) % -- some of the instances could be classified in either class 1 or 2, and % they have been evenly distributed between the two classes % 2. 5 Attributes % -- A. name (a randomly-generated number between 1 and 132) % -- B. hobby (a randomly-generated number between 1 and 3) % -- C. age (a number between 1 and 4) % -- D. education level (a number between 1 and 4) % -- E. marital status (a number between 1 and 4) % 3. Classification: % -- only attributes C-E are diagnostic; values for A and B are ignored % -- Class Neither: if a 4 occurs for any attribute C-E % -- Class 1: Otherwise, if (# of 1's)>(# of 2's) for attributes C-E % -- Class 2: Otherwise, if (# of 2's)>(# of 1's) for attributes C-E % -- Either 1 or 2: Otherwise, if (# of 2's)=(# of 1's) for attributes C-E % 4. Prototypes: % -- Class 1: 111 % -- Class 2: 222 % -- Class Either: 333 % -- Class Neither: 444 % 5. Number of training instances: 132 % -- Each instance presented 0, 1, or 10 times % -- None of the prototypes seen during training % -- 3 instances from each of categories 1, 2, and either are repeated % 10 times each % -- 3 additional instances from the Either category are shown during % learning % 5. Number of test instances: 28 % -- All 9 class 1 % -- All 9 class 2 % -- All 6 class Either % -- All 4 prototypes % -------------------- % -- 28 total % % Observations of interest: % 1. Relative classification confidence of % -- prototypes for classes 1 and 2 (2 instances) % (Anderson calls these Class 1 instances) % -- instances of class 1 with frequency 10 during training and % instances of class 2 with frequency 10 during training that % are 1 value away from their respective prototypes (6 instances) % (Anderson calls these Class 2 instances) % -- instances of class 1 with frequency 1 during training and % instances of class 2 with frequency 1 during training that % are 1 value away from their respective prototypes (6 instances) % (Anderson calls these Class 3 instances) % -- instances of class 1 with frequency 1 during training and % instances of class 2 with frequency 1 during training that % are 2 values away from their respective prototypes (6 instances) % (Anderson calls these Class 4 instances) % 2. Relative classification recognition of them also % % Some Expected results: % Both frequency and distance from prototype will effect the classification % accuracy of instances. Greater the frequency, higher the classification % confidence. Closer to prototype, higher the classification confidence. % % Information about the dataset % CLASSTYPE: nominal % CLASSINDEX: last % @relation hayes-roth @attribute 'hobby' INTEGER @attribute 'age' INTEGER @attribute 'educational_level' INTEGER @attribute 'marital_status' INTEGER @attribute 'class' {1,2,3,4} @data 1,1,1,2,1 1,1,2,1,1 1,2,1,1,1 1,1,1,3,1 1,1,3,1,1 1,3,1,1,1 1,1,3,3,1 1,3,1,3,1 1,3,3,1,1 1,2,2,1,2 1,2,1,2,2 1,1,2,2,2 1,2,2,3,2 1,2,3,2,2 1,3,2,2,2 1,2,3,3,2 1,3,2,3,2 1,3,3,2,2 1,1,3,2,1 1,3,2,1,2 1,2,1,3,1 1,2,3,1,2 1,1,2,3,1 1,3,1,2,2 1,1,1,1,1 1,2,2,2,2 1,3,3,3,1 1,4,4,4,3