% % 1. Title: INDUCE Trains Data set % % 2. Sources: % - Donor: GMU, Center for AI, Software Librarian, % Eric E. Bloedorn (bloedorn@aic.gmu.edu) % - Original owners: Ryszard S. Michalski (michalski@aic.gmu.edu) % and Robert Stepp % - Date received: 1 June 1994 % - Date updated: 24 June 1994 (Thanks to Larry Holder (UT Arlington) % for noticing a translation error) % % 3. Past usage: % - This set most closely resembles the data sets described in the following % two publications: % 1. R.S. Michalski and J.B. Larson "Inductive Inference of VL % Decision Rules" In Proceedings of the Workshop in Pattern-Directed % Inference Systems, Hawaii, May 1977. Also published in SIGART % Newsletter, ACM No. 63, pp. 38-44, June 1977. % 2. Stepp, R.E. and Michalski, R.S. "Conceptual Clustering: Inventing % Goal-Oriented Classifications of Structured Objects" In % R.S. Michalski, J.G. Carbonell, and T.M. Mitchell (Eds.) "Machine % Learning: An Artificial Intelligence Approach, Volume II". Los % Altos, Ca: Morgan Kaufmann. % % Both of these papers describe a set of 10 trains, 5 east-bound and 5 west % bound. Both refer to the same 10 trains as seen by the figures in these % publications. The differences are: % 1) This dataset has 10 attributes, no wheel, or load color attributes % 2) Reference 2 (Stepp, Michalski): does not completely list the % attributes used, but does mention wheel color - an attribute not % present in this dataset. % 3) Reference 1 (Michalski, Larson): 12 attributes mentioned, but only 6 % are explicitly described. These 6 are included in the dataset below % and the Stepp and Michalski set. % % Results: % [1] Michalski and Larson found the following decision rules: % (1) There exists car1, car2, lod1 and lod2 such that % [infront(car1, car2)][lcont(car1, lod1)][lcont(car2,lod2)] % [load-shape(lod1)=triangle][load-shape(lod2)=polygon]=>[dir=east] % (2) There exists a car1 such that % [ln(car1)=short][car-shape(car1)=closed-top]=>[dir=east] % (3) [ncar=3]v There exists car1 such that [car1(car-shape(car1)=jagged- % top] =>[dir=west] % There exists car1 such that % (4) [#cars(ln=long)=2][cshape(car1)=open,trapezoind,u-shaped] v % [location(car1)=2][cshape(car1)=closed, rectangle]=>[dir=west] % (The first selector in rule 4 uses a meta descriptor generated by % the program that counts the number of long cars in a train) % [2] The goal of the cluster research is to develop a general method % for clustering structured objects that can generate conjunctive % descriptions that occur in human classifications or invent new % concepts that have similar appeal. CLUSTER/S was able to find the % following cognitively appealing clusters: 1) a) "There are two % different car shapes in the train" b) "There are three or more % different car shapes in the train" 2) a) Wheels on all cars have % the same color, b) wheels on all cars do not have the same color." % % 4. Relevant information: % - Additional "background" knowledge is supplied that provides a partial % ordering on some of the attribute values. % - We are providing this dataset both in its original form and in a form % similar to the more typical propositional datasets in our repository. % Since the trains dataset records relations between attributes, this % transformation was somewhat challenging. However, it may shed some % insight on this problem for people who are more familiar with the simple % one-instance-per-line dataset format. % - Hierarchy of values: % if (cshape is one of {openrect,opentrap,ushaped,dblopnrect} % then cshape is opentop % if (cshape is one of {hexagon,ellipse,closedrect,jaggedtop,slopetop, % engine} % then cshape closedtop % - Prediction task: Determine concise decision rules distinguishing % trains traveling east from those traveling west. % % 5. Number of instances: 10 % % 6. Number of attributes: % - 10, not including the class attribute % 1. ccont(train idx1, car idx2): car idx is contained in train idx % 2. ncar(train idx): # of trains in car train idx (int) % 3. infront(car idx1, car idx2): relative positions of cars in train % 4. loc(car idx): absolute position of car in train (int) % 5. nwhl(car idx): # of wheels of car idx (int) % 6. ln(car idx): length of car idx (long, short) % 7. cshape(car idx): shape of car (engine, dblopenrect, % closedrect, openrect, opentrap, ushaped, % hexagon, ellipse, jaggedtop, slopetop, % opentop, closedtop) % 8. npl(car idx): number of loads in car idx % 9. lcont(car idx, load idx): description of which cars hold which loads % 10. lhshape(load idx): description of load shape (trianglod, % rectanglod, circlelod, hexagonlod) % Class: direction (east, west) % % The following format was used for the "transformed" dataset representation % as found in trains.transformed.data (one instance per line): % % Attributes: 33 % 1. Number_of_cars (integer in [3-5]) % 2. Number_of_different_loads (integer in [1-4]) % 3-22: 5 attributes for each of cars 2 through 5: (20 attributes total) % - num_wheels (integer in [2-3]) % - length (short or long) % - shape (closedrect, dblopnrect, ellipse, engine, hexagon, % jaggedtop, openrect, opentrap, slopetop, ushaped) % - num_loads (integer in [0-3]) % - load_shape (circlelod, hexagonlod, rectanglod, trianglod) % 23-32: 10 Boolean attributes describing whether 2 types of loads are on % adjacent cars of the train % - Rectangle_next_to_rectangle (0 if false, 1 if true) % - Rectangle_next_to_triangle (0 if false, 1 if true) % - Rectangle_next_to_hexagon (0 if false, 1 if true) % - Rectangle_next_to_circle (0 if false, 1 if true) % - Triangle_next_to_triangle (0 if false, 1 if true) % - Triangle_next_to_hexagon (0 if false, 1 if true) % - Triangle_next_to_circle (0 if false, 1 if true) % - Hexagon_next_to_hexagon (0 if false, 1 if true) % - Hexagon_next_to_circle (0 if false, 1 if true) % - Circle_next_to_circle (0 if false, 1 if true) % 33. Class attribute (east or west) % % The number of cars vary between 3 and 5. Therefore, attributes referring % to properties of cars that do not exist (such as the 5 attriubutes for % the "5th" car when the train has fewer than 5 cars) are assigned a value % of "-". % % 7. Distribution of classes: % - There are 5 east-bound trains and 5 west-bound trains % (i.e., 50% east, 50% west) % % % Information about the dataset % CLASSTYPE: nominal % CLASSINDEX: last % @relation trains @attribute 'Number_of_cars' {3,4,5} @attribute 'Number_of_different_loads' {1,2,3,4} @attribute 'num_wheels_2' {2,3} @attribute 'length_2' {long,short} @attribute 'shape_2' {closedrect,dblopnrect,openrect,opentrap,ushaped} @attribute 'num_loads_2' {1,3} @attribute 'load_shape_2' {circlelod,rectanglod,trianglod} @attribute 'num_wheels_3' {2,3} @attribute 'length_3' {long,short} @attribute 'shape_3' {closedrect,dblopnrect,hexagon,jaggedtop,openrect,opentrap,slopetop,ushaped} @attribute 'num_loads_3' {1,2} @attribute 'load_shape_3' {circlelod,rectanglod,trianglod} @attribute 'num_wheels_4' {2,3} @attribute 'length_4' {long,short} @attribute 'shape_4' {closedrect,ellipse,jaggedtop,openrect} @attribute 'num_loads_4' {0,1,2} @attribute 'load_shape_4' {circlelod,hexagonlod,rectanglod,trianglod} @attribute 'num_wheels_5' {2} @attribute 'length_5' {short} @attribute 'shape_5' {openrect,opentrap} @attribute 'num_loads_5' {1} @attribute 'load_shape_5' {circlelod,rectanglod} @attribute 'Rectangle_next_to_rectangle' {0,1} @attribute 'Rectangle_next_to_triangle' {0,1} @attribute 'Rectangle_next_to_hexagon' {0} @attribute 'Rectangle_next_to_circle' {0,1} @attribute 'Triangle_next_to_triangle' {0,1} @attribute 'Triangle_next_to_hexagon' {0,1} @attribute 'Triangle_next_to_circle' {0,1} @attribute 'Hexagon_next_to_hexagon' {0} @attribute 'Hexagon_next_to_circle' {0,1} @attribute 'Circle_next_to_circle' {0} @attribute 'class' {east,west} @data 5,4,2,long,openrect,3,rectanglod,2,short,slopetop,1,trianglod,3,long,openrect,1,hexagonlod,2,short,openrect,1,circlelod,0,1,0,0,0,1,0,0,1,0,east 4,3,2,short,ushaped,1,trianglod,2,short,opentrap,1,rectanglod,2,short,closedrect,2,circlelod,?,?,?,?,?,0,1,0,1,0,0,0,0,0,0,east 4,2,2,short,openrect,1,circlelod,2,short,hexagon,1,trianglod,3,long,closedrect,1,trianglod,?,?,?,?,?,0,0,0,0,1,0,1,0,0,0,east 5,2,2,short,opentrap,1,trianglod,2,short,dblopnrect,1,trianglod,2,short,ellipse,1,rectanglod,2,short,openrect,1,rectanglod,1,1,0,0,1,0,0,0,0,0,east 4,3,2,short,dblopnrect,1,trianglod,3,long,closedrect,1,rectanglod,2,short,closedrect,1,circlelod,?,?,?,?,?,0,1,0,1,0,0,0,0,0,0,east 3,2,2,long,closedrect,3,circlelod,2,short,openrect,1,trianglod,?,?,?,?,?,?,?,?,?,?,0,0,0,0,0,0,1,0,0,0,west 4,2,2,short,dblopnrect,1,circlelod,2,short,ushaped,1,trianglod,2,long,jaggedtop,0,?,?,?,?,?,?,0,0,0,0,0,0,1,0,0,0,west 3,2,3,long,closedrect,1,rectanglod,2,short,ushaped,1,circlelod,?,?,?,?,?,?,?,?,?,?,0,0,0,1,0,0,0,0,0,0,west 5,2,2,short,opentrap,1,circlelod,2,long,jaggedtop,1,rectanglod,2,short,openrect,1,rectanglod,2,short,opentrap,1,circlelod,1,0,0,1,0,0,0,0,0,0,west 3,1,2,short,ushaped,1,rectanglod,2,long,openrect,2,rectanglod,?,?,?,?,?,?,?,?,?,?,1,0,0,0,0,0,0,0,0,0,west