Data
AI4I2020

AI4I2020

active ARFF CC-BY Visibility: public Uploaded 19-05-2021 by Meilina Reksoprodjo
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
The AI4I 2020 Predictive Maintenance Dataset is a synthetic dataset that reflects real predictive maintenance data encountered in industry. Since real predictive maintenance datasets are generally difficult to obtain and in particular difficult to publish, we present and provide a synthetic dataset that reflects real predictive maintenance encountered in industry to the best of our knowledge. ### Attribute Information: The dataset consists of 10 000 data points stored as rows with 14 features in columns - UID: unique identifier ranging from 1 to 10000 - product ID: consisting of a letter L, M, or H for low (50% of all products), medium (30%) and high (20%) as product quality variants and a variant-specific serial number - air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K - process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K. - rotational speed [rpm]: calculated from a power of 2860 W, overlaid with a normally distributed noise - torque [Nm]: torque values are normally distributed around 40 Nm with a f = 10 Nm and no negative values. - tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process. - 'machine failure' label that indicates, whether the machine has failed in this particular datapoint for any of the following failure modes are true. The machine failure consists of five independent failure modes tool wear failure (TWF): the tool will be replaced of fail at a randomly selected tool wear time between 200-240 mins (120 times in our dataset). At this point in time, the tool is replaced 69 times, and fails 51 times (randomly assigned). - heat dissipation failure (HDF): heat dissipation causes a process failure, if the difference between air- and process temperature is below 8.6 K and the tools rotational speed is below 1380 rpm. This is the case for 115 data points. - power failure (PWF): the product of torque and rotational speed (in rad/s) equals the power required for the process. If this power is below 3500 W or above 9000 W, the process fails, which is the case 95 times in our dataset. - overstrain failure (OSF): if the product of tool wear and torque exceeds 11,000 minNm for the L product variant (12,000 M, 13,000 H), the process fails due to overstrain. This is true for 98 datapoints. - random failures (RNF): each process has a chance of 0,1 % to fail regardless of its process parameters. This is the case for only 5 datapoints, less than could be expected for 10,000 datapoints in our dataset. If at least one of the above failure modes is true, the process fails and the 'machine failure' label is set to 1. It is therefore not transparent to the machine learning method, which of the failure modes has caused the process to fail

14 features

UDInumeric10000 unique values
0 missing
Product IDstring10000 unique values
0 missing
Typestring3 unique values
0 missing
Air temperature [K]numeric93 unique values
0 missing
Process temperature [K]numeric82 unique values
0 missing
Rotational speed [rpm]numeric941 unique values
0 missing
Torque [Nm]numeric577 unique values
0 missing
Tool wear [min]numeric246 unique values
0 missing
Machine failurenumeric2 unique values
0 missing
TWFnumeric2 unique values
0 missing
HDFnumeric2 unique values
0 missing
PWFnumeric2 unique values
0 missing
OSFnumeric2 unique values
0 missing
RNFnumeric2 unique values
0 missing

19 properties

10000
Number of instances (rows) of the dataset.
14
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
12
Number of numeric attributes.
0
Number of nominal attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.
0
Number of attributes divided by the number of instances.
85.71
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
0
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
0
Number of binary attributes.
0
Percentage of binary attributes.

0 tasks

Define a new task