Data
absenteeism-at-work

absenteeism-at-work

active ARFF Public Domain (CC0) Visibility: public Uploaded 19-05-2021 by Meilina Reksoprodjo
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
The database was created with records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil. The data set allows for several new combinations of attributes and attribute exclusions, or the modification of the attribute type (categorical, integer, or real) depending on the purpose of the research. The data set (Absenteeism at work - Part I) was used in academic research at the Universidade Nove de Julho - Postgraduate Program in Informatics and Knowledge Management. ### Attribute Information: 1. Individual identification (ID) 2. Reason for absence (ICD). Absences attested by the International Code of Diseases (ICD) stratified into 21 categories (I to XXI) as follows: I Certain infectious and parasitic diseases II Neoplasms III Diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism IV Endocrine, nutritional and metabolic diseases V Mental and behavioural disorders VI Diseases of the nervous system VII Diseases of the eye and adnexa VIII Diseases of the ear and mastoid process IX Diseases of the circulatory system X Diseases of the respiratory system XI Diseases of the digestive system XII Diseases of the skin and subcutaneous tissue XIII Diseases of the musculoskeletal system and connective tissue XIV Diseases of the genitourinary system XV Pregnancy, childbirth and the puerperium XVI Certain conditions originating in the perinatal period XVII Congenital malformations, deformations and chromosomal abnormalities XVIII Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified XIX Injury, poisoning and certain other consequences of external causes XX External causes of morbidity and mortality XXI Factors influencing health status and contact with health services. And 7 categories without (CID) patient follow-up (22), medical consultation (23), blood donation (24), laboratory examination (25), unjustified absence (26), physiotherapy (27), dental consultation (28). 3. Month of absence 4. Day of the week (Monday (2), Tuesday (3), Wednesday (4), Thursday (5), Friday (6)) 5. Seasons (summer (1), autumn (2), winter (3), spring (4)) 6. Transportation expense 7. Distance from Residence to Work (kilometers) 8. Service time 9. Age 10. Work load Average/day 11. Hit target 12. Disciplinary failure (yes=1, no=0) 13. Education (high school (1), graduate (2), postgraduate (3), master and doctor (4)) 14. Son (number of children) 15. Social drinker (yes=1, no=0) 16. Social smoker (yes=1, no=0) 17. Pet (number of pet) 18. Weight 19. Height 20. Body mass index 21. Absenteeism time in hours (target)

21 features

IDnominal36 unique values
0 missing
Reason_for_absencenominal28 unique values
0 missing
Month_of_absencenumeric13 unique values
0 missing
Day_of_the_weeknominal5 unique values
0 missing
Seasonsnominal4 unique values
0 missing
Transportation_expensenumeric24 unique values
0 missing
Distance_from_Residence_to_Worknumeric25 unique values
0 missing
Service_timenumeric18 unique values
0 missing
Agenumeric22 unique values
0 missing
Work_load_Average/day_numeric38 unique values
0 missing
Hit_targetnumeric13 unique values
0 missing
Disciplinary_failurenominal2 unique values
0 missing
Educationnumeric4 unique values
0 missing
Sonnumeric5 unique values
0 missing
Social_drinkernominal2 unique values
0 missing
Social_smokernominal2 unique values
0 missing
Petnumeric6 unique values
0 missing
Weightnumeric26 unique values
0 missing
Heightnumeric14 unique values
0 missing
Body_mass_indexnumeric17 unique values
0 missing
Absenteeism_time_in_hoursnumeric19 unique values
0 missing

19 properties

740
Number of instances (rows) of the dataset.
21
Number of attributes (columns) of the dataset.
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
14
Number of numeric attributes.
7
Number of nominal attributes.
14.29
Percentage of binary attributes.
0
Percentage of instances having missing values.
Average class difference between consecutive instances.
0
Percentage of missing values.
0.03
Number of attributes divided by the number of instances.
66.67
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
33.33
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.
Number of instances belonging to the least frequent class.
3
Number of binary attributes.

0 tasks

Define a new task