Data
house_prices_nominal

house_prices_nominal

active ARFF Publicly available Visibility: public Uploaded 26-06-2020 by Marcos de Paula Bueno
0 likes downloaded by 1 people , 2 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Kaggle Source: [original](https://www.kaggle.com/c/house-prices-advanced-regression-techniques) - 2011 Please cite: Dean De Cock (2011) Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project, Journal of Statistics Education, 19:3, DOI: 10.1080/10691898.2011.11889627 Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence. With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home. SalePrice - the property's sale price in dollars. This is the target variable that you're trying to predict. MSSubClass: The building class MSZoning: The general zoning classification LotFrontage: Linear feet of street connected to property LotArea: Lot size in square feet Street: Type of road access Alley: Type of alley access LotShape: General shape of property LandContour: Flatness of the property Utilities: Type of utilities available LotConfig: Lot configuration LandSlope: Slope of property Neighborhood: Physical locations within Ames city limits Condition1: Proximity to main road or railroad Condition2: Proximity to main road or railroad (if a second is present) BldgType: Type of dwelling HouseStyle: Style of dwelling OverallQual: Overall material and finish quality OverallCond: Overall condition rating YearBuilt: Original construction date YearRemodAdd: Remodel date RoofStyle: Type of roof RoofMatl: Roof material Exterior1st: Exterior covering on house Exterior2nd: Exterior covering on house (if more than one material) MasVnrType: Masonry veneer type MasVnrArea: Masonry veneer area in square feet ExterQual: Exterior material quality ExterCond: Present condition of the material on the exterior Foundation: Type of foundation BsmtQual: Height of the basement BsmtCond: General condition of the basement BsmtExposure: Walkout or garden level basement walls BsmtFinType1: Quality of basement finished area BsmtFinSF1: Type 1 finished square feet BsmtFinType2: Quality of second finished area (if present) BsmtFinSF2: Type 2 finished square feet BsmtUnfSF: Unfinished square feet of basement area TotalBsmtSF: Total square feet of basement area Heating: Type of heating HeatingQC: Heating quality and condition CentralAir: Central air conditioning Electrical: Electrical system 1stFlrSF: First Floor square feet 2ndFlrSF: Second floor square feet LowQualFinSF: Low quality finished square feet (all floors) GrLivArea: Above grade (ground) living area square feet BsmtFullBath: Basement full bathrooms BsmtHalfBath: Basement half bathrooms FullBath: Full bathrooms above grade HalfBath: Half baths above grade Bedroom: Number of bedrooms above basement level Kitchen: Number of kitchens KitchenQual: Kitchen quality TotRmsAbvGrd: Total rooms above grade (does not include bathrooms) Functional: Home functionality rating Fireplaces: Number of fireplaces FireplaceQu: Fireplace quality GarageType: Garage location GarageYrBlt: Year garage was built GarageFinish: Interior finish of the garage GarageCars: Size of garage in car capacity GarageArea: Size of garage in square feet GarageQual: Garage quality GarageCond: Garage condition PavedDrive: Paved driveway WoodDeckSF: Wood deck area in square feet OpenPorchSF: Open porch area in square feet EnclosedPorch: Enclosed porch area in square feet 3SsnPorch: Three season porch area in square feet ScreenPorch: Screen porch area in square feet PoolArea: Pool area in square feet PoolQC: Pool quality Fence: Fence quality MiscFeature: Miscellaneous feature not covered in other categories MiscVal: $Value of miscellaneous feature MoSold: Month Sold YrSold: Year Sold SaleType: Type of sale SaleCondition: Condition of sale

80 features

SalePrice (target)numeric663 unique values
0 missing
Id (row identifier)numeric1460 unique values
0 missing
MSSubClassnumeric15 unique values
0 missing
MSZoningnominal5 unique values
0 missing
LotFrontagenumeric110 unique values
259 missing
LotAreanumeric1073 unique values
0 missing
Streetnominal2 unique values
0 missing
Alleynominal2 unique values
1369 missing
LotShapenominal4 unique values
0 missing
LandContournominal4 unique values
0 missing
Utilitiesnominal2 unique values
0 missing
LotConfignominal5 unique values
0 missing
LandSlopenominal3 unique values
0 missing
Neighborhoodnominal25 unique values
0 missing
Condition1nominal9 unique values
0 missing
Condition2nominal8 unique values
0 missing
BldgTypenominal5 unique values
0 missing
HouseStylenominal8 unique values
0 missing
OverallQualnumeric10 unique values
0 missing
OverallCondnumeric9 unique values
0 missing
YearBuiltnumeric112 unique values
0 missing
YearRemodAddnumeric61 unique values
0 missing
RoofStylenominal6 unique values
0 missing
RoofMatlnominal8 unique values
0 missing
Exterior1stnominal15 unique values
0 missing
Exterior2ndnominal16 unique values
0 missing
MasVnrTypenominal4 unique values
8 missing
MasVnrAreanumeric327 unique values
8 missing
ExterQualnominal4 unique values
0 missing
ExterCondnominal5 unique values
0 missing
Foundationnominal6 unique values
0 missing
BsmtQualnominal4 unique values
37 missing
BsmtCondnominal4 unique values
37 missing
BsmtExposurenominal4 unique values
38 missing
BsmtFinType1nominal6 unique values
37 missing
BsmtFinSF1numeric637 unique values
0 missing
BsmtFinType2nominal6 unique values
38 missing
BsmtFinSF2numeric144 unique values
0 missing
BsmtUnfSFnumeric780 unique values
0 missing
TotalBsmtSFnumeric721 unique values
0 missing
Heatingnominal6 unique values
0 missing
HeatingQCnominal5 unique values
0 missing
CentralAirnominal2 unique values
0 missing
Electricalnominal5 unique values
1 missing
1stFlrSFnumeric753 unique values
0 missing
2ndFlrSFnumeric417 unique values
0 missing
LowQualFinSFnumeric24 unique values
0 missing
GrLivAreanumeric861 unique values
0 missing
BsmtFullBathnumeric4 unique values
0 missing
BsmtHalfBathnumeric3 unique values
0 missing
FullBathnumeric4 unique values
0 missing
HalfBathnumeric3 unique values
0 missing
BedroomAbvGrnumeric8 unique values
0 missing
KitchenAbvGrnumeric4 unique values
0 missing
KitchenQualnominal4 unique values
0 missing
TotRmsAbvGrdnumeric12 unique values
0 missing
Functionalnominal7 unique values
0 missing
Fireplacesnumeric4 unique values
0 missing
FireplaceQunominal5 unique values
690 missing
GarageTypenominal6 unique values
81 missing
GarageYrBltnumeric97 unique values
81 missing
GarageFinishnominal3 unique values
81 missing
GarageCarsnumeric5 unique values
0 missing
GarageAreanumeric441 unique values
0 missing
GarageQualnominal5 unique values
81 missing
GarageCondnominal5 unique values
81 missing
PavedDrivenominal3 unique values
0 missing
WoodDeckSFnumeric274 unique values
0 missing
OpenPorchSFnumeric202 unique values
0 missing
EnclosedPorchnumeric120 unique values
0 missing
3SsnPorchnumeric20 unique values
0 missing
ScreenPorchnumeric76 unique values
0 missing
PoolAreanumeric8 unique values
0 missing
PoolQCnominal3 unique values
1453 missing
Fencenominal4 unique values
1179 missing
MiscFeaturenominal4 unique values
1406 missing
MiscValnumeric21 unique values
0 missing
MoSoldnumeric12 unique values
0 missing
YrSoldnumeric5 unique values
0 missing
SaleTypenominal9 unique values
0 missing
SaleConditionnominal6 unique values
0 missing

19 properties

1460
Number of instances (rows) of the dataset.
80
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
6965
Number of missing values in the dataset.
1460
Number of instances with at least one value missing.
37
Number of numeric attributes.
43
Number of nominal attributes.
Number of instances belonging to the least frequent class.
4
Number of binary attributes.
5
Percentage of binary attributes.
100
Percentage of instances having missing values.
-80645.68
Average class difference between consecutive instances.
5.96
Percentage of missing values.
0.05
Number of attributes divided by the number of instances.
46.25
Percentage of numeric attributes.
Percentage of instances belonging to the most frequent class.
53.75
Percentage of nominal attributes.
Number of instances belonging to the most frequent class.
Percentage of instances belonging to the least frequent class.

8 tasks

0 runs - estimation_procedure: 33% Holdout set - target_feature: SalePrice
0 runs - estimation_procedure: 10-fold Crossvalidation - evaluation_measure: root_mean_squared_error - target_feature: SalePrice
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
0 runs - estimation_procedure: 50 times Clustering
Define a new task