I don’t know how to handle this Statistics question and need guidance.
Your group has been selected as a consultant team by a real estate company to perform a statistical analysis of the residential data. This data was collected from publicly available information on recent home sales in your hometown. These are not a random sample, but they may be representative of home sales during a short period of time, in your hometown. The variables include
- Price in $1,000.
- Living space in ft2.
- Number of bedrooms
- Year property built
- Whether there is a garage or not.
- Location type – rural, urban or suburban
- The real estate management is interested in the relationship between location and whether there is a garage or not. Use a contingency table (two categorical variables on StatKey) to address the following questions that they have,
- What percent of homes are in the suburbs?
- What percent of homes have a garage and are in the suburbs?
- Of the suburban homes, what percent have garages?
- Of the homes with garages, what percent are in the suburbs?
- They also believe that there are more homes in the suburbs in this community. A census carried out earlier had shown 55% of all the homes in this hometown were suburban homes. Is there evidence that there are more homes in the suburbs now? Conduct the appropriate hypothesis test and state your conclusion at 5% significance level. Also find the 95% confidence interval for the true percent of suburban homes in this hometown.
- A competitor company claims that the mean living space of the properties in this area is less than 3000 ft2. Plot a histogram and boxplot of the living space and describe the distribution, being sure to discuss the shape, outliers, center and spread (SOCS). Conduct an appropriate hypothesis test to see if their suspicion is founded, and state you conclusion at 5% significance level.
- The management would like a predictive model to help predict the price of a home. One manager suggests using the living area, while another claims that age is more informative in predicting the price of the house. Which manager is correct? Fit a model that predicts price using the living area and another that uses the year built. Give an interpretation of the slope in each model. Which of the two is the better predictive model? Explain using statistics such as correlation and R2. They also need your team to predict the price of two homes – (a) a 3000 ft2 home built in 1995 and (b) a 4500 ft2 built in 2013. – *Statistical Report – this report should show all your working for each of the questions asked. Include all the appropriate tables, graphs or calculations to illustrate how you answered the question.