Create a clean (no unnecessary output) and well-formatted RMarkdown file that answers the two problems below (note both problems contains multiple parts). Use the #, ## and ### features to separate parts of the document to ease in reading.
Problem 1 (20pts)
Data source: https://dasl.datadescription.com/datafile/nail-polish/ (链接到外部网站。)
Description: A student, preparing for a triathlon, suspected that the 45 minutes each day she spent training in a chlorinated pool was damaging her nail polish. She wished to investigate whether the color of the nail polish might make a difference. She mounted acrylic nails on sticks and polished them with two different color nail polishes. She soaked them together in a chlorine solution equivalent to a swimming pool’s chlorination and then tapped them 100 times on a computer keyboard to simulate daily stress. She then recorded the % of nail chipped off as measured by scanning images of the nails and using an image processing program. She wishes to find out if the % of nail polish chipped off vary between the two different colors she uses.
The data set “nail-polish.csv” can be accessed using the following line of code:
df1 <- read.csv("https://raw.githubusercontent.com/hellomissingdata/STA363/main/nail-polish.csv")
Comment on the design of the experiment. Specifically state all the design elements in the context of the problem: the experimental units, the response variable(s), the factor, factor levels, and the treatments. If you were her, what would you do in the experiment to control for nuisance variation or confounding variables?
Build an appropriate plot for the design of data. Comment on what you see, including comments on the average and variation in the response. Make sure your plot is properly labeled and would be understandable to an outside viewer (that is, the labels and titles explain the context).
Perform the appropriate statistical analysis for this design. State the conclusion of the analysis results of the experiment outcomes, in context.
What are the assumptions of the statistical method chosen in part 3? Please perform a residual analysis to check for these assumptions. In your analysis, please provide your graphs and your comments about the model assumptions based on your findings from these graphs.
Problem 2 (20pts)
Data source: Janssen RG, Schwartz DA, Velleman PF. A randomized controlled study of contrast baths on patients with carpal tunnel syndrome. Journal of Hand Therapy : Official Journal of the American Society of Hand Therapists. 2009, 22(3):200-7. DOI: 10.1016/j.jht.2009.02.001. PMID: 19375278.
Description: Contrast baths are a treatment modality commonly used in hand clinics. Yet the benefits of contrast baths have been poorly substantiated. Contrast baths have been suggested for the purposes of reducing hand volume, alleviating pain, and decreasing stiffness in affected extremities. To determine the effects of specific contrast bath protocols on hand volume in patients diagnosed with Carpal Tunnel Syndrome, study participants were randomly assigned to one of three treatment group protocols: contrast baths with exercise, contrast baths without exercise, and an exercise-only control treatment group. Study participants were then evaluated with hand volumetry, before and after treatment, at two different data collection periods: pre- and postoperatively. The change in hand volume (the after treatment volume minus the before treatment volume) is the outcome of interest to us.
The data set “contrast-baths.csv” can be accessed with this code:
df2 <- read.csv("https://raw.githubusercontent.com/hellomissingdata/STA363/main/contrast-baths.csv")
There are two columns in the data, “Treatment” is the treatment group information of the experiment, and “Hand.Vol.Chg” is the change in hand volume. Each row represents an individual participant.
Note that there are missing data in these observations. “drop_na()” in “dplyr” package (called through “tidyverse” here) helps us easily remove those missing values:
df2 <- df2 %>% drop_na()
Comment on the design of the experiment. Specifically state all the design elements in the context of the problem: the experimental units, the response variable(s), the factor, factor levels, the treatments, and the steps the experimenter took in an attempt to control for nuisance variation or confounding variables.
Perform a meaning/helping EDA for this data. Comment on what you see, including comment on average and variation in hand volume change.
Perform the appropriate statistical analysis for this design. Cite the statistics value, degrees of freedom, p-value, and the conclusion in problem context.
Generate the residual plots for the statistical method chosen in part 4 to check the model assumptions. Provide the graphs and comments about the model assumptions based on your findings from these graphs.
Perform appropriate multiple comparisons if necessary and report the results in context. You must defend your choice to use Tukey or Dunnett.