Module 10. Assignment:
Question A:
A1: In the report, please state the result of coefficients and significance to any variables you like both under ANOVA and multivariate analysis. Please provide a specific interpretation of R results.
In this analysis, I used pemax as the dependent variable for the multiple regression experiment.
Results:
Weight = (p = 0.03287)
Bmp = (p = 0.02036)
Fev1 = (p = 0.04695)
Age = (p = 0.31389)
Age is the only variable that isn't significant. The p-values of weight, bmp, and fev1 are all significant.
Weight: A one-unit increase in weight increases the predicted pemax by 2.6882
Bmp: A one-unit increase in bmp decreases pemax by 2.0657 units.
Fev1: A one-unit increase in fev1 increases pemax by 1.0882 units.
ANOVA test results:
Age = (p = 0.00035)
Weight = (p = 0.2038)
Bmp = (p = 0.050)
Fev1 = (p = 0.0469)
The weight variable in the regression model is significant unlike in the ANOVA.
The age variable in the ANOVA test is significant unlike in the regression test.
Conclusion:
This model is statistically significant. The variables: weight, bmp, and fev1 are important predictors of pemax. In addition, the differences between the ANOVA and regression results suggest that some predictors share overlapping information. In particular, the variables related to physical development.
Question B:
B1: How much is gained by using both diameters in a prediction equation? The sum of the two regression coefficients is almost identical and equal to 3.
Using the data from the model results:
Ad only = (R^2 = 0.7959)
Bpd only = (R^2 = 0.7221)
Combined = (R^2 = 0.8583)
Therefore, the combined model has the highest R^2.
Using both diameters significantly improves the prediction of birth weight. Therefore, the combined model explains more variation and reduces error.
B2: Can this be given a nice interpretation to our analysis? Please provide step by step on your analysis and code you use to find out the result.
From the combined model:
log(ad) coefficient = 1.4667
log(bpd) coefficient = 1.5519
Sum:
1.4667 + 1.5519 = 3.0186 = 3
Interpretation:
log(bwt) = -5.8615 + 1.4667log(ad) + 1.5519log(bpd)
Conclusion:
The model has a meaningful real-world interpretation. The birth weight behaves like a three-dimensional measurement, confirming that fetal growth is related to volume.
B3: Just an additional question (This will not be graded). When should we consider "log-transforming" a dataset? This is a very common practice in data science.
I believe that log transformation is useful when:
Data is skewed,
Variance data increases with size,
We want percentage interpretation,
If we want to make relationships look linear.








Comments
Post a Comment