Exploring Linear Regression with the Possum Dataset

Answered using GPT 40 by GPT40

Question:

title: "Introduction to Linear Regression - in class" author: "Kody Gray" date: "r Sys.Date()" output: pdf_document: default html_document: default

knitr::opts_chunk$set(echo = TRUE)

Load libraries

library("tidyverse")
library("openintro")

Load possum dataset

This dataset is included in the package for our textbook.

data(possum)

We can preview the dataset with head(). The black arrow on the right of the row names allows you to scroll through the other variables.

head(possum)

Problem 1

Make a scatterplot for the variables head length (head_l) and skull width (skull_w).

Describe the association between head length and skull width.

Answer:

Problem 2

Check the conditions for regression for head length and skull width.

Indicate whether each condition is satisfied or not and why.

Answer:

Problem 3

Since there are conce

Answer:
## Introduction to Linear Regression - in class
author: "Kody Gray"
date: "`r Sys.Date()`"
output:
  pdf_document: default
  html_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

Load libraries

library("tidyverse")
library("openintro")

Load possum dataset

This dataset is included in the package for our textbook.

data(possum)

We can preview the dataset with head(). The black arrow on the right of the row names allows you to scroll through the other variables.

head(possum)

Problem 1

Make a scatterplot for the variables head length (head_l) and skull width (skull_w).

# Create a scatterplot for head length and skull width
ggplot(possum, aes(x = head_l, y = skull_w)) +
  geom_point() +
  labs(title = "Scatterplot of Head Length vs Skull Width",
       x = "Head Length (mm)",
       y = "Skull Width (mm)") +
  theme_minimal()

Describe the association between head length and skull width.

Answer: The scatterplot shows a positive linear association between head length and skull width. As head length increases, skull width also tends to increase, suggesting that larger possums tend to have wider skulls.

Problem 2

Check the conditions for regression for head length and skull width.

# Check for linearity
ggplot(possum, aes(x = head_l, y = skull_w)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +
  labs(title = "Linearity Check: Head Length vs Skull Width",
       x = "Head Length (mm)",
       y = "Skull Width (mm)") +
  theme_minimal()

Indicate whether each condition is satisfied or not and why.

Answer:

  1. Linearity: The scatterplot with the fitted line indicates a linear relationship between head length and skull width, satisfying the linearity condition.
  2. Independence: Assuming the data points are collected independently, this condition is likely satisfied.
  3. Homoscedasticity: The spread of the residuals should be constant across all levels of head length. We can check this by plotting the residuals after fitting a linear model.
  4. Normality of residuals: We can check the normality of the residuals using a Q-Q plot after fitting the model.
# Fit a linear model
model <- lm(skull_w ~ head_l, data = possum)

# Plot residuals
par(mfrow = c(1, 2))
plot(model, which = 1)  # Residuals vs Fitted
plot(model, which = 2)  # Q-Q plot

Problem 3

Since there are conce...


This R Markdown document provides a structured approach to exploring linear regression using the possum dataset. It includes code for creating a scatterplot, checking regression conditions, and interpreting the results. You can continue to expand on Problem 3 and any additional analyses or conclusions you wish to draw from the data.