Lab 01 Preparations

Run your “first” R code

Open RStudio and you should see a window similar to the one below:

The big window on the left with “Console” is the R console that functions the same as the consoles you get from opening R (not RStudio). You can run R command in the console directly. Try with

2 + 3

## [1] 5

Now you see that you can use R as a calculator!

R stores things in memory.

a <- 2 + 3

If you just run the above line the console, you will not see any pop-up. However, your calculation is stored in the element ‘a’. You can find it in the upper-right window of your RStudio under the Environment tab. Take a look:

And, of course you can output the value of a by

## [1] 5

Let’s plot some figures (the saying “a figure is worth a thousand words” should echo now). We will draw \(1000\) samples from a normal distribution with mean \(10\) and variance \(100\) (i.e., N(10, 100)).

set.seed(7260)
#Let's take a look at the PDF (probability density function) first
x <- 0:2000 * 0.05 - 50
y <- dnorm(x, mean = 10, sd = sqrt(100))
# X spans over [-50, 50] and y store the pdf value for each element in x.
# The following command generates a line plot
plot(x, y, type = "l")

# search the R Documentation for this "plot" function in the lower right window.

Now, let’s search the help documentation of the default “plot” function in R. Locate the “Help” tab in the lower-right window of your RStudio app.

# Let's draw some random variables from the distribution
normal.samples <- rnorm(length(x), 10, 10)  # avoid using sample as the variable name because it's already used for a built-in function.  The bad behavior of R (python shares this overriding problem too)
# Produce a histogram of the random samples
hist(normal.samples)

The above histogram has the default “Frecuency” as the Y axis. Let’s change it to “Density”

hist(normal.samples, probability = TRUE)

Now let’s place the line plot over the histogram.

hist(normal.samples, probability = TRUE)
lines(x, y, col="red", lwd=4)

#you may want to try the function "curve" too.  Go to its help page to find the usage.

Let’s re-plot the above figures using the ggplot2 package

library(ggplot2)
library(tibble)
# first group data into a data frame (an upgraded one, i.e., tibble)
sampled.data <- tibble(x = x, y = y, normal.samples = normal.samples)

ggplot(data = sampled.data) +
  geom_line(mapping = aes(x = x, y = y)) +
  theme_bw()

ggplot(data = sampled.data) +
  geom_histogram(mapping = aes(x = normal.samples)) +
  theme_bw()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data = sampled.data) +
  geom_histogram(mapping = aes(x = normal.samples, y = ..density..)) +
  theme_bw()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data = sampled.data) +
  geom_histogram(mapping = aes(x = normal.samples, y = ..density..)) +
  geom_line(mapping = aes(x = x, y = y), col = "red", size = 2) +
  theme_bw()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

This concludes our lab 1. Please let me know if you run into any trouble.

Lab 01 Preparations

MATH-6040/7260 Linear Models

Dr. Xiang Ji @ Tulane University

Feb 5, 2021

Run your “first” R code