Introduction to R: Vectors and Matrices
Previously, we explored the different datatypes that R understands, and considered how to assign values to variables, check their classes, and combine them in useful ways. We were focused on assigning variable with a single value. Here we will explore how to create and use variables that have more than a single value, that is, assigning variables to vectors and matrices.
The simplest way to create a vector of values is to use the ‘c’ function, which simply concatenates the values listed into a vector.
x = c(1, 5, 3, 9, 124, 7)
x
## [1] 1 5 3 9 124 7
class(x) # x is numeric
## [1] "numeric"
length(x) # length() is a function that computes the length of the vector
## [1] 6
You can also create vectors using the built in function ‘seq’, which is very helpful if you want a sequence of values, say from 1 to 10.
y1 = seq(1, 10, by = 1) # the 'by' argument tells R the distance between successive values in the sequence
y1
## [1] 1 2 3 4 5 6 7 8 9 10
y2 = seq(1, 10, by = 0.25)
y2
## [1] 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50
## [12] 3.75 4.00 4.25 4.50 4.75 5.00 5.25 5.50 5.75 6.00 6.25
## [23] 6.50 6.75 7.00 7.25 7.50 7.75 8.00 8.25 8.50 8.75 9.00
## [34] 9.25 9.50 9.75 10.00
# if by=1, you can specify the sequence in a shorter way using ':'
y3 = 1:10
y3
## [1] 1 2 3 4 5 6 7 8 9 10
# confirm that y1 and y3 are exactly the same notice that this will compare
# the first item in y1 to the first item in y3, the second item in y1 to the
# second item in y3, and so on.
y1 == y3
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
You can also create empty vectors using the built in function ‘vector’:
# the 'mode' argument asks what kind of data will be stored in the vector
# the 'length' argument specifies the length of the vector
z = vector(mode = "numeric", length = 10) # create a numeric vector of length 10
z # by default, it assigns every numeric value to be 0
## [1] 0 0 0 0 0 0 0 0 0 0
You can also index particular items in the vector using square brackets [].
x = seq(0, 1, 0.1)
x
## [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x[1] # the first item in the vector
## [1] 0
x[5] # the fifth item in the vector
## [1] 0.4
x[length(x)] # the last item in the vector
## [1] 1
You can use indexes to assign values to different positions in the vector.
z = vector(mode = "numeric", length = 10)
z
## [1] 0 0 0 0 0 0 0 0 0 0
z[3] = 5
z[6] = 8
z
## [1] 0 0 5 0 0 8 0 0 0 0
You can also create vectors using the function ‘rep’, which simply repeats the argument as many times as specified.
a = 10
# repeat the value of a 10 times
b = rep(a, 10)
b
## [1] 10 10 10 10 10 10 10 10 10 10
# create a vector of length 2
a = c(5, 6)
# repeat this vector 10 times
b = rep(a, 10)
b
## [1] 5 6 5 6 5 6 5 6 5 6 5 6 5 6 5 6 5 6 5 6
# notice that this repeats both items of the vector 10 times, so the
# resulting vector has length 20
length(b)
## [1] 20
# you can also repeat it so that the first item in the vector is repeated 10
# times, and then the second item is repeated 10 times
b = rep(a, each = 10) # use the argument 'each' to specify how many times each item is repeated
b
## [1] 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6
It is also possible to perform mathematical operations on vectors.
# multiply every entry in a vector by the same quantity
a = 5
b = c(0, 3, 8, 14, -5, -4)
a * b
## [1] 0 15 40 70 -25 -20
# add two vectors
d = c(5, 14, -3, 1, -9, 0)
b + d # b[1]+d[1], b[2]+d[2], b[3]+d[3], ...
## [1] 5 17 5 15 -14 -4
# multiply two vectors
b * d # b[1]*d[1], b[2]*d[2], b[3]*d[3], ...
## [1] 0 42 -24 14 45 0
# what happens if the two vectors are not the same length and you try to add
# or multiply them?
d = c(5, 14)
b + d # the shorter vector is 'recycled', so b[1]+d[1], b[2]+d[2], b[3]+d[1], b[4]+d[2], ...
## [1] 5 17 13 28 0 10
b * d # the shorter vector is 'recycled', so b[1]*d[1], b[2]*d[2], b[3]*d[1], b[4]*d[2], ...
## [1] 0 42 40 196 -25 -56
You can also combine vectors to make matrices.
a = c(1, 5, 8, 10)
b = c(2, 6, 9, 11)
# the function 'cbind' assigns each vector to a column of the new matrix
m1 = cbind(a, b)
m1
## a b
## [1,] 1 2
## [2,] 5 6
## [3,] 8 9
## [4,] 10 11
class(m1) # m1 is a matrix
## [1] "matrix"
# the function 'rbind' assigns each vector to a row of the new matrix
m2 = rbind(a, b)
m2
## [,1] [,2] [,3] [,4]
## a 1 5 8 10
## b 2 6 9 11
class(m2) # m2 is also a matrix
## [1] "matrix"
# you can use the function 'dim' to check the dimensions of a matrix - gives
# the number of rows first, then the number of columns
dim(m1)
## [1] 4 2
dim(m2)
## [1] 2 4
You can also pull specific items from the matrix out by using indexing with square brackets, but now you must specify the row and column index you are interested in.
m1[1, 1] # get the entry in the first row, first column of m1
## a
## 1
m1[2, 1] # get the entry in the second row, first column
## a
## 5
m1[1, 2] # get the entry in the first row, second column
## b
## 2
m1[1, ] # by leaving the column index blank, this returns the values for every entry in the first row
## a b
## 1 2
m1[2, ] # or the second row
## a b
## 5 6
m1[, 1] # this returns the values for every entry in the first column
## [1] 1 5 8 10
You can also create matrices using the ‘matrix’ function.
# create an empty matrix with 5 rows and 3 columns
m = matrix(0, nrow = 5, ncol = 3)
m
## [,1] [,2] [,3]
## [1,] 0 0 0
## [2,] 0 0 0
## [3,] 0 0 0
## [4,] 0 0 0
## [5,] 0 0 0
# perhaps you want to do something more complicated - you have a vector of
# length 12 and you want the first item to go in the first row, first
# column, the second item to go in the first row, second column, the third
# item to go in the first row, third column, but the fourth item to go in
# the second row, first column, and so on. So you are wanting to create a
# matrix with 4 rows and 3 columns.
v = 1:12 # our vector of length 12
m = matrix(v, nrow = 4, ncol = 3, byrow = TRUE) # 'byrow' says to fill the rows first rather than columns
m
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12
As with vectors, you can perform mathematical operations on matrices.
m1 = matrix(seq(1, 15), nrow = 3, ncol = 5, byrow = TRUE)
m1
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 2 3 4 5
## [2,] 6 7 8 9 10
## [3,] 11 12 13 14 15
m1 + 1 # add one to every entry
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 3 4 5 6
## [2,] 7 8 9 10 11
## [3,] 12 13 14 15 16
m1 * 2 # multiply every entry by 2
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 4 6 8 10
## [2,] 12 14 16 18 20
## [3,] 22 24 26 28 30
# You can combine matrices as well
m2 = matrix(seq(2, 30, by = 2), nrow = 3, ncol = 5, byrow = TRUE)
m2
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 4 6 8 10
## [2,] 12 14 16 18 20
## [3,] 22 24 26 28 30
m1 + m2 # m1[1,1] + m2[1,1], m1[1,2]+m2[1,2], etc.
## [,1] [,2] [,3] [,4] [,5]
## [1,] 3 6 9 12 15
## [2,] 18 21 24 27 30
## [3,] 33 36 39 42 45
m1 * m2 # m1[1,1]*m2[1,2], m1[1,2]*m2[1,2], etc.
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 8 18 32 50
## [2,] 72 98 128 162 200
## [3,] 242 288 338 392 450
# You can do matrix multiplication using %*%
m3 = t(m2) # transpose rows and columns because the number of columns in m1 does not equal the number of rows in m2, so matrix multiplication is impossible
m4 = m1 %*% m3
m4 # m4[1,1] = m1[1,1]*m3[1,1] + m1[1,2]*m3[2,1] + m1[1,3]*m3[3,1] + m1[1,4]*m3[4,1] + m1[1,5]*m3[5,1]
## [,1] [,2] [,3]
## [1,] 110 260 410
## [2,] 260 660 1060
## [3,] 410 1060 1710
# m4[1,2] = m1[1,1]*m3[1,2] + m1[1,2]*m3[2,2] + m1[1,3]*m3[3,2] +
# m1[1,4]*m3[4,2] + m1[1,5]*m3[5,2] m4[2,1] = m1[2,1]*m3[1,1] +
# m1[2,2]*m3[2,1] + m1[2,3]*m3[3,1] + m1[2,4]*m3[4,1] + m1[2,5]*m3[5,1]