Introduction to R: Data Types
I wanted to introduce some basic ideas of R coding that are pertinent to this course. R is primarily used for statistical analysis, but because it is open-source, users from all over the world have contributed software packages that can be integrated into R, making it an incredibly powerful, flexible software package for any kind of numerical analysis.
Here I will be focusing on the basics of R coding.
You have discovered that R can do simple numerical calculations.
1 + 2
## [1] 3
3 * 5
## [1] 15
(15934 / 3)^2
## [1] 28210262
You can also assign values to variables.
a = 5.5
a
## [1] 5.5
Variables can be assigned to one of several different data types, including numeric, integer, complex, logical, and character.
Numeric variables are decimal values. These are the default computatational type in R. The variable ‘a’ created above is a numeric variable, which you can check using the internal function ‘class’.
# This is a comment. R ignores anything occurring after a # sign. This is
# useful for annotating your code to explain what the code is doing. Here we
# are simply determining the class of the variable a.
class(a)
## [1] "numeric"
Even if we create a new variable ‘b’ and assign it an integer value, R will save it as a numeric.
b = 5
class(b)
## [1] "numeric"
## You can use the internal function is.integer to check whether a variable
## is an integer or not
is.integer(b)
## [1] FALSE
To assign a variable to be an integer requires that you call the function ‘as.integer’ - in general there is no reason to go through the trouble of doing this, as it is fine to work with integer variables with them assigned as numeric variables.
# convert the decimal value 5.0 to the integer 5
b = as.integer(5)
b
## [1] 5
# check the class
class(b)
## [1] "integer"
is.integer(b)
## [1] TRUE
R can also work with complex variables.
x = 1 + (0+2i) # i is the pure imaginary numer
x
## [1] 1+2i
class(x)
## [1] "complex"
It is important to know that R will not, by default, use complex values in functions. So, for example, to use the built-in R function ‘sqrt’ to calculate the square root of -1, the following code will produce an error because -1 is not a complex number:
sqrt(-1)
## Warning: NaNs produced
## [1] NaN
However, this will not, because we have passed the complex number -1+0i (which is identical to -1) to the function:
sqrt(-1 + (0+0i))
## [1] 0+1i
Logical variables are either TRUE or FALSE and are often used when comparing variables. For example,
x = 1
y = 2
z = x > y # Is x larger than y?
z
## [1] FALSE
# z has class 'logical' because it is assigned to TRUE
class(z)
## [1] "logical"
To make comparisons, you can use ‘>’, ‘<’, ‘>=’, ‘<=’, ‘==’, or ‘!=’. The first of these have the expected interpretation (greater than, less than, greater than or equal to, less than or equal to). Because a single equals sign ‘=’ assigns values to a variable, R uses a double equals sign ‘==’ to test for equality). The ‘!=’ is used to test for inequality.
5 > 4
## [1] TRUE
3 < 5
## [1] TRUE
4 == sqrt(16)
## [1] TRUE
7 != 3
## [1] TRUE
You can also combine multiple comparisons using ‘&’ (and) and ‘|’ (or):
a = 3
b = 5
# are both a and b positive?
(a > 0) & (b > 0)
## [1] TRUE
# is either a or b greater than 3?
(a > 3) | (b > 3)
## [1] TRUE
# is either a or b greater than 5?
(a > 5) | (b > 5) # false because b = 5
## [1] FALSE
# is either a or b greater than or equal to 5?
(a >= 5) | (b >= 5)
## [1] TRUE
Character variables are strings.
fname = "Joe"
class(fname)
## [1] "character"
# you can also use the function 'paste' to combine character strings
lname = "Smith"
name = paste(fname, lname)
name
## [1] "Joe Smith"
You can also convert between data types:
# Convert a numeric variable to character
n = 3.14
class(n)
## [1] "numeric"
m = as.character(n)
m
## [1] "3.14"
# Convert back to numeric
as.numeric(m)
## [1] 3.14
# R will not convert character variables that contain letters into numeric
# variables, however.
as.numeric(fname)
## Warning: NAs introduced by coercion
## [1] NA
q = "A1"
as.numeric(q)
## Warning: NAs introduced by coercion
## [1] NA
# You can also convert numeric variables into complex variables
a = 5
as.complex(a)
## [1] 5+0i
# but converting complex variables into numeric will produce a warning
b = 3 + (0+3i)
as.numeric(b)
## Warning: imaginary parts discarded in coercion
## [1] 3
# Combining numeric variables and character variables will create a
# character
age = 34
sent = paste("I am", age, "years old")
sent
## [1] "I am 34 years old"
class(sent)
## [1] "character"
Finally, you can combine variables to create new variables. That is, if you have previously assigned a value to a variable, you can simply use that variable to compute the value of new variables. That can be very helpful if you have to make a very complicated computation.
# set the value of the frequency of the A allele
p = 0.5
# set the value of genotype fitnessess
W_AA = 1 # notice that you can use '_' in the names of variables
W_Aa = 2
W_aa = 1
# compute the mean population fitness
W_bar = p^2 * W_AA + 2 * p * (1 - p) * W_Aa + (1 - p)^2 * W_aa
W_bar
## [1] 1.5
# to compute the mean population fitness for a different value of p, all I
# need to do is change p and compute W_bar again
p = 0.75
W_bar = p^2 * W_AA + 2 * p * (1 - p) * W_Aa + (1 - p)^2 * W_aa
W_bar
## [1] 1.375