There are a wide variety of data types in R. First we will try to understand what the different data types are and then we will move on to their applications.

So, let us get into the types of data.

If the data consists of only numbers, like decimals, whole numbers, then we call it NUMERIC DATA. In numeric data, the numbers can be positive or negative.

If the data consists **only** of whole numbers, it is called as INTEGER. Integers too may take negative or positive values.

If data consists of strings, i.e., words or sentences, we call it CHARACTER.

A vector used to store categorical data which contain only predefined values is known as FACTOR. They can store both strings and integers.

The type of data which can only assume two values, namely, **true ** and **false**, is called as LOGICAL DATA

## Data Types in R

A Vector is an unidimensional sequence of elements of the same type, whereas, a Matrix is two dimensional. A matrix is similar to a Vector, but additionally contains the dimension attribute. An Array is of two or more dimensions, holding multidimensional data. Two dimensional Arrays are called Matrices.

A data frame has two dimensions and is a table-like representation of the data objects. We can have different data types in different columns.

Unlike vectors, a list can contain elements of various data types and is often known as an ordered collection of data objects.

**Numeric Data**

This is an example of Numeric Data. We are simply creating two objects, here represented as “x” and “y”, and assigning them some values.

The function **class() **is used to check the data type of an object. x and y are of numeric data type.

[1] 4.5

```
y<-3567
y
```

[1] 3567

` class(x) `

[1] "numeric"

` class(y) `

[1] "numeric"

Class() function is used to check the data type of an object

In the previous slide, we took the value of x to be 4.5, giving us Numeric Data. Here we use the function **as.integer() **to convert the Numeric Data to Integer. Now the data type of x is displayed as integer.

```
x<-as.integer(x)
x
```

[1] 4

` class(x) `

[1] "integer"

### Integer

The function **as.integer() **is used to create integer data type in R, as by default, R shows the class of an Integer as Numeric.

#To create an integer variable in R use **as.integer() **function.

```
f<-as.integer(22.5)
f
```

[1] 22

` class(f) `

[1] "integer"

```
x=8
class(x)
```

[1] "numeric"

Note: The default class of an integer is a numeric class

**Character **

As said before, Character is used to represent String Data, i.e., words and sentences. String data can also comprise numbers, as any value enclosed in quotes is stored as Character Object. We can also convert any other form of data into Character by using the function **as.character()**.

```
z<-"Welcome to R Ready Reckon-er"
z
```

[1] "Welcome to R Ready Reckon-er"

```
x<-"4.5"
x
```

[1] "4.5"

` class(z) `

[1] "character"

` class(x) `

[1] "character"

**Factor **

Factor Objects can store both Strings and Integers, and is used to categorize data. They are especially useful when they have a limited number of unique values. Here let us discuss three types of commands.

**c() **is used to combine different types of data.

**is.factor() **is a command used to check whether a particular object is a factor or not. It returns either **true **or **false**.

**is.character() **is a command that is used to check whether a particular object is a Character or not. Just like **is.factor()**, this command outputs could be **true **or** false**.

# Create an object x

` x<-c("high", "medium", "low", "low", "medium", "high", "high", "high", "medium", "low","low") `

c() combines data of different types

# Check whether object x is a factor or character

` is.factor(x) `

[1] FALSE

is.factor() function returns True or False after checking whether the object is of type factor or not

` is.character(x) `

[1] TRUE

is.character() function returns True or False after checking whether the object is of type character or not

To create a Factor Object, we use the command **factor()**. A Factor is a categorical variable and can only take one of a fixed finite set of possibilities. The possible categories are called Levels.

Levels are unique data values.

Using the command **level() **we can check the levels of a Factor. In the output, by default, the Levels are arranged alphabetically.

#Create a factor object using **factor**() function

```
x<-factor(x)
x
```

[1] high medium low low medium high high high [9] medium low low Levels: high low medium

` levels(x) `

[1] "high" "low" "medium"

Factor object x has 11 elements and 3 levels. By default the levels are sorted alphabetically

The function **ordered() **is used to specify the order of a Factor.

The command **levels** takes the levels in the way we want to order.

```
x_ordered<-ordered(x, levels=c("low", "medium","high"))
x_ordered
```

[1] high medium low low medium high high high medium low low

Levels: low < medium < high

**Logical **

Logical type objects take the values TRUE and FALSE.

The command **is.integer() **is used to check whether a particular object is an Integer or not. It has two possible outputs, TRUE and FALSE.

Also, R can evaluate a logical question, i.e., whose answer will be either true or false and store it as an object.

# Create an object x and assign a value 4.5 and check whether it is an integer

```
x<-4.5
is.integer(x)
```

[1] FALSE

is.integer() function checks whether the object is integer or not

# Create two numeric objects y and z

# Check whether y is greater than x or not

```
y<-4
z<-7
Result <- y > z
Result
```

[1] FALSE

With this kind of statement, you are asking R to evaluate the logical question “Is it true that y is greater than z?”

The object(Result) storing the answer of above question is of type logical

You can check the class of the object using **class()**

### Vector

As mentioned before ,Vectors are unidimensional and contain data of similar type. There are three types of vector.

Numeric Vector consisting of Numeric Data.

Character Vector consisting of Character Data.

Logical Vectors are governed by statements, the result of which will be either TRUE or FALSE.

# Numeric vector

```
a <- c(1,2,5.3,6,-2,4)
a
```

[1] 1.0 2.0 5.3 6.0 -2.0 4.0

# Character vector

```
b <- c("one","two","three")
b
```

[1] "one" "two" "three"

# Logical vector

```
d<-c(4,24,6,4, 2,7)
d>5
```

[1] FALSE TRUE TRUE FALSE FALSE TRUE

### Matrix

Matrix is bidimensional and contains dimensional attribute.

We can easily convert any object into Matrix type by using the function **as.matrix****()**.

**matrix() ** function is used to create a matrix.

The functions **nrow **and **ncol **are used to specify the number of rows and columns of the Matrix respectively.

While composing a Matrix, the function **byrow=TRUE **is used to fill the Matrix row-wise. By default the matrix is filled column wise

Create a matrix with 3 rows and 2 columns.

```
x<-matrix(c(2, 3, 4, 5, 6, 7),nrow=3,ncol=2)
x
```

[,1] [,2] [1,] 2 5 [2,] 3 6 [3,] 4 7

matrix() function is used to create a matrix.

nrow= and ncol= is used to specify the dimension of the matrix

Note that the matrix is filled in by column-wise.

```
x<-matrix(c(2, 3, 4, 5, 6, 7),nrow=3,ncol=2,byrow=TRUE)
x
```

[,1] [,2] [1,] 2 3 [2,] 4 5 [3,] 6 7

byrow=TRUE fills the matrix row-wise

The argument **dimnames **can be used to name the rows and columns of a Matrix. The dimension names can be changed and/or accessed by using the functions **colnames **and **rownames**.

```
x<-matrix(c(2, 3, 4, 5, 6, 7),nrow=3,ncol=2,byrow=TRUE,
dimnames=list(c("X","Y","Z"), c("A","B")))
x
```

A B X 2 3 Y 4 5 Z 6 7

#Dimension names can be accessed or changed with two helpful functions colnames() and rownames():

colnames(x) [1] "A" "B" rownames(x) [1] "X" "Y" "Z" colnames(x) <- c("a","b") colnames(x) [1] "a" "b"

rownames can be changed in similar manner

Another useful method of composing a Matrix is by using the commands **cbind() **and **rbind****()**. In the above example cbind() will create a 3X2 matrix with the elements 2,3,4 in first column and 5,6,7 the second column.

rbind() will create a matrix of order 3×2 filling the values row wise

```
cbind(c(2,3,4),c(5,6,7))
rbind(c(2,3),c(4,5), c(6,7))
```

### Arrays

In an Array, each row is of the same length and each column is also of the same length. So, we say it holds multidimensional rectangular data.

Creating an Array is simple. We use the command **array(data, dim = c(r,c,t))**, to create an array, where, “r” represents the number of rows of the Array, “c” represents the number of columns, and “t” represents the number of tables. By default R fills the array column-wise, even though the first dimension in our command is that of rows. So, first the columns are filled, then the rows, then the rest of the dimensions.

```
a<-array(1:24,dim=c(3,4,2))
a
```

, , 1 [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 , , 2 [,1] [,2] [,3] [,4] [1,] 13 16 19 22 [2,] 14 17 20 23 [3,] 15 18 21 24

array(data, dim = c(r,c,t) )

r = no. of rows

c = no. of columns

t = no. of tables

Note:** **Although the rows are given as the first dimension, the tables are filled column-wise. So, for arrays, R fills the columns, then the rows, and then the rest

**Data Frames**

Data Frame is a bidimensional data structure, similar to Arrays. It is, actually, a list of vectors of equal length. These are the primary structure in R.

We can convert any object into Data Frame type by using the function **as.data.frame()**.

Suppose we have three vectors x,y,z. We can use the function **data.frame() **to combine these vectors to form a Data Frame.

As said before, Matrices are also bidimensional and is similar to vectors.

So what’s the difference between Data Frames and Matrices?

Data Frames can contain heterogenous data among its columns or variables, whereas Matrices contain only homogenous data.

```
x<-c(12,23,45)
y<-c(13,21,6)
z<-c("a","b","c")
```

creating vectors x, y, z

```
data<-data.frame(x,y,z)
data
```

data.frame() function combines them in a table.

object data is a dataframe containing three vectors x, y, z

x y z 1 12 13 a 2 23 21 b 3 45 6 c

The function **str() **displays the structure of an object. By default, R transforms Character Vectors or Character Matrix to Factors while creating a Data Frame. To avoid errors with respect to this, we can specify **stringsAsFactors=FALSE **while creating the Data Frame.

` str(data) `

'data.frame': 3 obs. of 3 variables: $ x: num 12 23 45 $ y: num 13 21 6 $ z: Factor w/ 3 levels "a","b","c": 1 2 3

str() shows the structure of an object

Note : z is a character vector but by default R stores it in the data frame as factor..

**Lists**

A data structure containing mixed data types is called a List. We can convert any object into a list by using the function **as.list()**. We can also create lists using the function **list()**.

```
n=c(2, 3, 5)
s=c("aa", "bb", "cc", "dd", "ee")
x=list(n, s, 3)
x
```

[[1]] [1] 2 3 5 [[2]] [1] "aa" "bb" "cc" "dd" "ee" [[3]] [1] 3

list() is used to create lists,

A part of a list can easily be retrieved by enclosing the index vector in a square bracket operator **[]**. The indexing of the list starts with [1]

` x[2]`

[[1]] [1] "aa" "bb" "cc" "dd" "ee"

` x[c(2, 3)] `

[[1]] [1] "aa" "bb" "cc" "dd" "ee"

[[2]] [1] 3

To recap, we’ve discussed the different data types in R and how to convert from one data type to another. This tutorial is based on lessons from the Data Analytics in R unit of the Digita Schools Advanced Diploma in Data Analytics and Postgraduate Diploma in Data Science