5  Structure and Attributes

One the most crucial understanding of any programming language is to understand the structure of an object and how to separate/extract the components of it.

If you are comfortable with an object’s structure, even at a very high level, many things will become smooth when programming and returning the results of your computation.

Anything you create in R is an object. For example, x = c(1, 2, 3) is an object whose ‘name’ is x. You can print it, view it, manipulate it, use it in computation and so on.

Let us formally discuss how to understand the structure and attributes of an object.

You can use str() function to view the structure of an object. Attributes cab ne viewed using the attributes() function.

Structure vs Attributes

To understand their differences, think of structure being the skeleton (e.g., components of a car) whereas attributes are the behaviors or abstract concepts or constructs (e.g., comfort level of the seats, how it handles curves).

5.1 Structure of an atomic vector

Consider a vector object, x = c(1, 2, 3) and another vector object y = 1:3. Let’s print them on the console

# an object
x = c(1, 2, 3)
print(x)
[1] 1 2 3
typeof(x)
[1] "double"
# another object
y = 1:3
print(y)
[1] 1 2 3
typeof(y)
[1] "integer"
# character vector
z = c('Male', 'Female')
print(z)
[1] "Male"   "Female"
typeof(z)
[1] "character"
str(x)
 num [1:3] 1 2 3
str(y)
 int [1:3] 1 2 3
str(z)
 chr [1:2] "Male" "Female"

Atomic vectors do not have attributes. attributes(x) will return NULL.

attributes(x) # returns NULL
NULL

5.2 Structure of a Matrix

A matrix object has attributes. Let us create a matrix my_mat with the sequence of numbers from 1 to 10 arranged by two rows and 5 columns.

my_mat = matrix(1:10, nrow = 2, ncol = 5)

Type of the my_mat object is integer because we’ve used a sequence (1:9) as its values. We will discuss more about sequence later in this book. For now, it is sufficient just to notice the data type, which is integer.

typeof(my_mat)
[1] "integer"

And the structure is

str(my_mat)
 int [1:2, 1:5] 1 2 3 4 5 6 7 8 9 10

How about the attributes of my_mat?

attributes(my_mat)
$dim
[1] 2 5

A matrix object has a dim attribute which represents its dimension. We rearranged an atomic vector 1:10 into a matrix object with 2 rows and 5 columns. ‘Dimension’ is the attribute of a matrix object, and it is represented by dim.

5.3 Structure of a List

my_list = list(
  serial = 1:5,
  age = c(10, 11, 20, 30, 32), 
  sex = c('M', 'F', 'F', 'M', 'M')
)

str(my_list)
List of 3
 $ serial: int [1:5] 1 2 3 4 5
 $ age   : num [1:5] 10 11 20 30 32
 $ sex   : chr [1:5] "M" "F" "F" "M" ...

And the attributes

attributes(my_list)
$names
[1] "serial" "age"    "sex"   
names(my_list)
[1] "serial" "age"    "sex"   
dim(my_list)
NULL

5.4 Structure of a data.frame

my_list = list(
  serial = 1:5,
  age = c(10, 11, 20, 30, 32), 
  sex = c('M', 'F', 'F', 'M', 'M')
)

df = data.frame(my_list)

str(df)
'data.frame':   5 obs. of  3 variables:
 $ serial: int  1 2 3 4 5
 $ age   : num  10 11 20 30 32
 $ sex   : chr  "M" "F" "F" "M" ...

Attributes

attributes(df)
$names
[1] "serial" "age"    "sex"   

$class
[1] "data.frame"

$row.names
[1] 1 2 3 4 5

5.5 Structure of Data Frame

Data frames are created using the data.frame() function by supplying a list of columns. data.frames, as it is typically referred to are of list data type with one important distinction. List can have elements of unequal length. In data.frame, all the elements must have the same length to make the data.frame a true rectangular array.

x = c(1, 2, 3)
my_list = list(
  serial = 1:5,
  age = c(10, 11, 20, 30, 32), 
  sex = c('M', 'F', 'F', 'M', 'M')
)
df = data.frame(my_list)

df
  serial age sex
1      1  10   M
2      2  11   F
3      3  20   F
4      4  30   M
5      5  32   M

If you look at the data type for df using typeof(df), you will see its a list.

typeof(df)
[1] "list"

To view the structure of df object

str(df)
'data.frame':   5 obs. of  3 variables:
 $ serial: int  1 2 3 4 5
 $ age   : num  10 11 20 30 32
 $ sex   : chr  "M" "F" "F" "M" ...

5.6 Attributes of Data Frame

As mentioned earlier, matrix and data frame are collection of vectors but they have additional characteristics called ‘attributes’. R’s data frame is a named list of vectors with the following attributes:

  • column names (names)
  • row names (row.names)
  • class (class)

Lets see the attributes of the df data frame object.

attributes(df)
$names
[1] "serial" "age"    "sex"   

$class
[1] "data.frame"

$row.names
[1] 1 2 3 4 5

Because they are attributes of an object, these functions can be used to extract these attributes from these objects. Thus, to know the column names sumply use the names() function as follows.

names(df)
[1] "serial" "age"    "sex"   

Likewise, to get the row names, use row.names(df) and to get the class of the object, use class(df)

row.names(df)
[1] "1" "2" "3" "4" "5"
class(df)
[1] "data.frame"

5.7 Exercise

  1. Create a matrix object and explore its attributes. What difference do you see from the attribtues of a data frame?
x = matrix(1:10, ncol=2)
x
attributes(x)
  1. Create a list object and explore its attributes.

  2. Create a data frame object and explore its attributes.

References