M6 QA Univ Sol
M6 QA Univ Sol
M6 QA Univ Sol
Questions on Module 6
i. Create a subset of subjects less than 4 by using subset() function and demonstrate the
output.
# Create a sample data frame with the given marks values
data <- data.frame(
subject = c(1, 2, 3, 4, 5, 6),
class = c(1, 2, 1, 2, 1, 2),
marks = c(56, 75, 48, 69, 84, 53)
)
This code creates a data frame with the specified marks values and then creates
a subset where the "subject" is less than 4.
ii. Create a subset where the subject column is less than 3 and the class equals to 2 by using
[ ] brackets and demonstrate the output.
In this example, data$subject < 3 checks if the "subject" is less than 3, and data$class == 2 checks if
the "class" is equal to 2. The resulting subset includes rows where both conditions are true. The
subset is created using square brackets [].
2 The data analyst of Argon technology Mr. John needs to enter the salaries of 10 employees in R. The
Salaries of the employees are given in the following table: [Dec 2024, 10 marks]
i. Which R commands will Mr. John use to enter these values? Demonstrate the output.
# Create a data frame with the given records
employee_data <- data.frame(
sr_number = 1:10,
name = c("Vivek", "Karan", "James", "Soham", "Renu", "Farah", "Hetal", "Mary",
"Ganesh", "Krish")
)
print("Employee Dataset:")
print(employee_data)
salary <- c(21000, 55000, 67000, 50000, 54000, 40000, 30000, 70000, 20000, 15000)
ii. Now Mr. John wants to add the salaries of 5 new employees in the existing table, which
commands he will use to join datasets with new values in R. Demonstrate the output.
In this demonstration, the rbind() function is used to combine the existing dataset
(employee_data) with the data frame of salaries for 5 new employees (new_employees).
The result is a combined dataset (combined_data) with the salaries of all 15 employees.
3 i. Write the script to sort the values contained in the following vector in ascending order
and descending order: (23, 45, 10, 34, 89, 20, 67, 99). Demonstrate the output.
ii. Name and explain the operators used to form data subsets in R. [Dec 2022, 10 marks ]
ii) Name and explain the operators used to form data subsets in R
V <- c(1,2,3,4,5,6)
subset(V, V<4)
Sample Output
[1] 1 2 3
4 Consider the following data frame given below: [May 2023, 10 Marks] [ May 2024, 10 Marks]
i. Create a subset of course less than 3 by using [ ] brackets and demonstrate the output.
# Subset using []
subset_course_less_than_3 <- course_data [ course_data$course < 3, ]
Here square brackets are [] are used to select rows where the course column is
less than 3
ii. Create a subset where the course column is less than 3 or the class equals to 2 by using
subset() function and demonstrate the output.
Used the subset() function to select rows where either the "course" column is less than 3
or the "class" column equals 2.
5 i. The following table shows the number of units of different products sold on different
days: [May 2023, 10 Marks]
Bread 12 3 5 11 9
Milk 21 27 18 20 15
Cola Cans 10 1 33 6 12
Chocolate 6 7 4 13 12
Bars
Detergent 5 8 12 20 23
In the given sales data table, each row represents a different product, and each column from Monday
to Friday represents the number of units sold for that product on each respective day. The sample
numeric vectors are randomly selected columns from this table, representing the sales data for a
particular day across all products. These vectors can be used for further analysis, such as calculating
daily averages or comparing the sales performance of different products on a specific day.
ii. Name and explain the operators used to form data subsets in R
In R, the primary operators and functions for forming data subsets include:
1. Square Brackets [] Operator:
● Syntax: data[rows, columns]
● Explanation: Used to subset data frames or matrices based on specific row and column
indices.
2. Logical Operators (&, |, !):
● Syntax: data[logical_condition, ]
● Explanation: Logical operators are employed to create conditions for subsetting data based on
specific criteria.
3. Subset() Function:
● Syntax: subset(data, logical_condition, select = c(columns))
● Explanation: The subset() function is designed for concise subsetting of data frames using
logical conditions.
4. %in% Operator:
● Syntax: data[data$column %in% c(values), ]
● Explanation: Checks if elements in a column are present in a specified set of values,
commonly used for categorical variables.
6 i. Create a data frame from the following 4 vectors and demonstrate the output:
emp_id = c(1:5 )
emp_name = c(“Rick”, “Dan”, “Michelle”, “Ryan”, “Gary”)
start_date = c(“2012-01-01”, “2013-09-23”, “2014-11-15”, “2014-05-11”, “2015-03-27”)
salary = c(60000, 45000,75000, 84000, 20000)
v. Extract the employee details whose salary is less than or equal to 60000.
# Step iv: Extract employee details whose salary is less than or equal to 60000
7 List and explain various functions that allow users to handle data in R workspace with appropriate
examples. [Dec 2023, 10 Marks]
Solution
R provides a range of functions that allow users to handle data effectively in the workspace. Here’s an
explanation of some of these functions with appropriate examples:
1. Vectors
A vector is the most basic data type in R and represents a sequence of data elements of the same type
(numeric, character, or logical). You can create and manipulate vectors using various functions.
A data frame is a two-dimensional data structure in R where each column can contain data of different
types (e.g., numeric, character, logical). It's one of the most commonly used data structures in R for
handling datasets.
The subset() function in R allows you to select specific rows and columns from a data frame or matrix
based on conditions.
The sort() function is used to sort a vector or data frame in ascending or descending order.
merge() function is used to merge the data contained in different data frames on the basis of columns
and rows.
It combines the data of two frames on the basis of the existence of a common column between the
two. The merge() function takes the following arguments.
all, all.x, all.y - Specify logical values for the type of merge. The default value is all FALSE.
Example
cbind() function is used to add columns of datasets having an equal set and identical order of rows.
It is used to bind the column names of two datasets. It helps in restricting the number of columns to
be included in new datasets
Example
course_data <- data.frame (
course = c(1, 2, 3, 4, 5, 6),
class = c(1, 2, 1, 2, 1, 2),
)
rbind() function is used to add rows in datasets having an equal number of columns
Solution;
In R, both functions and scripts can be used to execute code. However, functions offer several
advantages over scripts, especially when it comes to reusability, readability, and maintainability.
Functions: Once a function is defined, it can be reused multiple times throughout the script or in other
scripts by simply calling it with appropriate arguments. This prevents redundancy and avoids having to
write the same code repeatedly.
Scripts: Scripts are typically a collection of commands that run sequentially, but if you need the same
block of code again, you’d have to rewrite or copy-paste it.
ii.
# Dataset A
A <- c(6, 7, 8, 9)
# Dataset B
B <- c(1, 2, 4, 5)
# Combine A and B
C <- c(A, B)
print(C)
Output:
[1] 6 7 8 9 1 2 4 5
To combine two datasets (vectors, matrices, or data frames) into one in R, you can use functions like
c() for vectors or rbind() and cbind() for matrices and data frames.
Applications of visualization
Consider the following data frame given below: [ May 2024, 10 Marks]
i. Create a subset of course less than 5 by using [ ] brackets and demonstrate the output.
# Creating the data frame with the given information
course_data <- data.frame(
# Subset using []
subset_course_less_than_5 <- course_data[course_data$course < 5, ]
ii. Create a subset where the course column is less than 4 or the class equals to 1 by using
subset() function and demonstrate the output.
ii. Explain the various functions provided by R to combine different sets of data.
Solution
Script to create dataset named data1
output
Solution:
Ways to combine different datasets to be merged together are:
merge() function
cbind() function
rbind() function
merge() function is used to merge the data contained in different data frames on the basis of columns
and rows.
It combines the data of two frames on the basis of the existence of a common column between the
two. The merge() function takes the following arguments.
Example
cbind() function is used to add columns of datasets having an equal set and identical order of rows.
It is used to bind the column names of two datasets. It helps in restricting the number of columns to
be included in new datasets
Example
course_data <- data.frame (
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
class = c(1, 2, 1, 2, 1, 2),
)
rbind() function is used to add rows in datasets having an equal number of columns
rbind(data1, data2)