--- title: "Assignment 1" author: "Burkay Genç" date: "March 26, 2018" output: html_document: theme: cerulean --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, eval = TRUE) ``` ## Question 1 (5 points) Do the following in R: - Assign 8 to `p` - Assign 6 to `q` - Swap the values of `p` and `q`. You are not allowed to directly assign values. You have to "swap" them! ```{r} # Solution 1 : No extra variables used p <- 8 q <- 6 p <- p + q q <- p - q p <- p - q p q ``` ```{r} # Solution 2 : Temporary variable used p <- 8 q <- 6 temp <- p p <- q q <- temp p q ``` ## Question 2 (10 points) - Create a vector of the populations of the 10 largest cities in Turkey. - Name your vector with the names of the cities. - Print the names of the cities that have a population between 2 million and 3 million. ```{r} # Write your answer here city.pops <- c(14160467, 5045083, 4061074, 2740970, 2158265, 2149260, 2079225, 1844438, 1801980, 1705774) names(city.pops) <- c("İstanbul", "Ankara", "İzmir", "Bursa", "Antalya", "Adana", "Konya", "Gaziantep", "Şanlıurfa", "Mersin") names(city.pops)[city.pops > 2000000 & city.pops < 3000000] ``` ## Question 3 (10 points) - Create a matrix as follows: - First consists of numbers: {1,2,3,4,5,6} - Second row consists of numbers: {2,4,6,8,10,12} - Third row consists of numbers: {1,3,5,7,9,11} - Fourth row consists of the sum of the second and third rows - Fifth row consists of the division of the fourth row with the first row - Swap the columns of the matrix so that the first row reads: {1,3,5,2,4,6} ```{r} # Write your answer here first <- 1:6 second <- first * 2 third <- second - 1 fourth <- second + third fifth <- fourth / first m <- matrix(c(first, second, third, fourth, fifth), nrow = 5, byrow = T) m m <- m[, c(1, 3, 5, 2, 4, 6)] m ``` ## Question 4 (10 points) - Create a factor from the following vector: `{"red", "red", "blue", "brown", "green", "blue", "red", "green", "green", "brown", "red", "blue"}` - Display the frequencies of each factor value (level) - Re-name `"red"` as `"purple"` - Display the number of "purples" ```{r} # Write your answer here f <- factor(c("red", "red", "blue", "brown", "green", "blue", "red", "green", "green", "brown", "red", "blue")) table(f) index.of.red <- which(levels(f) == "red") levels(f)[index.of.red] <- "purple" table(f)["purple"] ``` ## Question 5 (20 points) - Create a data frame for the following girls. You must choose the correct column types: - Canan is 24 years old, blonde, 170cm and 56kgs. She is married. - Deniz is 35 years old, has brown hair, 173cm and 61kgs. She is married. - Eda is 21 years old, has brown hair, 156cm and 45kgs. She is not married. - Fatma is 40 years old, has black hair, 164cm and 60kgs. She is married. - Gonca is 33 years old, blonde, 182cm and 65kgs. She is not married. - Hilal is 45 years old, has black hair, 165cm and 58kgs. She is married. - Lale is 38 years old, has black hair, 175cm and 59kgs. She is not married. - Mine is 28 years old, has brown hair, 190cm and 71kgs. She is not married. - Answer the following questions based on this dataframe: - What is the average age of the group? - How many girls are above the average height? - What is the most frequent hair color? - What is the average height of girls above 60kgs? - Compare the height/weight ratio of married and single girls. Which is higher? ```{r} # Write your answer here name <- c("Canan", "Deniz", "Eda", "Fatma", "Gonca", "Hilal", "Lale", "Mine") age <- c(24, 35, 21, 40, 33 ,45, 38, 28) hair.color <- c("blonde", "brown", "brown", "black", "blonde", "black", "black", "brown") height <- c(170, 173, 156, 164, 182, 165, 175, 190) weight <- c(56, 61, 45, 60, 65, 58, 59, 71) married <- c(T, T, F, T, F, T, F, F) df <- data.frame(name, age, hair.color, height, weight, married) # What is the average age of the group? mean(df$age) # How many girls are above the average height? sum(df$height > mean(df$height)) # What is the most frequent hair color? names(sort(table(df$hair.color), decreasing = T))[1] # What is the most frequent hair color? [Alternative answer] t <- table(df$hair.color) names(t)[which(t == max(t))] # What is the average height of girls above 60kgs? mean(df$height[df$weight > 60]) # Compare the height/weight ratio of married and single girls. Which is higher? hw.m <- mean((df$height / df$weight)[df$married]) hw.nm <- mean((df$height / df$weight)[!df$married]) if (hw.m > hw.nm) { print("Married is higher") } else if (hw.nm > hw.m) { print("Not Married is higher") } else { print("They are equal") } ``` ## Question 6 (15 points) - Given the below vector, compute its mean without using **any** functions. ```{r} # Do not change the two lines below set.seed(1024) v <- runif(100, 1, 20) + rnorm(100, 1, 3) # Compute the mean of v below sum <- 0 count <- 0 for (i in v) { sum <- sum + i count <- count + 1 } # Computed sum / count # Actual mean(v) ``` ## Question 7 (20 points) - Write a function that takes two numeric vectors and returns a matrix as follows: ``` # Example: > a <- c(1,3,5) > b <- c(20, 40, 60) > c <- your_function(a, b) > c [,1] [,2] [,3] [1,] 21 41 61 [2,] 23 43 63 [3,] 25 45 65 ``` ```{r} # Write your answer here f <- function(a, b) { m <- matrix(0, nrow = length(a), ncol = length(b)) for (r in seq_along(a)) for (c in seq_along(b)) m[r, c] <- a[r] + b[c] m } a <- c(1, 3, 5) b <- c(20, 40, 60) c <- f(a, b) c ``` ```{r} # Alternative answer f2 <- function(a, b) { m1 <- matrix(a, nrow = length(a), ncol = length(b)) m2 <- matrix(b, nrow = length(a), ncol = length(b), byrow = T) m1 + m2 } a <- c(1, 3, 5) b <- c(20, 40, 60) c <- f2(a, b) c ``` ## Question 8 (20 points) - Write a function that takes a numeric vector `vec` and a numeric variable `val`, and returns `TRUE` if `val` exists in `vec`, and otherwise returns `FALSE`. You are **not** allowed to use `%in%` or any other functions present in R. ```{r} # Write your answer here f <- function(vec, val) { for (i in vec) { if (i == val) return(TRUE) } return(FALSE) } # Testing f(c("a", "b", "c", "d", "e"), "d") f(c("a", "b", "c", "d", "e"), "f") ```