## && and ||

This tutorial extends the Control Flow tutorial, where you learned how to use `if`

, `else`

, `return()`

, and `stop()`

.

Here you will learn how to

- combine logical tests in an if statement
- write if statements that work with vectors, which is a prerequisite if you want to write vectorized functions.

Here’s what `clean()`

looked like at the end of the Control Flow tutorial. Do you notice that all of the if statements have the same outcome?

```
clean <- function(x) {
stopifnot(!is.null(x))
if (x == -99) return(NA)
if (x == ".") return(NA)
if (x == "NaN") return(NA)
x
}
```

Let’s use your knowledge of logical tests to trim them down to a single if statement.

- Write a logical test that returns TRUE when x is -99 OR x is “.” (Let’s ignore the “NaN” case to keep things simple). Then click Submit Answer.

`"You can combine two logical tests in R with `&` (and) and `|` (or), e.g. x < 0 & x > 1."`

`x == -99 | x == "."`

`"This is the correct way to combine logical tests in R, but it has some downsides when you use it in an if statement."`

### & and |

`&`

and `|`

are R’s boolean operators for combining logical tests.

`&`

stands for “and” will return `TRUE`

if **both** tests return `TRUE`

and will return `FALSE`

otherwise.`|`

stands for “or” will return `TRUE`

if **one or both** tests returns `TRUE`

and will return `FALSE`

otherwise.

So,

```
x <- -99
x == -99 | x == "."
```

`## [1] TRUE`

However, it is bad practice to use `&`

and `|`

to combine logical tests within an `if`

condition. Why? Because:

- there is something better (as you’ll see in a minute)
`&`

and `|`

tend to generate warning messages when used with `if`

As R operators, both `&`

and `|`

are vectorized which means that you can use them with vectors. This is very useful.

```
x <- c(-99, 0 , 1)
x == -99
```

`## [1] TRUE FALSE FALSE`

`x == "."`

`## [1] FALSE FALSE FALSE`

`x == -99 | x == "."`

`## [1] TRUE FALSE FALSE`

However, `if`

conditions are not vectorized. `if`

expects the logical test contained within its parentheses to return a **single** `TRUE`

or `FALSE`

. If the condition returns a vector of `TRUE`

or `FALSE`

s, `if`

will use the first value and show a warning message.

```
x <- c(-99, 0 , 1)
if (x == -99 | x == ".") NA
```

```
## Warning in if (x == -99 | x == ".") NA: the condition has length > 1 and
## only the first element will be used
```

`## [1] NA`

### && and ||

You can avoid this by always using `&&`

and `||`

within your `if`

conditions. `&&`

and `||`

are lazy substitutes for `&`

and `|`

. They are lazy in two ways.

First, `&&`

and `||`

always return a single `TRUE`

or `FALSE`

. If you give `&&`

or `||`

vectors, they will compare only the first elements of the vectors—and they will not return a warning message.

```
x <- c(-99, 0 , 1)
x == -99 || x == "."
```

`## [1] TRUE`

### Use ||

Let’s use this to our immediate advantage.

- Replace the two
`if`

statements below with a single statement that tests whether x is `-99`

or `"."`

without throwing error messages.

```
clean <- function(x) {
stopifnot(!is.null(x))
if (x == -99) return(NA)
if (x == ".") return(NA)
x
}
```

`"Like |, || expects a _complete_ logical test on each side of ||."`

```
clean <- function(x) {
stopifnot(!is.null(x))
if (x == -99 || x == ".") return(NA)
x
}
```

`strict_check("Now lets see what happens if you use clean() with a vector of values.")`

```
clean <- function(x) {
stopifnot(!is.null(x))
if (x == -99 || x == ".") return(NA)
x
}
```

### Computation

The most important reason to use `||`

instead of `|`

is that `||`

saves unnecessary computation when possible. This is the second way that `&&`

and `||`

are lazy.

When possible, `&&`

and `||`

jump to the correct conclusion after evaluating the first of the two logical tests (not so with `&`

and `|`

).

`&&`

will return `FALSE`

if the test on the left returns `FALSE`

(because the combined test would return `FALSE`

).`||`

will return `TRUE`

if the test on the left returns `TRUE`

(because the combined test would return `TRUE`

)

In either case, `&&`

and `||`

will not evaluate the test on the right.

```
x <- -99
if (x == -99 || stop("if you evaluate this.")) "I didn't evaluate stop()."
```

`## [1] "I didn't evaluate stop()."`

How could you use this?

Remember how this code returns an error because `if`

cannot handle the result of `NULL == -99`

?

```
clean <- function(x) {
if (x == -99) return(NA)
x
}
clean(NULL)
```

`## Error in if (x == -99) return(NA): argument is of length zero`

### Quiz

Suppose we redefine `clean()`

like this:

```
clean <- function(x) {
if (is.null(x) || x == -99) return(NA)
x
}
```

## Vectorized if

Buried in the last section is an interesting question: what if you *do* want `clean()`

to work with vectors? i.e.

`clean(c(-99, 0, 1))`

`## [1] NA 0 1`

That would be a handy way to clean whole columns of data. How could you do it?

Compare these two functions (one should seem familiar). What is different?

```
clean <- function(x) {
if (x == -99) NA else x
}
clean2 <- function(x) {
ifelse(x == -99, NA, x)
}
```

### ifelse()

`ifelse()`

is a function that replicates an if else statement. It takes three arguments: a logical test followed by two pieces of code. If the test returns `TRUE`

, `ifelse()`

will return the results of the first piece of code. If the test returns `FALSE`

, `ifelse()`

will return the results of the second piece of code.

So `clean(-99)`

and `clean2(-99)`

both return `NA`

.

`clean(-99)`

`## [1] NA`

`clean2(-99)`

`## [1] NA`

However, unlike `if`

and `else`

, `ifelse`

is vectorized. As a result, you can pass `ifelse()`

a vector of values and it will apply the implied if else statement separately to each element of the vector.

```
x <- c(-99, 0, 1)
ifelse(x == -99, NA, x)
```

`## [1] NA 0 1`

`clean2()`

inherits this vectorized property from `ifelse()`

.

`clean2(c(-99, 0, 1))`

`## [1] NA 0 1`

Compare that to `clean()`

(which is non-vectorized because it relies on `if`

and `else`

, which are non-vectorized).

`clean(c(-99, 0, 1))`

```
## Warning in if (x == -99) NA else x: the condition has length > 1 and only
## the first element will be used
```

`## [1] NA`

### if_else

The dplyr package offers a slight improvement on `ifelse()`

named `if_else()`

. `if_else()`

is faster than `ifelse()`

, but it requires you to make sure that each case in the if else statement returns the same type of object. For example, the statement needs to return a real number (or a string, or a logical, etc.) *whether or not* the condition is `TRUE`

.

No big deal, right? Well kind of.

```
x <- c(-99, 0, 1)
if_else(x == -99, NA, x)
```

`## Error: `false` must be a logical vector, not a double vector`

### NA

What happened? Recall that data in R comes in six atomic types.

It is true:

`typeof(NA)`

`## [1] "logical"`

So when you write `if_else(x == -99, NA, x)`

, `if_else()`

returns a logical in the first case and a double (real number) in the second (assuming `x`

is a real number).

You can get around this mishap in two ways:

- Stick to
`ifelse()`

- Use a NA that comes with a type

### Types of NA

You may not realize it, but R comes with five types of NA. They all appear as `NA`

when printed, but they are each saved with a separate data type. These are:

`NA # logical`

`## [1] NA`

`NA_integer_ # integer`

`## [1] NA`

`NA_real_ # double`

`## [1] NA`

`NA_complex_ # complex`

`## [1] NA`

`NA_character_ # character`

`## [1] NA`

You can fix `if_else()`

by being precise about which NA to use (most other R functions will convert the type of NA without bothering you).

```
x <- c(-99, 0, 1)
if_else(x == -99, NA_real_, x)
```

`## [1] NA 0 1`

### Use if_else

- Fix the
`if_else()`

statement of `clean2()`

to work with real numbers. Then click Submit Answer.

```
clean2 <- function(x) {
ifelse(x == -99, NA, x)
}
```

```
clean2 <- function(x) {
ifelse(x == -99, NA_real_, x)
}
```

`strict_check("Notice that this version of `clean2()` will work with real numbers, but not other types of data like characters. What if you want `clean2()` to work with all types of data? That's simple: stick with `ifelse()`.")`

## Vectorized else if

What if you want to write a vectorized version of a multi-part if else tree? Like the tree in this function:

```
clean <- function(x) {
if (x == -99) NA
else if (x == ".") NA
else if (x == "") NA
else if (x == "NaN") NA
else x
}
```

In this case, neither `ifelse()`

or `if_else()`

will do. Why? Because each can only handle a single if condition, but our tree has four.

### case_when()

You can vectorize multi-part if else statements with dplyr’s `case_when()`

function. Here is how you would use `case_when()`

to rewrite our `foo()`

function from the Control Flow tutorial.

Here is the masterpiece in its original form

```
foo <- function(x) {
if (x > 2) "a"
else if (x < 2) "b"
else if (x == 1) "c"
else "d"
}
```

And here it is with `case_when()`

.

```
foo2 <- function(x) {
case_when(
x > 2 ~ "a",
x < 2 ~ "b",
x == 1 ~ "c",
TRUE ~ "d"
)
}
```

And here are our foos in action to prove that `foo2()`

is vectorized.

```
x <- c(3, 2, 1)
foo(x)
```

```
## Warning in if (x > 2) "a" else if (x < 2) "b" else if (x == 1) "c" else
## "d": the condition has length > 1 and only the first element will be used
```

`## [1] "a"`

`foo2(x)`

`## [1] "a" "d" "b"`

Notice that

`case_when()`

returns a single case for each element, the first case whose left hand side evaluates to `TRUE`

- The left hand side of the last case evaluates to
`TRUE`

no matter what the value of `x`

is (In fact, the left hand side *is* `TRUE`

). This is an easy way to add an `else`

clause to the end of `case_when()`

.

Now let’s look at the unusual syntax of `case_when()`

.

### case_when() syntax

```
foo2 <- function(x) {
case_when(
x > 2 ~ "a",
x < 2 ~ "b",
x == 1 ~ "c",
TRUE ~ "d"
)
}
```

Each argument of `case_when()`

is a pair that consists of a logical test on the left hand side and a piece of code on the right hand side. The two are *always* separated by a `~`

.

Like `if_else()`

, `case_when()`

expects each case to return the same type of output. So keep those NA types handy: `NA`

, `NA_integer_`

, `NA_real_`

, `NA_complex_`

, `NA_character_`

.

### Final Challenge

- Rewrite the multi-part version of
`clean()`

to use `case_when()`

, which will allow `clean()`

to handle vectors. Retain each case. Assume where necessary that `clean()`

will only work with real numbers. Then click Submit Answer.

```
clean <- function(x) {
if (x == -99) NA
else if (x == ".") NA
else if (x == "") NA
else if (x == "NaN") NA
else x
}
```

`"Use NA's that have the right type."`

```
clean <- function(x) {
case_when(
x == -99 ~ NA_real_,
x == "." ~ NA_real_,
x == "" ~ NA_real_,
x == "NaN" ~ NA_real_,
TRUE ~ x
)
}
```

`strict_check('And if you noticed that a vector of real numbers would never contain ".", "", and "Nan" because they are strings, you are of course right. Thanks for playing along with the charade.')`

### Congratulations!

You’ve learned how to alter the control flow of your functions with:

`if`

`else`

`return()`

`stop()`

`stopifnot()`

`ifelse()`

Not only that, you tackled two advanced methods: dplyr’s `if_else()`

and dplyr’s `case_when()`

.