Lexical Scoping in R Language

Introduction to Lexical Scoping

The Lexical Scoping in R Language is the set of rules that govern how R will look up the value of a symbol. For example

x <- 10

In this example, scoping is the set of rules that R applies to go from symbol $x$ to its value 10.

Types of Scoping

R has two types of scoping

  1. Lexical scoping: implemented automatically at the language level
  2. Dynamic scoping: used in select functions to save typing during interactive analysis.

Lexical scoping looks up symbol values based on how functions were nested when they were created, not how they are nested when they are called to figure out where the values of a variable will be looked up. You just need to look at the function’s definition.

Basic Principles of Lexical Scoping in R Language

There are four basic principles behind R’s implementation of lexical scoping in R Language:

Name Masking

The following example will illustrate the basic principle of lexical scoping

f <- function(){
      x <- 1
      y <- 2
      c(x, y)
}

f()
Name Masking in R Functions

If a name is not defined inside a function, R will look one level up.

x <- 2
g <- function(){
       y <- 1
       c(x,y)
}
g()

The same rules apply if a function is defined inside another function: look inside the current function, then where the function was defined, and so on, all the way up to the global environment, and then on to other loaded packages.

x <- 1
h <- function(){
       y <- 2
   i <- function(){
       z <- 3
       c(x,y,z)
   }
i()
}

h()
r(x,y)

The same rules apply to closures, functions created by other functions. The following function, j( ), returns a function.


How does R know what the value of y is after the function has been called? It works because k preserves the environment in which it was defined and because the environment includes the value of y.

j <- function(x){
       y <- 2
    function(){
       c(x,y)
    }
}
k<-j(1)
k()
rm(j,k)
Name Masking in R Example

Functions vs Variables

Finding functions works the same way as finding variables:

l <- function(x){
       x+1
}
m <- function(){
l <- function(x){
       x*2
}
  l(10)
}
m()
Lexical Scoping Functions VS Variables in R

If you are using a name in a context where it’s obvious that you want a function (e.g. f(3)), R will ignore objects that are not functions while it is searching. In the following example, n takes on a different value depending on whether R is looking for a function or a variable.

n <- function(x) {
      x/2
}
o <- function(){
      n <- 10
   n(n)
}
o()

Fresh Start

The following questions can be asked (i) What happens to the values in between invocation of a function? (ii) What will happen the first time you run this function? and (iii) What will happen the second time? (If you have not seen exists() before it returns TRUE if there’s a variable of that name, otherwise it returns FALSE).

j <- function(){
       if(!exists("a")) {
         a <- 1
       } else {
         a<-a+1
       }
    print(a)
          }
j()

From the above example, you might be surprised that it returns the same value, 1 every time. This is because every time a function is called, a new environment is created to host execution. A function has no way to tell what happened the last time it was run; each invocation is completely independent (but see mutable states).

Dynamic Lookup

Lexical scoping determines where to look for values, not when to look for them. R looks for values when the function is run, not when it’s created. This means that the output of a function can be different depending on objects outside its environment:

f <- function() {
       x
}
x <- 15
f()

x <- 20
f()

You generally want to avoid this behavior because it means the function is no longer self-contained.
One way to detect this problem is the findGlobals() function from codetools. This function lists all the external dependencies of a function:

f <- function{ 
     x + 1
}
codetools::findGlobals(f)
Lexical Scoping Dynamic Lookup in R


Another way to try and solve the problem would be to manually change the environment of the function to the emptyenv(), an environment that contains absolutely nothing:

environment(f) <- emptyenv()

This doesn’t work because R relies on lexical scoping to find everything, even the + operator. It’s never possible to make a function completely self-contained because you must always rely on functions defined in base R or other packages.

Since all standard operators in R are functions, you can override them with your alternatives.

'(' <- function(e1) {
      if(is.numeric(e1) && runif(1)<0.1){
         e1 + 1
      } else {
        e1
      }
}
replicate (50,(1+2))

A pernicious bug is introduced: 10% of the time, 1 will be added to any numeric calculation inside parenthesis. This is another good reason to regularly restart with a clean R session!

Bound Symbol or Variable

If a symbol is bound to a function argument, it is called a bound symbol or variable. In case, if a symbol is not bound to a function argument, it is called a free symbol or variable.

If a free variable is looked up in the environment in which the function is called, the scoping is said to be dynamic. If a free variable is looked up in the environment in which the function was originally defined the scoping is said to be static or lexical. R, like Lisp, is lexically scoped whereas R and S-plus are dynamically scoped.

y = 20
foo = function(){
  y = 10  #clouser for the foo function
  function(x) {
    x + y
    }
}
bar=foo()

Foo returns an anonymous function.

bar=foo() is a function in global like foo. $x + y$ is created in the foo environment not in global. Foo has a function as a return value, which is then bound to bar the global environment. Note that anonymous is a function that has no name.

https://itfeature.com

https://gmstat.com

Functions in R Language: Quick Guide 1

Functions in R language (or any programming language) are fundamental building blocks that allow you to organize the programming code, make it reusable, and perform complex tasks efficiently.

Functions in the R Language are first-class objects of the class function and can be passed by arguments to other functions. Functions can be assigned to variables, stored in a list, passed as arguments to other functions, created functions inside functions, and even returned function as the result of a function. There are three building blocks of functional programming: anonymous functions, closures (functions written by functions), and a list of functions.

Components of a Function

Each function in the R Language consists of three components

  1. formal( )
  2. body( )
  3. environment( )

Types of Function

There are two main types of functions in R:

  1. Built-in functions in R: There is a vast library of built-in functions in R Language, like finding the mean (mean()), calculating the sum (sum()), or creating graphs (plot()).
  2. User-defined functions in R: One can create functions to tailor them to one’s needs. This is useful for repetitive tasks, improving code readability, and avoiding errors.

Each function has arguments that can be given default values, which makes interactive usage more convenient. A function is defined by an assignment of the form

name <- function(arg1, arg2, …){
     Expression
}

where the expression uses the arg1, arg2, ... (arguments) to calculate a value. The value of the expression is the value returned for the function. A call to function usually takes the form

name(expr1, expr2, …)

and may occur anywhere a function call is legitimate.

Functions in R Language: Example

As an example, consider the following customized function (user-defined function) center(). This function can compute the mean, median, and trimmed mean of the input data. The center() function has two arguments, the first argument is for data and the second argument is for the selection of summary statistics.

center<-function(x, type){
    type == "mean" && return(mean(x))
    type == "median" && return(median(x))
    type == "trimmed" && return(mean(x, trim=0.1))
}

As another example, the user-defined summary( ) function is created without repetition of some arguments, i.e. duplication is removed. Note that all the functions in the user-defined function are stored as a list.

summary <- function(x) {
     funs <- c(mean, median, sd, mad, IQR)
     lapply(funs, function(f) f(x, na.rm = TRUE))
}
Functions in R Language

The center() function is created to perform some summary statistics using the switch() statement.

center<-function(x, type){
    switch(type,
         mean = mean(x),
         median = median(x),
         trimmed = mean(x,trim=0.1)
    )
}

Let us generate the data from normal distribution and check the output from the user-defined function center().

x <- rnorm(100)

center(x, type="mean")
center(x, type="median")
center(x, type="trimmed")
center(x, type="mode")
Functions in R Language

FAQs about Functions in R Language

  1. What is a function in R?
  2. Describe the components of a function
  3. Give some working examples of customized functions in R.
  4. What is meant by arguments of a function
  5. Differentiate between built-in and user-defined functions in R Language.

https://itfeature.com

https://gmstat.com

Debugging Tools in R Language

The R system has two main ways of reporting a problem in executing a function. One of them is a warning message while the other one is a simple error. The purpose of the warning is to tell the user (programmer) that “something unusual happened during the execution of the function, but the function was nevertheless able to execute to completion”. Writing a robust code (code that checks for imputing errors) is important for larger programs.

log(-1)     #produce a warning (NaN)
message <- function(x){
             if(x > 0)
               print("Hello")
            else
               print("Goodbye")
}

The log(-1) will result in a fatal error, not a warning. The first thing one should do is to print the call stack (print the sequence of function calls that led to the error). The traceback() function can be used which prints the list of functions that were called before the error occurred. However, this can be uninteresting if the error occurred at a top-level function.

Debugging Tools in R

Debugging Tools in R

The debugging tools in R are:

The traceback() Function

The traceback() function prints the sequence of function calls in reverse order from the top.

The debug() Function

The debug() function takes a single argument (the name of a function) and steps through the function line-by-line to identify the specific location of a bug, that function is flagged for debugging. To unflag a function, undebug() function is used. A function flagged for debugging does not execute in a usual way, rather, each statement in the function is executed one at a time and the user can control when each statement gets executed. After a statement is executed, the function suspends and the user is free to interact with the environment.

The browser() Function

The browser() function can be used to suspend the execution of a function so that the user can browse the local environment.

The trace() Function

The trace() function is very useful for making minor modifications to function “on the fly” without having to modify functions and re-sourcing them. It is especially useful if you need to track down an error that occurs in a base function.

trace("mean", quote( if( any(is.nan(x) ) ){ browser() }), print = FALSE)

The trace() function copies the original function code into a temporary location and replaces the original function with a new function containing the insert code.

The recover() Function

The recover() function can help to “jump up” to a higher level in the function call stack.

options(error = recover)

The error option tells R what to do in a situation where a function must halt the execution.

SPSS Data Analysis

r faqs