args(args(args)(args))

args metaprogramming

The unexpected sequal to “R is a language optimized for meme-ing”

June Choe (University of Pennsylvania Linguistics)https://live-sas-www-ling.pantheon.sas.upenn.edu/
2024-03-05

The kind of blog posts that I have the most fun writing are those where I hyperfocus on a single function, like dplyr::slice(), purrr::reduce(), and ggplot2::stat_summary(). In writing blog posts of this kind, I naturally come across a point where I need to introduce the argument(s) that the function takes. I usually talk about them one at a time as needed, but I could start by front-loading that important piece of information first.

In fact, there’s a function in R that lets me do exactly that, called args().

args()

args() is, in theory, a very neat function. According to ?args:

Displays the argument names and corresponding default values of a (non-primitive or primitive) function.

So, for example, I know that sum() takes the arguments ... and na.rm (with the na.rm = FALSE default). The role of args() is to display exactly that piece of information using R code. This blog runs on rmarkdown, so surely I can use args() as a convenient and fancy way of showing information about a function’s arguments to my readers.

In this blog post, I want to talk about args(). So let’s start by looking at the argument that args() takes.

Of course, I could just print args in the console:

args
  function (name) 
  .Internal(args(name))
  <bytecode: 0x0000024f98dbd180>
  <environment: namespace:base>

But wouldn’t it be fun if I used args() itself to get this information?

args(args)

args(args)
  function (name) 
  NULL

Okay, so I get the function (name) piece, which is the information I wanted to show. We can see that args() takes one argument, called name, with no defaults.

But wait - what’s that NULL doing there in the second line?

Hmm, I wonder if they forgot to invisible()-y return the NULL. args() is a function for displaying a function’s arguments after all, so maybe the arguments are printed to the console as a side-effect and the actual output of args() is NULL.

If that is true, we should be able to suppress the printing of NULL with invisible():

invisible(args(args))

Uh oh, now everything is invisible.

Alright, enough games! What exactly are you, output of args()?!

typeof(args(args))
  [1] "closure"

What?

args(args)(args)

Turns out that args(args) is actually returning a whole function that’s a copy of args(), except with its body replaced with NULL.

So args(args) is itself a function that takes an argument called name and then returns NULL. Let’s assign it to a variable and call it like a function:

abomination <- args(args)
abomination(123)
  NULL
abomination(mtcars)
  NULL
abomination(stop())
  NULL

The body is just NULL, so the function doesn’t care what it receives1 - it just returns NULL.

In fact, we could even pass it… args:

args(args)(args)
  NULL

args(args(args)(args))

But wait, that’s not all! args() doesn’t just accept a function as its argument. From the documentation:

Value

NULL in case of a non-function.

So yeah - if args() receives a non-function, it just returns NULL:

args(123)
  NULL
args(mtcars)
  NULL

This applies to any non-function, including… NULL:

args(NULL)
  NULL

And recall that:

is.null( args(args)(args) )
  [1] TRUE

Therefore, this is a valid expression in base R:

args(args(args)(args))
  NULL

ad infinitum

For our cursed usecase of using args(f) to return a copy of f with it’s body replaced with NULL only to then immediately call args(f)(f) to return NULL, it really doesn’t matter what the identity of f is as long as it’s a function.

That function can even be … args(args)!

So let’s take our args(args(args)(args)):

args( args( args )( args ))
  NULL

And swap every args() with args(args):

args(args)( args(args)( args(args) )( args(args) ))
  NULL

Or better yet, swap every args() with args(args(args)):

args(args(args))( args(args(args))( args(args(args)) )( args(args(args)) ))
  NULL

The above unhinged examples are a product of two patterns:

  1. The fact that you always get function (name) NULL from wrapping args()s over args:

    list(
       args(          args),
       args(     args(args)),
       args(args(args(args)))
     )
      [[1]]
      function (name) 
      NULL
    
      [[2]]
      function (name) 
      NULL
    
      [[3]]
      function (name) 
      NULL
  2. The fact that you can get this whole thing to return NULL by having function (name) NULL call the function object args. You can do this anywhere in the stack and the NULL will simply propagate:

    list(
       args(args(args(args))) (args)   ,
       args(args(args(args))  (args) ) ,
       args(args(args(args)   (args) ))
     )
      [[1]]
      NULL
    
      [[2]]
      NULL
    
      [[3]]
      NULL

We could keep going but it’s tiring to type out and read all these nested args()… but did you know that there’s this thing called the pipe %>% that’s the solution to all code readability issues?

Had enough args() yet?

Let’s make an args() factory ARGS()

library(magrittr)
ARGS <- function(n) {
  Reduce(
    f = \(x,y) bquote(.(x) %>% args()),
    x = seq_len(n),
    init = quote(args)
  )
}

… to produce a sequence of args()

ARGS(10)
  args %>% args() %>% args() %>% args() %>% args() %>% args() %>% 
      args() %>% args() %>% args() %>% args() %>% args()
eval(ARGS(10))
  function (name) 
  NULL

… and tidy it up!

ARGS(10) %>% 
  deparse1() %>% 
  styler::style_text()
  args %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args() %>%
    args()

Wanna see even more unhinged?

Let’s try to produce a “matrix” of args(). You get a choice of i “rows” of piped lines, and j “columns” of args()-around-args each time - all to produce a NULL.

Ready?

ARGS2 <- function(i, j) {
  Reduce(
    f = \(x,y) bquote(.(x) %>% (.(y))),
    x = rep(list(Reduce(\(x,y) call("args", x), seq_len(j), quote(args))), i)
  )
}
ARGS2(5, 1) %>% 
  deparse1() %>%
  styler::style_text()
  args(args) %>%
    (args(args)) %>%
    (args(args)) %>%
    (args(args)) %>%
    (args(args))
ARGS2(5, 3) %>% 
  deparse1() %>%
  styler::style_text()
  args(args(args(args))) %>%
    (args(args(args(args)))) %>%
    (args(args(args(args)))) %>%
    (args(args(args(args)))) %>%
    (args(args(args(args))))
ARGS2(10, 5) %>% 
  deparse1() %>%
  styler::style_text()
  args(args(args(args(args(args))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args)))))) %>%
    (args(args(args(args(args(args))))))
list(
  eval(ARGS2(5, 1)),
  eval(ARGS2(5, 3)),
  eval(ARGS2(10, 5))
)
  [[1]]
  NULL
  
  [[2]]
  NULL
  
  [[3]]
  NULL

Yay!

TL;DR: str()

If you want a version of args() that does what it’s supposed to, use str() instead:2

str(args)
  function (name)
str(sum)
  function (..., na.rm = FALSE)

args() is hereafter banned from my blog.

Coda (serious): redesigning args()

The context for my absurd rant above is that I was just complaining about how I think args() is a rather poorly designed function.

Let’s try to redesign args(). I’ll do three takes:

Take 1) Display is the side-effect; output is trivial

If the whole point of args() is to display a function’s arguments for inspection in interactive usage, then that can simply be done as a side-effect.

As I said above, str() surprisingly has this more sensible behavior out of the box. So let’s write our first redesign of args() which just calls str():

args1 <- function(name) {
  str(name)
}
args1(sum)
  function (..., na.rm = FALSE)

In args1()/str(), information about the function arguments are sent to the console.3 We know this because we can’t suppress this with invisible but we can grab this via capture.output:

invisible( args1(sum) )
  function (..., na.rm = FALSE)
capture.output( args1(sum) )
  [1] "function (..., na.rm = FALSE)  "

For functions whose purpose is to signal information to the console (and whose usage is limited to interactive contexts), we don’t particularly care about the output. In fact, because the focus isn’t on the output, the return value should be as trivial as possible.

A recommended option is to just invisibly return NULL. This is now how args1() does it (via str()).4:

print( args1(sum) )
  function (..., na.rm = FALSE)  
  NULL
is.null( args1(sum) )
  function (..., na.rm = FALSE)
  [1] TRUE

Alternatively, the function could just invisibly return what it receives,5 which is another common pattern for cases like this. Again, we return invisibly to avoid distracting from the fact that the point of the function is to display as the side-effect.

args2 <- function(name) {
  str(sum)
  invisible(name)
}
args2(rnorm)
  function (..., na.rm = FALSE)
args2(rnorm)(5)
  function (..., na.rm = FALSE)
  [1] -0.5494891  1.2861975 -1.2755454  1.0817387 -0.7248563

Take 2) Display is the side-effect; output is meaningful

One thing I neglected to mention in this blog post is that there are other ways to extract a function’s arguments. One of them is formals():6

formals(args)
  $name
formals(rnorm)
  $n
  
  
  $mean
  [1] 0
  
  $sd
  [1] 1

formals() returns the information about a function’s arguments in a list which is pretty boring, but it’s an object we can manipulate (unlike the return value of str()). So there’s some pros and cons.

Actually, we could just combine both formals() and str():

args3 <- function(name) {
  str(name)
  invisible(formals(name))
}
arguments <- args3(rnorm)
  function (n, mean = 0, sd = 1)
arguments
  $n
  
  
  $mean
  [1] 0
  
  $sd
  [1] 1
arguments$mean
  [1] 0

You get the nice display as a side-effect (via str()) and then an informative output (via formals()). You could even turn this into a class with a print method, which is definitely the better way to go about this, but I’m running out of steam here and I don’t like OOP, so I won’t touch that here.

Take 3) Just remove the NULL

This last redesign is the simplest of the three, and narrowly deals with the problem of that pesky NULL shown alongside the function arguments:

args(sum)
  function (..., na.rm = FALSE) 
  NULL

Fine, I’ll give them that args() must, for compatibility with S whatever reason, return a whole new function object, which in turn requires a function body. But if that function is just as a placeholder and not meant to be called, can’t you just make the function body, like, empty?

args4 <- function(name) {
  f <- args(name)
  body(f) <- quote(expr=)
  f
}
args4(sum)
  function (..., na.rm = FALSE)
args4(rnorm)
  function (n, mean = 0, sd = 1)
typeof( args4(rnorm) )
  [1] "closure"

Like, come on!

sessionInfo()

  R version 4.3.3 (2024-02-29 ucrt)
  Platform: x86_64-w64-mingw32/x64 (64-bit)
  Running under: Windows 11 x64 (build 22631)
  
  Matrix products: default
  
  
  locale:
  [1] LC_COLLATE=English_United States.utf8 
  [2] LC_CTYPE=English_United States.utf8   
  [3] LC_MONETARY=English_United States.utf8
  [4] LC_NUMERIC=C                          
  [5] LC_TIME=English_United States.utf8    
  
  time zone: America/New_York
  tzcode source: internal
  
  attached base packages:
  [1] stats     graphics  grDevices utils     datasets  methods   base     
  
  other attached packages:
  [1] magrittr_2.0.3
  
  loaded via a namespace (and not attached):
   [1] crayon_1.5.2      vctrs_0.6.5       cli_3.6.1         knitr_1.45       
   [5] rlang_1.1.2       xfun_0.41         purrr_1.0.2       styler_1.10.2    
   [9] jsonlite_1.8.8    htmltools_0.5.7   sass_0.4.7        fansi_1.0.5      
  [13] rmarkdown_2.25    R.cache_0.16.0    evaluate_0.23     jquerylib_0.1.4  
  [17] distill_1.6       fastmap_1.1.1     yaml_2.3.7        lifecycle_1.0.4  
  [21] memoise_2.0.1     compiler_4.3.3    prettycode_1.1.0  downlit_0.4.3    
  [25] rstudioapi_0.15.0 R.oo_1.25.0       R.utils_2.12.3    digest_0.6.33    
  [29] R6_2.5.1          R.methodsS3_1.8.2 bslib_0.6.1       tools_4.3.3      
  [33] withr_3.0.0       cachem_1.0.8

  1. You can even see lazy evaluation in action when it receives stop() without erroring.↩︎

  2. Though you have to remove the "srcref" attribute if the function has one. But also don’t actually do this!↩︎

  3. Technically, the "output" stream.↩︎

  4. For the longest time, I thought args() was doing this from how its output looked.↩︎

  5. Essentially acting like identity().↩︎

  6. But note that it has a special behavior of returning NULL for primitive functions (written in C) that clearly have user-facing arguments on the R side. See also formalArgs(), for a shortcut to names(formals())↩︎