In R there is now a function to load packages called use()
(from version 4.4.0 onwards). There are of course already library()
and require()
, but I strongly recommend using use()
, or at least understand why it exists and adjust how you use library()
accordingly.
To understand the usefulness of use()
, let us work with an example where you have already loaded a package like {data.table} but also want to use a few functions from {dplyr}. When you use use()
to load {dplyr}, you will see that there are functions such as between()
, first()
, and last()
in both packages, and if you call any of these functions, they will be called from {dplyr} (as this is the most recent package you have loaded):
use("dplyr")
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:data.table':
#>
#> between, first, last
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
Here is the smart part. You do not need to load all functions in a package. You can specify the few functions you would like to use in your namespace by simply providing a vector of the function names of interest. In the example here (from a fresh R session with {data.table} loaded), let us say we want to use select()
and slice()
from {dplyr}:
use("dplyr", c("select", "slice"))
Here, we do not get any message about objects being masked in our namespace since select()
and slice()
are not in conflict with functions from other packages being loaded. We are specifying the argument in use()
by position, which is fine. Here is the same code, but with all arguments made explicit:
use(package = "dplyr", include.only = c("select", "slice"))
Let us now take a quick look under the hood of use()
and see how it is working:
use
#> function (package, include.only)
#> invisible(library(package, lib.loc = NULL, character.only = TRUE,
#> logical.return = TRUE, include.only = include.only, attach.required = FALSE))
#> <bytecode: 0x12c0c27c8>
#> <environment: namespace:base>
That is straightforward. use()
is just a shortcut to library()
but only taking the argument of what functions to include and changing a few defaults. Accordingly, we can cut out the middleman and run the same specification as above with library()
:
library("dplyr", include.only = c("select", "slice"))
This is beautiful. Why do I like this so much? Because I hate name conflicts as much as the next guy and it is great to be explicit about what functions you rely on from where. It makes it a lot easier to debug code and you gain much better awareness of what is stored in your active R session. I was reading a blog post at Jumping Rivers about the new features of R version 4.5.0, and I agree with how use()
can be used drawing a parallel to how functions are loaded in Python.
I do not see any good reason not to only include the functions you are going to use. Not only does it make it easier for the reader of your code to understand why different packages are loaded, it also makes it easier for you to see if a package is not required at all in your script (unless you have set up a linter for this purpose).
As always, there are exceptions. Do whatever makes sense for you and think about when it makes sense to load all functions instead of just a few. For example, {ggplot2} and {data.table} come to mind as packages where it will be annoying to include each unique function required to make a plot or work with a data.table.
That being said, the ability to be explicit about what functions you want to rely on in your script is a great improvement to the long-term reproducibility of your code. Use use()
.