The R
computing environment uses packages to organize objects into discrete sets. A package may have a combination of functions and datasets. The base
package has the basic R
tools.
Packages are collections of R functions, data, and compiled code in a well-defined format. Packages are installed onto your computer with install.packages()
, which is done once. Packages are updated with update.packages()
. Both these operations can be done within Rstudio from the Packages
tab in the bottom-right pane of Rstudio.
The directory where packages are stored on your computer is called the library. Packages are attached from the library to your current workspace using the command library()
. For more information see packages vs. libraries and links therein.
To see what packages are attached, use
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.4.1 backports_1.1.0 magrittr_1.5 rprojroot_1.2
## [5] tools_3.4.1 htmltools_0.3.6 yaml_2.1.14 Rcpp_0.12.12
## [9] stringi_1.1.5 rmarkdown_1.6 knitr_1.17 stringr_1.2.0
## [13] digest_0.6.12 evaluate_0.10.1
The other packages automatically loaded are usually stats
and graphics
, and some other more arkane helpers, grDevices
, utils
, and methods
. Note that a number of packages may be loaded via a namespace (and not attached)
, which means they are used indirectly by some other attached package. Each package has appended its version number after -
.
datasets
packageAll R
distributions provide the datasets
packages which only contains sample datasets. In an interactive session help
will bring up the index of help pages for the datasets
package.
This is a collection of datasets, each organized in the basic tabular data structure (rows correspond to observations, columns to variables) called a data.frame
in R
.
help(package="datasets")
An alternative is to list the names of objects in a package. Here we use the pattern
to just show datasets beginning with a
.
ls("package:datasets", pattern = "^a")
## [1] "ability.cov" "airmiles" "airquality" "anscombe" "attenu"
## [6] "attitude" "austres"
Often of more interest, list the names and a brief description of the structure
ls.str("package:datasets", pattern = "^a")
## ability.cov : List of 3
## $ cov : num [1:6, 1:6] 24.64 5.99 33.52 6.02 20.75 ...
## $ center: num [1:6] 0 0 0 0 0 0
## $ n.obs : num 112
## airmiles : Time-Series [1:24] from 1937 to 1960: 412 480 683 1052 1385 ...
## airquality : 'data.frame': 153 obs. of 6 variables:
## $ Ozone : int 41 36 12 18 NA 28 23 19 8 NA ...
## $ Solar.R: int 190 118 149 313 NA NA 299 99 19 194 ...
## $ Wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
## $ Temp : int 67 72 74 62 56 66 65 59 61 69 ...
## $ Month : int 5 5 5 5 5 5 5 5 5 5 ...
## $ Day : int 1 2 3 4 5 6 7 8 9 10 ...
## anscombe : 'data.frame': 11 obs. of 8 variables:
## $ x1: num 10 8 13 9 11 14 6 4 12 7 ...
## $ x2: num 10 8 13 9 11 14 6 4 12 7 ...
## $ x3: num 10 8 13 9 11 14 6 4 12 7 ...
## $ x4: num 8 8 8 8 8 8 8 19 8 8 ...
## $ y1: num 8.04 6.95 7.58 8.81 8.33 ...
## $ y2: num 9.14 8.14 8.74 8.77 9.26 8.1 6.13 3.1 9.13 7.26 ...
## $ y3: num 7.46 6.77 12.74 7.11 7.81 ...
## $ y4: num 6.58 5.76 7.71 8.84 8.47 7.04 5.25 12.5 5.56 7.91 ...
## attenu : 'data.frame': 182 obs. of 5 variables:
## $ event : num 1 2 2 2 2 2 2 2 2 2 ...
## $ mag : num 7 7.4 7.4 7.4 7.4 7.4 7.4 7.4 7.4 7.4 ...
## $ station: Factor w/ 117 levels "1008","1011",..: 24 13 15 68 39 74 22 1 8 55 ...
## $ dist : num 12 148 42 85 107 109 156 224 293 359 ...
## $ accel : num 0.359 0.014 0.196 0.135 0.062 0.054 0.014 0.018 0.01 0.004 ...
## attitude : 'data.frame': 30 obs. of 7 variables:
## $ rating : num 43 63 71 61 81 43 58 71 72 67 ...
## $ complaints: num 51 64 70 63 78 55 67 75 82 61 ...
## $ privileges: num 30 51 68 45 56 49 42 50 72 45 ...
## $ learning : num 39 54 69 47 66 44 56 55 67 47 ...
## $ raises : num 61 63 76 54 71 54 66 70 71 62 ...
## $ critical : num 92 73 86 84 83 49 68 66 83 80 ...
## $ advance : num 45 47 48 35 47 34 35 41 31 41 ...
## austres : Time-Series [1:89] from 1971 to 1993: 13067 13130 13198 13254 13304 ...
When examining a new R
package, ls.str
is a useful way to learn about what objects are in a package. It will list both datasets and functions. However, it can still be rather verbose; it is often better to use the Packages
tab in the bottom-right pane of Rstudio. You will find a list of objects with one-line descriptions, and help page for each object by clicking on its name. Often, packages have overview documentation toward the top.
Note that in the calls to ls
and ls.str
the package name is given as a character string "package:datasets"
. This convention is also used in describing which packages are attached in a session.
Most packages have a namespace, which identifies which objects are visible to users. This is a rather arkane topic, but is important to understand for those going on to develop their own packages.
Normally, one attaches a package using the library
command, which gives direct access to all objects identified in the namespace of that package. It is possible to access objects in a package without attaching the package by using the convention packagename::objectname
. For instance, the following makes explicit reference to the package datasets
to examine the structure of ToothGrowth
.
str(datasets::ToothGrowth)
## 'data.frame': 60 obs. of 3 variables:
## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
This is not necessary for already attached packages (such as datasets
), but can be helpful to document the source of objects. It is generally used in packages that may use a few functions or datasets from another package. For instance, attaching the dplyr
package makes the magrittr
pipe (%>%
) available, without explicitly requiring the loading of this secondary package.
motivation for creating packages
R source package contents
DESCRIPTION
text fileR
, man
, data
inst
, vignettes
inst
typically has sub-folder doc
with Rmd
files, etc.NAMESPACE
README.md
myname.Rproj
LICENSE
R installed package contents
Jenny Bryan: Writing your own R package
> fortune("installing")