I am a tidyverse enthusiast. The proof is in the pudding: of my six packages on GitHub, only one DESCRIPTION
contains a non-tidyverse package (rcicero
, tidyjson
). I once contemplated rewriting these packages sans the tidyverse–for science, learning, growth,
bragging rights, and character building–but I broke into a cold sweat once I typed plot
. Admittedly, my reliance
on the tidyverse might be considered a crutch. Do I really know R, or just the conventions of a popular subset? A different question
for a different time.
The inspiration for this post was to practice some UNIX command line magic before bringing the data into R. I recently worked through Sean Kross’ excellent Coursera course, The Unix Workbench, and wanted to groove a little in the terminal. My question was this: which tidyverse packages do I use the most?
First, I had to grep for all the tidyverse exports (e.g. “dplyr::”, “purrr::”). Towards that end, I wrote a
small bash script that loops through an input, searching for each item within the package R/
directories, and then counts
the output lines.
#!/usr/bin/env bash
for package in $@
do
grep $package $(find . -name "*.R") | wc -l
done
I named the script count_namespace_exports.sh
, granted the execution permission, called the script with the tidyverse packages,
and piped the output into a file.
chmod +x count_namespace_exports.sh
./count_namespace_exports.sh "dplyr::" "purrr::" "tidyr::" "httr::" "tibble::" "stringr::" "readr::" "xml2::" "rvest::" "jsonlite::" > package_exports.txt
echo "dplyr::" "purrr::" "tidyr::" "httr::" "tibble::" "stringr::" "readr::" "xml2::" "rvest::" "jsonlite::" > packages.txt
Now to read the data into R for cleaning and visualization.
Was surprised to see that purrr
overtook dplyr
. Now to ascertain from Terminal which functions I use most.