Skip to content

alistaire47/pipecleaner

Repository files navigation

pipecleaner

Travis build status AppVeyor build status Coverage status CRAN status lifecycle

pipecleaner is a utility R package to debug pipelines using the magrittr %>% pipe. Its debug_pipeline launches the debugging browser on the input pipeline in a form that allows the user to step through the successive calls of the pipeline, examining the output of each successive element.

Installation

pipecleaner is not currently on CRAN, but can be installed with

# install.packages("remotes")
remotes::install_github("alistaire47/pipecleaner")

Debugging pipelines

To debug a pipeline, call debug_pipeline on the raw code or a character vector of code. If no input is supplied and it is called from RStudio, it will use whatever code is highlighed in the source editor as input.

debug_pipeline can also be called via an RStudio add-in by highlighting the pipeline to debug and then selecting “Debug pipeline in browser” from the “Addins” menu.

Once called, debug_pipeline will reassemble the pipeline into a function that can be debugged in the browser and call the debugger. Each line adds another call from the pipeline and prints and the output so the user can see the status of the data passed through the pipeline by stepping through the function.

The data is also stored to a variable called dot[N] in each line, where [N] is the index of the call, making it easy to compare input and output data of a step in the pipeline and try out new code formulations in the console.

All together, it looks like this:

library(magrittr)
library(pipecleaner)

debug_pipeline(
    x = 1:5 %>% 
        rev %>% 
        {. * 2} %>% 
        sample(replace = TRUE)
)
#> dot1 <- rev(1:5)
#> dot2 <- {dot1 * 2}
#> x <- sample(dot2, replace = TRUE)debugging in: pipeline_function()
#> debug: {
#>     print(dot1 <- rev(1:5))
#>     print(dot2 <- {
#>         dot1 * 2
#>     })
#>     print(x <- sample(dot2, replace = TRUE))
#> }
#> debug at /Users/alistaire/Documents/R_projects/pipecleaner/R/debug_pipeline.R#272: print(dot1 <- rev(1:5))
#> [1] 5 4 3 2 1
#> debug: print(dot2 <- {
#>     dot1 * 2
#> })
#> debug: dot1 * 2
#> [1] 10  8  6  4  2
#> debug: print(x <- sample(dot2, replace = TRUE))
#> [1]  6 10  8  4 10
#> exiting from: pipeline_function()

Bursting pipes

Occasionally it is necessary to restructure code from a piped to an unpiped form. Now burst_pipes makes this sort of restructuring simple:

burst_pipes(
    x = 1:5 %>% 
        rev %>% 
        {. * 2} %>% 
        .[3] %>% 
        rnorm(1, ., sd = ./10)
  )
#> dot1 <- rev(1:5)
#> dot2 <- {dot1 * 2}
#> dot3 <- dot2[3]
#> x <- rnorm(1, dot3, sd = dot3/10)

More specific names can be specified as a character vector:

burst_pipes(
    x <- 1:5 %>% 
        rev %>% 
        {. * 2} %>% 
        .[3] %>% 
        rnorm(1, ., sd = ./10),
    names = c("reversed", "doubled", "third", "x")
)
#> reversed <- rev(1:5)
#> doubled <- {reversed * 2}
#> third <- doubled[3]
#> x <- rnorm(1, third, sd = third/10)

burst_pipes can also be called via a pair of RStudio add-ins, which replace the highlighted code with its restructured form. The “Burst pipes” add-in creates names; the “Burst pipes and set names” add-in allows custom names to be set.

Limitations

pipecleaner should successfully debug most pipelines. However, due to its structure, it does have known limitations:

  • Only the %>% pipe is handled, not more exotic pipes like %$%. For the moment, this is unlikely to change absent significant demand.
  • Nested pipelines—e.g. piping within an anonymous function in purrr::map—are ignored; the whole call is treated as one step.

Related

  • ViewPipeSteps is a similar project that calls View() after each step in the pipeline.
  • magrittr itself contains debug_pipe, which is a wrapper around browser that returns its input, allowing it to be inserted within a pipeline.