Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed of tk_augment_lags() #143

Open
PabloCanovas opened this issue Mar 23, 2023 · 1 comment
Open

Improve speed of tk_augment_lags() #143

PabloCanovas opened this issue Mar 23, 2023 · 1 comment

Comments

@PabloCanovas
Copy link

PabloCanovas commented Mar 23, 2023

When creating several lags at the same time for a given variable, I've found that using a map+partial structure is around 7 times faster when working with big datasets and multiple lags (I tried with 16M rows and 10 lags). It could be worth it to check it out.

For your reference, this is the function I built:

calculate_lags <- function(df, var, lags){ map_lag <- lags %>% map(~partial(lag, n = .x)) return(df %>% mutate(across(.cols = {{var}}, .fns = map_lag, .names = "{.col}_lag{lags}"))) }

Edit: I don't know why it doesn't respect indentation...

@spsanderson
Copy link

I was literally exploring the same this morning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants