Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue : cannot allocate vector of size #5

Open
mohseniaref-InSAR opened this issue Oct 17, 2021 · 1 comment
Open

Memory issue : cannot allocate vector of size #5

mohseniaref-InSAR opened this issue Oct 17, 2021 · 1 comment

Comments

@mohseniaref-InSAR
Copy link

mohseniaref-InSAR commented Oct 17, 2021

Hi,

I got the following error, I was wondering if you can help me regarding this issue? Is there any way to deal with memory issues and big data?

Best regards,
Mohammad

library(HiClimR)
library(ncdf4)
nc_data <- nc_open('/raid-manaslu/maref/InSAR/S1/QdtFilterSenAT76/geo_timeseries_ramp_demErr_multiply-1_msk.nc')
nc <- ncvar_get(nc_data, "timeseries")
lon <- ncvar_get(nc_data, "longitude")
lat <- ncvar_get(nc_data, "latitude", verbose = F)
t <- ncvar_get(nc_data, "time")
xGrid <- grid2D(lon = unique(lon), lat = unique(lat))
lonn <- c(xGrid$lon)
latt <- c(xGrid$lat)
n <- aperm(nc, c(3,2,1))
x <- t(matrix(n, nrow=dim(n)[1], byrow=FALSE))
y <- HiClimR(x, lon = lonn, lat = latt, lonStep = 1, latStep = 1, geogMask = FALSE, meanThresh = FALSE, varThresh = 0, detrend = FALSE,standardize = TRUE, nPC = NULL, method = "single", hybrid = FALSE, kH = NULL,members=NULL,nSplit = 4,upperTri = TRUE, verbose = TRUE,validClimR = TRUE, k = 12, minSize = 1, alpha = 0.01,plot = TRUE, colPalette = NULL, hang = -1,labels = FALSE)
  

The error

PROCESSING STARTED

Checking Multivariate Clustering (MVC)...
---> x is a matrix
---> single-variate clustering: 1 variable
Checking data...
---> Checking dimensions...
---> Checking row names...
---> Checking column names...
Data filtering...
---> Computing mean for each row...
---> Checking rows with mean bellow meanThresh...
---> 60485454 rows found, mean ≤  FALSE
---> Computing variance for each row...
---> Checking rows with near-zero-variance...
---> 41238667 rows found, variance ≤  0
Data preprocessing...
---> Applying mask...
---> Checking columns with missing values...
---> Standardizing data...
Agglomerative Hierarchical Clustering...
---> Computing correlation/dissimilarity matrix...
Error: cannot allocate vector of size 479897.8 Gb

@mohseniaref-InSAR mohseniaref-InSAR changed the title Error in x[, ii] : subscript out of bounds Memory issue : cannot allocate vector of size Oct 17, 2021
@hsbadr
Copy link
Owner

hsbadr commented Oct 18, 2021

y <- HiClimR(x, lon = lonn, lat = latt, lonStep = 1, latStep = 1, geogMask = FALSE, meanThresh = FALSE, varThresh = 0, detrend = FALSE,standardize = TRUE, nPC = NULL, method = "single", hybrid = FALSE, kH = NULL,members=NULL,nSplit = 4,upperTri = TRUE, verbose = TRUE,validClimR = TRUE, k = 12, minSize = 1, alpha = 0.01,plot = TRUE, colPalette = NULL, hang = -1,labels = FALSE)

Data filtering...
---> 60485454 rows found, mean ≤ FALSE
---> 41238667 rows found, variance ≤ 0

@mohseniaref-InSAR meanThresh should be numeric not logical. Also, it seems that a large number of rows has been filtered with variance ≤ 0. As for the memory allocation error, the dissimilarity matrix for big data requires large amount of memory. You may try coarsening spatial resolution (lonStep and/or latStep > 1)or increasing the number of splits (nSplit > 1):

  # Coarsening spatial resolution
  lonStep = 1, latStep = 1,
  
  # Big data support:
  nSplit = 1, upperTri = TRUE, verbose = TRUE,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants