Open
Description
Dear authors,
You claimed in the paper that "using a single NVIDIA RTX-3090 GPU with a training batch size of 4, employing learning rates of 4 × 10−6 for DB", is that correct?
I found it slow on my V100GPU under this hyper parameter setting(two and an half hour for training a single concept),my gradient accumulation steps is set to 1, and max_train_steps is 1500, I would appreciate it if you could help me
Metadata
Metadata
Assignees
Labels
No labels
Activity