-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Open
Labels
Description
set "gradient_clipping" in deepspeed does not work, look into the source code in deepspeed.runtime.engine.DeepSpeedEngine,in line 2101
def _take_model_step(self, lr_kwargs, block_eigenvalue={}):
if self.gradient_clipping() > 0.0:
if not (self.fp16_enabled() or self.bfloat16_enabled() or self.amp_enabled() or self.zero_optimization()):
self.clip_fp32_gradients()
elif self.amp_enabled():
# AMP's recommended way of doing clipping
# https://nvidia.github.io/apex/advanced.html#gradient-clipping
master_params = amp.master_params(self.optimizer)
clip_grad_norm_(parameters=master_params, max_norm=self.gradient_clipping(), mpu=self.mpu)
self.optimizer.step()
thus gradient clipping do nothing at all!!!