&#20998;&#24067;&#24335;&#35757;&#32451;&#25253;&#38169;

Traceback (most recent call last):
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3489, in <module>
    main()
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 3482, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2510, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/pydevd.py", line 2517, in _exec
    globals = pydevd_runpy.run_path(file, globals, '__main__')
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/cvlab1045/.vscode-server/extensions/ms-python.debugpy-2024.0.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "main.py", line 197, in <module>
    main(config=update_config(config=cfg, option=opt))
  File "main.py", line 179, in main
    train(config=config, logger=logger)
  File "/media/cvlab1045/D1/LJ/MOTIP-main/train_engine.py", line 102, in train
    sampler_train.set_epoch(epoch)
  File "/media/cvlab1045/D1/LJ/MOTIP-main/train_engine.py", line 340, in train_one_epoch
    
  File "/media/cvlab1045/D1/LJ/media/cvlab1045/LJtrack/lib/python3.8/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/media/cvlab1045/D1/LJ/media/cvlab1045/LJtrack/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module parameter outside the `forward` function. Please make sure model parameters are not shared across multiple concurrent forward-backward passes. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameters in multiple reentrant backward passes. For example, if you use multiple `checkpoint` functions to wrap the same part of your model, it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a variable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
Parameter at index 352 with name seq_decoder.id_decoder.embed_to_word.weight has been marked as ready twice. This means that multiple autograd engine  hooks have fired for this particular parameter during this iteration.
seq_decoder.id_decoder.embed_to_word.weight&#26174;&#31034;&#26159;&#36825;&#19968;&#23618;&#30340;&#26435;&#37325;&#22312;&#22810;&#36827;&#31243;&#35757;&#32451;&#26102;&#23384;&#22312;&#38382;&#39064;&#65292;&#40635;&#28902;&#22823;&#20332;&#24110;&#25105;&#30475;&#19968;&#19979;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

分布式训练报错 #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development