-
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples of using differentiable least squares #254
Comments
@zeroAska Yes, it is supported. You may do something like
Bi-level optimization like this will be directly supported in a future release. |
Thanks for the quick response. If |
You can use |
Thanks!! In the above Another question is that, if a batch of training data have different poses, are we able to multiple each pose with its corresponding data as a batch and launch different least square problems within a batch? For example, in a batch of 2, we have the pose batch |
For initialization, it has no difference from a neural network, you may perform in-place value assignment for module parameters, e.g. For the second question, if you mean each time you want to activate different parameters for a LM problem to solve, PyPose currently doesn't directly support this because LM or GN doesn't work for stochastic inputs, as it doesn't use gradient descent, so it will not converge as solutions will jump far away from the last iteration. However, technically you can do it by defining different optimizers for different parameters. |
Many thanks! |
As a followup question, for the above outer-inner loop setup, as the prediction is coming from the least squares, how is its gradient w.r.t ground truth is propagated through the least square layer? |
We suggest only retaining the gradients from the last iteration for the inner optimization, as it will be more efficient and equivalent to back-propagating through the inner iterative optimization. More details you may refer to Sec 3.4 of this paper. |
An easy way to do this is to perform one more model forward after inner optimization, then do outer optimization. |
Thanks for the paper link. I will check it out. |
In the provided paper above, does the bi-level optimization (or the inner/outer loop) share the same loss? If the two stages have different loss to optimize, can we still use the trick of keeping the last iteration's gradients? For example, the inner loop that optimizes the pose might have a label-free loss, while the outer loop that optimizes the network parameters might have a supervised loss. |
They don't have to have the same loss. Another example having the different loss functions is this paper. |
I noticed that the optimizers in pypose are set to be @torch.no_grad() (e.g. in optim.GN.step, optim.LM.step), so how can I back-propagate the gradient through the optimizers to the front-end neural network? |
After optimization, we suggest performing another |
If the outer level loss is a supervised loss, does the outer level's gradient propagation method in the paper still hold? |
Yes, Supervised loss is an easier case.
|
|
📚 The doc issue
In the provided examples, the least square problem optimizes over all the parameters. However, in some applications, parts of the parameters are from the neural network and should be optimized with SGD, while the others can be directly optimized by the least square solvers. In Theseus, this is specified by the "inner loop" and "outer loop". Does the current version of PyPose support this?
Suggest a potential alternative/fix
Provide an example that the state space is a neural network to learn and the pose is to be optimized by least square solvers.
The text was updated successfully, but these errors were encountered: