Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Fast mul! with Q matrices from QR factorizations. #31581

Merged
merged 6 commits into from
Apr 15, 2019
Merged

RFC: Fast mul! with Q matrices from QR factorizations. #31581

merged 6 commits into from
Apr 15, 2019

Conversation

nrontsis
Copy link
Contributor

@nrontsis nrontsis commented Apr 2, 2019

Purpose

Uses LAPACK's gemqrt! and ormqr! when performing mul! with Q matrices, of QR decompositions, or their transpose. Fixes JuliaLang/LinearAlgebra.jl#612. Acknowledged duplicate of #31163.

Tasks list

  • Add mul! methods
  • Add tests
  • Discuss issues with lmul!/mul! on non-strided arrays.

Notes

No methods for mul! with a tranposed AbstractMatrix/AbstractVector input. This is because I could not find a way to perform this without extra memory allocation.

nrontsis added 2 commits April 2, 2019 12:43
Use BLAS' gemqrt! and ormqr! when performing mul! with Q matrices or their transpose.
@ViralBShah ViralBShah added the linear algebra Linear algebra label Apr 2, 2019
end

function mul!(C::StridedVecOrMat{T}, Q::AbstractQ{T}, B::StridedVecOrMat{T}) where T<:BlasFloat
check_dimensions(C, 'N', 'N', Q, B)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to check the dimensions here? Both copyto! and two-argument lmul! have checks so I wouldn't think it would be needed here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With StridedVecOrMats of BlasFloats, lmul! calls LAPACK.gemqrt! or LAPACK.ormqr! directly. So it depends on whether these assume correct sizes already.

On a different note, there is also function lmul!(A::QRPackedQ, B::AbstractVecOrMat). Should we then also have the proposed mul! methods here for abstract vectors or matrices?

Copy link
Contributor Author

@nrontsis nrontsis Apr 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK.gemqrt! and LAPACK.ormqr! seem to check dimensions, but copyto!(A, B) only checks that length(A) >= length(B) which is less strict than the checks I have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, I think that we should also support AbstractVecOrMat. On a related note, I cannot understand thought why rmul! requires StridedMatrix while lmul! does not, i.e.

function lmul!(A::QRPackedQ, B::AbstractVecOrMat)
function rmul!(A::StridedMatrix, adjQ::Adjoint{<:Any,<:QRPackedQ})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one wanted to have stricter checks, one could use copy! instead of copyto!, which requires equal axes. I couldn't find any use of copy! in qr.jl though. That sounds like there's no need for additional size checks then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it occurred to me that for mutating *mul! functions you need memory to write into. So it could be that the destination arrays need to be strided, but for 3-arg mul!, the source array could be abstract. Does that make sense for a general rule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkarrasch thanks I wasn't aware of copy!. I will use that and remove the additional size checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I thought that since both the LAPACK functions and the generic Julian *mul! functions do size checks, it would suffice to leave it to them and to use the more "generous" copyto! (as suggested by @andreasnoack), independently of whether you go the LAPACK or the generic Julia route. Basically, leave everything as you have, but kick out size checks.

@nrontsis
Copy link
Contributor Author

nrontsis commented Apr 8, 2019

As suggested, I removed the size checks. Also, following @dkarrasch suggestion, I relaxed the type of the "input" of mul! (i.e. its second or third argument) to be AbstractVecOrMat.

@dkarrasch I must say though that I do not understand what you meant by the following:

for mutating *mul! functions you need memory to write into.

On a related note, I do not understand why Julia's version (non-LAPACK) of rmul! requires strided matrices while lmul! does not.

@dkarrasch
Copy link
Member

What I meant was that for mutating versions, you require memory to write the result into. I assume that this is why you need a StridedVecOrMat for the output. For an input, it should be okay to have, say, a Range of appropriate size, it's just that you won't be able to call BLAS in that case, but then we still have the generic multiplication versions. From what I understand, strided arrays are arrays that have their data written out one way or another in the memory, as opposed to arrays which get their data, for instance, from function calls, like in ranges. For that reason, I guess both lmul! and rmul! should have a strided object as the one that is mutated, but no-one noticed because it's not tested? Could try to require the mutated arguments to be all strided, and then see if tests pass. Or wait until an expert tells what to do. 😃

@dkarrasch
Copy link
Member

Consider the following example:

A = rand(4,4)
F = qr(A, Val(true))
F.Q * (1.0:4.0) # works fine
lmul!(F.Q, 1.0:4.0) # yields ERROR: setindex! not defined for UnitRange{Int64}
lmul!(F.Q, collect(1.0:4.0)) # works fine again, Float64.(1:4) gives a StridedVector

Same with F = qr(A), where F.Q::QRCompactWYQ. So I very much assume that the mutated vec or mat should always be strided, but the input can be abstract, unless we call LAPACK. @andreasnoack ?

@nrontsis
Copy link
Contributor Author

nrontsis commented Apr 8, 2019

Thanks alot @dkarrasch for the explanations!

Following the suggestions above I relaxed the definitions of the newly-added methods to be:

mul!(C::StridedVecOrMat{T}, Q::AbstractQ{T}, B::AbstractVecOrMat{T}) where {T}
mul!(C::StridedVecOrMat{T}, A::AbstractVecOrMat{T}, Q::AbstractQ{T}) where {T}
mul!(C::StridedVecOrMat{T}, adjQ::Adjoint{<:Any,<:AbstractQ{T}}, B::AbstractVecOrMat{T}) where {T}
mul!(C::StridedVecOrMat{T}, A::AbstractVecOrMat{T}, adjQ::Adjoint{<:Any,<:AbstractQ{T}}) where {T}

However, the LinearAlgebra/test/matmul.jl now fails on line 508 with the following stacktrace:

Test Failed at /Users/nrontsis/julia/stdlib/LinearAlgebra/test/ambiguous_exec.jl:4
  Expression: detect_ambiguities(LinearAlgebra; imported=true, recursive=true) == []
   Evaluated: Tuple{Method,Method}[(mul!(C::AbstractArray{T,2} where T, ...  in LinearAlgebra at /Users/nrontsis/julia/usr/share/julia/stdlib/v1.2/LinearAlgebra/src/qr.jl:742)] == Any[]
ERROR: Error while loading expression starting at /Users/nrontsis/julia/stdlib/LinearAlgebra/test/ambiguous_exec.jl:4
caused by [exception 1]
There was an error during testing
method ambiguity: Test Failed at /Users/nrontsis/julia/stdlib/LinearAlgebra/test/matmul.jl:508

Any advice would be greatly appreciated.

@dkarrasch
Copy link
Member

dkarrasch commented Apr 8, 2019

I have seen elsewhere in the code (or remember Andreas mention) that sometimes methods are split for vectors and matrices, like

mul!(C::StridedMatrix{T}, Q::AbstractQ{T}, B::AbstractMatrix{T}) where {T}
mul!(C::StridedVector{T}, Q::AbstractQ{T}, B::AbstractVector{T}) where {T}

etc., maybe that helps?

EDIT: You generalized the method signature in commit 3dd2729, and didn't mention the issue before. So that could be an indication that this is indeed the issue.

@nrontsis
Copy link
Contributor Author

nrontsis commented Apr 11, 2019

@dkarrasch thanks for the suggestion. Unfortunately it seems that splitting the methods to vectors and matrices did not resolve the method ambiguity.

However, restricting all the inputs to be strided solved it. This is obviously not ideal, as, according to the discussion above, the second or third input to mul! should be able to be non-strided.

Regardless, I restricted the method definitions to only allow for strided inputs, so as to have a reference commit at which the tests pass.

I have also updated the description of the issue to include a task list, hoping that this will facilitate the discussion.

@dkarrasch
Copy link
Member

Good to know that there was a good reason for the strided requirement. After all, I think this should cover the regular use cases. Vectors that result from other multiplication processes will be likely strided anyway, and the Range example above was very artificial. I think this is a good solution. Probably, if you really want to, you can still pass a non-strided vector, it's just that it will go through the (slow) fallback.

@nrontsis
Copy link
Contributor Author

Okay, in that case lets also restrict lmul! to strided arrays? If you agree with that then I will make the change and, in my point of view, I think we would be ready to merge to master?

@dkarrasch
Copy link
Member

I'm not sure if we should add restrictions without immediate need. You could try to relax the rmul! method, because both the r/lmul! can be called without going through mul!.

@nrontsis nrontsis changed the title Fast mul! with Q matrices from QR factorizations. RFC: Fast mul! with Q matrices from QR factorizations. Apr 12, 2019
@nrontsis
Copy link
Contributor Author

Okay let's leave it as it is then. Do you think this is ready to be merged?

@dkarrasch
Copy link
Member

I don't have authority to merge, but apparently I do have authority to approve, which I have just done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance issue of mul! in Q matrices from QR factorizations
4 participants