-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Tests for Base.cwstring #56123
base: master
Are you sure you want to change the base?
WIP: Tests for Base.cwstring #56123
Conversation
Fixes equality operator in first test of "cwstring" testset Co-authored-by: Sukera <[email protected]>
982b27f
to
4bd0536
Compare
What operating system are you using? How are you interacting with git? The command line interface/VS Code/something else? |
test/strings/basic.jl
Outdated
@@ -1409,3 +1409,18 @@ end | |||
@test transcode(String, transcode(UInt8, transcode(UInt16, str))) == str | |||
end | |||
end | |||
|
|||
@testset "cwstring" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.cwstring
is only defined on Windows:
Lines 108 to 124 in b0c1525
if ccall(:jl_get_UNAME, Any, ()) === :NT | |
""" | |
Base.cwstring(s) | |
Converts a string `s` to a NUL-terminated `Vector{Cwchar_t}`, suitable for passing to C | |
functions expecting a `Ptr{Cwchar_t}`. The main advantage of using this over the implicit | |
conversion provided by [`Cwstring`](@ref) is if the function is called multiple times with the | |
same argument. | |
This is only available on Windows. | |
""" | |
function cwstring(s::AbstractString) | |
bytes = codeunits(String(s)) | |
0 in bytes && throw(ArgumentError("embedded NULs are not allowed in C strings: $(repr(s))")) | |
return push!(transcode(UInt16, bytes), 0) | |
end | |
end |
if Sys.iswindows()
@testset "cwstring" begin
# ...
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Testset still fails (intentionally) on Windows, will my branch still be merged or should a fix addressing the tests be pushed first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aren't going to merge a PR with broken tests (unless explicitly marked so), but I'm not clear why we should just add a broken test. What's broken? The test? The code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.cwstring() converts an input string s to a NUL-terminated character vector but currently throws an error for any input string s that contains a NUL character anywhere. This is the problem test/input
str_2 = "Wordu\000"
# ...
@test Base.cwstring(str_2) == UInt16[0x0057, 0x006f, 0x0072, 0x0064, 0x0000]
where I input a string with a terminating NUL expecting cwstring() to trim the NUL off the end of the input and proceed to convert the remainder as normal, but instead cwstring() errors. Sorry for not clarifying earlier, but my ultimate question is should we add terminal NUL trimming to cwstring() or just leave the function alone? If the latter, I can just change my broken test to
@test_throws ArgumentError Base.cwstring(str_2)
so all tests in the testset work and we can merge this PR without issue.
Sorry it took me so long to get back to you. Right now I’m on Windows and am just using the CLI (Cygwin) for Git interaction. When I first opened this PR, I had an SSH connection to my GitHub account from Windows Command Prompt but not from Cygwin. I was originally swapping between Command Prompt and Cygwin (building Julia with Cygwin, git push through Command Prompt, for example). It’s been so long I don’t remember exactly how my bad workflow worked, but I know it was likely the cause behind the random filemode changes and made reverting them difficult. I eventually connected Cygwin to my GitHub, used just Cygwin, and have not had similar permission issues or “unable to index file” errors since. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these tests! This was indeed a hole in our test coverage.
str_3 = "aܣ𒀀" | ||
@test Base.cwstring(str_0) == UInt16[0x0000] | ||
@test_throws ArgumentError Base.cwstring(str_1) | ||
@test Base.cwstring(str_2) == UInt16[0x0057, 0x006f, 0x0072, 0x0064, 0x0000] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@test Base.cwstring(str_2) == UInt16[0x0057, 0x006f, 0x0072, 0x0064, 0x0000] | |
@test_throws ArgumentError Base.cwstring(str_2) |
I do not think we should have automatic NUL trimming on cwstring
.
- There would be a slight performance cost to check for that
- Generally,
cwstring(s)
produces aVector{Cwchar_t}
that representss
exactly. If we trimmed trailing NUL characters then the returned value would not fully describe the input - I don't see a case where someone would have a String or AbstractString that is semantically NUL terminated and want to automatically strip that character.
I made simple unit tests for this function, but I would like some clarification on empty string and NUL terminated string inputs. cwstring currently throws an error for both, but I test for valid return vectors. If that's wrong, I can change the first and third tests.
Also, sorry for the wall of filemode changes. I tried to remove them from the commit but kept getting "unable to index file" errors. Advice on how to fix would be appreciated!