-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix terminology JSON schema URLs #6618
Conversation
234e34d
to
22deb0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, the logic in vtkSlicerTerminologiesModuleLogic
account for the new URL (support introduced in 56c9d32).
That said, I suggest to prefix the commit title using either STYLE
or ENH
as it is not fixing an actual problem. @cpinter Could you confirm ?
Slicer/Modules/Loadable/Terminologies/Logic/vtkSlicerTerminologiesModuleLogic.cxx
Lines 53 to 56 in 788909d
static std::string ANATOMIC_CONTEXT_SCHEMA = "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/anatomic-context-schema.json#"; | |
static std::string ANATOMIC_CONTEXT_SCHEMA_1 = "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/schemas/anatomic-context-schema.json#"; | |
static std::string TERMINOLOGY_CONTEXT_SCHEMA = "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/segment-context-schema.json#"; | |
static std::string TERMINOLOGY_CONTEXT_SCHEMA_1 = "https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/schemas/segment-context-schema.json#"; |
There is no actual bug, but I figured if the json is being validated, it would fail using the old URL so that would be an error. I can change the prefix, up to you. |
Actually, it seems that the URLs were fixed in the commit you referenced but then broken again in 5ae5564 |
Good catch. That said, since the logic used to check if a specific document is associated with a specific schema account for both URL, it was not strictly a breaking change. Regarding the schema reference, my understanding 12 is that having the URL being valid is helpful but not strictly required as its purpose is to identify the document. The application is then responsible to map that to actual schema and perform validation (which we don't do in Slicer). Also worth noting that the Footnotes |
Changing to |
Yes, the URL does not have to be an actual download URL. Actual server addresses, organization names, etc. (and therefore the download URL) may change, while the schema name is a unique identifier that is supposed to be unchanged forever. |
The schema files have been moved into a subdirectory so the URL became invalid. This commit fixes them.
22deb0d
to
1ef2bf1
Compare
In this case I think they should not be URLs at all, because then it suggests that that's where the schema file can be found, which is not really expected. I changed the prefix. |
This JSON validator also uses the convention with the dollar sign. For reference, I am working on the issue SlicerRt/SlicerRT#218 and am trying to define the schema. This is why I started looking into this. |
That's a good point.
If we change the schema URL then we should use a simpler URL, such as This is what Microsoft does, too. They use a schema names like https://developer.microsoft.com/json-schemas/... (see for example here), but the schema is actually stored on github, at https://github.com/Microsoft/json-schemas. |
It is supposed to be the actual download URL, and from what I see in the dcmqi repository in https://github.com/QIICR/dcmqi/tree/master/doc/schemas, they are such downloadable URLs (unless I don't correctly interpret "downloadable URL" term). Can you clarify Andras? Do you mean it is hard to keep it up to date because of the commit hash tags and versioning of that file? Here's a potentially relevant discussion on schema versioning: json-schema-org/website#197. Separately - the URL below is just invalid - it does not exist in any of the schemas that are available in dcmqi. Is this just a bug/typo that should be fixed? Slicer/Modules/Loadable/Terminologies/Logic/vtkSlicerTerminologiesModuleLogic.cxx Line 53 in 788909d
|
The old (now invalid because it has been moved into the schemas folder) URL is allowed so that terminology files defined before this relocation of the json file are still accepted. You can see in this commit 56c9d32 that the new, valid URL was added, but the old one kept for this sort of backwards compatibility. |
Thanks for the clarification, I missed it. It is unfortunate that I didn't think of this consequence while re-organizing that folder back in 2017. |
It is not a hard requirement to make a schema file avaialable at that URL on the public internet. Its primary role is to serve as a unique identifier. It is just a nice convenience feature that this URI may be used to obtain the schema (because then editors can automatically retrieve the schema and use it for documentation, syntax highlighting, validation checks, etc). But an application may have other mechanisms to obtain a schema file based on a schema URI. If you lose access over the download server and the schema URI does not download the schema file anymore, then it may still make sense to keep the same schema URI, especially if files with this schema are widely circulated. In this specific case, I would not change https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/anatomic-context-schema.json to https://raw.githubusercontent.com/qiicr/dcmqi/master/doc/schemas/anatomic-context-schema.json# because they are both bad. They are both too complex and they are not under QIICR's control. These github URLs will not disappear overnight, but github may change download URL syntax or may arbitrarily limit access for any reasons (security, bandwith use, geographical location, etc.) at any time. If the schema URI is ever changed again, it should be something like https://qiicr.org/schemas/... or https://schema.qiicr.org/..., and they can be redirected to github with redirection rules on the qiicr.org web server. |
Something as simple as renaming the default branch of https://github.com/QIICR/dcmqi from For renaming a branch:
|
Indeed, you convinced me that the current approach is not good and is not reliable. I wish we had this discussion before those schemas were published! but:
Maybe the way to do it, if one would do it over, would be to use versioned DOIs from a service like Zenodo https://zenodo.org/record/1256592 ? |
While discussing this during the 2022-11-01 hangout, three competing priorities were identified:
We also discussed:
|
Considering that ensuring domain like
Next steps:
@cpinter @lassoan @pieper @thewtex @fedorov @jamesobutler How does that sound ? |
Thank you for discussing it and the update about the potential solution! Yes it sounds good to me to have such a mapping. |
I would close this issue, as we know that the updated URLs would not remain a valid URL forever (for example, it does not contain any version identifier). Maybe we could add a low-priority issue in the bugtracker to record the results of this discussion. |
For reference, I am duplicating here a note I also added in the referenced issue:
|
The schema files have been moved into a subdirectory so the URL became invalid. This commit fixes them.