Warning:
This wiki has been archived and is now read-only.
XLIFF Mapping
IMPORTANT: This page/table is currently being progressively moved to XLIFF 1.2 Mapping and XLIFF 2.0 Mapping.
This page will be cleaned up when all the text is moved.
Notes:
- When posting emails about this topic, please refer to Issue 55
- The namespace prefix 'TBD' has been replaced with 'itsx'.
- The 'strucural' entries relate to the case where the element with the ITS information is a non-inline (structural) element. For example a <p> in HTML.
- The 'inline' entries relate to the case where the element with the ITS information is an inline element. For example a <span> in HTML.
- In general, this table attempts to map ITS data categories and their attributes first into corresponding attributes or elements in XLIFF 1.2 amd 2.0. Only if a native XLIFF equivalent is the introduction of a native ITS attribute or element into XLIFF considered, and then only where extensibility is permitted in XLIFF. In other words the mapping aims to ensure the resulting document is a conformant XLIFF document.
- itsx: is a schema prefix for the namespace http://www.w3.org/ns/its-xliff/.
- Color Code
color | meaning | stuck in XLIFF TC |
---|---|
dependent on an unstable ITS category | |
nedds W3C I18N WG review |
The mapping currently takes the following approach:
Data Categories ("driver") | XLIFF 1.2 | XLIFF 2.0 |
---|---|---|
Translate (Yves) |
||
See XLIFF_1.2_Mapping#Translate | See XLIFF_2.0_Mapping#Translate | |
Localization Note (Yves) |
||
See XLIFF_1.2_Mapping#Localization_Note | structural:
<note> |
|
See XLIFF_1.2_Mapping#Localization_Note | inline:
<mrk id='1' type='comment' value='[note]' > should extensiblity be introduced here? |
|
Terminology | ||
See XLIFF_1.2_Mapping#Terminology | inline:
<mrk type='term' value='info'|ref='infoRef'> |
|
Directionality | ||
See XLIFF_1.2_Mapping#Directionality | structural: XLIFF2 directionality mechanism | |
See XLIFF_1.2_Mapping#Directionality | inline: XLIFF2 directionality mechanism | |
Language information | ||
structural: fall back on mrk if needed
inline: <mrk mtype='x-its' xml:lang='en'> |
inline: <mrk type='x-its' xml:lang='en'> |
|
Element Within Text (Yves) |
See XLIFF_1.2_Mapping#Element_Within_Text | yes: inline codes no: unit |
Domain | See XLIFF_1.2_Mappin#Domain | itsx:domains |
Text Analysis (Dave and David) |
Recommend only use text analysis inline:
<mrk mtype="phrase" its:taConfidence="0.7" its:taClassRef="http://nerd.eurecom.fr/ontology#Place" its:taIdentRef="http://dbpedia.org/resource/Arizona"> Arizona</mrk> If its:annotatorsRef="text-analysis|http://enrycher.ijs.si" |
inline:
1.2 <mrk mtype='phrase'> using ITS native and (if used) comment for the resolved prose text |
Locale Filter (Yves) |
||
structural: When target locale is undefined:
<trans-unit id='id' its:localeFilterList="*-ca" its:localeFilterType="exclude"> When target locale is known: no extraction or <trans-unit id='id' translate='no|yes'> |
structural:
<trans-unit id='id' its:localeFilterList="*-ca" its:localeFilterType="exclude"> When target locale is known: no extraction or <trans-unit id='id' translate='no|yes'> |
|
inline: When target locale is undefined:
<mrk mtype='x-its' its:localeFilterList="*-ca" its:localeFilterType="exclude"> When target locale is known: Inline code or <mrk mtype="protected">...</mrk> <mrk mtype="x-its-translate-yes">...</mrk> |
inline: When target locale is undefined:
<mrk type='x-its' its:localeFilterList="*-ca" its:localeFilterType="exclude"> When target locale is known: Inline code or <mrk translate='yes|no'> |
|
Provenance (Dave) |
||
structural:
<target its:provenanceRecordsRef="#ph3"> ... <its:provenanceRecords xml:id="ph3"> <its:provenanceRecord person="John Doe" orgRef="http://www.legaltrans-ex.com/" revPerson="Tommy Atkins" revOrgRef="http://www.vistatec.com/" provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/> <its:provenanceRecord revPerson="John Smith" revOrgRef="http://john-smith.qa.example.com"/> </its:provenanceRecords> Important note to the example: The natural carriers of the provenance info on the structural level are <source>, <target>, <trans-unit>, <group>, <file>. However, in case the <source> and <target> elements are used within an <alt-trans> element (as opposed to <trans-unit>), the parent <alt-trans> element MUST carry all the provenance info relevenat for the whole translation candidate. |
structural:
<target its:provenanceRecordsRef="#ph3"> ... <its:provenanceRecords xml:id="ph3"> <its:provenanceRecord person="John Doe" orgRef="http://www.legaltrans-ex.com/" revPerson="Tommy Atkins" revOrgRef="http://www.vistatec.com/" provRef="http://www.examplelsp.com/excontent987/legal/prov/e6354"/> <its:provenanceRecord revPerson="John Smith" revOrgRef="http://john-smith.qa.example.com"/> </its:provenanceRecords> |
|
inline:
<mrk mtype="x-its" its:provenanceRecordsRef="#ph3"> or <mrk mtype="seg" its:provenanceRecordsRef="#ph3"> |
inline:
<mrk id='1' type="its:provenanceRecordsRef" ref="#ph3"> |
|
External Resource (???) |
||
See XLIFF_1.2_Mapping#External_Resource | in unit: NEED TO CHECK RELATIONSHIP TO RESOURCE MODULE | |
See XLIFF_1.2_Mapping#External_Resource | inline:
<mrk id='1' type="itsx:externalResource" ref="[uri]"> |
|
Target Pointer (Yves) |
See XLIFF_1.2_Mapping#Target_Pointer | N/A as mapping. ITS processors working on XLIFF documents should use:
<its:targetLocale selector="//xlf:source" targetPointer="../xlf:target"/> |
Id Value (Yves) |
||
See XLIFF_1.2_Mapping#Id_Value | structural: <unit name="[value]"> | |
See XLIFF_1.2_Mapping#Id_Value | inline: N/A - after deliberation the resolution would be more problematic than resolving the use case | |
Preserve Space (???) |
||
Structural: See XLIFF_1.2_Mapping#Preserve_Space | structural: xml:space | |
inline: See XLIFF_1.2_Mapping#Preserve_Space | inline: | |
Localization Quality Issue (Yves) |
||
structural: may be use in source , seg-source or target elements in a trans-unit or a alt-trans element:
<trans-unit> <target its:locQualityIssuesRef="#lqi1">c'es le contenu </target> </trans-unit> ... <its:locQualityIssues xml:id="lqi1"> <its:locQualityIssue locQualityIssueType="misspelling" locQualityIssueComment="'c'es' is unknown. Could be 'c'est'" locQualityIssueSeverity="50 /> </its:locQualityIssues> It is recommended that only the the stand-off mode of annotation is used and that |
in unit:
<unit its:locQualityIssuesRef="#lqi1"> ... <its:locQualityIssues </its:locQualityIssues> |
|
inline: may be used inline with an mrk within a source , seg-source or target elements in a trans-unit or a alt-trans element:
<mrk mtype="x-its" its:locQualityIssuesRef="#lqi1"> It is recommended that only the the stand-off mode of annotation is used and that For both structural and inline usage, if the content of an |
inline:
<mrk type="its:lqi" ref="#lqi1"> ... <its:locQualityIssues xml:id="lqi1"> <its:locQualityIssue locQualityIssueType locQualityIssueComment locQualityIssueSeverity locQualityIssueProfileRef locQualityIssueEnabled/> </its:locQualityIssues> |
|
Localization Quality Rating (dF) |
||
structural: may be used to annotate a group , trans-unit or alt-trans
<trans-unit its:locQualityRatingScore="100" its:locQualityRatingScoreThreshold="95" its:locQualityRatingProfileRef="http://example.org/qaModel/v13"> |
structural: | |
inline:
<mrk mtype="x-its" its:locQualityRatingScore="100" its:locQualityRatingScoreThreshold="95" its:locQualityRatingProfileRef="http://example.org/qaModel/v13"> Question: is LQR compatible with inline mark-up as it is used to "is used to express an overall measurement of the localization quality of a document or an item in a document."? For both structural and inline usage, if the content of an |
inline: | |
MT Confidence (dF) |
||
Structural: It is recommended that for use in alt-trans the existing xlf:match-quality attribute be used for presenting the value of its:mtConfidence . In this case, the ITS tools information should be given as its:annotatorsRef e.g.
<alt-trans mid="0" match-quality="0.546" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx"> Note: Only in cases when the XLIFF files are used with tools that do not consume the XLIFF alt-trans match-quality and origin and attributes, should consideration be given to using <alt-trans> <target its:mtConfidence="0.8982">some translated text</target> </alt-trans> In addition, if the content of an <trans-unit> <target its:mtConfidence="0.8982" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx"> some translated text </target> </trans-unit> If the translation was NOT performed on the whole unit, each segment mrk element must carry the MT confidence metadata (see the inline case). |
structural: | |
inline:
<target> <mrk mtype="seg" its:mtConfidence="0.8982" its:annotatorsRef="mtconfidence|http://mlwlt.moravia.com/mlwlt-service-xliff-mt/mlwlt-service.asmx"> some translated text</mrk> </target> dF: I think that the WG consensus was that that mtconfidence makes no sense for subsegment, so that this is really only relevant with mtype="seg" |
inline: | |
Allowed Characters (Yves) |
||
structural: this can be applied only to the source and/or the target elements of a trans-unit element.
<trans-unit> <target its:allowedCharacters="[character spec]"> some translated text </target> </trans-unit> |
structural: this can be applied only to the source and/or the target elements.
<segment> <target its:allowedCharacters="[character spec]"> some translated text </target> </segment> |
|
inline: this can be applied to mrk only within the source and/or the target elements of a trans-unit element.
<trans-unit> <source> <mrk mtype="x-its" its:allowedCharacters="[character spec]">some source text</mrk> </source> </trans-unit>
|
inline: this can be applied to mrk only within the source and/or the target element.
<segment> <source> <mrk mtype="its" its:allowedCharacters="[character spec]">some source text</mrk> </source> </segment> |
|
Storage Size (Yves) |
||
structural: this can be applied only to the source and/or the target elements in a trans-unit .
<trans-unit> <target its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf"> some translated text </target> </trans-unit>
|
structural: this can be applied only to the source and/or the target elements.
<segment> <target its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf"> some translated text </target> </segment>
|
|
inline: this can be applied to mrk only within the source and/or the target elements of a trans-unit element.
<trans-unit> <source> <mrk mtype="x-its" its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">some source text</mrk> </source> </trans-unit>
|
inline: this can be applied to mrk only within the source and/or the target elements.
<segment> <source> <mrk mtype="x-its" its:storageSize="12" its:storageEncoding="UTF-16" its:lineBreakType="crlf">some source text</mrk> </source> </segment> |
Notes
Provenance mapping
Best Practice
In XLIFF, the ITS provenance annotation should only be added as local stand-off markup i.e. using a its:provenanceRecords element within the XLIFF file. This facilitates the addition of further its:provenanceRecord elements as additional translation, translation revisions or other activities recorded in external provenance records are conducted upon the XLIFF file.
If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains any of the translation or translation revision related attributes, namely: its:person, its:personRef, its:org, its:orgRef, its:tool, its:toolRef, its:revPerson, its:revPersonRef, its:revOrg, its:revOrgRef, its:revTool or its:revToolRef, then the its:provenanceRecordsRef should only be used as local or global annotation selecting xlf:target or xlf:bin-target elements or a xlf:mrk inline markup within either of those XLIFF elements. This is because the provenance mark-up in this case is appropriate only to translated text.
If the its:provenanceRecords element referenced by a its:provenanceRecordsRef contains only the provRef attribute, then the its:provenanceRecordsRef may be used as local or global annotation selecting any XLIFF elements, since the its:provRef attribute may point to an external provenance records that could relate to an activity that resulted in textual content of any of the elements in an XLIFF file.
If, as the result of additional activities upon an XLIFF file results in values in a its:provenanceRecord that forks from that of other elements referencing the same its:provenanceRecords, then that its:provenanceRecords must be copied to a new element with a distinct id, while the reference attribute for the element(s) concerned is changed to refer to this new its:provenanceRecords id.
Design Note
Note XLIFF1.2 supports some constructs that could map to ITS provenance annotation as outlined below. This mapping is not recommended because these elements are dropped in XLIFF 2.0, so that the solution would not be future proof.
<trans-unit phase-name="#ph1"> <target phase-name="#ph2"> ... <phase-group> <phase phase-name="ph1"> process-name="translate" company-name="[value of its:orgRef or its:org]" tool-id="tl1" contact-name="[value of its:person]" contact-email="[value of its:personRef IF it has scheme 'mailto:'] " /> <phase phase-name="ph2"> ... /> </phase-group> ... <tool tool-id="tl1"> tool-name="[value of its:toolRef or its:tool]" </tool>
One limitation however is that we can't map non 'mailto:' scheme for its:personRef into contact-email. Therefore the proposed mapping uses a reference to the ITS stand-off record, its:provenanceRecords. This also offers a similar mapping then for both XLIFF1.2 and XLIFF2.0 where phase-group is not available.