Compression （日本語訳）

◎要約

この文書は、 ~binary~dataが成す~streamを［圧縮する／解凍する］ための，一群の~JS~APIを定義する。 ◎ This document defines a set of JavaScript APIs to compress and decompress streams of binary data.

1. 序論

◎非規範的

この仕様において指定される~APIは、 ~dataが成す~streamを［圧縮する／解凍する］ために利用される。それらは圧縮~algoとして［ `deflate$l, `deflate-raw$l, `gzip$l ］を~supportする。それらの~algoは、 ~web開発者により広く利用される。 ◎ The APIs specified in this specification are used to compress and decompress streams of data. They support "deflate", "deflate-raw" and "gzip" as compression algorithms. They are widely used by web developers.

2. 基盤

この仕様は、 `INFRA$r に依存する ◎ This specification depends on Infra. [INFRA]

~chunkは、 ~dataを成すある~pieceである。［ `CompressionStream$I ／ `DecompressionStream$I ］の事例においては、出力~chunkの型は `Uint8Array$I であり，入力として `BufferSource$I 型の値を受容する。 ◎ A chunk is a piece of data. In the case of CompressionStream and DecompressionStream, the output chunk type is Uint8Array. They accept any BufferSource type as input.

~streamは、 ~chunkたちが成す有順序な連列を表現する。用語［ `ReadableStream$I, `WritableStream$I ］は `STREAMS$r にて定義される。 ◎ A stream represents an ordered sequence of chunks. The terms ReadableStream and WritableStream are defined in Streams. [STREAMS]

`文脈@ とは、［圧縮~algo／解凍~algo ］により保守される内部~状態である。 `文脈$を成す内容は、利用-中にある［形式, ~algo, 実装］に依存する。それは、この仕様の視点からは，不透明な~objである。 `文脈$は、初期~時は，開始-状態 — 入力を成す最初の~byteを見越すような状態 — にある。 ◎ A compression context is the internal state maintained by a compression or decompression algorithm. The contents of a compression context depend on the format, algorithm and implementation in use. From the point of view of this specification, it is an opaque object. A compression context is initially in a start state such that it anticipates the first byte of input.

3. ~supportされる形式

`deflate$l

ZLIB で圧縮された~data形式（ `ZLIB Compressed Data Format^en ） `RFC1950$r ◎ "ZLIB Compressed Data Format" [RFC1950]

注記：この形式は、 ~HTTP`内容~符号法$との一貫性を得るため， "deflate" と称される。 `RFC7230$r § 4.2.2 【 `RFC9110^r `§ 8.4.1.2＠~HTTPsem#deflate.coding$ 】を見よ。 ◎ Note: This format is referred to as "deflate" for consistency with HTTP Content-Encodings. See [RFC7230] section 4.2.2.

実装は、 `RFC1950$r § 2.3 にて述べられるとおりに “準拠する” モノトスル。 ◎ Implementations must be "compliant" as described in [RFC1950] section 2.3.
`CompressionStream$I は、［ `RFC1950$r において妥当でないものと述べられた~field値］を作成しないモノトスル。そのような値は、 `DecompressionStream$I 用には~errorになるとする。 ◎ Field values described as invalid in [RFC1950] must not be created by CompressionStream, and are errors for DecompressionStream.
`CMF^i ~fieldを成す［ `CM^i （ `Compression method^en ）が成す部分］【圧縮~methodを指示する部分】用に妥当な値は 8 に限られる。 ◎ The only valid value of the CM (Compression method) part of the CMF field is 8.
この~APIは、 `FDICT^i ~flagを~supportしない — 【この~flag用の~bitが 1 に】設定された場合、当の~streamは~errorになるとする。 ◎ The FDICT flag is not supported by these APIs, and will error the stream if set.
`DecompressionStream$I は、 `FLEVEL^i ~flagを無視する。 ◎ The FLEVEL flag is ignored by DecompressionStream.
`ADLER32^i ~checksumが正しくない場合、 `DecompressionStream$I 用には，~errorになるとする。 ◎ It is an error for DecompressionStream if the ADLER32 checksum is not correct.
`ADLER32^i ~checksumより後に追加的な入力~dataが在る場合、 ~errorになるとする。 ◎ It is an error if there is additional input data after the ADLER32 checksum.

`deflate-raw$l

DEFLATE ~algo （ `The DEFLATE algorithm^en ） `RFC1951$r ◎ "The DEFLATE algorithm" [RFC1951]

実装は、 `RFC1951$r § 1.4 にて述べられるとおりに “準拠する” モノトスル。 ◎ Implementations must be "compliant" as described in [RFC1951] section 1.4.
`CompressionStream$I は、 `RFC1951$r に適合していない~blockを作成しないモノトスル。そのような~blockは、 `DecompressionStream$I 用には~errorになるとする。 ◎ Non-[RFC1951]-conforming blocks must not be created by CompressionStream, and are errors for DecompressionStream.
`BFINAL^i ~flagにより指示される最終-~blockより後に追加的な入力~dataが在る場合、 ~errorになるとする。 ◎ It is an error if there is additional input data after the final block indicated by the BFINAL flag.

`gzip$l

GZIP ~file形式（ `GZIP file format^en ） `RFC1952$r ◎ "GZIP file format" [RFC1952]

`RFC1952$r § 2.3.1.2 にて述べられるとおりに “準拠する” モノトスル。 ◎ Implementations must be "compliant" as described in [RFC1952] section 2.3.1.2.
`CompressionStream$I は、［ `RFC1952$r において妥当でないものと述べられた~field値］を作成しないモノトスル。そのような値は、 `DecompressionStream$I 用には~errorになるとする。 ◎ Field values described as invalid in [RFC1952] must not be created by CompressionStream, and are errors for DecompressionStream.
`CM^i ~field用に妥当な値は 8 に限られる。 ◎ The only valid value of the CM (Compression Method) field is 8.
`DecompressionStream$I は、 `FTEXT^i ~flagを無視するモノトスル。 ◎ The FTEXT flag must be ignored by DecompressionStream.
不正な `FHCRC^i ~fieldが在る場合、 ~errorになるとする。 ◎ If the FHCRC field is present, it is an error for it to be incorrect.
`DecompressionStream$I は、次に挙げる~fieldを成す内容を — それらが正しく終了されたか否か検証yすることを除いて — 無視するモノトスル ⇒＃ `FEXTRA^i, `FNAME^i, `FCOMMENT^i ◎ The contents of any FEXTRA, FNAME and FCOMMENT fields must be ignored by DecompressionStream, except to verify that they are terminated correctly.
`DecompressionStream$I は、次に挙げる~fieldを成す内容を無視するモノトスル ⇒＃ `MTIME^i, `XFL^i, `OS^i ◎ The contents of the MTIME, XFL and OS fields must be ignored by DecompressionStream.
次に挙げるいずれかの~fieldを成す内容が解凍された~dataに合致しない場合、 ~errorになるとする ⇒＃ `CRC32^i, `ISIZE^i ◎ It is an error if CRC32 or ISIZE do not match the decompressed data.
`gzip^i ~streamは、 1 個の “~member” （ `member^en ）しか包含し得ない — 当の~memberの終端より後に追加的な入力~dataが在る場合、 ~errorになるとする。 ◎ A gzip stream may only contain one "member". ◎ It is an error if there is additional input data after the end of the "member".

4. `CompressionStream^I ~interface

enum `CompressionFormat@I {
  `deflate@l,
  `deflate-raw@l,
  `gzip@l,
};

[Exposed=*]
interface `CompressionStream@I {
  `constructor＠#dom-compressionstream-compressionstream$(`CompressionFormat$I %format);
};
`CompressionStream$I includes `GenericTransformStream$I;

各 `CompressionStream$I には、次に挙げるものが結付けられる：

`圧縮~形式@ ⇒ ある `CompressionFormat$I 値
`圧縮~文脈@ ⇒ ある`文脈$

◎ A CompressionStream has an associated format and compression context context.

`new CompressionStream(format)@m 構築子~手続きは： ◎ The new CompressionStream(format) steps are:

~IF［ `CompressionStream$I は %format を~supportしない］ ⇒ ~THROW `TypeError$E ◎ If format is unsupported in CompressionStream, then throw a TypeError.
コレの`圧縮~形式$ ~SET %format ◎ Set this's format to format.
コレの`形式変換$GTS ~SET `新たな~obj$( `TransformStream$I ) ◎ ↓
コレの`形式変換$GTSを`設定しておく$TS — 次を与える下で：
- `形式変換~algo^i ~SET 所与の ( %~chunk ) に対し，次を走らす~algo ⇒ `~chunkを圧縮して~enqueueする$( コレ, %~chunk )
- `書出n~algo^i ~SET 次を走らす~algo ⇒ `圧縮-用の書出nを~enqueueする$( コレ )
◎ Let transformAlgorithm be an algorithm which takes a chunk argument and runs the compress and enqueue a chunk algorithm with this and chunk. ◎ Let flushAlgorithm be an algorithm which takes no argument and runs the compress flush and enqueue algorithm with this. ◎ Set this's transform to a new TransformStream. ◎ Set up this's transform with transformAlgorithm set to transformAlgorithm and flushAlgorithm set to flushAlgorithm.

`~chunkを圧縮して~enqueueする@ ~algoは、所与の ( `CompressionStream$I ~obj %cs, %~chunk ) に対し： ◎ The compress and enqueue a chunk algorithm, given a CompressionStream object cs and a chunk, runs these steps:

~IF［ %~chunk は `BufferSource$I 型でない］ ⇒ ~THROW `TypeError$E ◎ If chunk is not a BufferSource type, then throw a TypeError.
%~buffer ~LET %cs の［ `圧縮~形式$, `圧縮~文脈$ ］を用いて， %~chunk を圧縮した結果 ◎ Let buffer be the result of compressing chunk with cs's format and context.
~IF［ %~buffer は空である］ ⇒ ~RET ◎ If buffer is empty, return.
%~buffer を 1 個~以上の空でない~pieceに分割した結果†を成す ~EACH( %~piece ) に対し：
1. %配列 ~LET %~piece を `Uint8Array$I へ変換した結果††
2. %cs の`形式変換$GTSに`~chunkを~enqueueする$TS( %配列 )
【† どう分割するかは、指定されていない（したがって，`実装定義$になろう） — 他の~algoでも同様。】【†† この~algoは、 ~pieceの~data型について何も述べておらず， `Uint8Array$I へ変換するための詳細も指定していないが、 ~pieceは`~byte列$であると見做すなら，［ `~buffer~sourceを~byte列から作成する$( `Uint8Array$I, %~piece ) ］の結果になろう — 他の~algoでも同様。】
◎ Split buffer into one or more non-empty pieces and convert them into Uint8Arrays. ◎ For each Uint8Array array, enqueue array in cs's transform.

`圧縮-用の書出nを~enqueueする@ ~algoは、入力 `ReadableStream$I ~objからの~dataの終端を取扱う。それは、所与の ( `CompressionStream$I ~obj %cs ) に対し，次の手続きを走らす： ◎ The compress flush and enqueue algorithm, which handles the end of data from the input ReadableStream object, given a CompressionStream object cs, runs these steps:

%~buffer ~LET %cs の［ `圧縮~形式$, `圧縮~文脈$ ］を用いて, および `the finish flag^en 【これが具体的にどう働くかは指定されていない】を伴わせて，空な入力を圧縮した結果 ◎ Let buffer be the result of compressing an empty input with cs's format and context, with the finish flag.
~IF［ %~buffer は空である］ ⇒ ~RET ◎ If buffer is empty, return.
%~buffer を 1 個~以上の空でない~pieceに分割した結果を成す ~EACH( %~piece ) に対し：
1. %配列 ~LET %~piece を `Uint8Array$I へ変換した結果
2. %cs の`形式変換$GTSに`~chunkを~enqueueする$TS( %配列 )
◎ Split buffer into one or more non-empty pieces and convert them into Uint8Arrays. ◎ For each Uint8Array array, enqueue array in cs's transform.

5. `DecompressionStream^I ~interface

[Exposed=*]
interface `DecompressionStream@I {
  `constructor＠#dom-decompressionstream-decompressionstream$(`CompressionFormat$I %format);
};
`DecompressionStream$I includes `GenericTransformStream$I;

各 `DecompressionStream$I には、次に挙げるものが結付けられる：

`解凍~形式@ ⇒ ある `CompressionFormat$I 値
`解凍~文脈@ ⇒ ある`文脈$

◎ A DecompressionStream has an associated format and compression context context.

`new DecompressionStream(format)@m 構築子~手続きは： ◎ The new DecompressionStream(format) steps are:

~IF［ `DecompressionStream$I は %format を~supportしない］ ⇒ ~THROW `TypeError$E ◎ If format is unsupported in DecompressionStream, then throw a TypeError.
コレの`解凍~形式$ ~SET %format ◎ Set this's format to format.
コレの`形式変換$GTS ~SET `新たな~obj$( `TransformStream$I ) ◎ ↓
コレの`形式変換$GTSを`設定しておく$TS — 次を与える下で：
- `形式変換~algo^i ~SET 所与の ( %~chunk ) に対し，次を走らす~algo ⇒ `~chunkを解凍して~enqueueする$( コレ, %~chunk )
- `書出n~algo^i ~SET 次を走らす~algo ⇒ `解凍-用の書出nを~enqueueする$( コレ )
◎ Let transformAlgorithm be an algorithm which takes a chunk argument and runs the decompress and enqueue a chunk algorithm with this and chunk. ◎ Let flushAlgorithm be an algorithm which takes no argument and runs the decompress flush and enqueue algorithm with this. ◎ Set this's transform to a new TransformStream. ◎ Set up this's transform with transformAlgorithm set to transformAlgorithm and flushAlgorithm set to flushAlgorithm.

`~chunkを解凍して~enqueueする@ ~algoは、所与の ( `DecompressionStream$I ~obj %ds, %~chunk ) に対し： ◎ The decompress and enqueue a chunk algorithm, given a DecompressionStream object ds and a chunk, runs these steps:

~IF［ %~chunk は `BufferSource$I 型でない］ ⇒ ~THROW `TypeError$E ◎ If chunk is not a BufferSource type, then throw a TypeError.
%~buffer ~LET %ds の［ `解凍~形式$, `解凍~文脈$ ］を用いて， %~chunk を解凍した結果 ◎ Let buffer be the result of decompressing chunk with ds's format and context.＼
~IF［前~段は~errorした］ ⇒ ~THROW `TypeError$E ◎ If this results in an error, then throw a TypeError.
~IF［ %~buffer は空である］ ⇒ ~RET ◎ If buffer is empty, return.
%~buffer を 1 個~以上の空でない~pieceに分割した結果を成す ~EACH( %~piece ) に対し：
1. %配列 ~LET %~piece を `Uint8Array$I へ変換した結果
2. %ds の`形式変換$GTSに`~chunkを~enqueueする$TS( %配列 )
◎ Split buffer into one or more non-empty pieces and convert them into Uint8Arrays. ◎ For each Uint8Array array, enqueue array in ds's transform.

`解凍-用の書出nを~enqueueする@ ~algoは、入力 `ReadableStream$I ~objからの~dataの終端を取扱う。それは、所与の ( `DecompressionStream$I ~obj %ds ) に対し，次の手続きを走らす： ◎ The decompress flush and enqueue algorithm, which handles the end of data from the input ReadableStream object, given a DecompressionStream object ds, runs these steps:

%~buffer ~LET %ds の［ `解凍~形式$, `解凍~文脈$ ］を用いて, および `the finish flag^en 【これが具体的にどう働くかは指定されていない】を伴わせて，空な入力を解凍した結果 ◎ Let buffer be the result of decompressing an empty input with ds's format and context, with the finish flag.
~IF［圧縮された入力の終端に達していない【この条件が精確にどう定義されるのかは述べられていない。】］ ⇒ ~THROW `TypeError$E ◎ If the end of the compressed input has not been reached, then throw a TypeError.
~IF［ %~buffer は空である］ ⇒ ~RET ◎ If buffer is empty, return.
%~buffer を 1 個~以上の空でない~pieceに分割した結果を成す ~EACH( %~piece ) に対し：
1. %配列 ~LET %~piece を `Uint8Array$I へ変換した結果
2. %ds の`形式変換$GTSに`~chunkを~enqueueする$TS( %配列 )
◎ Split buffer into one or more non-empty pieces and convert them into Uint8Arrays. ◎ For each Uint8Array array, enqueue array in ds's transform.

6. ~privacy／~securityの考慮点

この~APIは、 ~web~platformに新たな特権を何も追加しない。 ◎ The API doesn’t add any new privileges to the web platform.

しかしながら，~web開発者は、［攻撃者が~dataの長さを取得できる状況］に注意を払う必要がある。そのような状況においては、 ~dataを成す内容を攻撃者が推測-可能になり得るので。 ◎ However, web developers have to pay attention to the situation when attackers can get the length of the data. If so, they may be able to guess the contents of the data.

7. 例

7.1. ~streamを `gzip^l で圧縮する

const %compressedReadableStream
    = %inputReadableStream.pipeThrough(new CompressionStream('gzip'));

7.2. `ArrayBuffer$I を `Uint8Array$I へ `deflate^l で圧縮する

async function compressArrayBuffer(%input) {
  const %cs = new CompressionStream('deflate');

  const %writer = %cs.writable.getWriter();
  %writer.write(%input);
  %writer.close();

  const %output = [];
  let %totalSize = 0;
  for (const %chunk of %cs.readable) {
    %output.push(%value);
    %totalSize += %value.byteLength;
  }

  const %concatenated = new Uint8Array(%totalSize);
  let %offset = 0;
  for (const %array of %output) {
    %concatenated.set(%array, %offset);
    %offset += %array.byteLength;
  }

  return %concatenated;
}

7.3. `Blob$I を `Blob$I へ `gzip^l で解凍する

function decompressBlob(%blob) {
  const %ds = new DecompressionStream('gzip');
  const %decompressionStream = %blob.stream().pipeThrough(%ds);
  return new Response(%decompressionStream).blob();
}

謝辞

次の方々からの~~貢献に感謝する：

`_acks1@

知的財産権

`_ipr1@