File API

W3C Working Draft,

This version:
https://www.w3.org/TR/2017/WD-FileAPI-20171026/
Latest published version:
https://www.w3.org/TR/FileAPI/
Editor's Draft:
https://w3c.github.io/FileAPI/
Previous Versions:
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Google)
Vsevolod Shmyroff (Yandex)
Former Editor:
Arun Ranganathan (Mozilla Corporation)
Tests:
web-platform-tests FileAPI/ (ongoing work)

Abstract

This specification provides an API for representing file objects in web applications, as well as programmatically selecting them and accessing their data. This includes:

Additionally, this specification defines objects to be used within threaded web applications for the synchronous reading of files.

§10 Requirements and Use Cases covers the motivation behind this specification.

This API is designed to be used in conjunction with other APIs and elements on the web platform, notably: XMLHttpRequest (e.g. with an overloaded send() method for File or Blob arguments), postMessage(), DataTransfer (part of the drag and drop API defined in [HTML]) and Web Workers. Additionally, it should be possible to programmatically obtain a list of files from the input element when it is in the File Upload state [HTML]. These kinds of behaviors are defined in the appropriate affiliated specifications.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document was published by the Web Platform Working Group as a Working Draft. This document is intended to become a W3C Recommendation.

GitHub issues are preferred for discussion of this specification. There is also a historical mailing-list archive.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2017 W3C Process Document.

Previous discussion of this specification has taken place on two other mailing lists: [email protected] (archive) and [email protected] (archive). Ongoing discussion will be on the [email protected] mailing list.

This draft consists of changes made to the previous Last Call Working Draft. Please send comments to the [email protected] as described above. You can see Last Call Feedback on the W3C Wiki: https://www.w3.org/wiki/Webapps/LCWD-FileAPI-20130912

1. Introduction

This section is informative.

Web applications should have the ability to manipulate as wide as possible a range of user input, including files that a user may wish to upload to a remote server or manipulate inside a rich web application. This specification defines the basic representations for files, lists of files, errors raised by access to files, and programmatic ways to read files. Additionally, this specification also defines an interface that represents "raw data" which can be asynchronously processed on the main thread of conforming user agents. The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.

The File interface represents file data typically obtained from the underlying file system, and the Blob interface ("Binary Large Object" - a name originally introduced to web APIs in Google Gears) represents immutable raw data. File or Blob reads should happen asynchronously on the main thread, with an optional synchronous API used within threaded web applications. An asynchronous API for reading files prevents blocking and UI "freezing" on a user agent’s main thread. This specification defines an asynchronous API based on an event model to read and access a File or Blob’s data. A FileReader object provides asynchronous read methods to access that file’s data through event handler content attributes and the firing of events. The use of events and event handlers allows separate code blocks the ability to monitor the progress of the read (which is particularly useful for remote drives or mounted drives, where file access performance may vary from local drives) and error conditions that may arise during reading of a file. An example will be illustrative.

In the example below, different code blocks handle progress, error, and success conditions.
function startRead() {
  // obtain input element through DOM

  var file = document.getElementById('file').files[0];
  if(file){
    getAsText(file);
  }
}

function getAsText(readFile) {

  var reader = new FileReader();

  // Read file into memory as UTF-16
  reader.readAsText(readFile, "UTF-16");

  // Handle progress, success, and errors
  reader.onprogress = updateProgress;
  reader.onload = loaded;
  reader.onerror = errorHandler;
}

function updateProgress(evt) {
  if (evt.lengthComputable) {
    // evt.loaded and evt.total are ProgressEvent properties
    var loaded = (evt.loaded / evt.total);
    if (loaded < 1) {
      // Increase the prog bar length
      // style.width = (loaded * 200) + "px";
    }
  }
}

function loaded(evt) {
  // Obtain the read file data
  var fileString = evt.target.result;
  // Handle UTF-16 file dump
  if(utils.regexp.isChinese(fileString)) {
    //Chinese Characters + Name validation
  }
  else {
    // run other charset test
  }
  // xhr.send(fileString)
}

function errorHandler(evt) {
  if(evt.target.error.name == "NotReadableError") {
    // The file could not be read
  }
}

2. Terminology and Algorithms

When this specification says to terminate an algorithm the user agent must terminate the algorithm after finishing the step it is on. Asynchronous read methods defined in this specification may return before the algorithm in question is terminated, and can be terminated by an abort() call.

The algorithms and steps in this specification use the following mathematical operations:

The term Unix Epoch is used in this specification to refer to the time 00:00:00 UTC on January 1 1970 (or 1970-01-01T00:00:00Z ISO 8601); this is the same time that is conceptually "0" in ECMA-262 [ECMA-262].

3. The Blob Interface and Binary Data

A Blob object refers to a byte sequence, and has a size attribute which is the total number of bytes in the byte sequence, and a type attribute, which is an ASCII-encoded string in lower case representing the media type of the byte sequence.

Each Blob must have an internal snapshot state, which must be initially set to the state of the underlying storage, if any such underlying storage exists. Further normative definition of snapshot state can be found for Files.

[Constructor(optional sequence<BlobPart> blobParts,
             optional BlobPropertyBag options),
 Exposed=(Window,Worker), Serializable]
interface Blob {

  readonly attribute unsigned long long size;
  readonly attribute DOMString type;

  // slice Blob into byte-ranged chunks
  Blob slice([Clamp] optional long long start,
            [Clamp] optional long long end,
            optional DOMString contentType);
};

dictionary BlobPropertyBag {
  DOMString type = "";
};

typedef (BufferSource or Blob or USVString) BlobPart;

Blob objects are serializable objects. Their serialization steps, given value and serialized, are:

  1. Set serialized.[[SnapshotState]] to value’s snapshot state.

  2. Set serialized.[[ByteSequence]] to value’s underlying byte sequence.

Their deserialization step, given serialized and value, are:

  1. Set value’s snapshot state to serialized.[[SnapshotState]].

  2. Set value’s underlying byte sequence to serialized.[[ByteSequence]].

3.1. Constructors

The Blob() constructor can be invoked with zero or more parameters. When the Blob() constructor is invoked, user agents must run the following steps:

  1. If invoked with zero parameters, return a new Blob object consisting of 0 bytes, with size set to 0, and with type set to the empty string.

  2. Otherwise, the constructor is invoked with a blobParts sequence. Let a be that sequence.

  3. Let bytes be an empty sequence of bytes.

  4. Let length be a’s length. For 0 ≤ i < length, repeat the following steps:

    1. Let element be the ith element of a.

    2. If element is a USVString, run the following substeps:

      1. Append the result of UTF-8 encoding s to bytes.

        Note: The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid utf-16 string with U+FFFD replacement characters. Scenarios exist when the Blob constructor may result in some data loss due to lost or scrambled character sequences.

    3. If element is a BufferSource, get a copy of the bytes held by the buffer source, and append those bytes to bytes.

    4. If element is a Blob, append the bytes it represents to bytes. The type of the Blob array element is ignored and will not affect type of returned Blob object.

  5. If the type member of the optional options argument is provided and is not the empty string, run the following sub-steps:

    1. Let t be the type dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps.

    2. Convert every character in t to ASCII lowercase.

  6. Return a Blob object referring to bytes as its associated byte sequence, with its size set to the length of bytes, and its type set to the value of t from the substeps above.

3.1.1. Constructor Parameters

The Blob() constructor can be invoked with the parameters below:

A blobParts sequence
which takes any number of the following types of elements, and in any order:
An optional BlobPropertyBag
which takes one member:
  • type, the ASCII-encoded string in lower case representing the media type of the Blob. Normative conditions for this member are provided in the §3.1 Constructors.

Examples of constructor usage follow.
// Create a new Blob object

var a = new Blob();

// Create a 1024-byte ArrayBuffer
// buffer could also come from reading a File

var buffer = new ArrayBuffer(1024);

// Create ArrayBufferView objects based on buffer

var shorts = new Uint16Array(buffer, 512, 128);
var bytes = new Uint8Array(buffer, shorts.byteOffset + shorts.byteLength);

var b = new Blob(["foobarbazetcetc" + "birdiebirdieboo"], {type: "text/plain;charset=utf-8"});

var c = new Blob([b, shorts]);

var a = new Blob([b, c, bytes]);

var d = new Blob([buffer, b, c, bytes]);

3.2. Attributes

size , of type unsigned long long, readonly
Returns the size of the byte sequence in number of bytes. On getting, conforming user agents must return the total number of bytes that can be read by a FileReader or FileReaderSync object, or 0 if the Blob has no bytes to be read.
type , of type DOMString, readonly
The ASCII-encoded string in lower case representing the media type of the Blob. On getting, user agents must return the type of a Blob as an ASCII-encoded string in lower case, such that when it is converted to a byte sequence, it is a parsable MIME type, or the empty string – 0 bytes – if the type cannot be determined.

The type attribute can be set by the web application itself through constructor invocation and through the slice() call; in these cases, further normative conditions for this attribute are in §3.1 Constructors, §4.1 Constructor, and §3.3.1 The slice method respectively. User agents can also determine the type of a Blob, especially if the byte sequence is from an on-disk file; in this case, further normative conditions are in the file type guidelines.

Note: The type t of a Blob is considered a parsable MIME type, if performing the parse a MIME type algorithm to a byte sequence converted from the ASCII-encoded string representing the Blob object’s type does not return undefined.

Note: Use of the type attribute informs the encoding determination and parsing the Content-Type header when dereferencing Blob URLs.

3.3. Methods and Parameters

3.3.1. The slice method

The slice() method returns a new Blob object with bytes ranging from the optional start parameter up to but not including the optional end parameter, and with a type attribute that is the value of the optional contentType parameter. It must act as follows:

  1. Let O be the Blob context object on which the slice() method is being called.

  2. The optional start parameter is a value for the start point of a slice() call, and must be treated as a byte-order position, with the zeroth position representing the first byte. User agents must process slice() with start normalized according to the following:

    1. If the optional start parameter is not used as a parameter when making this call, let relativeStart be 0.
    2. If start is negative, let relativeStart be max((size + start), 0).
    3. Else, let relativeStart be min(start, size).
  3. The optional end parameter is a value for the end point of a slice() call. User agents must process slice() with end normalized according to the following:

    1. If the optional end parameter is not used as a parameter when making this call, let relativeEnd be size.
    2. If end is negative, let relativeEnd be max((size + end), 0).
    3. Else, let relativeEnd be min(end, size).
  4. The optional contentType parameter is used to set the ASCII-encoded string in lower case representing the media type of the Blob. User agents must process the slice() with contentType normalized according to the following:

    1. If the contentType parameter is not provided, let relativeContentType be set to the empty string.
    2. Else let relativeContentType be set to contentType and run the substeps below:
      1. If relativeContentType contains any characters outside the range of U+0020 to U+007E, then set relativeContentType to the empty string and return from these substeps.

      2. Convert every character in relativeContentType to ASCII lowercase.

  5. Let span be max((relativeEnd - relativeStart), 0).

  6. Return a new Blob object S with the following characteristics:

    1. S refers to span consecutive bytes from O, beginning with the byte at byte-order position relativeStart.
    2. S.size = span.
    3. S.type = relativeContentType.
The examples below illustrate the different types of slice() calls possible. Since the File interface inherits from the Blob interface, examples are based on the use of the File interface.
// obtain input element through DOM

var file = document.getElementById('file').files[0];
if(file)
{
  // create an identical copy of file
  // the two calls below are equivalent

  var fileClone = file.slice();
  var fileClone2 = file.slice(0, file.size);

  // slice file into 1/2 chunk starting at middle of file
  // Note the use of negative number

  var fileChunkFromEnd = file.slice(-(Math.round(file.size/2)));

  // slice file into 1/2 chunk starting at beginning of file

  var fileChunkFromStart = file.slice(0, Math.round(file.size/2));

  // slice file from beginning till 150 bytes before end

  var fileNoMetadata = file.slice(0, -150, "application/experimental");
}

4. The File Interface

A File object is a Blob object with a name attribute, which is a string; it can be created within the web application via a constructor, or is a reference to a byte sequence from a file from the underlying (OS) file system.

If a File object is a reference to a byte sequence originating from a file on disk, then its snapshot state should be set to the state of the file on disk at the time the File object is created.

Note: This is a non-trivial requirement to implement for user agents, and is thus not a must but a should [RFC2119]. User agents should endeavor to have a File object’s snapshot state set to the state of the underlying storage on disk at the time the reference is taken. If the file is modified on disk following the time a reference has been taken, the File's snapshot state will differ from the state of the underlying storage. User agents may use modification time stamps and other mechanisms to maintain snapshot state, but this is left as an implementation detail.

When a File object refers to a file on disk, user agents must return the type of that file, and must follow the file type guidelines below:

File objects are serializable objects. Their serialization steps, given value and serialized, are:

  1. Set serialized.[[SnapshotState]] to value’s snapshot state.

  2. Set serialized.[[ByteSequence]] to value’s underlying byte sequence.

  3. Set serialized.[[Name]] to the value of value’s name attribute.

  4. Set serialized.[[LastModified]] to the value of value’s lastModified attribute.

Their deserialization steps, given value and serialized, are:

  1. Set value’s snapshot state to serialized.[[SnapshotState]].

  2. Set value’s underlying byte sequence to serialized.[[ByteSequence]].

  3. Initialize the value of value’s name attribute to serialized.[[Name]].

  4. Initialize the value of value’s lastModified attribute to serialized.[[LastModified]].

4.1. Constructor

The File constructor is invoked with two or three parameters, depending on whether the optional dictionary parameter is used. When the File() constructor is invoked, user agents must run the following steps:

  1. Let a be the fileBits sequence argument. Let bytes be an empty sequence of byte. Let length be a’s length. For 0 ≤ i < length, repeat the following steps:

    1. Let element be the i’th element of a.

    2. If element is a USVString, run the following substeps:

      1. Append the result of UTF-8 encoding s to bytes.

      Note: The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid utf-16 string with U+FFFD replacement characters. Scenarios exist when the Blob constructor may result in some data loss due to lost or scrambled character sequences.

    3. If element is a BufferSource, get a copy of the bytes held by the buffer source, and append those bytes to bytes.

    4. If element is a Blob, append the bytes it represents to bytes. The type of the Blob argument must be ignored.

  2. Let n be a new string of the same size as the fileName argument to the constructor. Copy every character from fileName to n, replacing any "/" character (U+002F SOLIDUS) with a ":" (U+003A COLON).

    Note: Underlying OS filesystems use differing conventions for file name; with constructed files, mandating UTF-16 lessens ambiquity when file names are converted to byte sequences.

  3. If the optional FilePropertyBag dictionary argument is used, then run the following substeps:

    1. If the type member is provided and is not the empty string, let t be set to the type dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps.

    2. Convert every character in t to ASCII lowercase.

    3. If the lastModified member is provided, let d be set to the lastModified dictionary member. If it is not provided, set d to the current date and time represented as the number of milliseconds since the Unix Epoch (which is the equivalent of Date.now() [ECMA-262]).

      Note: Since ECMA-262 Date objects convert to long long values representing the number of milliseconds since the Unix Epoch, the lastModified member could be a Date object [ECMA-262].

  4. Return a new File object F such that:

    1. F refers to the bytes byte sequence.

    2. F.size is set to the number of total bytes in bytes.

    3. F.name is set to n.

    4. F.type is set to t.

    5. F.lastModified is set to d.

4.1.1. Constructor Parameters

The File() constructor can be invoked with the parameters below:

A fileBits sequence
which takes any number of the following elements, and in any order:
A fileName parameter
A USVString parameter representing the name of the file; normative conditions for this constructor parameter can be found in §4.1 Constructor.
An optional FilePropertyBag dictionary
which in addition to the members of BlobPropertyBag takes one member:
  • An optional lastModified member, which must be a long long; normative conditions for this member are provided in §4.1 Constructor.

4.2. Attributes

name , of type DOMString, readonly
The name of the file. On getting, this must return the name of the file as a string. There are numerous file name variations and conventions used by different underlying OS file systems; this is merely the name of the file, without path information. On getting, if user agents cannot make this information available, they must return the empty string. If a File object is created using a constructor, further normative conditions for this attribute are found in §4.1 Constructor.
lastModified , of type long long, readonly
The last modified date of the file. On getting, if user agents can make this information available, this must return a long long set to the time the file was last modified as the number of milliseconds since the Unix Epoch. If the last modification date and time are not known, the attribute must return the current date and time as a long long representing the number of milliseconds since the Unix Epoch; this is equivalent to Date.now() [ECMA-262]. If a File object is created using a constructor, further normative conditions for this attribute are found in §4.1 Constructor.

The File interface is available on objects that expose an attribute of type FileList; these objects are defined in HTML [HTML]. The File interface, which inherits from Blob, is immutable, and thus represents file data that can be read into memory at the time a read operation is initiated. User agents must process reads on files that no longer exist at the time of read as errors, throwing a NotFoundError exception if using a FileReaderSync on a Web Worker [Workers] or firing an error event with the error attribute returning a NotFoundError.

In the examples below, metadata from a file object is displayed meaningfully, and a file object is created with a name and a last modified date.
var file = document.getElementById("filePicker").files[0];
var date = new Date(file.lastModified);
println("You selected the file " + file.name + " which was modified on " + date.toDateString() + ".");

...

// Generate a file with a specific last modified date

var d = new Date(2013, 12, 5, 16, 23, 45, 600);
var generatedFile = new File(["Rough Draft ...."], "Draft1.txt", {type: "text/plain", lastModified: d})

...

5. The FileList Interface

Note: The FileList interface should be considered "at risk" since the general trend on the Web Platform is to replace such interfaces with the Array platform object in ECMAScript [ECMA-262]. In particular, this means syntax of the sort filelist.item(0) is at risk; most other programmatic use of FileList is unlikely to be affected by the eventual migration to an Array type.

This interface is a list of File objects.

[Exposed=(Window,Worker), Serializable]
interface FileList {
  getter File? item(unsigned long index);
  readonly attribute unsigned long length;
};

FileList objects are serializable objects. Their serialization steps, given value and serialized, are:

  1. Set serialized.[[Files]] to an empty list.

  2. For each file in value, append the sub-serialization of file to serialized.[[Files]].

Their deserialization step, given serialized and value, are:

  1. For each file of serialized.[[Files]], add the sub-deserialization of file to value.

Sample usage typically involves DOM access to the <input type="file"> element within a form, and then accessing selected files.
// uploadData is a form element
// fileChooser is input element of type 'file'
var file = document.forms['uploadData']['fileChooser'].files[0];

// alternative syntax can be
// var file = document.forms['uploadData']['fileChooser'].files.item(0);

if(file)
{
  // Perform file ops
}

5.1. Attributes

length , of type unsigned long, readonly
must return the number of files in the FileList object. If there are no files, this attribute must return 0.

5.2. Methods and Parameters

item(index)
must return the indexth File object in the FileList. If there is no indexth File object in the FileList, then this method must return null.

index must be treated by user agents as value for the position of a File object in the FileList, with 0 representing the first file. Supported property indices are the numbers in the range zero to one less than the number of File objects represented by the FileList object. If there are no such File objects, then there are no supported property indices.

Note: The HTMLInputElement interface has a readonly attribute of type FileList, which is what is being accessed in the above example. Other interfaces with a readonly attribute of type FileList include the DataTransfer interface.

6. Reading Data

6.1. The Read Operation

The algorithm below defines a read operation, which takes a Blob and a synchronous flag as input, and reads bytes into a byte stream which is returned as the result of the read operation, or else fails along with a failure reason. Methods in this specification invoke the read operation with the synchronous flag either set or unset.

The synchronous flag determines if a read operation is synchronous or asynchronous, and is unset by default. Methods may set it. If it is set, the read operation takes place synchronously. Otherwise, it takes place asynchronously.

To perform a read operation on a Blob and the synchronous flag, run the following steps:

  1. Let s be a a new body, b be the Blob to be read from, and bytes initially set to an empty byte sequence. Set the length on s to the size of b. While there are still bytes to be read in b, perform the following substeps:

    1. If the synchronous flag is set, follow the steps below:

      1. Let bytes be the byte sequence that results from reading a chunk from b. If a file read error occurs reading a chunk from b, return s with the error flag set, along with a failure reason, and terminate this algorithm.

        Note: Along with returning failure, the synchronous part of this algorithm must return the failure reason that occurred for throwing an exception by synchronous methods that invoke this algorithm with the synchronous flag set.

      2. If there are no errors, push bytes to s, and increment s’s transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.

      3. When all the bytes of b have been read into s, return s and terminate this algorithm.

    2. Otherwise, the synchronous flag is unset. Return s and process the rest of this algorithm asynchronously.

    3. Let bytes be the byte sequence that results from reading a chunk from b. If a file read error occurs reading a chunk from b, set the error flag on s, and terminate this algorithm with a failure reason.

      Note: The asynchronous part of this algorithm must signal the failure reason that occurred for asynchronous error reporting by methods expecting s and which invoke this algorithm with the synchronous flag unset.

    4. If no file read error occurs, push bytes to s, and increment s’s transmitted [Fetch] by the number of bytes in bytes. Reset bytes to the empty byte sequence and continue reading chunks as above.

To perform an annotated task read operation on a Blob b, perform the steps below:

  1. Perform a read operation on b with the synchronous flag unset, along with the additional steps below.

  2. If the read operation terminates with a failure reason, queue a task to process read error with the failure reason and terminate this algorithm.

  3. When the first chunk is being pushed to the body s during the read operation, queue a task to process read.

  4. Once the body s from the read operation has at least one chunk read into it, or there are no chunks left to read from b, queue a task to process read data. Keep queuing tasks to process read data for every chunk read or every 50ms, whichever is least frequent.

  5. When all of the chunks from b are read into the body s from the read operation, queue a task to process read EOF.

Use the file reading task source for all these tasks.

6.2. The File Reading Task Source

This specification defines a new generic task source called the file reading task source, which is used for all tasks that are queued in this specification to read byte sequences associated with Blob and File objects. It is to be used for features that trigger in response to asynchronously reading binary data.

6.3. The FileReader API

[Constructor, Exposed=(Window,Worker)]
interface FileReader: EventTarget {

  // async read methods
  void readAsArrayBuffer(Blob blob);
  void readAsBinaryString(Blob blob);
  void readAsText(Blob blob, optional DOMString label);
  void readAsDataURL(Blob blob);

  void abort();

  // states
  const unsigned short EMPTY = 0;
  const unsigned short LOADING = 1;
  const unsigned short DONE = 2;


  readonly attribute unsigned short readyState;

  // File or Blob data
  readonly attribute (DOMString or ArrayBuffer)? result;

  readonly attribute DOMException? error;

  // event handler content attributes
  attribute EventHandler onloadstart;
  attribute EventHandler onprogress;
  attribute EventHandler onload;
  attribute EventHandler onabort;
  attribute EventHandler onerror;
  attribute EventHandler onloadend;

};

6.3.1. Constructor

When the FileReader() constructor is invoked, the user agent must return a new FileReader object.

In environments where the global object is represented by a Window or a WorkerGlobalScope object, the FileReader constructor must be available.

6.3.2. Event Handler Content Attributes

The following are the event handler content attributes (and their corresponding event handler event types) that user agents must support on FileReader as DOM attributes:

event handler content attribute event handler event type
onloadstart loadstart
onprogress progress
onabort abort
onerror error
onload load
onloadend loadend

6.3.3. FileReader States

The FileReader object can be in one of 3 states. The readyState attribute, on getting, must return the current state, which must be one of the following values:

EMPTY (numeric value 0)
The FileReader object has been constructed, and there are no pending reads. None of the read methods have been called. This is the default state of a newly minted FileReader object, until one of the read methods have been called on it.
LOADING (numeric value 1)
A File or Blob is being read. One of the read methods is being processed, and no error has occurred during the read.
DONE (numeric value 2)
The entire File or Blob has been read into memory, OR a file read error occurred, OR the read was aborted using abort(). The FileReader is no longer reading a File or Blob. If readyState is set to DONE it means at least one of the read methods have been called on this FileReader.

6.3.4. Reading a File or Blob

The FileReader interface makes available several asynchronous read methodsreadAsArrayBuffer(), readAsBinaryString(), readAsText() and readAsDataURL(), which read files into memory. If multiple concurrent read methods are called on the same FileReader object, user agents must throw an InvalidStateError on any of the read methods that occur when readyState = LOADING.

(FileReaderSync makes available several synchronous read methods. Collectively, the sync and async read methods of FileReader and FileReaderSync are referred to as just read methods.)

6.3.4.1. The result attribute

On getting, the result attribute returns a Blob's data as a DOMString, or as an ArrayBuffer, or null, depending on the read method that has been called on the FileReader, and any errors that may have occurred.

The list below is normative for the result attribute and is the conformance criteria for this attribute:

6.3.4.2. The readAsDataURL() method

When the readAsDataURL(blob) method is called, the user agent must run the steps below.

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise set readyState to LOADING.

  3. Initiate an annotated task read operation using the blob argument as input and handle tasks queued on the file reading task source per below.

  4. To process read error with a failure reason, proceed to §6.3.4.6 Error Steps.

  5. To process read fire a progress event called loadstart at the context object.

  6. To process read data fire a progress event called progress at the context object.

  7. To process read EOF run these substeps:

    1. Set readyState to DONE.

    2. Set the result attribute to the body returned by the read operation as a DataURL [RFC2397]; on getting, the result attribute returns the blob as a Data URL [RFC2397].

      • Use the blob’s type attribute as part of the Data URL if it is available in keeping with the Data URL specification [RFC2397].

      • If the type attribute is not available on the blob return a Data URL without a media-type. [RFC2397]. Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [RFC2397].

    3. Fire a progress event called load at the context object.

    4. Unless readyState is LOADING fire a progress event called loadend at the context object. If readyState is LOADING do NOT fire loadend at the context object.

  8. Terminate this algorithm.

6.3.4.3. The readAsText() method

The readAsText() method can be called with an optional parameter, label, which is a DOMString argument that represents the label of an encoding [Encoding]; if provided, it must be used as part of the encoding determination used when processing this method call.

When the readAsText(blob, label) method is called, the user agent must run the steps below.

  1. If readyState = LOADING throw an InvalidStateError and terminate this algorithm.

  2. Otherwise set readyState to LOADING.

  3. Initiate an annotated task read operation using the blob argument as input and handle tasks queued on the file reading task source per below.

  4. To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.

  5. To process read fire a progress event called loadstart at the context object.

  6. To process read data fire a progress event called progress at the context object.

  7. To process read EOF run these substeps:

    1. Set readyState to DONE

    2. Set the result attribute to the body returned by the read operation, represented as a string in a format determined by the encoding determination.

    3. Fire a progress event called load at the context object.

    4. Unless readyState is LOADING fire a progress event called loadend at the context object. If readyState is LOADING do NOT fire loadend at the context object.

  8. Terminate this algorithm.

6.3.4.4. The readAsArrayBuffer() method

When the readAsArrayBuffer(blob) method is called, the user agent must run the steps below.

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise set readyState to LOADING.

  3. Initiate an annotated task read operation using the blob argument as input and handle tasks queued on the file reading task source per below.

  4. To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.

  5. To process read fire a progress event called loadstart at the context object.

  6. To process read data fire a progress event called progress at the context object.

  7. To process read EOF run these substeps:

    1. Set readyState to DONE

    2. Set the result attribute to the body returned by the read operation as an ArrayBuffer object.

    3. Fire a progress event called load at the context object.

    4. Unless readyState is LOADING fire a progress event called loadend at the context object. If readyState is LOADING do NOT fire loadend at the context object.

  8. Terminate this algorithm.

6.3.4.5. The readAsBinaryString() method

When the readAsBinaryString(blob) method is called, the user agent must run the steps below.

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise set readyState to LOADING.

  3. Initiate an annotated task read operation using the blob argument as input and handle tasks queued on the file reading task source per below.

  4. To process read error with a failure reason, proceed to the §6.3.4.6 Error Steps.

  5. To process read fire a progress event called loadstart at the context object.

  6. To process read data fire a progress event called progress at the context object.

  7. To process read EOF run these substeps:

    1. Set readyState to DONE

    2. Set the result attribute to the body returned by the read operation as a binary string.

    3. Fire a progress event called load at the context object.

    4. Unless readyState is LOADING fire a progress event called loadend at the context object. If readyState is LOADING do NOT fire loadend at the context object.

  8. Terminate this algorithm.

The use of readAsArrayBuffer() is preferred over readAsBinaryString(), which is provided for backwards compatibility.
6.3.4.6. Error Steps

These error steps are to process read error with a failure reason.

  1. Set the context object’s readyState to DONE and result to null if it is not already set to null.

  2. Set the error attribute on the context object; on getting, the error attribute must be a a DOMException object that corresponds to the failure reason. Fire a progress event called error at the context object.

  3. Unless readyState is LOADING, fire a progress event called loadend at the context object. If readyState is LOADING do NOT fire loadend at the context object.

  4. Terminate the algorithm for any read method.

6.3.4.7. The abort() method

When the abort() method is called, the user agent must run the steps below:

  1. If readyState = EMPTY or if readyState = DONE set result to null and terminate this algorithm.

  2. If readyState = LOADING set readyState to DONE and result to null.

  3. If there are any tasks from the context object on the file reading task source in an affiliated task queue, then remove those tasks from that task queue.

  4. Terminate the algorithm for the read method being processed.

  5. Fire a progress event called abort.

  6. Fire a progress event called loadend.

6.3.4.8. Blob Parameters

The asynchronous read methods, the synchronous read methods, and URL.createObjectURL() take a Blob parameter. This section defines this parameter.

blob
This is a Blob argument and must be a reference to a single File in a FileList or a Blob argument not obtained from the underlying OS file system.

6.4. Determining Encoding

When reading Blob objects using the readAsText() read method, the following encoding determination steps must be followed:

  1. Let encoding be null.

  2. If the label argument is present when calling the method, set encoding to the result of the getting an encoding from label.

  3. If the getting an encoding steps above return failure, then set encoding to null.

  4. If encoding is null, and the blob argument’s type attribute is present, and it uses a Charset Parameter [RFC2046], set encoding to the result of getting an encoding for the portion of the Charset Parameter that is a label of an encoding.

    If blob has a type attribute of text/plain;charset=utf-8 then getting an encoding is run using "utf-8" as the label. Note that user agents must parse and extract the portion of the Charset Parameter that constitutes a label of an encoding.
  5. If the getting an encoding steps above return failure, then set encoding to null.

  6. If encoding is null, then set encoding to utf-8.

  7. Decode this blob using fallback encoding encoding, and return the result. On getting, the result attribute of the FileReader object returns a string in encoding format. The synchronous readAsText() method of the FileReaderSync object returns a string in encoding format.

6.5. Events

The FileReader object must be the event target for all events in this specification.

When this specification says to fire a progress event called e (for some ProgressEvent e at a given FileReader reader as the context object), the following are normative:

6.5.1. Event Summary

The following are the events that are fired at FileReader objects.

Event name Interface Fired when…
loadstart ProgressEvent When the read starts.
progress ProgressEvent While reading (and decoding) blob
abort ProgressEvent When the read has been aborted. For instance, by invoking the abort() method.
error ProgressEvent When the read has failed (see file read errors).
load ProgressEvent When the read has successfully completed.
loadend ProgressEvent When the request has completed (either in success or failure).

6.5.2. Summary of Event Invariants

This section is informative.

The following are invariants applicable to event firing for a given asynchronous read method in this specification:

  1. Once a loadstart has been fired, a corresponding loadend fires at completion of the read, UNLESS any of the following are true:

    • the read method has been cancelled using abort() and a new read method has been invoked

    • the event handler function for a load event initiates a new read

    • the event handler function for a error event initiates a new read.

    Note: The events loadstart and loadend are not coupled in a one-to-one manner.

    This example showcases "read-chaining": initiating another read from within an event handler while the "first" read continues processing.
    // In code of the sort...
    reader.readAsText(file);
    reader.onload = function(){reader.readAsText(alternateFile);}
    
    .....
    
    //... the loadend event must not fire for the first read
    
    reader.readAsText(file);
    reader.abort();
    reader.onabort = function(){reader.readAsText(updatedFile);}
    
    //... the loadend event must not fire for the first read
    
  2. One progress event will fire when blob has been completely read into memory.

  3. No progress event fires before loadstart.

  4. No progress event fires after any one of abort, load, and error have fired. At most one of abort, load, and error fire for a given read.

  5. No abort, load, or error event fires after loadend.

6.6. Reading on Threads

Web Workers allow for the use of synchronous File or Blob read APIs, since such reads on threads do not block the main thread. This section defines a synchronous API, which can be used within Workers [[Web Workers]]. Workers can avail of both the asynchronous API (the FileReader object) and the synchronous API (the FileReaderSync object).

6.6.1. The FileReaderSync API

This interface provides methods to synchronously read File or Blob objects into memory.

[Constructor, Exposed=(DedicatedWorker,SharedWorker)]
interface FileReaderSync {
  // Synchronously return strings

  ArrayBuffer readAsArrayBuffer(Blob blob);
  DOMString readAsBinaryString(Blob blob);
  DOMString readAsText(Blob blob, optional DOMString label);
  DOMString readAsDataURL(Blob blob);
};
6.6.1.1. Constructors

When the FileReaderSync() constructor is invoked, the user agent must return a new FileReaderSync object.

In environments where the global object is represented by a WorkerGlobalScope object, the FileReaderSync constructor must be available.

6.6.1.2. The readAsText() method

When the readAsText(blob, label) method is called, the following steps must be followed:

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise, initiate a read operation using the blob argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm.

  3. If no error has occurred, return the result of the read operation represented as a string in a format determined through the encoding determination algorithm.

6.6.1.3. The readAsDataURL() method

When the readAsDataURL(blob) method is called, the following steps must be followed:

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise, initiate a read operation using the blob argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm.

  3. If no error has occurred, return the result of the read operation as a Data URL [RFC2397] subject to the considerations below:

    • Use the blob’s type attribute as part of the Data URL if it is available in keeping with the Data URL specification [RFC2397].

    • If the type attribute is not available on the blob return a Data URL without a media-type. [RFC2397]. Data URLs that do not have media-types [RFC2046] must be treated as plain text by conforming user agents. [RFC2397].

6.6.1.4. The readAsArrayBuffer() method

When the readAsArrayBuffer(blob) method is called, the following steps must be followed:

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise, initiate a read operation using the blob argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm.

  3. If no error has occurred, return the result of the read operation as an ArrayBuffer.

6.6.1.5. The readAsBinaryString() method

When the readAsBinaryString(blob) method is called, the following steps must be followed:

  1. If readyState = LOADING throw an InvalidStateError exception and terminate this algorithm.

  2. Otherwise, initiate a read operation using the blob argument, and with the synchronous flag set. If the read operation returns failure, throw the appropriate exception as defined in §7.1 Throwing an Exception or Returning an Error. Terminate this algorithm.

  3. If no error has occurred, return the result of the read operation as an binary string.

The use of readAsArrayBuffer() is preferred over readAsBinaryString(), which is provided for backwards compatibility.

7. Errors and Exceptions

File read errors can occur when reading files from the underlying filesystem. The list below of potential error conditions is informative.

7.1. Throwing an Exception or Returning an Error

This section is normative.

Error conditions can arise when reading a File or a Blob.

The read operation can terminate due to error conditions when reading a File or a Blob; the particular error condition that causes a read operation to return failure or queue a task to process read error is called a failure reason.

Synchronous read methods throw exceptions of the type in the table below if there has been an error owing to a particular failure reason.

Asynchronous read methods use the error attribute of the FileReader object, which must return a DOMException object of the most appropriate type from the table below if there has been an error owing to a particular failure reason, or otherwise return null.

Type Description and Failure Reason
NotFoundError If the File or Blob resource could not be found at the time the read was processed, this is the NotFound failure reason.

For asynchronous read methods the error attribute must return a NotFoundError exception and synchronous read methods must throw a NotFoundError exception.

SecurityError If:
  • it is determined that certain files are unsafe for access within a Web application, this is the UnsafeFile failure reason.

  • it is determined that too many read calls are being made on File or Blob resources, this is the TooManyReads failure reason.

For asynchronous read methods the error attribute may return a SecurityError exception and synchronous read methods may throw a SecurityError exception.

This is a security error to be used in situations not covered by any other failure reason.

NotReadableError If:
  • the snapshot state of a File or a Blob does not match the state of the underlying storage, this is the SnapshotState failure reason.

  • the File or Blob cannot be read, typically due due to permission problems that occur after a snapshot state has been established (e.g. concurrent lock on the underlying storage with another application) then this is the FileLock failure reason.

For asynchronous read methods the error attribute must return a NotReadableError exception and synchronous read methods must throw a NotReadableError exception.

8. A URL for Blob and File reference

This section defines a scheme for a URL used to refer to Blob objects (and File objects).

8.1. Requirements for a New Scheme

This specification defines a scheme with URLs of the sort: blob:550e8400-e29b-41d4-a716-446655440000#aboutABBA. This section provides some requirements and is an informative discussion.

8.2. Discussion of Existing Schemes

This section is an informative discussion of existing schemes that may have been repurposed or reused for the use cases for URLs above, and justification for why a new scheme is considered preferable. These schemes include HTTP [RFC7230], file [RFC1630][RFC4266], and a scheme such as urn:uuid [RFC4122]. One broad consideration in determining what scheme to use is providing something with intuitive appeal to web developers.

8.3. The Blob URL

A Blob URL is a URL with a scheme of blob, a host of the origin of the Blob URL, a path with one entry comprised of a UUID [RFC4122] (see An ABNF for UUID), and an object of the associated Blob or File. A Blob URL may contain an optional fragment. The Blob URL is serialized as a string according to the Serialization of a Blob URL algorithm.

A fragment, if used, has a distinct interpretation depending on the media type of the Blob or File resource in question (see §8.3.3 Discussion of Fragment Identifier).

blob = scheme ":" origin "/" UUID [fragIdentifier]

scheme = "blob"

; scheme is always "blob"

; origin is a string representation of the Blob URL’s [=url/origin=].
; UUID is as defined in [RFC4122] and An ABNF for UUID
; fragIdentifier is optional and as defined in [RFC3986] and An ABNF for Fragment Identifiers
An example of a Blob URL might be blob:https://example.org/9115d58c-bcda-ff47-86e5-083e9a215304.

8.3.1. Origin of Blob URLs

Blob URLs are created using URL.createObjectURL(), and are revoked using URL.revokeObjectURL(). The origin of a Blob URL must be the same as the origin specified by the current settings object at the time the method that created it was called.

there is currently some confusion between the generic definition of the origin of a URL and the specific definition of the origin of a Blob URL. This is tracked in issue #63 and in whatwg/url#127.

Cross-origin requests on Blob URLs must return a network error.

Note: In practice this means that HTTP and HTTPS origins are covered by this specification as valid origins for use with Blob URLs. This specification does not address the case of non-HTTP and non-HTTPS origins. For instance blob:file:///Users/arunranga/702efefb-c234-4988-a73b-6104f1b615ee (which uses the "file:" origin) may have behavior that is undefined, even though user agents may treat such Blob URLs as valid.

8.3.2. Serialization of a Blob URL

The Serialization of a Blob URL is the value returned by the following algorithm, which is invoked by URL.createObjectURL():

  1. Let result be the empty string. Append the string "blob" (that is, the Unicode code point sequence U+0062, U+006C, U+006F, U+0062) to result.

  2. Append the ":" (U+003A COLON) character to result.

  3. Let settings be the current settings object

  4. Let origin be settings’s origin.

  5. Let serialized be the ASCII serialization of origin.

  6. If serialized is "null", set it to an implementation-defined value.

  7. Append serialized to result.

  8. Append the "/" character (U+0024 SOLIDUS) to result.

  9. Generate a UUID [RFC4122] as a Unicode string and append it to result.

  10. Return result.

8.3.3. Discussion of Fragment Identifier

The fragment’s resolution and processing directives depend on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the Blob URL is dereferenced. For example, in an HTML file [HTML] the fragment could be used to refer to an anchor within the file. If the user agent does not recognize the media type of the resource, OR if a fragment is not meaningful within the resource, it must ignore the fragment. The fragment must not be used to identify a resource; only the blob scheme and the origin/path of the URL constitute a valid resource identifier.

A valid Blob URL reference could look like: blob:http://example.org:8080/550e8400-e29b-41d4-a716-446655440000#aboutABBA where "#aboutABBA" might be an HTML fragment identifier referring to an element with an id attribute of "aboutABBA".

Note: The fragment is not used to identify the resource. The URL.createObjectURL() method does not generate a fragment.

8.4. Dereferencing Model for Blob URLs

This section is informative.

Note: The [URL] and [Fetch] specifications should be considered normative for parsing and fetching Blob URLs.

Blob URLs are dereferenced when the user agent retrieves the resource identified by the Blob URL and returns it to the requesting entity. This section provides guidance on requests and responses

Only requests with GET [RFC7231] are supported. Specifically, responses are only a subset of the following from HTTP [RFC7231]:

8.4.1. 200 OK

This response is used if the request has succeeded, and no network errors are generated.

8.4.2. Response Headers

Along with 200 OK responses, user agents use a Content-Type header [RFC7231] that is equal to the value of the Blob's type attribute, if it is not the empty string.

Along with 200 OK responses, user agents use a Content-Length header [RFC7230] that is equal to the value of the Blob's size attribute.

If a Content-Type header [RFC7231] is provided, then user agents obtain and process that media type in a manner consistent with the Mime Sniffing specification [MIMESNIFF].

If a resource identified with a Blob URL is a File object, user agents use that file’s name attribute, as if the response had a Content-Disposition header with the filename parameter set to the File's name attribute [RFC6266].

Note: A corollary to this is a non-normative implementation guideline: that the "File Save As" user interface in user agents takes into account the File's name attribute if presenting a default name to save the file as.

8.4.3. Network Errors

Responses that do not succeed with a 200 OK act as if a network error has occurred. Network errors are used when:

8.4.4. Sample Request and Response Headers

This section is informative.

This section provides sample exchanges between web applications and user agents using Blob URLs. A request can be triggered using HTML markup of the sort <img src="blob:http://example.org:8080/550e8400-e29b-41d4-a716-446655440000">. These examples merely illustrate the request and response; web developers are not likely to interact with all the headers, but the getAllResponseHeaders() method of XMLHttpRequest, if used, will show relevant response headers.

Requests could look like this:
GET http://example.org:8080/550e8400-e29b-41d4-a716-446655440000

If the Blob has an affiliated media type [RFC2046] represented by its type attribute, then the response message should include the Content-Type header from RFC7231 [RFC7231]. See §8.4.2 Response Headers.

Example response:

200 OK
Content-Type: image/jpeg
Content-Length: 21897

....

If there is a file read error associated with the Blob, then a user agent acts as if a network error has occurred.

8.5. Creating and Revoking a Blob URL

Blob URLs are created and revoked using methods exposed on the URL object, supported by global objects Window [HTML] and WorkerGlobalScope [[Web Workers]]. Revocation of a Blob URL decouples the Blob URL from the resource it refers to, and if it is dereferenced after it is revoked, user agents must act as if a network error has occurred. This section describes a supplemental interface to the URL specification [URL] and presents methods for Blob URL creation and revocation.

[Exposed=(Window,DedicatedWorker,SharedWorker)]
partial interface URL {
  static DOMString createObjectURL(Blob blob);
  static void revokeObjectURL(DOMString url);
};

ECMAScript user agents of this specification must ensure that they do not expose a prototype property on the URL interface object unless the user agent also implements the URL [URL] specification. In other words, URL.prototype must evaluate to true if the user agent implements the URL [URL] specification, and must NOT evaluate to true otherwise.

8.5.1. Methods and Parameters

The createObjectURL(blob) static method
Returns a unique Blob URL. This method must act as follows:
  1. Let url be the result of the Serialization of a Blob URL algorithm.

  2. Add an entry to the Blob URL Store for url and blob.

  3. Return url.

The revokeObjectURL(url) static method
Revokes the Blob URL provided in the string url by removing the corresponding entry from the Blob URL Store. This method must act as follows:
  1. If the value provided for the url argument is not a Blob URL, OR if the value provided for the url argument does not have an entry in the Blob URL Store, this method call does nothing. User agents may display a message on the error console.

  2. Otherwise, user agents must remove the entry from the Blob URL Store for url.

    Note: Subsequent attemps to dereference url result in a network error, since the entry has been removed from the Blob URL Store.

The url argument to the revokeObjectURL() method is a Blob URL string.

In the example below, window1 and window2 are separate, but in the same origin; window2 could be an iframe inside window1.
myurl = window1.URL.createObjectURL(myblob);
window2.URL.revokeObjectURL(myurl);

Since window1 and window2 are in the same origin and share the same Blob URL Store, the URL.revokeObjectURL() call ensures that subsequent dereferencing of myurl results in a the user agent acting as if a network error has occurred.

8.5.2. Examples of Blob URL Creation and Revocation

Blob URLs are strings that are used to dereference Blob objects, and can persist for as long as the document from which they were minted using URL.createObjectURL()see §8.6 Lifetime of Blob URLs.

This section gives sample usage of creation and revocation of Blob URLs with explanations.

In the example below, two img elements [HTML] refer to the same Blob URL:
url = URL.createObjectURL(blob);
img1.src = url;
img2.src = url;
In the example below, URL.revokeObjectURL() is explicitly called.
var blobURLref = URL.createObjectURL(file);
img1 = new Image();
img2 = new Image();

// Both assignments below work as expected
img1.src = blobURLref;
img2.src = blobURLref;

// ... Following body load
// Check if both images have loaded
if(img1.complete && img2.complete) {
  // Ensure that subsequent refs throw an exception
  URL.revokeObjectURL(blobURLref);
} else {
  msg("Images cannot be previewed!");
  // revoke the string-based reference
  URL.revokeObjectURL(blobURLref);
}

The example above allows multiple references to a single Blob URL, and the web developer then revokes the Blob URL string after both image objects have been loaded. While not restricting number of uses of the Blob URL offers more flexibility, it increases the likelihood of leaks; developers should pair it with a corresponding call to URL.revokeObjectURL().

8.6. Lifetime of Blob URLs

A global object which exposes URL.createObjectURL() must maintain a Blob URL Store which is a list of Blob URLs created by the URL.createObjectURL() method, and the Blob resource that each refers to.

When this specification says to add an entry to the Blob URL Store for a Blob URL and a Blob input, the user-agent must add the Blob URL and a reference to the Blob it refers to to the Blob URL Store.

When this specification says to remove an entry from the Blob URL Store for a given Blob URL or for a given Blob, user agents must remove the Blob URL and the Blob it refers to from the Blob URL Store. Subsequent attempts to dereference this URL must result in a network error.

This specification adds an additional unloading document cleanup step: user agents must remove all Blob URLs from the Blob URL Store within that document.

Note: User agents are free to garbage collect resources removed from the Blob URL Store.

9. Security and Privacy Considerations

This section is informative.

This specification allows web content to read files from the underlying file system, as well as provides a means for files to be accessed by unique identifiers, and as such is subject to some security considerations. This specification also assumes that the primary user interaction is with the <input type="file"/> element of HTML forms [HTML], and that all files that are being read by FileReader objects have first been selected by the user. Important security considerations include preventing malicious file selection attacks (selection looping), preventing access to system-sensitive files, and guarding against modifications of files on disk after a selection has taken place.

This section is provisional; more security data may supplement this in subsequent drafts.

10. Requirements and Use Cases

This section covers what the requirements are for this API, as well as illustrates some use cases. This version of the API does not satisfy all use cases; subsequent versions may elect to address these.

Appendix A: Blob URL Grammar

This section uses the Augmented Backus-Naur Form (ABNF), defined in [RFC5234], to describe components of blob: URLs.

An ABNF for UUID

The following is an ABNF [RFC5234] for UUID. UUID strings must only use characters in the ranges U+002A to U+002B, U+002D to U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005E to U+007E [Unicode], and should be at least 36 characters long.

UUID                   = time-low "-" time-mid "-"
                        time-high-and-version "-"
                        clock-seq-and-reserved
                        clock-seq-low "-" node
time-low               = 4hexOctet
time-mid               = 2hexOctet
time-high-and-version  = 2hexOctet
clock-seq-and-reserved = hexOctet
clock-seq-low          = hexOctet
node                   = 6hexOctet
hexOctet               = hexDigit hexDigit
hexDigit = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
          "a" / "b" / "c" / "d" / "e" / "f" /
          "A" / "B" / "C" / "D" / "E" / "F"

An ABNF for Fragment Identifiers

fragIdentifier = "#" fragment

; Fragment Identifiers depend on the media type of the Blob
; fragment is defined in [RFC3986]
; fragment processing for HTML is defined in [HTML]

fragment    = *( pchar / "/" / "?" )

pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"

pct-encoded   = "%" HEXDIG HEXDIG

sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" /
                "*" / "+" / "," / ";" / "="

Acknowledgements

This specification was originally developed by the SVG Working Group. Many thanks to Mark Baker and Anne van Kesteren for their feedback.

Thanks to Robin Berjon and Jonas Sicking for editing the original specification.

Special thanks to Olli Pettay, Nikunj Mehta, Garrett Smith, Aaron Boodman, Michael Nordman, Jian Li, Dmitry Titov, Ian Hickson, Darin Fisher, Sam Weinig, Adrian Bateman and Julian Reschke.

Thanks to the W3C WebApps WG, and to participants on the [email protected] listserv

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Yongsheng Zhu. W3C DOM 4.1. 25 October 2017. W3C Working Draft. URL: https://www.w3.org/TR/dom41/ ED: https://w3c.github.io/dom/
[ECMA-262]
ECMAScript Language Specification. URL: https://tc39.github.io/ecma262/
[Encoding]
Anne van Kesteren; Joshua Bell; Addison Phillips. Encoding. W3C Candidate Recommendation. URL: https://www.w3.org/TR/encoding/ ED: https://encoding.spec.whatwg.org/
[Fetch]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Steve Faulkner; Arron Eicholz; Travis Leithead; Alex Danilo; Sangwhan Moon. HTML 5.2. 8 August 2017. W3C Candidate Recommendation. URL: https://www.w3.org/TR/html52/ ED: https://w3c.github.io/html
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[MIMESNIFF]
Gordon P. Hemsley. MIME Sniffing Standard. Living Standard. URL: https://mimesniff.spec.whatwg.org/
[RFC2046]
N. Freed; N. Borenstein. Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. November 1996. Draft Standard. URL: https://tools.ietf.org/html/rfc2046
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[RFC2397]
L. Masinter. The "data" URL scheme. August 1998. Proposed Standard. URL: https://tools.ietf.org/html/rfc2397
[RFC5234]
D. Crocker, Ed.; P. Overell. Augmented BNF for Syntax Specifications: ABNF. January 2008. Internet Standard. URL: https://tools.ietf.org/html/rfc5234
[RFC6266]
J. Reschke. Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP). June 2011. Proposed Standard. URL: https://tools.ietf.org/html/rfc6266
[RFC7230]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. June 2014. Proposed Standard. URL: https://tools.ietf.org/html/rfc7230
[RFC7231]
R. Fielding, Ed.; J. Reschke, Ed.. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. June 2014. Proposed Standard. URL: https://tools.ietf.org/html/rfc7231
[STREAMS]
Adam Rice; Domenic Denicola; 吉野剛史 (Takeshi Yoshino). Streams Standard. Living Standard. URL: https://streams.spec.whatwg.org/
[Unicode]
The Unicode Standard. URL: https://www.unicode.org/versions/latest/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[WebIDL]
Cameron McCormack; Boris Zbarsky; Tobie Langel. Web IDL. URL: https://heycam.github.io/webidl/
[XHR]
Anne van Kesteren. XMLHttpRequest Standard. Living Standard. URL: https://xhr.spec.whatwg.org/

Informative References

[RFC1630]
T. Berners-Lee. Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web. June 1994. Informational. URL: https://tools.ietf.org/html/rfc1630
[RFC3986]
T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax. January 2005. Internet Standard. URL: https://tools.ietf.org/html/rfc3986
[RFC4122]
P. Leach; M. Mealling; R. Salz. A Universally Unique IDentifier (UUID) URN Namespace. July 2005. Proposed Standard. URL: https://tools.ietf.org/html/rfc4122
[RFC4266]
P. Hoffman. The gopher URI Scheme. November 2005. Proposed Standard. URL: https://tools.ietf.org/html/rfc4266
[WebRTC]
Adam Bergkvist; et al. WebRTC 1.0: Real-time Communication Between Browsers. URL: https://www.w3.org/TR/webrtc/
[Workers]
Ian Hickson. Web Workers. URL: https://www.w3.org/TR/workers/

IDL Index

[Constructor(optional sequence<BlobPart> blobParts,
             optional BlobPropertyBag options),
 Exposed=(Window,Worker), Serializable]
interface Blob {

  readonly attribute unsigned long long size;
  readonly attribute DOMString type;

  // slice Blob into byte-ranged chunks
  Blob slice([Clamp] optional long long start,
            [Clamp] optional long long end,
            optional DOMString contentType);
};

dictionary BlobPropertyBag {
  DOMString type = "";
};

typedef (BufferSource or Blob or USVString) BlobPart;

[Constructor(sequence<BlobPart> fileBits,
             USVString fileName,
             optional FilePropertyBag options),
 Exposed=(Window,Worker), Serializable]
interface File : Blob {
  readonly attribute DOMString name;
  readonly attribute long long lastModified;
};

dictionary FilePropertyBag : BlobPropertyBag {
  long long lastModified;
};

[Exposed=(Window,Worker), Serializable]
interface FileList {
  getter File? item(unsigned long index);
  readonly attribute unsigned long length;
};

[Constructor, Exposed=(Window,Worker)]
interface FileReader: EventTarget {

  // async read methods
  void readAsArrayBuffer(Blob blob);
  void readAsBinaryString(Blob blob);
  void readAsText(Blob blob, optional DOMString label);
  void readAsDataURL(Blob blob);

  void abort();

  // states
  const unsigned short EMPTY = 0;
  const unsigned short LOADING = 1;
  const unsigned short DONE = 2;


  readonly attribute unsigned short readyState;

  // File or Blob data
  readonly attribute (DOMString or ArrayBuffer)? result;

  readonly attribute DOMException? error;

  // event handler content attributes
  attribute EventHandler onloadstart;
  attribute EventHandler onprogress;
  attribute EventHandler onload;
  attribute EventHandler onabort;
  attribute EventHandler onerror;
  attribute EventHandler onloadend;

};

[Constructor, Exposed=(DedicatedWorker,SharedWorker)]
interface FileReaderSync {
  // Synchronously return strings

  ArrayBuffer readAsArrayBuffer(Blob blob);
  DOMString readAsBinaryString(Blob blob);
  DOMString readAsText(Blob blob, optional DOMString label);
  DOMString readAsDataURL(Blob blob);
};

[Exposed=(Window,DedicatedWorker,SharedWorker)]
partial interface URL {
  static DOMString createObjectURL(Blob blob);
  static void revokeObjectURL(DOMString url);
};

Issues Index

there is currently some confusion between the generic definition of the origin of a URL and the specific definition of the origin of a Blob URL. This is tracked in issue #63 and in whatwg/url#127.
This section is provisional; more security data may supplement this in subsequent drafts.