Verifiable Credentials Implementation Guidelines 1.0

3. Terminology

This section is non-normative.

The following terms are used to describe concepts in this specification.

claim: An assertion made about a subject.
credential: A set of one or more claims made by an issuer. A verifiable credential is a tamper-evident credential that has authorship that can be cryptographically verified. Verifiable credentials can be used to build verifiable presentations, which can also be cryptographically verified. The claims in a credential can be about different subjects.
data minimization: The act of limiting the amount of shared data strictly to the minimum necessary to successfully accomplish a task or goal.
decentralized identifier: A portable URL-based identifier, also known as a DID, associated with an entity. These identifiers are most often used in a verifiable credential and are associated with subjects such that a verifiable credential itself can be easily ported from one repository to another without the need to reissue the credential. An example of a DID is did:example:123456abcdef.
decentralized identifier document: Also referred to as a DID document, this is a document that is accessible using a verifiable data registry and contains information related to a specific decentralized identifier, such as the associated repository and public key information.
derived predicate: A verifiable, boolean assertion about the value of another attribute in a verifiable credential. These are useful in zero-knowledge-proof-style verifiable presentations because they can limit information disclosure. For example, if a verifiable credential contains an attribute for expressing a specific height in centimeters, a derived predicate might reference the height attribute in the verifiable credential demonstrating that the issuer attests to a height value meeting the minimum height requirement, without actually disclosing the specific height value. For example, the subject is taller than 150 centimeters.
digital signature: A mathematical scheme for demonstrating the authenticity of a digital message.
entity: A thing with distinct and independent existence, such as a person, organization, or device that performs one or more roles in the ecosystem.
graph: A network of information composed of subjects and their relationship to other subjects or data.
hashlink: Hashlink URLs can be used to provide content integrity for links to external resources.
holder: A role an entity might perform by possessing one or more verifiable credentials and generating presentations from them. A holder is usually, but not always, a subject of the verifiable credentials they are holding. Holders store their credentials in credential repositories.
identity: The means for keeping track of entities across contexts. Digital identities enable tracking and customization of entity interactions across digital contexts, typically using identifiers and attributes. Unintended distribution or use of identity information can compromise privacy. Collection and use of such information should follow the principle of data minimization.
identity provider: An identity provider, sometimes abbreviated as IdP, is a system for creating, maintaining, and managing identity information for holders, while providing authentication services to relying party applications within a federation or distributed network. In this case the holder is always the subject. Even if the verifiable credentials are bearer credentials, it is assumed the verifiable credentials remain with the subject, and if they are not, they were stolen by an attacker. This specification does not use this term unless comparing or mapping the concepts in this document to other specifications. This specification decouples the identity provider concept into two distinct concepts: the issuer and the holder.
issuer: A role an entity can perform by asserting claims about one or more subjects, creating a verifiable credential from these claims, and transmitting the verifiable credential to a holder.
presentation: Data derived from one or more verifiable credentials, issued by one or more issuers, that is shared with a specific verifier. A verifiable presentation is a tamper-evident presentation encoded in such a way that authorship of the data can be trusted after a process of cryptographic verification. Certain types of verifiable presentations might contain data that is synthesized from, but do not contain, the original verifiable credentials (for example, zero-knowledge proofs).
repository: A program, such as a storage vault or personal verifiable credential wallet, that stores and protects access to holders' verifiable credentials.
selective disclosure: The ability of a holder to make fine-grained decisions about what information to share.
subject: A thing about which claims are made.
user agent: A program, such as a browser or other Web client, that mediates the communication between holders, issuers, and verifiers.
validation: The assurance that a verifiable credential or a verifiable presentation meets the needs of a verifier and other dependent stakeholders. This specification is constrained to verifying verifiable credentials and verifiable presentations regardless of their usage. Validating verifiable credentials or verifiable presentations is outside the scope of this specification.
verifiable data registry: A role a system might perform by mediating the creation and verification of identifiers, keys, and other relevant data, such as verifiable credential schemas, revocation registries, issuer public keys, and so on, which might be required to use verifiable credentials. Some configurations might require correlatable identifiers for subjects. Some registries, such as ones for UUIDs and public keys, might just act as namespaces for identifiers.
verification: The evaluation of whether a verifiable credential or verifiable presentation is an authentic and timely statement of the issuer or presenter, respectively. This includes checking that: the credential (or presentation) conforms to the specification; the proof method is satisfied; and, if present, the status is successfully checked.
verifier: The entity verifying a claim about a given subject.
URI: A Uniform Resource Identifier, as defined by [RFC3986].

9. Extensions

This section is non-normative.

The Verifiable Credentials Data Model is designed around an open world assumption, meaning that any entity can say anything about another entity. This approach enables permissionless innovation; there is no centralized registry or authority through which an extension author must register themselves nor the specific data models and vocabularies they create.

Instead, credential data model authors are expected to use machine-readable vocabularies through the use of [LINKED-DATA]. This implementation guide provides examples for how to express data models using a data format that is popular with software developers and web page authors called [JSON-LD]. This data format provides features that enable authors to express their data models in idiomatic JSON while also ensuring that their vocabulary terms are unambigiously understood, even by software that does not implement JSON-LD processing.

The Verifiable Credentials data model also uses a graph-based data model, which allows authors to model both simple relationships that describe one or more attributes for a single entity and complex multi-entity relationships.

The rest of this section describes how to author extensions that build on the Verifiable Credentials Data Model.

9.1 Creating New Credential Types

We expect the most common extensions to the Verifiable Credentials Data Model to be new credential types. Whenever someone has something to say about one or more entities and they want their authorship to be verifiable, they should use a Verifiable Credential. Sometimes there may be an existing credential type, that someone else has created, that can be reused to make the statements they want to make. However, there are often cases where new credential types are needed.

New credential types can be created by following a few steps. This guide will also walk you through creating an example new credential type. At a high level, the steps to follow are:

Design the data model.
Create a new JSON-LD context.
Select a publishing location.
Use the new JSON-LD context when issuing new credentials.

So, let's walk through creating a new credential type which we will call ExampleAddressCredential. The purpose of this credential will be to express a person's postal address.

Design the data model

First, we must design a data model for our new credential type. We know that we will need to be able to express the basics of a postal address, things like a person's city, state, and zipcode. Of course, those items are quite US centric, so we should consider internationalizing those terms. But before we go further, since we're using [LINKED-DATA] vocabularies, there is a good chance that commonly known concepts may already have a vocabulary that someone else has created that we can leverage.

If we are going to use someone else's vocabulary, we will want to make sure it is stable and unlikely to change in any significant way. There may even be technologies that we can make use of that store immutable vocabularies that we can reference, but those are not the focus of this example. Here we will rely on the inertia that comes from a very popularly used vocabulary on the Web, schema.org. It turns out that this vocabulary has just what we need; it has already modeled a postal address and even has examples for how to express it using JSON-LD.

Please note that schema.org is developed incrementally, meaning that the definition of a term today may differ from a future definition, or even be removed. Although schema.org developers encourage using the latest release, as in the simple non-versioned schema.org URLs such as http://schema.org/Place in structured data applications, there are times in which more precise versioning is important. Schema.org also provides dated snapshots of each release, including both human and machine readable definitions of the schema.org core vocabulary. These are linked from the releases page. For instance, instead of the unversioned URI http://schema.org/Place, you might use the versioned URI https://schema.org/version/3.9/schema-all.html#term_Place. In addition, the schemaVersion property has been defined to provide a way for documents to indicate the specific intended version of schema.org's definitions.

Using the schema.org vocabulary and JSON-LD we can express a person's address like so:

Example 6: Example schema.org address

{
  
  "@context": [
    "http://schema.org"
  ],
  "type": "Person",
  "address": {
    "type": "PostalAddress",
    "streetAddress": "123 Main St."
    "addressLocality": "Blacksburg",
    "addressRegion": "VA",
    "postalCode": "24060",
    "addressCountry": "US"
  }
}

Note the above @context key in the JSON. This @context refers to a machine-readable file (also expressed in JSON) that provides term definitions [JSON-LD]. A term definition maps a key or type used in the JSON, such as address or PostalAddress, to a globally unique identifier: a URL.

This ensures that when software sees the @context http://schema.org, that it will interpret the the keys and types in the JSON in a globally consistent way, without requiring developers to use full URLs in the JSON or in the code that may traverse it. As long as the software is aware of the specific @context used (or if it uses JSON-LD processing to transform it to some other known @context), then it will understand the context in which the JSON was written and meant to be understood. The use of @context also allows [JSON-LD] keywords such as @type to be aliased to the simpler type as is done in the above example.

Note that we could also express the JSON using full URLs, if we want to avoid using @context. Here is what the example would look like if we did that:

Example 7: Example schema.org address with full URLs

{
  "@type": "http://schema.org/Person",
  "http://schema.org/address": {
    "@type": "http://schema.org/PostalAddress",
    "http://schema.org/streetAddress": "123 Main St."
    "http://schema.org/addressLocality": "Blacksburg",
    "http://schema.org/addressRegion": "VA",
    "http://schema.org/postalCode": "24060",
    "http://schema.org/addressCountry": "US"
  }
}

While this form is an acceptable way to express the information such that it is unambiguous, many software developers would prefer to use more idiomatic JSON. The use of @context enables idiomatic JSON without losing global consistency and without the need for a centralized registry or authority for creating extensions. Note that @context can also have more than one value. In this case, a JSON array is used to express multiple values, where each value references another context that defines terms. Using this mechanism we can first bring in the terms defined in the Verifiable Credentials Data Model specification and then bring in the terms defined by schema.org:

Example 8: Example address credential with schema.org context

{
  
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "http://schema.org"
  ],
  ...
  "credentialSubject": {
    "type": "Person",
    "address": {
      "type": "PostalAddress",
      "streetAddress": "123 Main St."
      "addressLocality": "Blacksburg",
      "addressRegion": "VA",
      "postalCode": "24060",
      "addressCountry": "US"
    }
  },
  ...
}

Note, however, that each context might have a different definition for the same term, e.g., the JSON key address might map to a different URL in each context. By default, [JSON-LD] allows terms in a @context to be redefined using a last term wins order. While these changes can be safely dealt with by using JSON-LD processing, we want to lower the burden on consumers of Verifiable Credentials. We want consumer software to be able to make assumptions about the meaning of terms by only having to read and understand the string value associated with the @context key. We don't want them to have to worry about terms being redefined in unexpected ways. That way their software can inspect only the @context values and then be hard coded to understand the meaning of the terms.

In order to prevent term redefinition, the [JSON-LD] @protected feature must be applied to term definitions in the @context. All terms in the core Verifiable Credentials @context are already protected in this way. The only time that an existing term is allowed to be redefined is if the new definition is scoped underneath another new term that is defined in a context. This matches developer expectations and ensures that consumer software has strong guarantees about the semantics of the data it is processing; it can be written such that it is never confused about the definition of a term. Note that consumers must determine their own risk profile for how to handle any credentials their software processes that include terms that it does not understand.

Create a new JSON-LD context

Given the above, there is at least one reason why we don't want to use the schema.org context: it is designed to be very flexible and thus does not use the @protected feature. There are a few additional reasons we want to create our own [JSON-LD] context though. First, the schema.org context does not define our new credential type: ExampleAddressCredential. Second, it is not served via a secure protocol (e.g., https); rather, it uses http. Note that this is less of a concern than it may seem, as it is recommended that all Verifiable Credential consumer software hard code the @context values it understands and not reach out to the Web to fetch them. Lastly, it is a very large context, containing many more term definitions than are necessary for our purposes.

So, we will create our own [JSON-LD] context that expresses just those term definitions that we need for our new credential type. Note that this does not mean that we must mint new URLs; we can still reuse the schema.org vocabulary terms. All we are doing is creating a more concise and targeted context. Here's what we'll need in our context:

Example 9: Example address credential context

{
  "@version": 1.1,
  "@protected": true,

  "ExampleAddressCredential":
    "https://example.org/ExampleAddressCredential",

  "Person": {
    "@id": "http://schema.org/Person",
    "@context": {
      "@version": 1.1,
      "@protected": true,

      "address": "http://schema.org/address"
    }
  },
  "PostalAddress": {
    "@id": "http://schema.org/PostalAddress",
    "@context": {
      "@version": 1.1,
      "@protected": true,

      "streetAddress": "http://schema.org/streetAddress",
      "addressLocality": "http://schema.org/addressLocality",
      "addressRegion": "http://schema.org/addressRegion",
      "postalCode": "http://schema.org/postalCode",
      "addressCountry": "http://schema.org/addressCountry"
    }
  }
}

The above context defines a term for our new credential type ExampleAddressCredential, mapping it to the URL https://example.org/ExampleAddressCredential. We could have also chosen a URI like urn:private-example:ExampleAddressCredential, but this approach would not allow us to serve up a Web page to describe it, if we so desire. The context also defines the terms for types Person and PostalAddress, mapping them to their schema.org vocabulary URLs. Furthermore, when those types are used, it also defines protected terms for each of them via a scoped context, mapping terms like address and streetAddress to their schema.org vocabulary URLs. For more information on how to write a JSON-LD context or scoped contexts, see the [JSON-LD] specification.

Select a publishing location

Now that we have a [JSON-LD] context, we must give it a URL. Technically speaking, we could just use a URI, for example, a private URN such as urn:private-example:my-extension. However, if we want people to be able to read and discover it on the Web, we should give it a URL like https://example.org/example-address-credential-context/v1.

When this URL is dereferenced, it should return application/ld+json by default, to allow JSON-LD processors to process the context. However, if a user agent requests HTML, it should return human readable text that explains, to humans, what the term definitions are and what they map to. Since we're reusing an existing vocabulary, schema.org, we can also simply link to the definitions of the meaning of our types and terms via their website. If we had created our own new vocabulary terms, we would describe them on our own site, ideally including machine readable Information as well.

Use the new JSON-LD context when issuing new credentials

Now we're ready for our context to be used by anyone who wishes to issue an ExampleAddressCredential!

Example 10: Example address credential with schema.org context

{
  
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://example.org/example-address-credential-context/v1"
  ],
  "id": "https://example.org/credentials/1234",
  "type": "ExampleAddressCredential",
  "issuer": "https://example.org/people#me",
  "issuanceDate": "2017-12-05T14:27:42Z",
  "credentialSubject": {
    "id": "did:example:1234",
    "type": "Person",
    "address": {
      "type": "PostalAddress",
      "streetAddress": "123 Main St."
      "addressLocality": "Blacksburg",
      "addressRegion": "VA",
      "postalCode": "24060",
      "addressCountry": "US"
    }
  },
  "proof": { ... }
}

Note that writing this new credential type requires permission from no one, you must only adhere to the above referenced standards.

9.2 Extending JWTs

The Verifiable Credentials Data Model 1.0 specifies a minimal set of JWT claim names that are to be used to represent the properties of a verifiable credential and its credentialSubject. Implementers may wish to extend a verifiable credential with some properties that are new (e.g., drivingLicenseNumber, mySpecialProperty or that are already registered with IANA as JWT claim names (e.g., given_name. phone_number_verified.

As the Verifiable Credentials Data Model 1.0 states, such extension properties are best placed directly in either the JWT vc claim or the credentialSubject property of the vc claim as appropriate, although they MAY be placed directly into their own JWT claims.

If implementers wish to use JWT claim names for these extensions, the following steps are recommended. Note that there are three types of JWT claim name: public, named with a URI; private, named with a local name; and registered with IANA.

First, check with IANA (https://www.iana.org/assignments/jwt/jwt.xhtml) to see if the JWT claim name already exists.
If it does not exist, the implementer may wish to either give it a public name (i.e., a URI), give it a local name (i.e., any string), or register it with IANA.
Once the JWT claim name exists, define encoding/decoding transformation rules to convert the verifiable credential property or credentialSubject property into the JWT claim.
- Encoding: Remove the property from the verifiable credential, encode it according to the defined rule, and place it in the JWT claim
- Decoding: Remove the value from the JWT claim, decode it according to the defined rule, and place it in the new verifiable credential JSON object, as either a property of the verifiable credential or the credentialSubject, as appropriate.

9.3 Human Readability

The JSON-LD Context declaration mechanism is used by implementations to signal the context in which the data transmission is happening to consuming applications:

Example 11: Use of @context mechanism

{
  
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://www.w3.org/2018/credentials/examples/v1"
  ],
  "id": "http://example.edu/credentials/1872",
  ...

Extension authors are urged to publish two types of information at the context URLs. The first type of information is for machines, and is the machine-readable JSON-LD Context. The second type of information is for humans, and should be an HTML document. It is suggested that the default mode of operation is to serve the machine-readable JSON-LD Context as that is the primary intended use of the URL. If content-negotiation is supported, requests for text/html should result in a human readable document. The human readable document should at least contain usage information for the extension, such as the expected order of URLs associated with the @context property, specifications that elaborate on the extension, and examples of typical usage of the extension.

10. Proof Formats

This section is non-normative.

The verifiable credentials data model is designed to be proof format agnostic. The specification does not normatively require any particular digital proof or signature format. While the data model is the canonical representation of a verifiable credential or verifiable presentation, the proving mechanisms for these are often tied to the syntax used in the transmission of the document between parties. As such, each proofing mechanism has to specify whether the validation of the proof is calculated against the state of the document as transmitted, against the transformed data model, or against another form. At the time of publication, at least two proof formats are being actively utilized by implementers, and the Working Group felt that documenting what these proof formats are and how they are being used would be beneficial to other implementers.

This guide provides tables in section Benefits of JWTs and section Benefits of JSON-LD and LD-Proofs that compare three syntax and proof format ecosystems; JSON+JWTs, JSON-LD+JWTs, and JSON-LD+LD-Proofs.

Because the Verifiable Credentials Data Model is extensible, and agnostic to any particular proof format, the specification and use of additional proof formats is supported.

10.1 Benefits of JWTs

The Verifiable Credentials Data Model is designed to be compatible with a variety of existing and emerging syntaxes and digital proof formats. Each approach has benefits and drawbacks. The following table is intended to summarize a number of these native trade-offs.

The table below compares three syntax and proof format ecosystems; JSON+JWTs, JSON-LD+JWTs, and JSON-LD+LD-Proofs.

Feature	JSON + JWTs	JSON‑LD + JWTs	JSON‑LD + LD‑Proofs
PF1a. Proof format supports Zero-Knowledge Proofs.	✓	✓	✓
PF2a. Proof format supports arbitrary proofs such as Proof of Work, Timestamp Proofs, and Proof of Stake.	✓	✓	✓
PF3a. Based on existing official standards.	✓	✖	✖
PF4a. Designed to be small in size.	✓	✖	✖
PF5a. Offline support without further processing.	✓	✖	✖
PF6a. Wide adoption in other existing standards.	✓	✓	✖
PF7a. No type ambiguity.	✓	✖	✖
PF8a. Broad library support.	✓	✖	✖
PF9a. Easy to understand what is signed.	✓	✓	✖
PF10a. Ability to be used as authn/authz token with existing systems.	✓	✓	✖
PF11a. No additional canonicalization required.	✓	✖	✖
PF12a. No Internet PKI required.	✓	✖	✖
PF13a. No resolution of external documents needed.	✓	✖	✖

Note

Some of the features listed in the table above are debateable, since a feature can always be added to a particular syntax or digital proof format. The table is intended to identify native features of each combination such that no additional language design or extension is required to achieve the identified feature. Features that all languages provide, such as the ability to express numbers, have not been included for the purposes of brevity. Find more information about different proof formats in the next section.

PF1a: Proof format supports Zero-Knowledge Proofs.: JWTs can embed proof attributes for repudiable proofs such as Zero-Knowledge Proofs. In that case, the JWS will not have an signature element.
PF2a: Proof format supports arbitrary proofs such as Proof of Work, Timestamp Proofs, and Proof of Stake.: JWTs can embed proof attributes for any type of proofs such as Proof of Work, Timestamp, Proofs, and Proof Stake.
PF3a: Based on existing official standards.: JSON and JWT are proposed and mature IETF standards. While JSON-LD 1.0 is in REC state in W3C, JSON-LD 1.1 is still in WD state. LD-Proofs are not standardized at all.
PF4a: Designed to be small in size.: JSON was invented as a simple data format to be transmitted on the wire. A verifiable credential can be expressed by its attributes only, without the necessity to introduce additional meta-information such as @context. This makes the resulting JSON+JWT credential typically also smaller in size.
PF5a: Offline support without further processing.: A JWT can fully describe itself without the need to retrieve or verify any external documents. JSON-LD requires the context to be queryable and requires further documents to be accessible to check the prevalent document, e.g., LD-Proof. Additional caching needs to be implemented to support offline use cases.
PF6a: Wide adoption in other existing standards.: JWT founds its application in many other existing standards, e.g., OAuth2, OpenID Connect. This allows for backward compatibility with existing authentication and authorization frameworks without or with only minor modifications to these legacy systems.
PF7a: No type ambiguity.: It is best practice that JSON data structures typically do not expect changing types of their internal attributes. JSON-LD has implicit support for compact form serialization which transforms arrays with a single element only to switch its data type. Developers writing parsers have to implement special handling of these data types, which results in more code, is more error-prone and sometimes does not allow parsers based on code generation, which rely on static types.
PF8a: Broad library support.: JWT and JSON due to its maturity and standardization, have a lot of open-source library support. While JSON-LD 1.0 is a standard and has support for different programming languages, it is still behind JSON which is often part of the native platform toolchain, e.g., JavaScript. For LD-Proofs, on the other hand, only a few scattered libraries exist.
PF9a: Easy to understand what is signed.: JWT makes it visible what is signed in contrast to LD-Proofs, e.g., LD Signatures, that are detached from the actual payload and contain links to external documents which makes it not obvious for a developer to figure out what is part of the signature.
PF10a: Ability to be used as authn/authz token with existing systems.: Many existing applications rely on JWT for authentication and authorization purposes. In theory, developers maintaining these applications could leverage JWT-based verifiable presentations in their current systems with minor or no modifications. LD-Proofs represents a new approach which would require more work to achieve the same result.
PF11a: No additional canonicalization required.: Beyond base64 URL encoding JSON and JWT don't require any canonicalization to be transmitted on the wire. The JWS can be calculated on any data inside of the payload. This results in less computation, less complexity, and light-weight libraries compared to JSON-LD and LD-Proofs where canonicalization is required.
PF12a: No Internet PKI required.: JSON-LD and LD-Proofs rely on resolving external documents, e.g., @context. This means that a verifiable credential system would rely on existing Internet PKI to a certain extend and cannot be fully decentralized. A JWT-based system does not need to introduce this dependency.
PF13a: No resolution of external documents needed.: JSON-LD and LD-Proofs require the resolution of external documents, which leads to an increased network load for the verifier of a verifiable presentation. This needs to be mitigated through caching strategies.

10.2 Benefits of JSON-LD and LD-Proofs

The table below compares three syntax and proof format ecosystems; JSON+JWTs, JSON-LD+JWTs, and JSON-LD+LD-Proofs. Readers should be aware that Zero-Knowledge Proofs are currently proposed as a sub-type of LD-Proofs and thus fall into the final column below.

Feature	JSON + JWTs	JSON‑LD + JWTs	JSON‑LD + LD‑Proofs
PF1b. Support for open world data modelling.	✖	✓	✓
PF2b. Universal identifier mechanism for JSON objects via the use of URIs.	✖	✓	✓
PF3b. A way to disambiguate properties shared among different JSON documents by mapping them to IRIs via a context.	✖	✓	✓
PF4b. A mechanism to refer to data in an external document, where the data may be merged with the local document without a merge conflict in semantics or structure.	✖	✓	✓
PF5b. The ability to annotate strings with their language.	✖	✓	✓
PF6b. A way to associate arbitrary datatypes, such as dates and times, with arbitrary property values.	✖	✓	✓
PF7b. A facility to express one or more directed graphs, such as a social network, in a single document.	✖	✓	✓
PF8b. Supports signature sets.	✖	✖	✓
PF9b. Embeddable in HTML such that search crawlers will index the machine-readable content.	✖	✖	✓
PF10b. Data on the wire is easy to debug and serialize to database systems.	✖	✖	✓
PF11b. Nesting signed data does not cause data size to double for every embedding.	✖	✖	✓
PF12b. Proof format supports Zero-Knowledge Proofs.	✖	✖	✓
PF13b. Proof format supports arbitrary proofs such as Proof of Work, Timestamp Proofs, and Proof of Stake.	✖	✖	✓
PF14b. Proofs can be expressed unmodified in other data syntaxes such as YAML, N-Quads, and CBOR.	✖	✖	✓
PF15b. Changing property-value ordering, or introducing whitespace does not invalidate signature.	✖	✖	✓
PF16b. Designed to easily support experimental signature systems.	✖	✖	✓
PF17b. Supports signature chaining.	✖	✖	✓
PF18b. Does not require pre-processing or post-processing.	✖	✖	✓
PF19b. Canonicalization requires only base-64 encoding.	✖	✖	✓

Note

PF1b: Support for open world data modelling: An open world data model is one where any entity can make any statement about anything while simultaneously ensuring that the semantics of the statement are unambiguous. This specification is enabled by an open world data model called Linked Data. One defining characteristic of supporting an open world data model is the ability to specify the semantic context in which data is being expressed. JSON-LD provides this mechanism via the @context property. JSON has no such feature.
PF2b: Universal identifier mechanism for JSON objects via the use of URIs.: All entities in a JSON-LD document are identified either via an automatic URI, or via an explicit URI. This enables all entities in a document to be unambiguously referenced. JSON does not have a native URI type nor does it require objects to have one, making it difficult to impossible to unambiguously identify an entity expressed in JSON.
PF3b: A way to disambiguate properties shared among different JSON documents by mapping them to IRIs via a context.: All object properties in a JSON-LD document, such as the property "homepage", are either keywords or they are mapped to an IRI. This feature enables open world systems to identify the semantic meaning of the property in an unambiguous way, which enables seamless merging of data between disparate systems. JSON object properties are not mapped to IRIs, which result in ambiguities with respect to the semantic meaning of the property. For example, one JSON document might use "title" (meaning "book title") in a way that is semantically incompatible with another JSON document using "title" (meaning "job title").
PF4b: A mechanism to refer to data in an external document, where the data may be merged with the local document without a merge conflict in semantics or structure.: JSON-LD provides a mechanism that enables a data value to use a URL to refer to data outside of the local document. This external data may then be automatically merged with the local document without a merge conflict in semantics or structure. This feature enables a system to apply the "follow your nose" principle to discover a richer set of data that is associated with the local document. While a JSON document can contain pointers to external data, interpreting the pointer is often application specific and usually does not support merging the external data to construct a richer data set.
PF5b: The ability to annotate strings with their language.: JSON-LD enables a developer to specify the language, such as English, French, or Japanese, in which a text string is expressed via the use of language tags. JSON does not provide such a feature.
PF6b: A way to associate arbitrary datatypes, such as dates and times, with arbitrary property values.: JSON-LD enables a developer to specify the data type of a property value, such as Date, unsigned integer, or Temperature by specifying it in the JSON-LD Context. JSON does not provide such a feature.
PF7b: A facility to express one or more directed graphs, such as a social network, in a single document.: JSON-LD's abstract data model supports the expression of information as a directed graph of labeled nodes and edges, which enables an open world data model to be supported. JSON's abstract data model only supports the expression of information as a tree of unlabeled nodes and edges, which restricts the types of relationships and structures that can be natively expressed in the language.
PF8b: Supports signature sets.: A signature set is an unordered set of signatures over a data payload. Use cases, such as cryptographic signatures applied to a legal contract, typically require more than one signature to be associated with the contract in order to legally bind two or more parties under the terms of the contract. Linked Data Proofs, including Linked Data Signatures, natively support sets of signatures. JWTs only enable a single signature over a single payload.
PF9b: Embeddable in HTML such that search crawlers will index the machine-readable content.: All major search crawlers natively parse and index information expressed as JSON-LD in HTML pages. LD-Proofs enable the current data format that search engines use to be extended to support digital signatures. JWTs have no mechanism to express data in HTML pages and are currently not indexed by search crawlers.
PF10b: Data on the wire is easy to debug and serialize to database systems.: When developers are debugging software systems, it is beneficial for them to be able to see the data that they are operating on using common debugging tools. Similarly, it is useful to be able to serialize data from the network to a database and then from the database back out to the network using a minimal number of pre and post processing steps. LD-Proofs enable developers to use common JSON tooling without having to convert the format into a different format or structure. JWTs base-64 encode payload information, resulting in complicated pre and post processing steps to convert the data into JSON data while not destroying the digital signature. Similarly, schema-less databases, which are typically used to index JSON data, cannot index information that is expressed in an opaque base-64 encoded wrapper.
PF11b: Nesting signed data does not cause data size to double for every embedding.: When a JWT is encapsulated by another JWT, the entire payload must be base-64 encoded in the initial JWT, and then base-64 encoded again in the encapsulating JWT. This is often necessary when a cryptographic signature is required on a document that contains a cryptographic signature, such as when a Notary signs a document that has been signed by someone else seeking the Notary's services. LD-Proofs do not require base-64 encoding the signed portion of a document and instead rely on a canonicalization process that is just as secure, and that only requires the cryptographic signature to be encoded instead of the entire payload.
PF12b: Proof format supports Zero-Knowledge Proofs.: The LD-Proof format is capable of modifying the algorithm that generates the hash or hashes that are cryptographically signed. This cryptographic agility enables digital signature systems, such as Zero-Knowledge Proofs, to be layered on top of LD-Proofs instead of an entirely new digital signature container format to be created. JWTs are designed such that an entirely new digital signature container format will be required to support Zero-Knowledge Proofs.
PF13b: Proof format supports arbitrary proofs such as Proof of Work, Timestamp Proofs, and Proof of Stake.: The LD-Proof format was designed with a broader range of proof types in mind and supports cryptographic proofs beyond simple cryptographic signatures. These proof types are in common usage in systems such as decentralized ledgers and provide additional guarantees to verifiable credentials, such as the ability to prove that a particular claim was made at a particular time or that a certain amount of energy was expended to generate a particular credential. The JWT format does not support arbitrary proof formats.
PF14b: Proofs can be expressed unmodified in other data syntaxes such as XML, YAML, N-Quads, and CBOR.: The LD-Proof format utilizes a canonicalization algorithm to generate a cryptographic hash that is used as an input to the cryptographic proof algorithm. This enables the bytes generated as the cryptographic proof to be compact and expressible in a variety of other syntaxes such as XML, YAML, N-Quads, and CBOR. Since JWTs require the use of JSON to be generated, they are inextricably tied to the JSON syntax.
PF15b: Changing property-value ordering, or introducing whitespace does not invalidate signature.: Since LD-Proofs utilize a canonicalization algorithm, the introduction of whitespace that does not change the meaning of the information being expressed has no effect on the final cryptographic hash over the information. This means that simple changes in whitespace formatting, such as those changes made when writing data to a schema-less database and then retrieving the same information from the same database do not cause the digital signature to fail. JWTs encode the payload using the base-64 format which is not resistant to whitespace formatting that has no effect on the information expressed. This shortcoming of JWTs make it challenging to, for example, express signed data in web pages that search crawlers index.
PF16b: Designed to easily support experimental signature systems.: The LD-Proof format is naturally extensible, not requiring the format to be extended in a formal international standards working group in order to prevent namespace collisions. The JWT format requires entries in a centralized registry in order to avoid naming collisions and does not support experimentation as easily as the LD-Proof format does. LD-Proof format extension is done through the decentralized publication of cryptographic suites that are guaranteed to not conflict with other LD-Proof extensions. This approach enables developers to easily experiment with new cryptographic signature mechanisms that support selective disclosure, zero-knowledge proofs, and post-quantum algorithms.
PF17b: Supports signature chaining.: A signature chain is an ordered set of signatures over a data payload. Use cases, such as cryptographic signatures applied to a notarized document, typically require a signature by the signing party and then an additional one by a notary to be made after the original signing party has made their signature. Linked Data Proofs, including Linked Data Signatures, natively support chains of signatures. JWTs only enable a single signature over a single payload.
PF18b: Does not require pre-processing or post-processing.: In order to encode a verifiable credential or a verifiable presentation in a JWT, an extra set of steps are required to convert the data to and from the JWT format. No such extra converstion step are required for verifiable credentials and verifiable presentations protected by LD-Proofs.
PF19b: Canonicalization requires only base-64 encoding.: The JWT format utilizes a simple base-64 encoding format to generate the cryptographic hash of the data. The encoding format for LD-Proofs requires a more complex canonicalization algorithm to generate the cryptographic hash. The benefits of the JWT approach are simplicity at the cost of encoding flexibility. The benefits of the LD-Proof approach are flexibility at the cost of implementation complexity.

11. Zero-Knowledge Proofs

This section is non-normative

The Verifiable Credentials Data Model is designed to be compatible with a variety of existing and emerging digital proof formats. Each proof format has benefits and drawbacks. Many proof formats cannot selectively reveal attribute values from a verifiable credential; they can only reveal all (or none).

Zero-Knowledge Proofs (ZKPs) are a proof format that enables data-minimization features in verifiable presentations, such as selective disclosure and predicate proofs.

Full Disclosure

Currently, disclosing data is an all or nothing process, whether online or off. Many digital identity systems reveal all the attributes in a digital credential. The simplest method for signing a verifiable credential signs the entire credential and when presented, fully discloses all the attributes.

Along with a full disclosure of all the attributes in a verifiable credential, standard verifiable presentations reveal the actual signature. With both the data and signature in hand, a verifier has a complete copy of the credential. Without care, this could enable the verifier to impersonate the holder. Also, since the signature is the same every time this credential is presented, the signature itself is a unique identifier and becomes PII (personally identifiable information).

It is also possible to fully disclose the attributes in a zero-knowledge verifiable credential. Unlike non-ZKP methods, zero-knowledge methods do not reveal the actual signature; instead, they only reveal a cryptographic proof of a valid signature. Only the holder of the signature has the information needed to present the credential to a verifier. This means that zero-knowledge methods provide a holder additional protection from impersonation. Because the signature is not revealed, it also cannot be used as a unique identifier.

Selective Disclosure

Selective disclosure means that a holder doesn't have to reveal all of the attributes contained in a verifiable credential. This reduces the liability of handling or holding data that it is not necessary to share or collect.

Non-ZKP methods for selective disclosure often require the credential issuer to create a unique credential for each individual attribute, or possible combination of attributes. This could quickly become impractical as the number of credentials or combinations thereof exponentially explodes. Atomic credentials (which only contain a single attribute) may also not guarantee that the data is properly paired when used in a verifiable presentation. For example, a holder has two vehicle credentials, one for a 2018 Mazda with 15,000 miles and the other for a 1965 Lincoln with 350,000 miles. With atomic credentials it may be possible to claim the user has a 1965 Lincoln with 15,000 miles.

Zero-knowledge methods allow a holder to choose which attributes to reveal and which attributes to withhold on a case-by-case basis without involving the issuer. The credential issuer only needs to provide a single verifiable credential that contains all of the attributes. Each attribute is individually incorporated into the signature. This enables two options: to reveal the attribute or to prove that you know the value of the attribute without revealing it. For example, a credential with attributes for name, birthdate, and address can be used in a presentation to reveal only your name.

Non-ZKP methods implementing selective disclosure often requires the cooperation of the issuer. Selective disclosure using zero-knowledge methods gives the holder personal control over what to reveal. A verifiable presentation based on zero-knowledge proof mechanisms only contains those attributes and associated values that are required to satisfy the presentation requirements.

Predicate Proofs

A predicate proof is a proof that answers a true-or-false question. For example, "Are you over the age of 18?" Using non-ZKP methods, predicate proofs must be provided by the issuer as one of the attributes of a verifiable credential. This means that in order for a non-ZKP credential to be used to prove age-over-18, it would need to contain the attribute age-over-18. This credential could not be used to reveal your birthdate, unless it also included a birthdate claim. It also couldn't be used to prove age-over-25. To prove age-over-25, the holder would need to have received a credential with an age-over-25 claim.

Using zero-knowledge methods, predicate proofs can be generated by the holder at the time of presentation without issuer involvement. For example, a verifiable credential with the claim birthdate can be used in a verifiable presentation to prove age-over-18. The same credential could then be used in another presentation to prove age-over-25, all without revealing the holder's birthdate.

Revocation

Verifiable credentials may need to be revocable. If an issuer can revoke a credential, verifiers must be able to determine a credential's revocation status.

Non-ZKP methods for checking revocation status may require the verifier to directly contact the issuer. Less restrictive checks could be made against a list of revoked credential identifiers posted in a public registry. The holder is required to disclose the credential identifier to the verifier so that it can be checked. The verifier is then responsible for doing the work to check revocation.

Using zero-knowledge methods, the credential identifier can be checked against a list of revoked credential identifiers without revealing the identifier. This reduces the ability of network monitors to correlate a holder's credential presentations, and removes the ability of an issuer to be made aware of the presentation of verifiable credentials they have issued.

Correlation

Correlation is the ability to link data from multiple interactions to a single user. Correlation can be performed by a verifier, by issuers and verifiers working together, or by a third party observing interactions on the network. Correlation is a way to collect data about a holder without the holder's consent or knowledge. It is also a way to deanonymize private transactions. For example, a holder might use a verifiable credential to prove they are authorized to vote, then submit a secret ballot. If it is possible to correlate the holder's credential with the secret ballot, thereby linking a specific vote to a specific voter, it would be detrimental to the democratic process and could enable retaliation.

One way to reduce correlation is through data minimization, by sharing only the information required to complete a transaction. Another way to reduce correlation is to make each interaction look unique. When interactions disclose unique identifiers, an observer can link multiple interactions to a single user. Non-ZKP methods with only a single identifier per user create correlation opportunities by embedding that identifier in multiple credentials or interactions. Zero-knowledge proofs remove this linkability between interactions.

Non-ZKP methods that reveal all attributes and use unique identifiers are completely correlatable. Zero-knowledge methods enable data minimization and allow holders to have trusted interactions with verifiers without dependence on unique identifiers.

Although correlation can never be eliminated completely, the goal of zero-knowledge methods is to reduce the probability of correlation and to put control over the level of correlation into the hands of the verifiable credential holder.

Drawbacks

Zero-knowledge methods are more complex than non-ZKP methods. Cryptographic engineers must understand complicated protocols and write code to create libraries that support zero-knowledge methods. System implementers can then use these libraries without being exposed to the underlying complexity, but must trust that the implementation was done correctly. They can utilize the features of selective disclosure and bring the benefits of the method to their customers without a significant increase in effort over using non-ZKP methods.

Due to the underlying complexity, zero-knowledge methods require more CPU and memory to use. This also adds to the time required to create and verify proofs. This should be considered when using less capable devices such as IOT devices or older phones.

Another drawback of zero-knowledge proofs is that they tend to be larger than simple signatures.

There is a perception that zero-knowledge methods are new and untested. Zero-knowledge methods were first introduced in 1989 as a way to guard secrets. Although they may not be well understood by the general public, they have received considerable review and scrutiny in the cryptographic community. They are considered just as secure as many common cryptographic techniques in use today.

12. Progressive Trust

This section is non-normative.

Entities that use verifiable credentials and verifiable presentations should follow protocols that enable progressive trust. Progressive trust refers to enabling individuals to share information about themselves only on an as needed basis, slowing building up more trust as more information is shared with another party.

Progressive trust is strongly related to the principle of data minimization, and enabled by technologies such as selective disclosure and predicate proofs. We encourage the use of progressive trust as a guiding principle for implementers as they develop protocols for issuers, holders, and verifiers.

12.1 Data Minimization

Data minimization is a principle that encourages verifiers to request the minimum amount of data necessary from holders, and for holders to only provide the minimum amount of data to verifiers. This "minimum amount of data" depends on the situation and may change over the course of a holder's interaction with a verifier.

For example, a holder may apply for a loan, with a bank acting as the verifier. There are several points at which the bank may want to determine whether the holder is qualified to continue in the process of applying for the loan; for instance, the bank may have a policy of only providing loans to existing account holders. A protocol that follows the principle of data minimization would allow the holder to reveal to the verifier only that they are an existing account holder, before the bank requests any additional information, such as account balances or employment status. In this way, the applicant may progressively entrust the bank with more information, as the data needed by the bank to make its determinations is requested a piece at a time, as needed, rather than as a complete set, up front.

12.2 Selective Disclosure

Selective disclosure is the ability of a holder to select some elements of a verifiable credential to share with a verifier, without revealing the rest. There are several different methods which support selective disclosure, we provide three examples:

Atomic Credentials - These are verifiable credentials which consist of a single claim. An issuer may provide a set of atomic credentials that duplicates the claims of a standard credential. This atomicity allows a holder to disclose only those claims which need to be revealed to a verifier, rather than requiring all of the claims of a standard credential to be revealed.
Selective Disclosure Signatures - Certain signature schemes natively support selective disclosure of verifiable credential claims. One example of these is Camenisch-Lysyanskaya signatures. Such Signatures allow a holder to disclose only those claims which need to be revealed to a verifier, rather than requiring all of the credential's claims to be revealed.
Hashed Values - With this method, the issuer issues a single verifiable credential containing all the issuer's claims about the subject. However, each claim value is created by hashing the actual value with a different nonce so that the verifier cannot determine the actual value. There are several different ways of modeling this, and no standard way is currently defined. The holder includes the actual values of the claims that are to be revealed to the verifier in the verifiable presentation.

12.3 Predicates

Another technique which may be used to support progressive trust is to use predicates as the values of revealed claims. Predicates allow a holder to provide True/False values to a verifier rather than revealing claim values.

Predicate proofs may be enabled by verifiable credential issuers as claims, e.g., the credentialSubject may include an ageOver18 property rather than a birthdate property. This would allow holders to provide proof that they are over 18 without revealing their birthdates.

Certain signature types enable predicate proofs by allowing claims from a standard verifiable credential to be presented as predicates. For example, a Camenisch-Lysyanskaya signed verifiable credential that contains a credentialSubject with a birthdate property may be included in a verifiable presentation as a derived credential that contains an ageOver18 property.

12.4 Further Techniques

The examples provided in this section are intended to illustrate some possible mechanisms for supporting progressive trust, not provide an exhaustive or comprehensive list of all the ways progressive trust may be supported. Research in this area continues with the use of cutting-edge proof techniques such as zk-SNARKS and Bulletproofs, as well as different signature protocols.

A draft report by the Credentials Community Group on data minimization may also be useful reading for implementers looking to enable progressive trust.

Verifiable Credentials Implementation Guidelines 1.0

Implementation guidance for Verifiable Credentials

W3C Working Group Note 24 September 2019

Abstract

Status of This Document

1. Introduction

2. Identifiers

3. Terminology

4. Verification

4.1 Core Data Model

4.2 Specific Verifiable Credentials

4.3 Content Integrity

4.3.1 Hashlinks

4.3.2 Verifiable Data Registries

5. Referencing Other Credentials

5.1 Referencing Credentials Without Integrity Protection

5.2 Referencing Credentials With Integrity Protection

5.3 Attaching Evidence

6. Disputes

7. Presentations

8. Using the JWT aud claim

9. Extensions

9.1 Creating New Credential Types

Design the data model

Create a new JSON-LD context

Select a publishing location

Use the new JSON-LD context when issuing new credentials

9.2 Extending JWTs

9.3 Human Readability

10. Proof Formats

10.1 Benefits of JWTs

10.2 Benefits of JSON-LD and LD-Proofs

11. Zero-Knowledge Proofs

Full Disclosure

Selective Disclosure

Predicate Proofs

Revocation

Correlation

Drawbacks

12. Progressive Trust

12.1 Data Minimization

12.2 Selective Disclosure

12.3 Predicates

12.4 Further Techniques

13. Related Specifications

13.1 Web Authentication

14. Test suite

A. References

A.1 Informative references