Published 01 Jan 2024, Updated 01 Jan 2024
Handling importing and exporting as application requirements change.
Andy EdwardsAt JCore we have an evolving file format for exporting settings from Clarity Gateway. Similar to how the latest version of Microsoft Word can still open Word '97 documents, we need to be able to import old versions of the settings, but export more information in newer versions of the settings.
It's common to add additional fields or possible field values in new versions of our settings, but without knowing the future, it's safest to assume we might even decide to move or delete fields someday, and design a system that supports a drastic restructuring in future versions.
At first, I opted for a simple but verbose approach: for each new version, we just copy the code that defined the schema for the previous version, and make whatever changes necessary. And we create functions that convert old versions to new versions:
// settings/v1.ts
import z from 'zod'
export const SettingsV1 = z.object({ version: z.literal(1), metadata: z.array( z.object({ tag: z.string(), dataType: z.enum(['number', 'string', 'boolean']), }) ),})export type SettingsV1 = z.infer<typeof SettingsV1>
// settings/v2.ts
import z from 'zod'
export const SettingsV2 = z.object({ version: z.literal(2), metadata: z.array( z.object({ tag: z.string(), dataType: z.enum(['number', 'string', 'boolean']), /** * new field in V2 */ settable: z.boolean(), }) ),})export type SettingsV2 = z.infer<typeof SettingsV2>
// settings/convertSettingsV1ToV2.ts
import { type SettingsV1 } from './v1'
export function convertSettingsV1ToV2({ metadata }: SettingsV1): SettingsV2 { return { version: 2, metadata: metadata.map((item) => ({ ...item, settable: false })), }}
// settings/convertSettingsToLatestVersion.ts
import { type SettingsV1 } from './v1'import { type SettingsV2 } from './v2'
export function convertSettingsToLatestVersion( settings: SettingsV1 | SettingsV2): SettingsV2 { switch (settings.version) { case 1: return convertSettingsV1ToV2(settings) case 2: return settings }}
This was straightforward and very type-safe, but my coworkers weren't happy when they needed to add a new field to our settings; just adding one field took numerous changes:
settings/v3.ts
settings/convertSettingsV1ToV2.ts
to settings/convertSettingsV1ToV3.ts
settings/convertSettingsV2ToV3.ts
settings/convertSettingsToLatestVersion.ts
const MetadataItemV1 = z.object({ tag: z.string(), dataType: z.enum(['number', 'string', 'boolean']),})
const MetadataItemV2 = MetadataItemV1.extend({ settable: z.boolean(),})
export const SettingsV1 = z.object({ version: z.literal(1), metadata: z.array(MetadataItemV1),})
export const SettingsV2 = SettingsV1.extend({ version: z.literal(1), metadata: z.array(MetadataItemV2),})
This is less verbose than a bunch of copy and paste, but it has major drawbacks:
Ideally, we wanted to only have to touch one place to add a new field anywhere in the schema, or at least touch as few places as possible. This meant having some kind of single Franken-schema containing all fields for all versions, where each field has attached metadata about which version it belongs to. And then we could filter down to the appropriate fields when processing a specific version.
At first I was concerned that a single-schema system would have major shortcomings:
However, after getting more experience doing advanced mapping on Zod schemas, I started to see a way to accomplish our goals.
There have been numerous requests to add support for custom metadata to Zod schemas. Unfortunately, Colin Hacks, the author of Zod, has always been opposed to making a first-class feature for custom metadata; he advocates wrapping Zod schemas in your own separate constructs that declare the metadata. But to get the convenient solution we want, we need to be able to attach version metadata on at any level of the schema, no matter how deep, and deeply filter out things that aren't in version:
const MetadataItem = z.object({ tag: z.string(), dataType: z.enum(['number', 'string', 'boolean']), settable: version(z.boolean(), { since: 2 }),})
const MetadataItemV1 = schemaForVersion(MetadataItem, 1) // somehow magically removes the `settable` propertytype MetadataItemV1 = z.infer<typeof MetadataItemV1> // also somehow magically lacks the `settable` property
If version
returned something like { schema: z.boolean(), since: 2 }
, then we wouldn't be able to pass that as the settable
property schema because it's not an instance of ZodType
. And if we passed .schema
, which is a ZodType
, we would lose the since: 2
metadata. Maybe we could make a versionableObject
that can accept schema-plus-metadata wrappers as properties, but think about it -- we'd also need a versionableArray
, versionableRecord
, versionableUnion
, and so on -- we'd practically be reimplementing Zod at that point.
So what if version
could somehow return a ZodType
instance, along with the attached version
metadata of { since: 2 }
? Then we could use this as a property in z.object()
, the element of z.array()
, or inside any other Zod schema. Is there a way? Yes! After experimenting I realized I could make a subclass that's essentially a no-op z.refine(() => true)
with the metadata attached:
import z from 'zod'
export class ZodMetadata< T extends z.ZodTypeAny, M extends object> extends z.ZodEffects<T> { constructor(def: ZodEffectsDef<T>, public metadata: M) { super(def) }
unwrap() { return this._def.schema }}
export function zodMetadata<T extends z.ZodTypeAny, M extends object>( schema: T, metadata: M): ZodMetadata<T, M> { return new ZodMetadata(schema.refine(() => true)._def, metadata)}
export type Version = 1 | 2 | 3 // etc
export type VersionRange = { until?: Version; since?: Version }
export function version<T extends z.ZodTypeAny, V extends VersionRange>( schema: T, version: V): ZodMetadata<T, { version: V }> { return zodMetadata(schema, { version })}
Now what happens if we inspect the type of our MetadataItem
schema?
const MetadataItem: z.ZodObject<{ tag: z.ZodString; dataType: z.ZodEnum<["number", "string", "boolean"]>; settable: ZodMetadata<z.ZodBoolean, { version: { since: number; }; }>;}, "strip", z.ZodTypeAny, { ...;}, { ...;}>
Voilà! This is something we can work with!
Well, these schemas are a bit weird, the awkward thing is, we could end up with a schema like this:
const schema = z.object({ a: version(z.number(), { until: 2 }), b: z.string(), c: version(z.boolean(), { since: 3 }),})
This schema is kind of bogus because it accepts a mishmash of properties from all versions, even though no version accepts all of those properties:
schema.parse({ a: 1, b: 'hello', c: true,}) // whoops, this isn't valid for v1, v2, or v3, but no error
So, we're kind of abusing Zod; we can't think of this as a run-of-the-mill Zod schema that's ready to use for parsing. But we're abusing Zod in a very pragmatic way; as long as we treat this like a proto-schema from which we'll construct the actual schemas for each version, it's very convenient:
const schemaV1 = schemaForVersion(schema, 1)const schemaV2 = schemaForVersion(schema, 2)
Now that we have metadata we can inspect, we can strip away out-of-version properties using a recursive function:
export function schemaForVersionHelper< S extends z.ZodTypeAny, V extends Version = any>(schema: S, version: V): S | undefined { switch (schema._def.typeName) { case z.ZodFirstPartyTypeKind.ZodObject: { const object: z.AnyZodObject = schema as any const shape: z.ZodRawShape = {} for (const [key, value] of Object.entries(object.shape)) { const valueForVersion = schemaForVersionHelper( value as z.ZodTypeAny, version ) if (valueForVersion == null) continue shape[key] = valueForVersion } const catchall = schemaForVersionHelper(object._def.catchall, version) ?? z.never() return new z.ZodObject({ ...object._def, shape: () => shape, catchall, }) as any } case z.ZodFirstPartyTypeKind.ZodEffects: { if (schema instanceof ZodMetadata) { const { metadata } = schema if (metadata.version) { if (!isVersionInRange(version, metadata.version)) { return undefined } if (Object.keys(metadata).length === 1) return subschema.unwrap() } return schema.unwrap() } const effects: z.ZodEffects<any> = schema as any const innerSchema = schemaForVersionHelper(effects._def.schema, version) return ( innerSchema == null ? undefined : new z.ZodEffects({ ...effects._def, schema: innerSchema, }) ) as any } case z.ZodFirstPartyTypeKind.ZodOptional: { const optional: z.ZodOptional<any> = schema as any const unwrapped = schemaForVersionHelper(optional.unwrap(), version) return unwrapped == null ? undefined : (unwrapped.optional() as any) } // etc for other non-primitive schema types default: return schema }}
export function schemaForVersion< S extends z.ZodTypeAny, V extends Version = any>(schema: S, version: V): S { const filtered = schemaForVersionHelper(schema, version) if (!filtered) throw new Error(`entire schema is out of version`) return filtered}
Note that this is just a stub example, and the complete code will also need to deeply map ZodNullable
, ZodDefault
, ZodArray
, and any other Zod schema types being used.
Similarly, we have to use a recursive conditional TS type to get the equivalent of z.output<...>
, but for a specific version:
type OutputForVersion< S extends z.ZodTypeAny, V extends Version> = S extends ZodMetadata<infer T, infer M> ? M extends { version: infer SchemaVersion extends VersionRange } ? IsVersionInRange<V, SchemaVersion> extends true ? OutputForVersion<T, V> : never : OutputForVersion<T, V> : // bail if output type is any to avoid combinatorial explosion IsAny<z.output<S>> extends true ? any : S extends z.ZodObject<infer T, infer UnknownKeys, infer Catchall> ? ObjectOutputForVersion<T, Catchall, UnknownKeys, V> : S extends z.ZodOptional<infer T> ? OutputForVersion<T, V> extends never ? never : OutputForVersion<T, V> | undefined : // etc for other non-primitive schema types z.output<S>
Again this is just a stub example, and the complete code will also need to deeply map ZodNullable
, ZodDefault
, ZodArray
, and any other Zod schema types being used.
ObjectOutputForVersion
is fairly complicated, but basically just adapted from ZodObject
's default Output
parameter type:
type ObjectOutputForVersion< T extends z.ZodRawShape, Catchall extends z.ZodTypeAny, UnknownKeys extends z.UnknownKeysParam, V extends Version> = z.objectUtil.flatten< z.objectUtil.addQuestionMarks< RemoveNeverProps<{ [K in keyof T]: OutputForVersion<T[K], V> }> >> & CatchallOutputForVersion<Catchall, V> & z.PassthroughType<UnknownKeys>
type RemoveNeverProps<T extends object> = { [K in keyof T as T[K] extends never ? never : K]: T[K]}
type CatchallOutputForVersion< Catchall extends z.ZodTypeAny, V extends Version> = z.ZodTypeAny extends Catchall ? unknown : { [k: string]: OutputForVersion<Catchall, V> }
Now we can improve the return type of schemaForVersion
:
export function schemaForVersion< S extends z.ZodTypeAny, V extends Version = any>(schema: S, version: V): z.ZodType<OutputForVersion<S, V>>
The previous section left IsVersionInRange
unspecified, because it takes a bit of legwork to accomplish in TypeScript. First we need a way to do comparisons like <
, >=
on number types. It's not pretty, but we can do this for a limited range of versions and tail recursive conditional types:
export type DecrementVersion = { 1: never 2: 1 3: 2}
export type IncrementVersion = { 1: 2 2: 3 3: never}
type Version = keyof IncrementVersion
type IsLessThanOrEqual<A extends Version, B extends Version> = [A] extends [ never] ? false : [A] extends [B] ? true : IsLessThanOrEqual<IncrementVersion[A], B>
type IsGreaterThanOrEqual<A extends Version, B extends Version> = [A] extends [ never] ? false : [A] extends [B] ? true : IsGreaterThanOrEqual<DecrementVersion[A], B>
type IsLessThan<A extends Version, B extends Version> = IsLessThanOrEqual< IncrementVersion[A], B>
Leveraging that and an And<A, B>
type we can define our complete IsVersionInRange
type:
export type And<A extends boolean, B extends boolean> = A extends true ? B : A
export type IsVersionInRange<V extends Version, R extends VersionRange> = And< [R] extends [{ until: infer Until extends Version }] ? IsLessThan<V, Until> : true, [R] extends [{ since: infer Since extends Version }] ? IsGreaterThanOrEqual<V, Since> : true>
We can use this same approach with ZodUnions to exclude union options from versions they don't apply to:
const ConnectionSchema = z.union([ ModbusConnectionSchema, version(SparkPlugConnectionSchema, { since: 2 }), version(EthernetIPConnectionSchema, { since: 3 }),])
const ConnectionSchemaV2 = schemaForVersion(ConnectionSchema, 2) // should exclude EthernetIPConnectionSchema option
To do this we need to filter union options in our schemaForVersion
function and OutputForVersion
type:
export function schemaForVersionHelper< S extends z.ZodTypeAny, V extends Version = any>(schema: S, version: V): z.ZodType<OutputForVersion<S, V>> | undefined { switch (schema._def.typeName) { // ... case z.ZodFirstPartyTypeKind.ZodUnion: { const union: z.ZodUnion<any> = schema as any const options = (union.options as z.ZodTypeAny[]) .map((option) => schemaForVersionHelper(option, version)) .filter((s): s is z.ZodTypeAny => s != null) return ( options.length === 1 ? options[0] : hasTwoOrMore(options) ? new z.ZodUnion({ ...union._def, options }) : undefined ) as any } // ... }}
function hasTwoOrMore<T>(arr: T[]): arr is [T, T, ...T[]] { return arr.length >= 2}
export type OutputForVersion<S extends z.ZodTypeAny, V extends Version> = // ... S extends z.ZodUnion<infer T> ? OutputForVersion<T[number], V> : // ... z.output<S>
We can use a TypeScript discriminated union type to improve type safety when normalizing and older version of the settings to the latest version:
function normalizeSettings( settings: SettingsV1 | SettingsV2 | SettingsV3): SettingsV3 { switch (settings.version) { case 1: // TypeScript knows settings is SettingsV1 return normalizeSettingsV1(settings) case 2: // TypeScript knows settings is SettingsV2 return normalizeSettingsV2(settings) case 3: // TypeScript knows settings is SettingsV3 return settings }}
This is all well and good for the top-level type since it has a version
property that serves as a discriminator. But how can we handle nested types, since they don't have a version
property?
We can start by passing the version
down to the functions that normalize nested types:
function normalizeSettings( settings: SettingsV1 | SettingsV2 | SettingsV3): SettingsV3 { const { version, metadata } = settings return { version: 3, metadata: metadata.map((item) => normalizeMetadataItem({ version, item })), }}
function normalizeMetadataItem( data: | { version: 1; item: MetadataItemV1 } | { version: 2; item: MetadataItemV2 } | { version: 3; item: MetadataItemV3 }): MetadataItemV3 { switch (data.version) { case 1: // TypeScript knows data.item is MetadataItemV1 return normalizeMetadataItemV1(data.item) case 2: // TypeScript knows data.item is MetadataItemV2 return normalizeMetadataItemV2(data.item) case 3: // TypeScript knows data.item is MetadataItemV3 return data.item }}
However, these { version: 1; item: MetadataItemV1 } | { version: 2; item: MetadataItemV2 } | ...
type annotations are going to be a pain to write by hand; we can do better!
export type OutputForVersionMap< S extends z.ZodTypeAny, V extends Version = Version> = V extends any ? { version: V; output: OutputForVersion<S, V> } : never
If V extends any ?
seems odd, well, that's just TypeScript's weird syntax for distributing over a union:
type MetadataItemData = OutputForVersionMap<typeof MetadataItem, 1 | 2 | 3>// produces:type Produced = | { version: 1; item: MetadataItemV1 } | { version: 2; item: MetadataItemV2 } | { version: 3; item: MetadataItemV3 }
(Without the V extends any ?
, it would produce { version: 1 | 2 | 3, item: z.output<typeof MetadataItem> }
.)
With that, we can declare
function normalizeMetadataItem( data: OutputForVersionMap<typeof MetadataItem> // Version defaults to union of all versions): MetadataItemV3 { switch (data.version) { case 1: // TypeScript knows data.item is MetadataItemV1 return normalizeMetadataItemV1(data.item) ... }}
version
schemaNaively we would declare the version
like
const SettingsSchema = z.object({ version: z.union([z.literal(1), z.literal(2), z.literal(3)]),})
However, the problem with this is that
type SettingsV1 = OutputForVersion<SettingsSchema, 1> // whoops, { version: 1 | 2 | 3 }
To get SettingsV1
to have just version: 1
, we need
const SettingsSchema = z.object({ version: z.union([ version(z.literal(1), { until: 2 }), version(z.literal(2), { since: 2, until: 3 }), version(z.literal(3), { since: 3 }), ]),})
It would be nice not to have to tweak this boilerplate any time we introduce a new version. And fortunately, we can, with a bit of TypeScript magic:
import { range } from 'lodash'
/** * Now we just need to increment this when we introduce a new version, * and everything else gets handled for us! */export const SettingsLatestVersion = 3
export type SettingsLatestVersion = typeof SettingsLatestVersion
type VersionsUpToTuple<V extends Version> = V extends 1 ? [1] : [...VersionsUpToTuple<DecrementVersion[V]>, V]
export const SettingsVersions = range( 1, SettingsLatestVersion + 1) as any as VersionsUpToTuple<SettingsLatestVersion> // [1, 2, 3]
type MakeVersionUnion<T extends ExportedSettingsVersion[]> = { [K in keyof T]: ZodMetadata< z.ZodLiteral<T[K]>, { version: IncrementVersion[T[K]] extends never ? { since: T[K] } : { since: T[K]; until: IncrementVersion[T[K]] } } >}export const SettingsSchema = z.strictObject({ version: z.union( SettingsVersions.map((v) => version(z.literal(v), { since: v, until: (v + 1) as any }) ) as MakeVersionUnion<typeof SettingsVersions> ),})
type SettingsV1 = OutputForVersion<SettingsSchema, 1> // { version: 1 }type SettingsV2 = OutputForVersion<SettingsSchema, 2> // { version: 2 }