Specify default column values
This page describes how to set a default value for a column in a BigQuery table. When you add a row to a table that doesn't contain data for a column with a default value, the default value is written to the column instead.
Default value expression
The default value expression for a column must be a literal or one of the following functions:
CURRENT_DATE
CURRENT_DATETIME
CURRENT_TIME
CURRENT_TIMESTAMP
GENERATE_UUID
RAND
SESSION_USER
ST_GEOGPOINT
You can compose a STRUCT or ARRAY default value with these functions, such as
[CURRENT_DATE(), DATE '2020-01-01']
.
Functions are evaluated when the data is written to the table.
The type of the default value must match or
coerce
to the type of the column it applies to. If no default value is set, the default
value is NULL
.
Set default values
You can set the default value for columns when you create a new table. You use the
CREATE TABLE
DDL statement
and add the DEFAULT
keyword and default value expression after the column name
and type. The following example creates a table called simple_table
with two
STRING
columns, a
and b
. Column b
has the default value 'hello'
.
CREATE TABLE mydataset.simple_table ( a STRING, b STRING DEFAULT 'hello');
When you insert data into simple_table
that omits column b
, the default
value 'hello'
is used instead—for example:
INSERT mydataset.simple_table (a) VALUES ('val1'), ('val2');
The table simple_table
contains the following values:
+------+-------+ | a | b | +------+-------+ | val1 | hello | | val2 | hello | +------+-------+
If a column has type STRUCT
, then you must set the default value for the
entire STRUCT
field. You cannot set the default value for a subset of the
fields. The
default value for an array cannot be NULL
or contain any NULL
elements.
The following example creates a table called complex_table
and sets a
default value for the column struct_col
, which contains nested fields,
including an ARRAY
type:
CREATE TABLE mydataset.complex_table ( struct_col STRUCT<x STRUCT<x1 TIMESTAMP, x2 NUMERIC>, y ARRAY<DATE>> DEFAULT ((CURRENT_TIMESTAMP(), NULL), [DATE '2022-01-01', CURRENT_DATE()]) );
You can't set default values that violate a constraint on the column, such as
a default value that doesn't conform to a
parameterized type
or a NULL
default value when the column's mode
is REQUIRED
.
Change default values
To change the default value for a column, select one of the following options:
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer panel, expand your project and dataset, then select the table.
In the details panel, click the Schema tab.
Click Edit schema. You might need to scroll to see this button.
In the Current schema page, locate the top-level field that you want to change.
Enter the default value for that field.
Click Save.
SQL
Use the
ALTER COLUMN SET DEFAULT
DDL statement.
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER TABLE mydataset.mytable ALTER COLUMN column_name SET DEFAULT default_expression;
Click
Run.
For more information about how to run queries, see Run an interactive query.
Setting the default value for a column only affects future inserts to the table.
It does not change any existing table data. The following example sets the
default value of column a
to SESSION_USER()
;
ALTER TABLE mydataset.simple_table ALTER COLUMN a SET DEFAULT SESSION_USER();
If you insert a row into simple_table
that omits column a
, the current
session user is used instead.
INSERT mydataset.simple_table (b) VALUES ('goodbye');
The table simple_table
contains the following values:
+------------------+---------+ | a | b | +------------------+---------+ | val1 | hello | | val2 | hello | | [email protected] | goodbye | +------------------+---------+
Remove default values
To remove the default value for a column, select one of the following options:
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer panel, expand your project and dataset, then select the table.
In the details panel, click the Schema tab.
Click Edit schema. You might need to scroll to see this button.
In the Current schema page, locate the top-level field that you want to change.
Enter
NULL
for the default value.Click Save.
SQL
Use the
ALTER COLUMN DROP DEFAULT
DDL statement.
In the Google Cloud console, go to the BigQuery page.
In the query editor, enter the following statement:
ALTER TABLE mydataset.mytable ALTER COLUMN column_name DROP DEFAULT;
You can also remove the default value from a column by changing its value to
NULL
with theALTER COLUMN SET DEFAULT
DDL statement.Click
Run.
For more information about how to run queries, see Run an interactive query.
Use DML statements with default values
You can add rows with default values to a table by using the
INSERT
DML statement.
The default value is used when the value for a column is not specified, or when
the keyword DEFAULT
is used in place of the value expression. The following
example creates a table and inserts a row where every value is the default
value:
CREATE TABLE mydataset.mytable ( x TIME DEFAULT CURRENT_TIME(), y INT64 DEFAULT 5, z BOOL); INSERT mydataset.mytable (x, y, z) VALUES (DEFAULT, DEFAULT, DEFAULT);
The table mytable
looks like the following:
+-----------------+---+------+ | x | y | z | +-----------------+---+------+ | 22:13:24.799555 | 5 | null | +-----------------+---+------+
Column z
doesn't have a default value, so NULL
is used as the default. When
the default value is a function, such as CURRENT_TIME()
, it is evaluated at
the time the value is written. Calling INSERT
with the default value for
column x
again results in a different value for TIME
. In the following
example, only
column z
has a value set explicitly, and the omitted columns use their default
values:
INSERT mydataset.mytable (z) VALUES (TRUE);
The table mytable
looks like the following:
+-----------------+---+------+ | x | y | z | +-----------------+---+------+ | 22:13:24.799555 | 5 | null | | 22:18:29.890547 | 5 | true | +-----------------+---+------+
You can update a table with default values by using the
MERGE
DML statement.
The following example creates two tables and updates one of them with a MERGE
statement:
CREATE TABLE mydataset.target_table ( a STRING, b STRING DEFAULT 'default_b', c STRING DEFAULT SESSION_USER()) AS ( SELECT 'val1' AS a, 'hi' AS b, '[email protected]' AS c UNION ALL SELECT 'val2' AS a, 'goodbye' AS b, SESSION_USER() AS c ); CREATE TABLE mydataset.source_table ( a STRING DEFAULT 'default_val', b STRING DEFAULT 'Happy day!') AS ( SELECT 'val1' AS a, 'Good evening!' AS b UNION ALL SELECT 'val3' AS a, 'Good morning!' AS b ); MERGE mydataset.target_table T USING mydataset.source_table S ON T.a = S.a WHEN NOT MATCHED THEN INSERT(a, b) VALUES (a, DEFAULT);
The result is the following:
+------+-----------+--------------------+ | a | b | c | +------+-----------+--------------------+ | val1 | hi | [email protected] | | val2 | goodbye | [email protected] | | val3 | default_b | [email protected] | +------+-----------+--------------------+
You can update a table with default values by using the
UPDATE
DML statement.
The following example updates the table source_table
so that each row of
column b
is equal to its default value:
UPDATE mydataset.source_table SET b = DEFAULT WHERE TRUE;
The result is the following:
+------+------------+ | a | b | +------+------------+ | val1 | Happy day! | | val3 | Happy day! | +------+------------+
Append a table
You can use the bq query
command with the --append_table
flag to append the
results of a query to a destination table that has default values. If the query
omits a column with a default value, the default value is assigned. The following
example appends data that specifies values only for column z
:
bq query \ --nouse_legacy_sql \ --append_table \ --destination_table=mydataset.mytable \ 'SELECT FALSE AS z UNION ALL SELECT FALSE AS Z'
The table mytable
uses default values for columns x
and y
:
+-----------------+---+-------+ | x | y | z | +-----------------+---+-------+ | 22:13:24.799555 | 5 | NULL | | 22:18:29.890547 | 5 | true | | 23:05:18.841683 | 5 | false | | 23:05:18.841683 | 5 | false | +-----------------+---+-------+
Load data
You can load data
into a table with default values by using the
bq load
command or the
LOAD DATA
statement.
Default values are applied when the loaded data has fewer columns than the
destination table. NULL
values in the loaded data are not converted to default
values.
Binary formats, such as AVRO, Parquet, or ORC, have encoded file schemas. When the file schema omits some columns, default values are applied.
Text formats, such as JSON and CSV, don't have encoded file schema. To specify
their schema using the bq command-line tool, you can use the --autodetect
flag or supply a
JSON schema. To specify
their schema using the LOAD DATA
statement, you must provide a list of
columns. The following is an example that loads only column a
from a CSV file:
LOAD DATA INTO mydataset.insert_table (a) FROM FILES( uris = ['gs://test-bucket/sample.csv'], format = 'CSV');
Write API
The Storage Write API only populates default values when the
write stream
schema is missing a field that is contained in the destination table schema.
In this case, the missing field is populated with the default value on the
column for every write. If the field exists in the write stream schema but is
missing from the data itself, then the missing field is populated with NULL
.
For example, suppose you are writing data
to a BigQuery table with the following schema:
[ { "name": "a", "mode": "NULLABLE", "type": "STRING", }, { "name": "b", "mode": "NULLABLE", "type": "STRING", "defaultValueExpression": "'default_b'" }, { "name": "c", "mode": "NULLABLE", "type": "STRING", "defaultValueExpression": "'default_c'" } ]
The following
write stream schema
is missing the field c
that is present in the destination table:
[ { "name": "a", "type": "STRING", }, { "name": "b", "type": "STRING", } ]
Suppose you stream the following values to the table:
{'a': 'val_a', 'b': 'val_b'} {'a': 'val_a'}
The result is the following:
+-------+-------+-----------+ | a | b | c | +-------+-------+-----------+ | val_a | val_b | default_c | | val_a | NULL | default_c | +-------+-------+-----------+
The write stream schema contains the field b
, so the default value default_b
is not used even when no value is specified for the field. Since the write
stream schema doesn't contain the field c
, every row in column c
is
populated with the destination table's default value default_c
.
The following write stream schema matches the schema of the table you're writing to:
[ { "name": "a", "type": "STRING", }, { "name": "b", "type": "STRING", } { "name": "c", "type": "STRING", } ]
Suppose you stream the following values to the table:
{'a': 'val_a', 'b': 'val_b'} {'a': 'val_a'}
The write stream schema isn't missing any fields contained in the destination table, so none of the columns' default values are applied, regardless of whether the fields are populated in the streamed data:
+-------+-------+------+ | a | b | c | +-------+-------+------+ | val_a | val_b | NULL | | val_a | NULL | NULL | +-------+-------+------+
You can specify connection-level default values settings in
default_missing_value_interpretation
within the AppendRowsRequest
message. If the value is set to
DEFAULT_VALUE
, the missing value will pick up the default value even when the column is
presented in the user schema.
You can also specify request-level default values in the
missing_value_interpretations
map within the
AppendRowsRequest
message.
Each key is the name of a column and its
value
indicates how to interpret missing values.
For example, the map {'col1': NULL_VALUE, 'col2': DEFAULT_VALUE}
means that all missing values in col1
are interpreted as NULL
, and
all missing values in col2
are interpreted as the default value set for col2
in the table schema.
If a field is not in this map and has missing values, then the missing values
are interpreted as NULL
.
Keys can only be top-level column names. Keys can't be struct subfields, such as
col1.subfield1
.
Use the insertAll
API method
The tabledata.insertAll
API method
populates default values at the row level when data is written to a table.
If a row is missing columns with default values, then the default values are
applied to those columns.
For example, suppose you have the following table schema:
[ { "name": "a", "mode": "NULLABLE", "type": "STRING", }, { "name": "b", "mode": "NULLABLE", "type": "STRING", "defaultValueExpression": "'default_b'" }, { "name": "c", "mode": "NULLABLE", "type": "STRING", "defaultValueExpression": "'default_c'" } ]
Suppose you stream the following values to the table:
{'a': 'val_a', 'b': 'val_b'} {'a': 'val_a'} {}
The result is the following:
+-------+------------+-----------+ | a | b | c | +-------+------------+-----------+ | val_a | val_b | default_c | | val_a | default_b | default_c | | NULL | default_b | default_c | +-------+------------+-----------+
The first inserted row doesn't contain a value for the field c
, so the default
value default_c
is written to column c
. The second inserted row doesn't
contain values for the fields b
or c
, so their default values are written to
columns b
and c
. The third inserted row
contains no values. The value written to column a
is NULL
since no other
default value is set. The default values default_b
and default_c
are written
to columns b
and c
.
View default values
To see the default value for a column, query the
INFORMATION_SCHEMA.COLUMNS
view. The column_default
column field
contains the default value for the column. If no default value is set, it is
NULL
. The following example shows the column names and default values for the
table mytable
:
SELECT column_name, column_default FROM mydataset.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'mytable';
The result is similar to the following:
+-------------+----------------+ | column_name | column_default | +-------------+----------------+ | x | CURRENT_TIME() | | y | 5 | | z | NULL | +-------------+----------------+
Limitations
- You can read from tables with default values by using Legacy SQL, but you cannot write to tables with default values using Legacy SQL.
- You cannot add a new column with a default value to an existing table.
However, you can add the column without a default value, then change its
default value by using the
ALTER COLUMN SET DEFAULT
DDL statement. - You cannot copy and append a source table to a destination table that has more
columns than the source table, and the additional columns have default
values. Instead, you can run
INSERT destination_table SELECT * FROM source_table
to copy over the data.
What's next
- For more information about loading data into BigQuery, see Introduction to loading data.