The Spanner query optimizer determines the most efficient way to execute a SQL query. However, the query plan determined by the optimizer may change slightly when the query optimizer itself evolves, or when the database statistics are updated. To minimize any potential for performance regression when the query optimizer or statistics change, Spanner provides the following query options.
-
optimizer_version: Changes to the query optimizer are bundled and released as optimizer versions. Spanner starts using the latest version of the optimizer as the default at least 30 days after that version is released. You can use the query optimizer version option to run queries against an older version of the optimizer.
-
optimizer_statistics_package: Spanner updates optimizer statistics regularly. New statistics are made available as a package. This query option specifies a statistics package for the query optimizer to use when compiling a SQL query. The specified package must have garbage collection disabled:
GoogleSQL
ALTER STATISTICSSET OPTIONS (allow_gc=false)
PostgreSQL
ALTER STATISTICS spanner."<package_name>" SET OPTIONS (allow_gc = true)
This guide shows how to set these individual options at different scopes in Spanner.
List query optimizer options
Spanner stores information about the available optimizer versions and statistics packages that you can select.
Optimizer versions
The query optimizer version is an integer value, incremented by 1 with each update. The latest version of the query optimizer is 7.
Execute the following SQL statement to return a list of all supported optimizer versions, along with their corresponding release dates and whether that version is the default. The largest version number returned is the latest supported version of the optimizer.
SELECT * FROM SPANNER_SYS.SUPPORTED_OPTIMIZER_VERSIONS;
Default version
By default, Spanner starts using the latest version of the optimizer at least 30 days after that version is released. During the 30+ day period between a new release and that release becoming the default, you're encouraged to test queries against the new version to detect any regression.
To find the default version, execute the following SQL statement:
SELECT * FROM SPANNER_SYS.SUPPORTED_OPTIMIZER_VERSIONS;
The query returns a list of all supported optimizer versions. The
IS_DEFAULT
column specifies which version is the current default.
For details about each version, see Query optimizer version history.
Optimizer statistics packages
Each new optimizer statistics package that Spanner creates is assigned a package name that is guaranteed to be unique within the given database.
The format of the package name is auto_{PACKAGE_TIMESTAMP}UTC
.
In GoogleSQL, the ANALYZE
statement triggers the creation of the statistics package name. In
PostgreSQL, the
ANALYZE
statement performs this task. The format of the statistics package name is
analyze_{PACKAGE_TIMESTAMP}UTC
, where
{PACKAGE_TIMESTAMP}
is the timestamp, in UTC timezone, of when the
statistics construction started. Execute the following SQL statement to return a
list of all available optimizer statistics packages.
SELECT * FROM INFORMATION_SCHEMA.SPANNER_STATISTICS;
By default, Spanner uses the latest optimizer statistics package unless the database or query is pinned to an older package using one of the methods described on this page.
Option override precedence
If you're using a GoogleSQL-dialect database, Spanner offers multiple ways to change the optimizer options. For example, you can set the option(s) for a specific query or configure the option in the client library at the process or query level. When an option is set in multiple ways, the following precedence order applies. (Select a link to jump to that section in this document).
Spanner default ← database option ← client app ← environment variable ← client query ← statement hint
For example, here's how to interpret the order of precedence when setting the query optimizer version:
When you create a database, it uses the Spanner default optimizer version. Setting the optimizer version using one of the methods listed above takes precedence over anything to the left of it. For example, setting the optimizer for an app using an environment variable takes precedence over any value you set for the database using the database option. Setting the optimizer version through a statement hint has the highest precedence for the given query, taking precedence over the value set using any other method.
Let's now look at each method in more detail.
Set optimizer options at the database level
You can set the default optimizer version on a database using the following
ALTER DATABASE
DDL command.
GoogleSQL
ALTER DATABASE MyDatabase
SET OPTIONS (optimizer_version = 7);
PostgreSQL
ALTER DATABASE MyDatabase SET spanner.optimizer_version = 5;
You can set the statistics package similarly, as shown in the following example.
GoogleSQL
ALTER DATABASE MyDatabase
SET OPTIONS (optimizer_statistics_package = "auto_20191128_14_47_22UTC");
PostgreSQL
ALTER DATABASE MyDatabase
SET spanner.optimizer_statistics_package = "auto_20191128_14_47_22UTC";
You can also set more than one option at the same time, as shown in the following DDL command.
GoogleSQL
ALTER DATABASE MyDatabase
SET OPTIONS (optimizer_version = 7,
optimizer_statistics_package = "auto_20191128_14_47_22UTC");
You can run ALTER DATABASE
in gcloud CLI with the
gcloud CLI databases ddl update
command as follows.
GoogleSQL
gcloud spanner databases ddl update MyDatabase --instance=test-instance \
--ddl='ALTER DATABASE MyDatabase SET OPTIONS ( optimizer_version = 7 )'
PostgreSQL
gcloud spanner databases ddl update MyDatabase --instance=test-instance \
--ddl='ALTER DATABASE MyDatabase SET spanner.optimizer_version = 7'
Setting a database option to NULL
(in GoogleSQL) or DEFAULT
(in
PostgreSQL) clears it so that the default value is used.
To see the current value of these options for a database, query the
INFORMATION_SCHEMA.DATABASE_OPTIONS
view for GoogleSQL, or the
information_schema database_options
table for PostgreSQL, as follows.
GoogleSQL
SELECT
s.OPTION_NAME,
s.OPTION_VALUE
FROM
INFORMATION_SCHEMA.DATABASE_OPTIONS s
WHERE
s.SCHEMA_NAME=""
AND s.OPTION_NAME IN ('optimizer_version', 'optimizer_statistics_package')
PostgreSQL
SELECT
s.option_name,
s.option_value
FROM
information_schema.database_options s
WHERE
s.schema_name='public'
AND s.option_name IN ('optimizer_version',
'optimizer_statistics_package')
Set optimizer options with client libraries
When you are programmatically interacting with Spanner through client libraries, there are a number of ways to change query options for your client application.
You must be using the latest versions of the client libraries to set optimizer options.
Set optimizer options for a database client
An application can set optimizer options globally on the client library by configuring the query options property as shown in the following code snippets. The optimizer settings are stored in the client instance and are applied to all queries run throughout the lifetime of the client. Even though the options apply at a database level in the backend, when the options are set at a client level, they apply to all databases connected to that client.
C++
C#
Go
Java
Node.js
PHP
Python
Ruby
Set optimizer options with environment variables
To make it easier to try different optimizer settings without having to
recompile your app, you can set the SPANNER_OPTIMIZER_VERSION
and
SPANNER_OPTIMIZER_STATISTICS_PACKAGE
environment variables and run your app,
as shown in the following snippet.
Linux / macOS
export SPANNER_OPTIMIZER_VERSION="7"
export SPANNER_OPTIMIZER_STATISTICS_PACKAGE="auto_20191128_14_47_22UTC"
Windows
set SPANNER_OPTIMIZER_VERSION="7"
set SPANNER_OPTIMIZER_STATISTICS_PACKAGE="auto_20191128_14_47_22UTC"
The specified query optimizer options values are read and stored in the client instance at client initialization time and apply to all queries run throughout the lifetime of the client.
Set optimizer options for a client query
You can specify a value for optimizer version or statistics package version at the query level in your client application by specifying a query options property when building your query.
C++
C#
Go
Java
Node.js
PHP
Python
Ruby
Set optimizer options for a query using a statement hint
A statement hint is a hint on a query statement that changes the execution
of the query from the default behavior. Setting the OPTIMIZER_VERSION
hint on
a statement forces that query to run using the specified query optimizer
version.
The OPTIMIZER_VERSION
hint has the highest optimizer version precedence. If
the statement hint is specified, it will be used regardless of all other
optimizer version settings.
GoogleSQL
@{OPTIMIZER_VERSION=7} SELECT * FROM MyTable;
PostgreSQL
/*@OPTIMIZER_VERSION=7*/ SELECT * FROM MyTable;
You can also use the latest_version literal to set the optimizer version for a query to the latest version as shown here.
GoogleSQL
@{OPTIMIZER_VERSION=latest_version} SELECT * FROM MyTable;
PostgreSQL
/*@OPTIMIZER_VERSION=latest_version*/ SELECT * FROM MyTable;
Setting the OPTIMIZER_STATISTICS_PACKAGE
hint on
a statement forces that query to run using the specified query optimizer
statistics package version. The specified package
must have garbage collection disabled:
GoogleSQL
ALTER STATISTICS <package_name> SET OPTIONS (allow_gc=false)
PostgreSQL
ALTER STATISTICS spanner."package_name" SET OPTIONS (allow_gc=false)
The OPTIMIZER_STATISTICS_PACKAGE
hint has the highest optimizer package
setting precedence. If the statement hint is specified, it will be used
regardless of all other optimizer package version settings.
@{OPTIMIZER_STATISTICS_PACKAGE=auto_20191128_14_47_22UTC} SELECT * FROM MyTable;
You can also use the latest literal to use the latest statistics package.
@{OPTIMIZER_STATISTICS_PACKAGE=latest} SELECT * FROM MyTable;
Both hints can be set in a single statement as shown in the following example.
The default_version literal sets the optimizer version for a query to the default version, which may be different than the latest version. See Default version for details.
GoogleSQL
@{OPTIMIZER_VERSION=default_version, OPTIMIZER_STATISTICS_PACKAGE=auto_20191128_14_47_22UTC} SELECT * FROM MyTable;
PostgreSQL
/*@OPTIMIZER_VERSION=default_version, OPTIMIZER_STATISTICS_PACKAGE=auto_20191128_14_47_22UTC*/ SELECT * FROM KeyValue;
Set optimizer options when using the Spanner JDBC driver
You can override the default value of the optimizer version and statistics package by specifying options in the JDBC connection string as shown in the following example.
These options are only supported in the latest versions of the Spanner JDBC driver.
You can also set the query optimizer version using the SET OPTIMIZER_VERSION
statement as shown in the following example.
For more details on using the open source driver, see Using the open source JDBC driver.
How invalid optimizer versions are handled
Spanner supports a range of optimizer versions.
This range changes over time when the query optimizer is updated. If the version
you specify is out of range, the query fails. For example, if you attempt to run
a query with the statement hint
@{OPTIMIZER_VERSION=8}
,
but the most recent optimizer version number is only
7
, Spanner responds with
this error message:
Query optimizer version: 8 is not
supported
Handle an invalid optimizer statistics package setting
You can pin your database or query to any available statistics package using one of the methods described earlier on this page. A query fails if an invalid statistics package name is provided. A statistics package specified by a query needs to be either:
Determine the query optimizer version used to run a query
The optimizer version used for a query is visible through the Google Cloud console and in the Google Cloud CLI.
Google Cloud console
To view the optimizer version used for a query, run your query in the Spanner Studio page of the Google Cloud console, and then select the Explanation tab. You should see a message similar to the following:
Query optimizer version: 7
gcloud CLI
To see the version used when running a query in gcloud CLI, set
the --query-mode
flag to PROFILE
as shown in the following snippet.
gcloud spanner databases execute-sql MyDatabase --instance=test-instance \
--query-mode=PROFILE --sql='SELECT * FROM MyTable'
Visualize query optimizer version in Metrics Explorer
Cloud Monitoring collects measurements to help you understand how your applications and system services are performing. One of the metrics collected for Spanner is count of queries, which measures the number of queries in an instance, sampled over time. While this metric is very useful to view queries grouped by error code, we can also use it to see what optimizer version was used to run each query.
You can use Metrics Explorer in Google Cloud console to visualize Count of queries for your database instance. Figure 1 shows the count of queries for three databases. You can see which optimizer version is being used in each database.
The table below the chart in this figure shows that my-db-1
attempted to run
a query with an invalid optimizer version, returning the status Bad usage
and resulting in a query count of 0. The other databases ran queries using
versions 1 and 2 of the optimizer respectively.
Figure 1. Count of queries displayed in Metrics Explorer with queries grouped by optimizer version.
To set up a similar chart for your instance:
- Navigate to the Metrics Explorer in the Google Cloud console.
- In the Resource type field, select
Cloud Spanner Instance
. - In the Metric field, select
Count of queries
. - In the Group By field, select
database
,optimizer_version
, andstatus
.
Not shown in this example is the case where a different optimizer version is being used for different queries in the same database. In that case, the chart would display a bar segment for each combination of database and optimizer version.
To learn how to use Cloud Monitoring to monitor your Spanner instances, see Monitoring with Cloud Monitoring