This page explains how to convert fields into the decimals and perform transformations on them when you prepare data in the Wrangler workspace of the Cloud Data Fusion Studio.
Read decimal data
- Go to the Wrangler workspace in Cloud Data Fusion.
-
Open an object, such as a table, from a database or a Cloud Storage file.
- For a database or a BigQuery connection, if the table
has a decimal column, Wrangler converts it into a
BigDecimal
type. When you create the pipeline from Wrangler, the column is then converted to thedecimal
data type. -
If your dataset contains non-decimal data that you want to convert to decimals, use the
set-column
directive:set-column : DECIMAL_COLUMN exp:{new("java.math.BigDecimal", INPUT_COLUMN)}
Replace the following:
DECIMAL_COLUMN
: the decimal column to be transformed. After the directive executes, the column's data type changes toBigDecimal
, and the schema also contains the appropriate data type.INPUT_COLUMN
: the column that gets converted, which can be one of the following types:STRING
,INTEGER
,LONG
,FLOAT
, orDOUBLE
.
If your dataset includes values with varying scale, such as 1.05, 2.698, 5.8745512, set the scale with a Wrangler directive and edit the schema in the pipeline to set the scale for the decimal column.
To set the scale in the Wrangler, use a directive similar to the following:
set-column : OUTPUT_COLUMN exp:{new("java.math.BigDecimal", DECIMAL_COLUMN).setScale()}
The following example converts a column called
cost
from a string to a decimal, sets a scale of 9, and outputs the results to a new column calledoutput-column
:set-column : output-column exp:{new("java.math.BigDecimal", "cost").setScale(9)}
- For a database or a BigQuery connection, if the table
has a decimal column, Wrangler converts it into a
Transform decimal data
Decimal columns in Wrangler use the Java BigDecimal class.
After the columns are converted to the BigDecimal data type, transform the
columns with methods from Class BigDecimal
.
Transformation | Directive |
---|---|
Get the absolute value | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.abs() |
Get the precision of a decimal value | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.precision() |
Get the scale of a decimal value | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.scale() |
Get the unscaled value of a decimal value | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.unscaledValue() |
Add two decimal columns | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.add(DECIMAL_COLUMN_2) |
Subtract a decimal from another | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.subtract(DECIMAL_COLUMN_2) |
Multiply a decimal with another | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.multiply(DECIMAL_COLUMN_2) |
Divide a decimal column by another and return the quotient | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.divide(DECIMAL_COLUMN_2) |
Divide a decimal column by another and return the remainder | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.remainder(DECIMAL_COLUMN_2) |
Convert decimal to a integer | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.intValue() |
Convert decimal to a long | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.longValue() |
Convert decimal to a float | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.floatValue() |
Convert decimal to a double | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.doubleValue() |
Check if a decimal value is equal to another | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.equals(DECIMAL_COLUMN_2) |
Find the maximum of two decimal columns | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.max(DECIMAL_COLUMN_2) |
Find the minimum of two decimal columns | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.min(DECIMAL_COLUMN_2) |
Move the decimal point n places to the left | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.movePointLeft(n) |
Move the decimal point n places to the right | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.movePointRight(n) |
Get the nth power of a decimal | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.pow(n) |
Negate a decimal | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.negate() |
Strip trailing zeros in a decimal | set-column : OUTPUT_COLUMN DECIMAL_COLUMN.stripTrailingZeros() |
Replace the following:
OUTPUT_COLUMN
: the column containing the output of the operation.DECIMAL_COLUMN
: the decimal column that's transformed.DECIMAL_COLUMN_2
: the second decimal column included in the operation, such as when you add the values from two decimal columns together.
What's next
- Learn more about Wrangler directives.