DESCRIPTION > Many times you have data with `null` values and you want to fill them with non-null values. This is how to do **forward and back filling** `null` values on ClickHouse. It involves using some array functions, and we'll explain step-by-step how to do it and how they work. This was also posted on our [blog](https://blog.tinybird.co/2021/05/18/tips-7-filling-null-values-on-clickhouse/). NODE raw_data DESCRIPTION > This is a sample dataset, with a numeric column and a string column with some null values. We want to fill nulls with the previous and next non-null values, something you'd do in Pandas with the [`.fillna`](https://pandas.pydata.org/docs/reference/api/pandas.Series.fillna.html) method or on Postgres [like this](https://hakibenita.com/sql-for-data-analysis#back-and-forward-fill) SQL > SELECT * FROM num_str ORDER BY n NODE final_result DESCRIPTION > This is how you'd do it on ClickHouse. Correlated subqueries as the one used on Postgres aren't available on ClickHouse, but we can do it using array functions. In the next nodes you can see what's happening in each of the parts of this query. SQL > SELECT tuples.1 n, tuples.2 v, tuples.3 v_ffill, tuples.4 v_bfill FROM (SELECT arrayJoin( arrayZip( groupArray(n) AS n, arrayMap(x -> x != 'wadus' ? x : null, groupArray(v_nulls_replaced)) AS v, arrayFill(x -> x != 'wadus', groupArray(v_nulls_replaced)) AS v_ffill, arrayReverseFill(x -> x != 'wadus', groupArray(v_nulls_replaced)) AS v_bfill ) ) tuples FROM (SELECT *, ifNull(v, 'wadus') v_nulls_replaced FROM num_str ORDER BY n ASC) ) NODE replacing_null_values DESCRIPTION > The core part of the previous query are the functions [`arrayFill`](https://clickhouse.tech/docs/en/sql-reference/functions/array-functions/#array-fill) and [`arrayReverseFill`](https://clickhouse.tech/docs/en/sql-reference/functions/array-functions/#array-reverse-fill). They take arrays, and the way to generate an array from a column on CH is with the [`groupArray`](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/reference/grouparray/) function. All of the [aggregate functions](https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/) on ClickHouse skip null values, so the workaround is replacing them by a value we know is not present in the column. Make sure that you sort the data properly here. SQL > SELECT *, ifNull(v, 'wadus') v_nulls_replaced FROM num_str ORDER BY n ASC NODE arrays DESCRIPTION > This is the most important part, where the actual forward and back filling takes place. `arrayFill` and `arrayReverseFill` take arrays as inputs and return arrays. In the next nodes we convert those arrays into individual rows. SQL > SELECT groupArray(n) AS n, arrayMap(x -> x != 'wadus' ? x : null, groupArray(v_nulls_replaced)) AS v, arrayFill(x -> x != 'wadus', groupArray(v_nulls_replaced)) AS v_ffill, arrayReverseFill(x -> x != 'wadus', groupArray(v_nulls_replaced)) AS v_bfill FROM replacing_null_values NODE using_arrayZip DESCRIPTION > [`arrayZip`](https://clickhouse.tech/docs/en/sql-reference/functions/array-functions/#arrayzip) combines N arrays with M elements into a single array with M tuples containing N elements each SQL > SELECT arrayZip(n, v, v_bfill, v_ffill) zipped_array FROM arrays NODE using_arrayJoin DESCRIPTION > [`arrayJoin`](https://clickhouse.tech/docs/en/sql-reference/functions/array-join/#functions_arrayjoin) is the ClickHouse equivalent of `unnest` in Postgres. It takes an array of N elements and expands it to N rows with one element each. SQL > SELECT arrayJoin(zipped_array) tuples FROM using_arrayZip NODE result_extracting_values_from_tuples DESCRIPTION > This is the final result. In it, we just have to extract each one of the values from the tuples created in the previous step. SQL > SELECT tuples.1 n, tuples.2 v, tuples.3 v_ffill, tuples.4 v_bfill FROM using_arrayJoin NODE bonus_fill_with_constant DESCRIPTION > If you just need to fill the null values with a constant value, either setting it manually or getting it from the data (here `v_mode_fill` fills the null values with the most common value of thet column), this is how you'd do it. SQL > SELECT n, v, ifNull(v, 'Z') v_simple_fill, ifNull(v, (SELECT topK(1)(v)[1] from raw_data) as mode) as v_mode_fill FROM raw_data NODE the_end DESCRIPTION > [Tinybird](https://tinybird.co) lets you use a notebook-like interface like this to develop real-time endpoints on large amounts of data, [powered by ClickHouse](https://blog.tinybird.co/2021/02/16/tinybird-clickhouse/). If you'd like to use it, sign up for a free account [here](https://www.tinybird.co/signup?utm_source=snapshots&utm_campaign=fillna) ![image](https://media1.giphy.com/media/3o6Zt7g9nH1nFGeBcQ/giphy.gif?cid=ecf05e47j51paam4g69k2cfcql9hk767l1f03rurzygnwjap&rid=giphy.gif&ct=g) SQL > SELECT 'Thanks!'