Description
Hi.
Some context. We use riak version 3.12, leveldb, ring_size = 64, n_val = 3, 5 riak nodes. We tried switching to Tictac AAE to use NextGen replication. The following was added to the config:
anti_entropy = passive
tictacaae_active = active
tictacaae_storeheads = enabled
After tictac aae has been built, we observe different statistics on objects through riak_client:aae_fold({object_stats, <<"domainRecord">>, all, all}).
on immutable buckets (no external traffic enters riak). There are more than 10kk objects in the bucket, and object_stats shows a different value each time with a slight difference. erase_keys also gives different statistics. For example (the request was repeated with a difference of 5 minutes):
riak_client:aae_fold({object_stats, <<"any_bucket">>, all, {date, {{2023, 8, 5}, {12, 0, 0}}, {{2023, 8, 5}, {13, 0, 0}}}}).
{ok,[{total_count,3652},
{total_size,19184471},
{sizes,[{2,105},{3,3156},{4,391}]},
{siblings,[{1,3652}]}]}
riak_client:aae_fold({object_stats, <<"any_bucket">>, all, {date, {{2023, 8, 5}, {12, 0, 0}}, {{2023, 8, 5}, {13, 0, 0}}}}).
{ok,[{total_count,3645},
{total_size,19145341},
{sizes,[{2,105},{3,3149},{4,391}]},
{siblings,[{1,3645}]}]}
riak_client:aae_fold({object_stats, <<"any_bucket">>, all, {date, {{2023, 8, 5}, {12, 0, 0}}, {{2023, 8, 5}, {13, 0, 0}}}}).
{ok,[{total_count,3643},
{total_size,19132325},
{sizes,[{2,105},{3,3147},{4,391}]},
{siblings,[{1,3643}]}]}
riak_client:aae_fold({object_stats, <<"any_bucket">>, all, {date, {{2023, 8, 5}, {12, 0, 0}}, {{2023, 8, 5}, {13, 0, 0}}}}).
{ok,[{total_count,3650},
{total_size,19171870},
{sizes,[{2,105},{3,3154},{4,391}]},
{siblings,[{1,3650}]}]}
Over a longer period of time the difference will be greater.
Also, when real-time replication is enabled, object_stats on the sink cluster and the source cluster will be different. There are no errors in the logs.
Can you give any comments? Is this normal behavior? Due to the lack of tools for checking data integrity, we cannot be sure that consistency between the two clusters will be maintained and we cannot reliably use replication.
Activity