What is data sampling
Data sampling is the data-analysis practice of analyzing a subset of data in order to uncover meaningful information from a larger data set. The practice enables you to retrieve data more quickly with minimal impact on data quality.
For example, if you wanted to estimate the number of trees in a 100-acre area where the distribution of trees was fairly uniform, you could count the number of trees in 1 acre and multiply by 100, or count the trees in a half acre and multiply by 200 to get an accurate representation of the entire 100 acres.
Why you see data sampling
In Google Analytics, data sampling may occur when the number of events used to create a report, exploration, or request exceeds the quota limit for your property. When this happens, Analytics uses a portion of the data and then scales up to provide directionally accurate results that are representative of all your data.
When your results use sampling, it is indicated in the data quality icon with the percentage of data used to create the results. The higher the sample size used, the more accurate the results.
What are the limits
The quota limit for event level queries is 10 million events for standard Google Analytics properties and up to 1 billion events for Google Analytics 360 properties.
Google Analytics 360 properties have an initial default of 100 million events per query, to provide you with faster and directionally accurate results. When an increased accuracy is required, through the data quality icon you can access the higher sampling limit in Explore selecting âmore detailed resultsâ.
What about unsampled data
For unsampled reports, Google Analytics uses HyperLogLog++ (HLL++) to estimate exact distinct counts for most frequently used metrics such as Active users and Sessions. Using HLL++ ensures better performance, higher estimation accuracy, and lower error bounds. You can also use HLL++ with your Google Analytics data in BigQuery. Learn more about Unique count approximation in Google Analytics.