In BigQuery when running a query, the costs associated with doing so include a “scan” of the data that may be processed as part of executing the query.
For example, if you have a 100GB table and you try to run the following query
SELECT
*
FROM myproject.mydatset.some100gbtable
Then BigQuery will charge you for scanning 100GB of data.
In the weird world that is BigQuery billing, you will still be charged for scanning 100GB of data if you use a limit clause in the SQL statement
SELECT
*
FROM myproject.mydatset.some100gbtable
LIMIT 100
So what do you do if you just want to see a small subset of rows without being charged for scanning the whole table.
We can use TABLESAMPLE
clause to only scan a fraction of a large table
SELECT
*
FROM myproject.mydatset.some100gbtable
TABLESAMPLE SYSTEM ( 10 PERCENT)

Example
Before TABLESAMPLE
288MB is scanned

After TABLESAMPLE
28MB is scanned (or 10% of total)
