Adding a new column to a large table in long-term storage

rosanna · August 5, 2024, 4:12pm

should be similar performance

jayW · August 5, 2024, 5:08pm

But only with Iceberg? Or also with other formats

rosanna · August 5, 2024, 6:02pm

obviously it’ll depend on the workload, but learning more about Iceberg made me understand BigQuery better. They’re clearly based on similar formats and styles

jayW · August 5, 2024, 6:05pm

I see, I will give it a look. Already moving to physical storage we saved 10x on a few massive tables

rosanna · August 5, 2024, 6:10pm

Iceberg has the best BQ support
https://cloud.google.com/bigquery/docs/query-iceberg-data

jayW · August 5, 2024, 6:50pm

If we 10x the data we definitely will look for more optimization

rosanna · August 5, 2024, 7:44pm

BQ has a private preview running for Managed Iceberg, with several plans moving forward
https://www.youtube.com/watch?v=4d4nqKkANdM&ab_channel=ApacheIceberg

rosanna · August 5, 2024, 8:17pm

the other reason for us is the ability to use other query engines. our biggest cost is query analysis, so being able to move some queries to DuckDB, StarRocks, or another engine, could save us a boatload

timothyP · August 5, 2024, 8:36pm

Super interesting - latency for adhoc analysis is always our concern - e.g. an Analysts time waiting for query response & data engineer cost to spin up the federated data source vs just doing it through BQ

rolandHawkins · August 5, 2024, 9:35pm

Regarding the BQ colab notebook, does anyone have any good practice material?