Understanding Google's load balancing through gcloud commands and SDN stack

Hey folks! I’m trying to understand how Google does it’s load balancing.
I think this is correct:
The gcloud compute addresses create command is creating/registering an anycast ip address with the Google Front End (aka https://research.google/pubs/maglev-a-fast-and-reliable-software-network-load-balancer/|Maglev?).
Then stuff like gcloud compute forwarding-rules create (and stuff like url-maps and backends) says how to route requests the GFE recieves on that IP to the stuff I’ve created in GCP (like my gke cluster in northamerica-northeast1).

These commands seem kinda disconnected (to me), but if those are actually a set of commands for programming the GFE via the Software Defined Networking stack (https://cloud.google.com/blog/products/networking/google-cloud-networking-in-depth-how-andromeda-2-2-enables-high-throughput-vms|Andromeda?) then it feels like the dots are finally connecting.

Am I getting this right?

The 1st command is used to reserve a static IP (regional or global) for the LB setup . GFE is not within user’s control it is a abstraction layer but yet it is Maglev. The 2nd command is also part of the LB which routes traffic from GFE to the actual LB. Route LB request s to stuff created in GCP (aka backends) are done via backend services also a part of LB setup.

So when I create an IP maglev starts “anycasting” it?

No. Maglev is GFE at edge locations. When you create a IP , Maglev learns the LB and would route traffic from GFE to the LB based on whether it is a Global or Regional LB

If you look at some of their terraform load balancer modules, or some of their howtos on setup that use the CLI, it may help understand the different parts better. GCP’s load balancers have a lot of separate parts that are kind of hidden if you create them using the CLI or a Kubernetes resource.

Also, GCP has multiple load balancing implementations (internal L7, internal L4, external L7 (both classic and new flavors) and external L4).

They’re very performant / efficient, but understanding all the moving parts can be a lot (and again, both the parts involved and the underlying technology are somewhat different for all of the above). I think the newer flavor of external L7 LBs and internal L7 LBs are both based on Envoy

and do you have some links (to code, blog posts, videos, or terraform modules) that I should be looking at to get a better understanding?

tf module wise, could look at
https://github.com/terraform-google-modules/terraform-google-lb
https://github.com/terraform-google-modules/terraform-google-lb-http
https://github.com/terraform-google-modules/terraform-google-lb-internal

As far as other posts, there are a bunch of individual howtos in the GCP docs, for example
https://cloud.google.com/load-balancing/docs/https/ext-https-lb-simple etc.
— looking at the CLI commands should give a good idea of what the moving parts are.

Some of this may be now outdated, and there might be some other good Next talks as well, but I believe I saw <https://www.youtube.com/watch?v=HUHBq_VGgFg|this talk> or <https://www.youtube.com/watch?v=J5HJ1y6PeyE|this talk> in person and found it very informative.

https://cloud.google.com/blog/products/gcp/google-shares-software-network-load-balancer-design-powering-gcp-networking
https://cloud.google.com/blog/products/networking/google-cloud-networking-in-depth-cloud-load-balancing-deconstructed
https://research.google.com/pubs/archive/44824.pdf
etc. for some higher level info

I think one of the massive pros is that GCP’s L7 LBs are mostly global from the start (vs. regional), and there are some pretty sophisticated options for how to handle traffic to different regions if you have a multi-region service (vs. having to have LBs in each region but then do DNS based global load balancing to determine where traffic goes).
With the envoy-based products, there are some other cool capabilities including (if memory serves), some facilities for things like traffic mirroring and chaos engineering hooks.

OTOH, there is some lack of configurability in terms of things like being able to apply firewall rules (yes, you can apply some restrictions using cloudarmor, but not at a lower level), choose what ports are being listened on, and, in a lot of cases, how error pages show up (though there are some new options there now, I think).