Skip to content
My GitHub My Twitter My LinkedIn

Avoiding Network Policy Interference with Workload Identity on GKE

Today we stumbled upon an interesting case which I want to share as it might help you in your debugging journey.

Let's assume you have the following infrastructure setup:

Problem

You might think everything is fine, but your service which is communicating with Google APIs (like Google Cloud Storage Client Libraries, etc.) complains with something like:

Could not load the default credentials.
Browse to https://cloud.google.com/docs/authentication/getting-started for more information.

The first reaction is: How can this be the case when I configured Workload Identity as recommended? When looking behind the curtain it is likely that the culprit here is not Workload Identity at all.

Cause

In our case, the root cause was a restrictive network policy which blocked the egress traffic to the GKE metadata server.

Solution

The solution depends slightly on the Kubernetes version you're operating.

1.21.0-gke.1000 and later

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: your-network-policy
  namespace: default
spec:
  egress:
    - ports:
        - port: 988
          protocol: TCP
      to:
        - ipBlock:
            cidr: 169.254.169.252/32
  policyTypes:
    - Egress

Prior versions

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: your-network-policy
  namespace: default
spec:
  egress:
    - ports:
        - port: 988
          protocol: TCP
      to:
        - ipBlock:
            cidr: 127.0.0.1/32
  policyTypes:
    - Egress