Cloud Native

Kubernetes Network Observability: Monitoring Container Traffic

Kubernetes networking is complex and dynamic. Learn how to gain visibility into pod-to-pod traffic, service mesh metrics, and network policies.

Why Kubernetes Networking Is Different

Traditional network monitoring tracks physical interfaces and static IP addresses. Kubernetes throws this out the window: pods are ephemeral, IPs change constantly, and traffic flows through virtual overlays. The network fabric itself is software-defined and abstracted.

The Challenge

A pod might exist for 30 seconds. By the time you investigate an alert, the pod is gone, its IP reassigned, and logs scattered across nodes. You need observability built into the platform, not bolted on.

Kubernetes Network Architecture

Understanding what to monitor starts with understanding the layers:

Pod Network

Every pod gets a unique IP. Pods can communicate directly without NAT. Implemented by CNI plugins like Calico, Cilium, or Flannel.

Service Network

ClusterIP services provide stable endpoints. kube-proxy or eBPF handles load balancing. Service IPs exist only in iptables/IPVS rules.

Ingress Layer

External traffic enters through Ingress controllers or LoadBalancer services. This is where external monitoring typically starts.

Network Policies

Kubernetes-native firewalling. Defines allowed traffic between pods. Enforcement depends on CNI plugin capabilities.

Key Metrics to Monitor

MetricSource What It Reveals
container_network_receive_bytescAdvisorPod ingress traffic volume
container_network_transmit_bytescAdvisorPod egress traffic volume
kube_pod_status_phasekube-state-metricsPod lifecycle state
node_network_receive_dropnode_exporterNode-level packet drops
nginx_ingress_controller_requestsIngress controllerExternal request rates
# PromQL: Network bytes by pod (top 10)
topk(10,
  sum by (pod) (
    rate(container_network_receive_bytes_total[5m])
  )
)

# PromQL: Pods with network errors
sum by (pod) (
  rate(container_network_receive_errors_total[5m])
) > 0

CNI Plugin Observability

Your CNI choice determines available network visibility:

CNI Observability Features
CiliumeBPF-based flow visibility, Hubble UI, L7 policy metrics
CalicoFlow logs, network policy metrics, Felix stats
WeaveConnection tracking, Prometheus metrics endpoint
FlannelBasic - relies on node-level monitoring

Recommendation: If network observability is important, choose Cilium. Its eBPF-based Hubble provides the deepest visibility into Kubernetes networking without performance overhead.

Service Mesh Metrics

If you're running Istio, Linkerd, or similar service mesh:

Request Metrics

Request count, latency histograms, error rates per service pair. The mesh sidecar captures every request without application changes.

TCP Metrics

Bytes sent/received, connection duration, active connections. Useful for non-HTTP traffic.

mTLS Status

Track which connections are encrypted. Identify services that haven't enrolled in the mesh or policy violations.

# Istio: Request rate by source and destination
sum by (source_workload, destination_workload) (
  rate(istio_requests_total[5m])
)

# Istio: P99 latency
histogram_quantile(0.99,
  sum by (destination_service, le) (
    rate(istio_request_duration_milliseconds_bucket[5m])
  )
)

Network Policy Monitoring

Network policies are only useful if you know they're working:

  • -Policy hit counts: How often each policy allows or denies traffic. Unused policies might be misconfigured.
  • -Denied connections: Alert on unexpected denials. Might indicate legitimate traffic being blocked or attack attempts.
  • -Policy coverage: What percentage of pods are protected by policies? Default-deny should be the goal.

Troubleshooting Network Issues

Common Kubernetes networking problems and how to diagnose them:

Symptom Possible Cause How to Check
Pods can't reach each otherCNI misconfiguration, network policykubectl exec + ping/curl, policy audit
Service DNS not resolvingCoreDNS issues, service selectornslookup from pod, check endpoints
Intermittent timeoutsNode network saturation, conntrack limitsNode metrics, conntrack stats
External traffic failingIngress config, load balancer healthIngress controller logs, LB status

Recommended Observability Stack

Prometheus + Grafana

Standard for Kubernetes metrics. Scrape cAdvisor, kube-state-metrics, and CNI exporters. Pre-built dashboards available.

Hubble (with Cilium)

Deep flow visibility. See every connection between pods with source, destination, protocol, and verdict. UI and CLI available.

Kiali (with Istio)

Service mesh visualization. Shows traffic flow between services, health status, and configuration validation.