Introduction
Welcome to the MeshLab repository! In this lab, you will find a setup to validate Istio configurations in a cell-based architecture. Each cell is an architecture block representing a unit of isolation and scalability. The lab defines two cells, named pasta
and pizza
, each composed of two clusters. Each cluster is configured with a multi-primary Istio control-plane for high availability and resilience.
Although the cells share the same root CA for their cryptographic material, each one uses a different SPIFFE trustDomain and each cluster within a cell has its own intermediate CA. Locality failover is possible within the clusters of a cell, and all mTLS cross-cluster traffic flows through east-west Istio gateways because pod networks have non-routable CIDRs.
The purpose of this lab is to test and validate different Istio configurations in a realistic environment.
Helm is used to deploy:
Argo Workflows and ArgoCD are used to deploy:
Quick Start
To quickly get started with the MeshLab repository, follow these simple steps:
./bin/meshlab-multipass create
./bin/meshlab-multipass suspend
./bin/meshlab-multipass delete
Components
Pull-through registries
A pull-through registry is a proxy that sits between your local Docker
installation and a remote Docker registry. It caches the images you pull from
the remote registry, and if another user on the same network tries to pull the
same image, the pull-through registry will serve it to them directly, rather
than pulling it again from the remote registry. The Container Runtime Interface
(CRI) in this lab is set up to use local pull-through registries for the
remote registries docker.io
, quay.io
and ghcr.io
on each cluster.
List all images in a registry:
curl -s 127.0.0.1:5011/v2/_catalog | jq # docker.io
curl -s 127.0.0.1:5012/v2/_catalog | jq # quay.io
curl -s 127.0.0.1:5013/v2/_catalog | jq # ghcr.io
List tags for a given image:
curl -s 127.0.0.1:5012/v2/argoproj/argocd/tags/list | jq
Get the manifest for a given image and tag:
curl -s http://127.0.0.1:5012/v2/argoproj/argocd/manifests/v2.4.7 | jq
Multipass
Multipass from Canonical is a tool for launching, managing, and orchestrating Linux virtual machines on local computers, simplifying the process for development, testing, and other purposes. It provides a user-friendly command-line interface and integrates with other tools for automation and customization.
Stop/start multipassd:
sudo launchctl unload /Library/LaunchDaemons/com.canonical.multipassd.plist
sudo launchctl load -w /Library/LaunchDaemons/com.canonical.multipassd.plist
Restart multipassd:
sudo launchctl kickstart -k system/com.canonical.multipassd
Directories of interest:
sudo tree /var/root/Library/Caches/multipassd
sudo tree /var/root/Library/Application\ Support/multipassd
sudo tree /Library/Application\ Support/com.canonical.multipass
List all available instances:
multipass list
Display information about all instances:
multipass info
Open a shell on a running instance:
multipass shell pasta-1
Tail the logs:
sudo tail -f /Library/Logs/Multipass/multipassd.log
Hypervisor.framework
The drivers utilized on MacOS, specifically HyperKit and QEMU, rely on MacOS' Hypervisor.framework to manage the networking stack for the instances. When an instance is created, the Hypervisor.framework
on the host employs MacOS' 'Internet Sharing' mechanism to establish a virtual switch. Each instance is then connected to this switch with a subnet address from:
$ sudo cat /Library/preferences/SystemConfiguration/com.apple.vmnet.plist | grep -A1 Shared_Net_Address
Password:
<key>Shared_Net_Address</key>
<string>192.168.65.1</string>
Furthermore, the host provides DHCP and DNS resolution services on this switch through the IP address 192.168.65.1, facilitated by the bootpd
and mDNSResponder
services running on the host machine. It is worth noting that attempting to manually edit the configuration file /etc/bootpd.plist
is futile, as MacOS will regenerate it according to its own preferences.
Is the bootpd
DHCP server alive?
sudo lsof -iUDP:67 -n -P
Start it:
sudo launchctl load -w /System/Library/LaunchDaemons/bootps.plist
Flush all DHCP leases:
sudo launchctl stop com.apple.bootpd
sudo rm -f /var/db/dhcpd_leases
sudo launchctl start com.apple.bootpd
It appears that at a certain juncture, docker
and multipass
ceased to share the same network bridge. Whichever starts first will occupy bridge100
with the IP address 192.168.64.1
, while the subsequent one will take bridge101
with the IP address 192.168.65.1
. Upon repeatedly stopping and starting these services, you will notice a sequential increment in the third octet of the Shared_Net_Address
.
Cloud-init
cloud-init is a tool used to configure virtual machine instances in the cloud during their first boot. It simplifies the provisioning process, enabling quick setup of new environments with desired configurations. The following commands provide examples for monitoring and inspecting the cloud-init process on various nodes in the system, including logs and scripts run during the instance's first boot.
Tail the cloud-init
logs:
multipass exec mnger-1 -- tail -f /var/log/cloud-init-output.log
multipass exec pasta-1 -- tail -f /var/log/cloud-init-output.log
multipass exec pasta-2 -- tail -f /var/log/cloud-init-output.log
Inspect the rendered runcmd
:
multipass exec mnger-1 -- sudo cat /var/lib/cloud/instance/scripts/runcmd
multipass exec pasta-1 -- sudo cat /var/lib/cloud/instance/scripts/runcmd
multipass exec pasta-2 -- sudo cat /var/lib/cloud/instance/scripts/runcmd
multipass exec virt-01 -- sudo cat /var/lib/cloud/instance/scripts/runcmd
k3s
k3s is a lightweight version of Kubernetes designed for resource-constrained environments like IoT devices and edge computing. It requires fewer resources and has additional features such as simplified installation and compatibility with ARM architectures.
Run config check:
multipass exec pasta-1 -- bash -c "sudo k3s check-config"
multipass exec pasta-2 -- bash -c "sudo k3s check-config"
Cilium
Cilium is an open source, cloud native solution for providing, securing, and observing network connectivity between workloads, fueled by the revolutionary Kernel technology eBPF.
Display status:
cilium --context pasta-1 status
Show status of ClusterMesh:
cilium --context pasta-1 clustermesh status
Display status of daemon:
k --context pasta-1 -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg status
Display full details:
k --context pasta-1 -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg status --verbose
List services:
k --context pasta-1 -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg service list
Troubleshoot connectivity towards remote clusters:
k --context pasta-1 -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg troubleshoot clustermesh
ArgoCD
ArgoCD is a GitOps platform for Kubernetes applications that enables continuous delivery with declarative management and automation of deployments from Git repositories to multiple clusters. With its user-friendly interface, robust features, and deep Kubernetes integration, ArgoCD is a popular choice for automating application delivery.
List all the applications:
argocd app list
Manually sync applications:
argocd app sync -l name=istio-issuers --async
argocd app sync -l name=istio-base --async
argocd app sync -l name=istio-cni --async
argocd app sync -l name=istio-istiod --async
argocd app sync -l name=istio-nsgw --async
argocd app sync -l name=istio-ewgw --async
CoreDNS
CoreDNS is a flexible, extensible DNS server that can be easily configured to provide custom DNS resolutions in Kubernetes clusters. It allows for dynamic updates, service discovery, and integration with external data sources, making it a popular choice for service discovery and network management in cloud-native environments.
Create DNS records for demo.lab
:
k --context pasta-1 -n kube-system create configmap coredns-custom --from-literal=demo.server='demo.lab {
hosts {
ttl 60
192.168.65.3 worker.service-1.demo.lab
192.168.65.3 worker.service-2.demo.lab
fallthrough
}
}'
Vault
Blah, blah, blah...
cert-manager
Cert-manager is an open-source software that helps automate the management and issuance of TLS/SSL certificates in Kubernetes clusters. It integrates with various certificate authorities (CAs) and can automatically renew certificates before they expire, ensuring secure communication between services running in the cluster.
Print the cert-manager CLI version and the deployed cert-manager version:
cmctl --context pasta-1 version
This check attempts to perform a dry-run create of a cert-manager v1alpha2
Certificate
resource in order to verify that CRDs are installed and all the
required webhooks are reachable by the K8S API server. We use v1alpha2 API to
ensure that the API server has also connected to the cert-manager conversion
webhook:
cmctl check api --context pasta-1
Get details about the current status of a cert-manager Certificate resource,
including information on related resources like CertificateRequest
or Order
:
cmctl --context pasta-1 --namespace istio-system status certificate istio-cluster-ica
Mark cert-manager Certificate
resources for manual renewal:
cmctl renew --context pasta-1 --namespace istio-system istio-cluster-ica
Istio
Istio is an open-source service mesh platform that provides traffic management, policy enforcement, and telemetry collection for microservices applications. It helps in improving the reliability, security, and observability of service-to-service communication in a cloud-native environment. By integrating with popular platforms such as Kubernetes, Istio makes it easier to manage the complexities of microservices architecture.
Lists the remote clusters each istiod
instance is connected to:
istioctl --context pasta-1 remote-clusters
Access the istiod
WebUI:
istioctl --context pasta-1 dashboard controlz deployment/istiod-1-22-2.istio-system
klipper-lb
klipper-lb
uses a host port for each Service
of type LoadBalancer
and
sets up iptables to forward the request to the cluster IP. The regular k8s
scheduler will find a free host port. If there are no free host ports, the
Service
will stay in pending. There is one DaemonSet
per Service
of type
LoadBalancer
and each Pod
has one container per exposed Service
port.
List the containers fronting the exposed argocd-server
ports:
k --context mnger-1 -n kube-system get ds -l svccontroller.k3s.cattle.io/svcname=argocd-server -o yaml | yq '.items[].spec.template.spec.containers[].name'
List the containers fronting the exposed istio-eastwestgateway
ports:
k --context pasta-1 -n kube-system get ds -l svccontroller.k3s.cattle.io/svcname=istio-eastwestgateway -o yaml | yq '.items[].spec.template.spec.containers[].name'
List the containers fronting the exposed istio-ingressgateway
ports:
k --context pasta-1 -n kube-system get ds -l svccontroller.k3s.cattle.io/svcname=istio-ingressgateway -o yaml | yq '.items[].spec.template.spec.containers[].name'
Envoy
Envoy is an open-source proxy server designed for modern microservices architectures, providing features such as load balancing, traffic management, and service discovery. It runs standalone or integrated with a service mesh, making it a powerful tool for microservices communication.
Inspect the config_dump
of a VM:
multipass exec virt-01 -- curl -s localhost:15000/config_dump | istioctl pc listeners --file -
multipass exec virt-01 -- curl -s localhost:15000/config_dump | istioctl pc routes --file -
multipass exec virt-01 -- curl -s localhost:15000/config_dump | istioctl pc clusters --file -
multipass exec virt-01 -- curl -s localhost:15000/config_dump | istioctl pc secret --file -
Set debug log level on a given proxy:
istioctl pc log sleep-xxx.httpbin --level debug
k --context pasta-1 -n httpbin logs -f sleep-xxx -c istio-proxy
Access the WebUI of a given envoy proxy:
istioctl dashboard envoy sleep-xxx.httpbin
Dump the envoy config of an eastweast gateway:
k --context pasta-1 -n istio-system exec -it deployment/istio-eastwestgateway -- curl -s localhost:15000/config_dump
Dump the common_tls_context
for a given envoy cluster:
k --context pasta-1 -n httpbin exec -i sleep-xxx -- \
curl -s localhost:15000/config_dump | jq '
.configs[] |
select(."@type"=="type.googleapis.com/envoy.admin.v3.ClustersConfigDump") |
.dynamic_active_clusters[] |
select(.cluster.name=="outbound|80||httpbin.httpbin.svc.cluster.local") |
.cluster.transport_socket_matches[] |
select(.name=="tlsMode-istio") |
.transport_socket.typed_config.common_tls_context
'
List LISTEN
ports:
k --context pasta-1 -n istio-system exec istio-eastwestgateway-xxx -- netstat -tuanp | grep LISTEN | sort -u
Check the status-port:
curl -o /dev/null -Isw "%{http_code}" http://10.0.16.124:31123/healthz/ready
Testing
Send requests to service-1
from an unauthenticated out-of-cluster workstation via the north-south Istio ingress gateway:
IP=$(multipass list | awk '/pasta-1/ {print $3}')
curl -sk --resolve service-1.demo.lab:443:${IP} https://service-1.demo.lab/data | jq -r '.podName'
Same as above but with certificate validation:
IP=$(multipass list | awk '/pasta-1/ {print $3}')
k --context pasta-1 -n istio-system get secret cacerts -o json | jq -r '.data."ca.crt"' | base64 -d > /tmp/ca.crt
curl -s --cacert /tmp/ca.crt --resolve service-1.demo.lab:443:${IP} https://service-1.demo.lab/data | jq -r '.podName'
Locality load balancing
Istio's Locality Load Balancing (LLB) is a feature that helps distribute traffic across different geographic locations in a way that minimizes latency and maximizes availability. It routes traffic to the closest available instance of the service, reducing network hops and improving performance, while also providing fault tolerance and resilience. LLB is important for managing microservices architectures.
From the perspective of istio-nsgw
: get the endpoints, priority, and weight of service-1
:
# Get a running pod name
POD=$(k --context pasta-1 -n istio-system get po -l istio=nsgw --no-headers | awk 'NR==1{print $1}')
# Add an ephemeral container to the running pod
k --context pasta-1 -n istio-system debug -it \
--attach=false --image=istio/base --target=istio-proxy --container=debugger \
${POD} -- bash
# Watch for the endpoints
watch "istioctl --context pasta-1 -n istio-system pc endpoint deploy/istio-nsgw | grep -E '^END|service-1'; echo; k --context pasta-1 -n istio-system exec -it ${POD} -c debugger -- curl -X POST localhost:15000/clusters | grep '^outbound.*service-1' | grep -E 'zone|region|::priority|::weight' | sort | sed -e '/:zone:/s/$/\n/'"
TLS
TLS 1.3 is the latest version of the TLS protocol. TLS, which is used by HTTPS and other network protocols for encryption, is the modern version of SSL. TLS 1.3 dropped support for older, less secure cryptographic features, and it speeds up TLS handshakes, among other improvements.
Setup a place to dump the crypto material:
k --context pasta-1 -n httpbin patch deployment sleep --type merge -p '
spec:
template:
metadata:
annotations:
sidecar.istio.io/userVolume: "[{\"name\":\"sniff\", \"emptyDir\":{\"medium\":\"Memory\"}}]"
sidecar.istio.io/userVolumeMount: "[{\"name\":\"sniff\", \"mountPath\":\"/sniff\"}]"
proxy.istio.io/config: |
proxyMetadata:
OUTPUT_CERTS: /sniff
'
Write the required per-session TLS secrets to a file (source):
k --context pasta-1 apply -f - << EOF
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: httpbin
namespace: httpbin
spec:
workloadSelector:
labels:
app: sleep
configPatches:
- applyTo: CLUSTER
match:
context: SIDECAR_OUTBOUND
cluster:
service: "httpbin.httpbin.svc.cluster.local"
portNumber: 80
patch:
operation: MERGE
value:
transport_socket:
name: "envoy.transport_sockets.tls"
typed_config:
"@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext"
common_tls_context:
key_log:
path: /sniff/keylog
EOF
Restart envoy to kill all TCP connections and force new TLS handshakes:
k --context pasta-1 -n httpbin exec -it deployment/sleep -c istio-proxy -- curl -X POST localhost:15000/quitquitquit
Optionally, use this command to list all available endpoints:
istioctl --context pasta-1 pc endpoint deploy/httpbin.httpbin | egrep '^END|httpbin'
Start tcpdump
:
k --context pasta-1 -n httpbin exec -it deployment/sleep -c istio-proxy -- sudo tcpdump -s0 -w /sniff/dump.pcap
Send a few requests to the endpoints listed above:
k --context pasta-1 -n httpbin exec -i deployment/sleep -- curl -s httpbin/hostname | jq -r 'hostname'
Stop tcpdump
and download everything:
k --context pasta-1 -n httpbin cp -c istio-proxy sleep-xxx:sniff ~/sniff
Open it with Wireshark:
open ~/sniff/dump.pcap
Filter by tls.handshake.type == 1
and follow the TLS stream of a Client Hello
packet.
Right click a TLSv1.3
packet then Protocol Preferences
--> Transport Layer Security
--> (Pre)-Master-Secret log filename
and provide the path to the keylog
file.
Certificates
Find below a collection of commands to troubleshoot certificate issues.
Connect to the externally exposed istiod
service and inspect the certificate bundle it presents:
step certificate inspect --bundle --servername istiod-1-19-6.istio-system.svc https://192.168.65.3:15012 --roots /path/to/root-ca.pem
step certificate inspect --bundle --servername istiod-1-19-6.istio-system.svc https://192.168.65.3:15012 --insecure
Inspect the certificate chain provided by a given workload:
istioctl --context pasta-1 pc secret httpbin-xxxxxxxxxx-yyyyy.httpbin -o json | jq -r '.dynamicActiveSecrets[] | select(.name=="default") | .secret.tlsCertificate.certificateChain.inlineBytes' | base64 -d | step certificate inspect --bundle
Inspect the certificate root CA present in a given workload:
istioctl --context pasta-1 pc secret sleep-xxxxxxxxxx-yyyyy.httpbin -o json | jq -r '.dynamicActiveSecrets[] | select(.name=="ROOTCA") | .secret.validationContext.trustedCa.inlineBytes' | base64 -d | step certificate inspect --bundle
Similar as above but this time as a client:
k --context pasta-1 -n httpbin exec -it deployment/sleep -c istio-proxy -- openssl s_client -showcerts httpbin:80
Get details about the status of a cert-manager managed certificate:
cmctl --context pasta-1 --namespace applab-blau status certificate blau
Development
Provision only one VM:
source ./lib/misc.sh && launch_k8s mnger-1
source ./lib/misc.sh && launch_vms virt-01
Debug
Add locality info:
k --context pasta-1 -n httpbin patch workloadentries httpbin-192.168.65.5-vm-network --type merge -p '{"spec":{"locality":"milky-way/solar-system/virt-01"}}'
k --context pasta-1 -n httpbin patch deployment sleep --type merge -p '{"spec":{"template":{"metadata":{"labels":{"istio-locality":"milky-way.solar-system.pasta-1"}}}}}'
k --context pasta-1 -n httpbin label pod sleep-xxxx topology.istio.io/subzone=pasta-1 topology.kubernetes.io/region=milky-way topology.kubernetes.io/zone=solar-system
k --context pasta-1 -n httpbin patch deployment sleep --type merge -p '{"spec":{"template":{"metadata":{"labels":{
"topology.kubernetes.io/region":"milky-way",
"topology.kubernetes.io/zone":"solar-system",
"topology.istio.io/subzone":"pasta-1"
}}}}}'
Delete locality info:
k --context pasta-1 -n httpbin patch workloadentries httpbin-192.168.65.5-vm-network --type json -p '[{"op": "remove", "path": "/spec/locality"}]'
k --context pasta-1 -n httpbin patch deployment sleep --type json -p '[{"op": "remove", "path": "/spec/template/metadata/labels/istio-locality"}]'
k --context pasta-1 -n httpbin label pod sleep-xxxx topology.istio.io/subzone- topology.kubernetes.io/region- topology.kubernetes.io/zone-
Set debug images:
k --context pasta-1 -n istio-system set image deployment/istiod-1-19-6 discovery=docker.io/h0tbird/pilot:1.19.6
k --context pasta-1 -n httpbin patch deployment sleep --type merge -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/proxyImage":"docker.io/h0tbird/proxyv2:1.19.6"}}}}}'
Unset debug images:
k --context pasta-1 -n istio-system set image deployment/istiod-1-19-6 discovery=docker.io/istio/pilot:1.19.6
k --context pasta-1 -n httpbin patch deployment sleep --type merge -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/proxyImage":"docker.io/istio/proxyv2:1.19.6"}}}}}'
Debug:
k --context pasta-1 -n httpbin exec -it deployments/sleep -c istio-proxy -- sudo bash -c 'echo 0 > /proc/sys/kernel/yama/ptrace_scope'
k --context pasta-1 -n istio-system exec -it deployments/istiod-1-19-6 -- dlv dap --listen=:40000 --log=true
k --context pasta-1 -n istio-system port-forward deployments/istiod-1-19-6 40000:40000