Last Updated on: 20th December 2024, 11:57 am
Service mesh in Kubernetes is a useful tool. It offers efficient networking, connection routing, encryption, and enhanced monitoring of communication inside the cluster. However, deployment and certificate handling can be a pain sometimes, especially in the GitOps era. This article will describe how to deploy Linkerd without using the CLI tool, with full Cert-Manager and Trust-Manager integration, using ArgoCD in the Kubernetes cluster.
Background—Argo Rollouts
Any DevOps that I’ve talked with always asks the same question: “Why do we even need service mesh?! Just for show?” In our case, no, we have an actual, valid reason: traffic management. In the ContextAI project, we decided to introduce a canary deployment pipeline using Argo Rollouts. From the Argo Rollouts website:
“Argo Rollouts is a Kubernetes controller and set of CRDs which provide advanced deployment capabilities such as blue-green, canary, canary analysis, experimentation, and progressive delivery features to Kubernetes.”
Argo Rollouts is using service mesh in Kubernetes to route connections to new deployments according to a configured deployment strategy. The decision about the next stage of deployment is made based on response code metrics. Each step of deployment propagation is defined by the developer in the application manifest.
Argo Rollouts works both with the native Kubernetes Ingress Controller and with multiple types of service meshes. Native traffic management in Kubernetes doesn’t have the tools needed to handle fine-gained traffic routing, like redirecting some percentage of traffic to specific pods. For that type of task, we’ve decided to implement service mesh.
Service mesh—Linkerd
From Wikipedia:
“Linkerd is Cloud Native Computing Foundation’s fifth member project and the project that coined the term “service mesh.” Linkerd adds observability, security, and reliability features to applications by adding them to the platform rather than the application layer, and features a “micro-proxy” to maximize speed and security of its data plane. Linkerd graduated from CNCF in July 2021.”
Linkerd is one of the most well-known service meshes. It’s confirmed by multiple tests to be the fastest and least resource-hungry service mesh available. Meshing new services is as easy as adding new annotations to the service manifest. Linkerd Viz offers a dashboard that can be used to monitor traffic in a cluster.
Installation by CLI is also simple and straightforward, but nowadays, with GitOps and tools like Argo CD, this kind of installation may feel a little bit out of date. Another thing is in-cluster-certificate handling, which can be more effectively managed by Cert-Manager.
Cert-Manager and Trust-Manager are designed to simplify issuing and renewal of certificates in the Kubernetes cluster. Most widely used in combination with LetsEncript certificates.
Deployment with Argo CD
Argo CD is a declarative GitOps continuous delivery tool for Kubernetes. In short, Argo CD keeps your cluster in sync with your repositories. It also offers monitoring features. In the case of RTB House, it’s a default way to deploy services to Kubernetes.
All charts used in this article are from the https://helm.linkerd.io/stable and https://charts.jetstack.io repositories. Our Linkerd deployment is split into three namespaces:
- Cert-manager—contains deployments for cert-manager and trust-manager.
- Linkerd—contains deployments from linkerd-control-plane and linkerd-jaeger charts.
- Linkerd-crd—contains deployments from linkerd-crds chart.
- Linkerd-viz—contains deployments from linkerd-viz chart.
Each namespace is a separate application in Argo CD. Each application has its own chart, which includes Linkerd charts as a dependency and additional objects for this namespace (like cert-manager certificates).
apiVersion: v2
name: cert-manager
description: Cert-Manager chart for Kubernetes
type: application
version: 1.12.3
appVersion: "1.12.3"
dependencies:
- name: cert-manager
version: "v1.16.1"
repository: "https://charts.jetstack.io"
- name: trust-manager
version: "v0.12.0"
repository: "https://charts.jetstack.io"
apiVersion: v2
name: linkerd
description: Linkerd chart for Kubernetes
type: application
version: 1.16.11
appVersion: "1.16.11"
dependencies:
- name: linkerd-control-plane
version: "1.16.11"
repository: "https://helm.linkerd.io/stable"
- name: linkerd-jaeger
version: "30.12.11"
repository: "https://helm.linkerd.io/stable"
apiVersion: v2
name: linkerd-crds
description: Linkerd CRDs chart for Kubernetes
type: application
version: 1.8.0
appVersion: "1.8.0"
dependencies:
- name: linkerd-crds
version: "1.8.0"
repository: "https://helm.linkerd.io/stable"
apiVersion: v2
name: linkerd-viz
description: Linkerd Viz chart for Kubernetes
type: application
version: 30.12.11
appVersion: "30.12.11"
dependencies:
- name: linkerd-viz
version: "30.12.11"
repository: "https://helm.linkerd.io/stable"
All of the applications are deployed at once, with an auto-sync option, using application-set CRD from Argo CD. Initial deployment may take a few minutes because Argo first needs to finish applying Linkerd CRDs and then retry deploying control-plane again. Any updates after the first deployment are instant and seamless.
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: infra-prod
namespace: argocd
spec:
generators:
- git:
repoURL: https://**************/charts.git
revision: _prod
template:
metadata:
name: '{{path.basename}}'
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: cai
source:
repoURL: https://**************/charts.git
targetRevision: _prod
path: '{{.path.path}}'
helm:
releaseName: "{{path.basename}}"
valueFiles:
- values.yaml
- values-prod.yaml
ignoreMissingValueFiles: true
destination:
server: https://**************
namespace: '{{path.basename}}'
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
retry:
limit: 3
backoff:
duration: 5s
factor: 2
Cert-Manager certificates and Trust-Manager bundle for Linkerd
Linkerd depends on a master certificate called Trust Anchor. Normally, it would be fetched to the CICD machine from secret storage and used in deployment. In our setup, we didn’t want to put secrets on random machines; we wanted this to happen inside the cluster.
First, we’ll need to create new Cluster Issuers and Certificates in the cert-manager namespace. This specific part of our system doesn’t use external certificates, we can rely on self-signed ones.
If you have a Trust Anchor certificate that you want to use, store it in Kubernetes secret and pass it as Certificate Authority directly to linkerd-trust-anchor Cluster Issuer, that will be created in next steps.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: linkerd-self-signed
namespace: cert-manager
spec:
selfSigned: {}
With a new, self-signed issuer, we can now create a Trust Anchor certificate that will be used by Linkerd. We’ll also create a Webhook Issuer certificate.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-trust-anchor
namespace: cert-manager
spec:
isCA: true
duration: 336h
renewBefore: 24h
issuerRef:
name: linkerd-self-signed
kind: ClusterIssuer
secretName: linkerd-trust-anchor
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
commonName: identity.linkerd.cluster.local
dnsNames:
- identity.linkerd.cluster.local
privateKey:
algorithm: ECDSA
size: 256
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-webhook-issuer
namespace: cert-manager
spec:
isCA: true
duration: 336h
renewBefore: 24h
issuerRef:
name: linkerd-self-signed
kind: ClusterIssuer
secretName: linkerd-webhook-issuer
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
commonName: webhook.linkerd.cluster.local
privateKey:
algorithm: ECDSA
size: 256
New certificates will be generated and stored in secret objects in K8s. With a Trust Anchor certificate, we can now create a Trust Anchor Custer Issuer. The same needs to be done for the Webhook Issuer.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: linkerd-trust-anchor
namespace: cert-manager
spec:
ca:
secretName: linkerd-trust-anchor
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: linkerd-webhook-issuer
namespace: cert-manager
spec:
ca:
secretName: linkerd-webhook-issuer
Now, we need to generate an identity issuer certificate using the new linkerd-trust-anchor Cluster Issuer.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-identity-issuer
namespace: cert-manager
spec:
secretName: linkerd-identity-issuer
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 336h
renewBefore: 24h
issuerRef:
name: linkerd-trust-anchor
kind: ClusterIssuer
commonName: identity.linkerd.cluster.local
dnsNames:
- identity.linkerd.cluster.local
isCA: true
privateKey:
algorithm: ECDSA
usages:
- cert sign
- crl sign
- server auth
- client auth
The last part of the cert-manager setup is creating a certificate bundle. This can be done with a Trust Manager. Installation is straightforward, when a Trust Manager will be installed we’ll need to declare a Identity Trust Roots certificate bundle. This bundle will be injected into all namespaces and used to validate certificates.
apiVersion: trust.cert-manager.io/v1alpha1
kind: Bundle
metadata:
name: linkerd-identity-trust-roots
namespace: cert-manager
spec:
sources:
- secret:
name: linkerd-identity-issuer
key: "ca.crt"
target:
configMap:
key: "ca-bundle.crt"
That completes the list of objects needed to be created in the cert-manager namespace. Is that all the certificates we need?
Nope.
Now we need to declare certificates that will be used by Linkerd itself. We’ll need to create five certificates in linkerd namespace and two in linkerd-viz namespace. The important part is to add the “app.kubernetes.io/part-of: Linkerd” label to all secret objects, otherwise Linkerd won’t be able to access them.
Certificates in linkerd namespace:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-identity-issuer
namespace: linkerd
spec:
secretName: linkerd-identity-issuer
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-trust-anchor
kind: ClusterIssuer
commonName: identity.linkerd.cluster.local
dnsNames:
- identity.linkerd.cluster.local
isCA: true
privateKey:
algorithm: ECDSA
usages:
- cert sign
- crl sign
- server auth
- client auth
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-policy-validator
namespace: linkerd
spec:
secretName: linkerd-policy-validator-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: linkerd-policy-validator.linkerd.svc
dnsNames:
- linkerd-policy-validator.linkerd.svc
isCA: false
usages
- server auth
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-proxy-injector
namespace: linkerd
spec:
secretName: linkerd-proxy-injector-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: linkerd-proxy-injector.linkerd.svc
dnsNames:
- linkerd-proxy-injector.linkerd.svc
isCA: false
usages:
- server auth
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: linkerd-sp-validator
namespace: linkerd
spec:
secretName: linkerd-sp-validator-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: linkerd-sp-validator.linkerd.svc
dnsNames:
- linkerd-sp-validator.linkerd.svc
isCA: false
usages:
- server auth
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: jaeger-injector
namespace: linkerd
spec:
secretName: jaeger-injector-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: jaeger-injector.linkerd-jaeger.svc
dnsNames:
- jaeger-injector.linkerd-jaeger.svc
isCA: false
usages:
- server auth
Certificates in linkerd-viz namespace:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: tap-injector
namespace: linkerd-viz
spec:
secretName: tap-injector-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: tap-injector.linkerd-viz.svc
dnsNames:
- tap-injector.linkerd-viz.svc
isCA: false
usages:
- server auth
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: tap
namespace: linkerd-viz
spec:
secretName: tap-k8s-tls
secretTemplate:
labels:
app.kubernetes.io/part-of: Linkerd
duration: 168h
renewBefore: 24h
issuerRef:
name: linkerd-webhook-issuer
kind: ClusterIssuer
commonName: tap.linkerd-viz.svc
dnsNames:
- tap.linkerd-viz.svc
isCA: false
usages:
- server auth
Ok! At this point, we have all required certificates in place, in a structure required by Linkerd. Now, let’s use them in Linkerd itself.
Linkerd and Linkerd Viz integration with Cert-Manager
Linkerd has an option to use external certificates, which we’ll use in this case. To use certificates created by cert-manager, we need to point all components of Linkerd to corresponding secrets.
Values.yaml for linkerd deployment:
linkerd-control-plane:
identityTrustAnchorsPEM: ~
identity:
externalCA: true
issuer:
scheme: kubernetes.io/tls
proxyInit:
runAsRoot: true
enablePodDisruptionBudget: true
deploymentStrategy:
rollingUpdate:
maxUnavailable: 1
maxSurge: 25%
proxyInjector:
externalSecret: true
injectCaFrom: linkerd/linkerd-proxy-injector
profileValidator:
externalSecret: true
injectCaFrom: linkerd/linkerd-sp-validator
policyValidator:
externalSecret: true
injectCaFrom: linkerd/linkerd-policy-validator
enablePodAntiAffinity: true
controllerReplicas: 2
webhookFailurePolicy: Fail
linkerd-jaeger:
webhook:
externalSecret: true
injectCaFrom: linkerd/jaeger-injector
Values.yaml for linkerd-viz deployment:
linkerd-viz:
dashboard:
enforcedHostRegexp: .*
tap:
externalSecret: true
injectCaFrom: linkerd-viz/tap
tapInjector:
externalSecret: true
injectCaFrom: linkerd-viz/tap-injector
Certificate rotation in Linkerd
Here’s a part with a little hack. All certificates should have a defined end date. And all certificates should be rotated regularly. Usually, it’s done using the Linkerd CLI tool or by helm deployment (with new certificate in values). In both cases, new certificate secrets are created, and Linkerd pods are restarted to load those certificates. Without restarting, Linkerd pods will not load new certificates. After certificates are loaded by Linkerd, they are stored in pod memory.
But we can do the same set of tasks automatically inside the cluster.
Just to be sure, we’ll also force the recreation of cert-manager certificate secrets by deleting them. Cert-manager will issue fresh certificates instantly and store it in secrets. After that, the last thing to do is a rolling restart of Linkerd deployments. This way, we achieved zero-downtime certificate rotation.
This cronjob needs to be created both in the linkerd and linkerd-viz namespace.
apiVersion: batch/v1
kind: CronJob
metadata:
name: linkerd-restarter
namespace: {{ .Release.Namespace }}
spec:
concurrencyPolicy: Forbid
schedule: '25 */6 * * *'
jobTemplate:
spec:
backoffLimit: 2
activeDeadlineSeconds: 600
template:
spec:
serviceAccountName: linkerd-restarter
restartPolicy: Never
containers:
- name: rotate-secrets
image: bitnami/kubectl
command:
- 'kubectl'
- 'delete'
- 'secret'
- '--selector'
- 'app.kubernetes.io/part-of=Linkerd'
- '-n'
- '{{ .Release.Namespace }}'
- name: restart-linkerd
image: bitnami/kubectl
command:
- 'kubectl'
- 'rollout'
- 'restart'
- 'deployment'
- '--selector'
- 'app.kubernetes.io/part-of=Linkerd'
- '-n'
- '{{ .Release.Namespace }}'
Restarter cronjob requires a dedicated service account, role, and role binding.
kind: ServiceAccount
apiVersion: v1
metadata:
name: linkerd-restarter
namespace: {{ .Release.Namespace }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: linkerd-restarter
namespace: {{ .Release.Namespace }}
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["list", "patch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["list", "delete"]
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: linkerd-restarter
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: linkerd-restarter
subjects:
- kind: ServiceAccount
name: linkerd-restarter
namespace: {{ .Release.Namespace }}
That’s all! Commit all objects to the repository, let ArgoCD deploy them, and verify that all Linkerd pods are up. Restart occurs every six hours, but it won’t be noticeable.
Final thoughts
With a few tricks, we were able to create a “fire-and-forget” deployment of Linkerd service mesh. Our certificates are temporary and are rotated regularly, without any intervention from operators. All operations happen in the background. We’ve been using this setup for a few months now without any significant issues. Linkerd works great for our canary and blue/green deployments and gives us additional insight into communication inside the cluster.