Troubleshooting

    These checks only run when the —pre flag is set. This flag is intended foruse prior to running linkerd install, to verify your cluster is prepared forinstallation.

    Example failure:

    By default linkerd install will create a linkerd namespace. Prior toinstallation, that namespace should not exist. To check with a differentnamespace, run:

    1. linkerd check --pre --linkerd-namespace linkerd-test

    √ can create Kubernetes resources

    The subsequent checks in this section validate whether you have permission tocreate the Kubernetes resources required for Linkerd installation, specifically:

    1. can create Namespaces
    2. can create ClusterRoles
    3. can create ClusterRoleBindings
    4. can create CustomResourceDefinitions

    The “pre-kubernetes-setup” checks

    These checks only run when the —pre flag is set This flag is intended foruse prior to running linkerd install, to verify you have the correct RBACpermissions to install Linkerd.

    1. can create Namespaces
    2. can create ClusterRoles
    3. can create ClusterRoleBindings
    4. can create CustomResourceDefinitions
    5. can create PodSecurityPolicies
    6. can create ServiceAccounts
    7. can create Services
    8. can create Deployments
    9. can create ConfigMaps

    √ no clock skew detected

    This check verifies whether there is clock skew between the system runningthe linkerd install command and the Kubernetes node(s), causingpotential issues.

    The “pre-kubernetes-capability” checks

    These checks only run when the —pre flag is set. This flag is intended foruse prior to running linkerd install, to verify you have the correctKubernetes capability permissions to install Linkerd.

    √ has NET_ADMIN capability

    Example failure:

    1. × has NET_ADMIN capability
    2. found 3 PodSecurityPolicies, but none provide NET_ADMIN
    3. see https://linkerd.io/checks/#pre-k8s-cluster-net-admin for hints

    Linkerd installation requires the NET_ADMIN Kubernetes capability, to allowfor modification of iptables.

    For more information, see the Kubernetes documentation on,Security Contexts,and the .

    √ has NET_RAW capability

    Example failure:

    1. × has NET_RAW capability
    2. found 3 PodSecurityPolicies, but none provide NET_RAW
    3. see https://linkerd.io/checks/#pre-k8s-cluster-net-raw for hints

    Linkerd installation requires the NET_RAW Kubernetes capability, to allow formodification of iptables.

    For more information, see the Kubernetes documentation on,Security Contexts,and the .

    The “pre-linkerd-global-resources” checks

    These checks only run when the —pre flag is set. This flag is intended foruse prior to running linkerd install, to verify you have not already installedthe Linkerd control plane.

    1. no ClusterRoles exist
    2. no ClusterRoleBindings exist
    3. no CustomResourceDefinitions exist
    4. no MutatingWebhookConfigurations exist
    5. no ValidatingWebhookConfigurations exist
    6. no PodSecurityPolicies exist

    If you do not expect to have the permission for a full cluster install, try the—single-namespace flag, which validates if Linkerd can be installed in asingle namespace, with limited cluster access:

    1. linkerd check --pre --single-namespace

    √ control plane namespace exists

    1. × control plane namespace exists
    2. The "linkerd" namespace does not exist

    In —single-namespace mode, linkerd check assumes that the installer doesnot have permission to create a namespace, so the installation namespace mustalready exist.

    By default the linkerd namespace is used. To use a different namespace run:

    1. linkerd check --pre --single-namespace --linkerd-namespace linkerd-test

    √ can create Kubernetes resources

    The subsequent checks in this section validate whether you have permission tocreate the Kubernetes resources required for Linkerd —single-namespaceinstallation, specifically:

    1. can create Roles
    2. can create RoleBindings

    For more information on cluster access, see the section above.

    The “kubernetes-api” checks

    Example failures:

    1. × can initialize the client
    2. error configuring Kubernetes API client: stat badconfig: no such file or directory
    3. Get https://8.8.8.8/version: dial tcp 8.8.8.8:443: i/o timeout

    Ensure that your system is configured to connect to a Kubernetes cluster.Validate that the KUBECONFIG environment variable is set properly, and/or~/.kube/config points to a valid cluster.

    For more information see these pages in the Kubernetes Documentation:

    1. kubectl config view
    2. kubectl cluster-info
    3. kubectl version

    Another example failure:

    1. can query the Kubernetes API
    2. Get REDACTED/version: x509: certificate signed by unknown authority

    As an (unsafe) workaround to this, you may try:

    1. kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \
    2. --server=${KUBE_CONTEXT}

    The “kubernetes-version” checks

    √ is running the minimum Kubernetes API version

    Example failure:

    1. × is running the minimum Kubernetes API version
    2. Kubernetes is on version [1.7.16], but version [1.12.0] or more recent is required

    Linkerd requires at least version 1.12.0. Verify your cluster version with:

    1. kubectl version

    √ is running the minimum kubectl version

    Example failure:

    1. × is running the minimum kubectl version
    2. kubectl is on version [1.9.1], but version [1.12.0] or more recent is required
    3. see https://linkerd.io/checks/#kubectl-version for hints

    Linkerd requires at least version 1.12.0. Verify your kubectl version with:

    1. kubectl version --client --short

    To fix please update kubectl version.

    For more information on upgrading Kubernetes, see the page in the KubernetesDocumentation onUpgrading a cluster

    The “linkerd-config” checks

    This category of checks validates that Linkerd’s cluster-wide RBAC and relatedresources have been installed. These checks run via a default linkerd check,and also in the context of a multi-stage setup, for example:

    1. # install cluster-wide resources (first stage)
    2. linkerd install config | kubectl apply -f -
    3. # validate successful cluster-wide resources installation
    4. linkerd check config
    5. # install Linkerd control plane
    6. linkerd install control-plane | kubectl apply -f -
    7. # validate successful control-plane installation
    8. linkerd check

    √ control plane Namespace exists

    Example failure:

    1. × control plane Namespace exists
    2. The "foo" namespace does not exist
    3. see https://linkerd.io/checks/#l5d-existence-ns for hints

    Ensure the Linkerd control plane namespace exists:

    1. kubectl get ns

    The default control plane namespace is linkerd. If you installed Linkerd intoa different namespace, specify that in your check command:

    1. linkerd check --linkerd-namespace linkerdtest

    √ control plane ClusterRoles exist

    Example failure:

    1. × control plane ClusterRoles exist
    2. missing ClusterRoles: linkerd-linkerd-controller
    3. see https://linkerd.io/checks/#l5d-existence-cr for hints
    1. $ kubectl get clusterroles | grep linkerd
    2. linkerd-linkerd-controller 9d
    3. linkerd-linkerd-identity 9d
    4. linkerd-linkerd-prometheus 9d
    5. linkerd-linkerd-proxy-injector 20d
    6. linkerd-linkerd-sp-validator 9d

    Also ensure you have permission to create ClusterRoles:

    1. $ kubectl auth can-i create clusterroles
    2. yes

    Example failure:

    1. × control plane ClusterRoleBindings exist
    2. missing ClusterRoleBindings: linkerd-linkerd-controller
    3. see https://linkerd.io/checks/#l5d-existence-crb for hints

    Ensure the Linkerd ClusterRoleBindings exist:

    Also ensure you have permission to create ClusterRoleBindings:

    1. $ kubectl auth can-i create clusterrolebindings
    2. yes

    √ control plane ServiceAccounts exist

    Example failure:

    1. × control plane ServiceAccounts exist
    2. missing ServiceAccounts: linkerd-controller
    3. see https://linkerd.io/checks/#l5d-existence-sa for hints

    Ensure the Linkerd ServiceAccounts exist:

    1. $ kubectl -n linkerd get serviceaccounts
    2. NAME SECRETS AGE
    3. default 1 23m
    4. linkerd-controller 1 23m
    5. linkerd-grafana 1 23m
    6. linkerd-identity 1 23m
    7. linkerd-prometheus 1 23m
    8. linkerd-proxy-injector 1 7m
    9. linkerd-sp-validator 1 23m
    10. linkerd-web 1 23m

    Also ensure you have permission to create ServiceAccounts in the Linkerdnamespace:

    1. $ kubectl -n linkerd auth can-i create serviceaccounts
    2. yes

    √ control plane CustomResourceDefinitions exist

    Example failure:

    1. × control plane CustomResourceDefinitions exist
    2. missing CustomResourceDefinitions: serviceprofiles.linkerd.io
    3. see https://linkerd.io/checks/#l5d-existence-crd for hints

    Ensure the Linkerd CRD exists:

    1. $ kubectl get customresourcedefinitions
    2. NAME CREATED AT
    3. serviceprofiles.linkerd.io 2019-04-25T21:47:31Z

    Also ensure you have permission to create CRDs:

    1. $ kubectl auth can-i create customresourcedefinitions
    2. yes

    √ control plane MutatingWebhookConfigurations exist

    Example failure:

    1. × control plane MutatingWebhookConfigurations exist
    2. missing MutatingWebhookConfigurations: linkerd-proxy-injector-webhook-config
    3. see https://linkerd.io/checks/#l5d-existence-mwc for hints

    Ensure the Linkerd MutatingWebhookConfigurations exists:

    1. $ kubectl get mutatingwebhookconfigurations | grep linkerd
    2. linkerd-proxy-injector-webhook-config 2019-07-01T13:13:26Z

    Also ensure you have permission to create MutatingWebhookConfigurations:

    1. $ kubectl auth can-i create mutatingwebhookconfigurations
    2. yes

    √ control plane ValidatingWebhookConfigurations exist

    Example failure:

    1. × control plane ValidatingWebhookConfigurations exist
    2. missing ValidatingWebhookConfigurations: linkerd-sp-validator-webhook-config
    3. see https://linkerd.io/checks/#l5d-existence-vwc for hints

    Ensure the Linkerd ValidatingWebhookConfiguration exists:

    1. $ kubectl get validatingwebhookconfigurations | grep linkerd

    Also ensure you have permission to create ValidatingWebhookConfigurations:

    1. $ kubectl auth can-i create validatingwebhookconfigurations
    2. yes

    √ control plane PodSecurityPolicies exist

    Example failure:

    1. × control plane PodSecurityPolicies exist
    2. missing PodSecurityPolicies: linkerd-linkerd-control-plane
    3. see https://linkerd.io/checks/#l5d-existence-psp for hints

    Ensure the Linkerd PodSecurityPolicy exists:

    1. $ kubectl get podsecuritypolicies | grep linkerd
    2. linkerd-linkerd-control-plane false NET_ADMIN,NET_RAW RunAsAny RunAsAny MustRunAs MustRunAs true configMap,emptyDir,secret,projected,downwardAPI,persistentVolumeClaim

    Also ensure you have permission to create PodSecurityPolicies:

    1. $ kubectl auth can-i create podsecuritypolicies
    2. yes

    √ ‘linkerd-config’ config map exists

    Example failure:

    1. × 'linkerd-config' config map exists
    2. missing ConfigMaps: linkerd-config
    3. see https://linkerd.io/checks/#l5d-existence-linkerd-config for hints

    Ensure the Linkerd ConfigMap exists:

    1. $ kubectl -n linkerd get configmap/linkerd-config
    2. NAME DATA AGE
    3. linkerd-config 3 61m

    Also ensure you have permission to create ConfigMaps:

    1. $ kubectl -n linkerd auth can-i create configmap

    √ control plane replica sets are ready

    This failure occurs when one of Linkerd’s ReplicaSets fails to schedule a pod.

    For more information, see the Kubernetes documentation on.

    √ no unschedulable pods

    Example failure:

    1. × no unschedulable pods
    2. linkerd-prometheus-6b668f774d-j8ncr: 0/1 nodes are available: 1 Insufficient cpu.
    3. see https://linkerd.io/checks/#l5d-existence-unschedulable-pods for hints

    For more information, see the Kubernetes documentation on the.

    √ controller pod is running

    Example failure:

    1. × controller pod is running
    2. No running pods for "linkerd-controller"

    Note, it takes a little bit for pods to be scheduled, images to be pulled andeverything to start up. If this is a permanent error, you’ll want to validatethe state of the controller pod with:

    1. $ kubectl -n linkerd get po --selector linkerd.io/control-plane-component=controller
    2. NAME READY STATUS RESTARTS AGE
    3. linkerd-controller-7bb8ff5967-zg265 4/4 Running 0 40m

    Check the controller’s logs with:

    1. linkerd logs --control-plane-component controller

    √ can initialize the client

    Example failure:

    1. × can initialize the client
    2. parse http:// bad/: invalid character " " in host name

    Verify that a well-formed —api-addr parameter was specified, if any:

    1. linkerd check --api-addr " bad"

    Example failure:

    1. × can query the control plane API
    2. Post http://8.8.8.8/api/v1/Version: context deadline exceeded

    This check indicates a connectivity failure between the cli and the Linkerdcontrol plane. To verify connectivity, manually connect to the controller pod:

    …and then curl the /metrics endpoint:

    1. curl localhost:9995/metrics

    The “linkerd-api” checks

    √ control plane pods are ready

    Example failure:

    1. × control plane pods are ready
    2. No running pods for "linkerd-web"
    1. $ kubectl -n linkerd get po
    2. NAME READY STATUS RESTARTS AGE
    3. pod/linkerd-controller-b8c4c48c8-pflc9 4/4 Running 0 45m
    4. pod/linkerd-grafana-776cf777b6-lg2dd 2/2 Running 0 1h
    5. pod/linkerd-prometheus-74d66f86f6-6t6dh 2/2 Running 0 1h
    6. pod/linkerd-web-5f6c45d6d9-9hd9j 2/2 Running 0 3m

    √ control plane self-check

    Example failure:

    1. × control plane self-check
    2. Post https://localhost:6443/api/v1/namespaces/linkerd/services/linkerd-controller-api:http/proxy/api/v1/SelfCheck: context deadline exceeded

    Check the logs on the control-plane’s public API:

    1. linkerd logs --control-plane-component controller --container public-api

    √ [kubernetes] control plane can talk to Kubernetes

    Example failure:

    1. × [kubernetes] control plane can talk to Kubernetes
    2. Error calling the Kubernetes API: FAIL

    Check the logs on the control-plane’s public API:

    1. linkerd logs --control-plane-component controller --container public-api

    √ [prometheus] control plane can talk to Prometheus

    Example failure:

    1. × [prometheus] control plane can talk to Prometheus
    2. Error calling Prometheus from the control plane: FAIL

    NoteThis will fail if you have changed your default cluster domain fromcluster.local, see the for moreinformation and potential workarounds.

    Validate that the Prometheus instance is up and running:

    1. kubectl -n linkerd get all | grep prometheus

    Check the Prometheus logs:

    1. linkerd logs --control-plane-component prometheus

    Check the logs on the control-plane’s public API:

    1. linkerd logs --control-plane-component controller --container public-api

    The “linkerd-service-profile” checks

    Example failure:

    1. no invalid service profiles
    2. ServiceProfile "bad" has invalid name (must be "<service>.<namespace>.svc.cluster.local")

    Validate the structure of your service profiles:

    1. $ kubectl -n linkerd get sp
    2. NAME AGE
    3. bad 51s
    4. linkerd-controller-api.linkerd.svc.cluster.local 1m

    Example failure:

    1. no invalid service profiles
    2. the server could not find the requested resource (get serviceprofiles.linkerd.io)

    Validate that the Service Profile CRD is installed on your cluster and that itslinkerd.io/created-by annotation matches your linkerd version clientversion:

    1. kubectl get crd/serviceprofiles.linkerd.io -o yaml | grep linkerd.io/created-by

    If the CRD is missing or out-of-date you can update it:

    1. linkerd upgrade | kubectl apply -f -

    The “linkerd-version” checks

    √ can determine the latest version

    Example failure:

    1. × can determine the latest version
    2. Get https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli: context deadline exceeded

    Ensure you can connect to the Linkerd version check endpoint from theenvironment the linkerd cli is running:

    1. $ curl "https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli"
    2. {"stable":"stable-2.1.0","edge":"edge-19.1.2"}

    √ cli is up-to-date

    Example failure:

    1. cli is up-to-date
    2. is running version 19.1.1 but the latest edge version is 19.1.2

    See the page on Upgrading Linkerd.

    Example failures:

    1. control plane is up-to-date
    2. is running version 19.1.1 but the latest edge version is 19.1.2
    3. control plane and cli versions match
    4. mismatched channels: running stable-2.1.0 but retrieved edge-19.1.2

    See the page on .

    The “linkerd-data-plane” checks

    These checks only run when the —proxy flag is set. This flag is intended foruse after running linkerd inject, to verify the injected proxies are operatingnormally.

    √ data plane namespace exists

    Example failure:

    1. $ linkerd check --proxy --namespace foo
    2. ...
    3. × data plane namespace exists
    4. The "foo" namespace does not exist

    Ensure the —namespace specified exists, or, omit the parameter to check allnamespaces.

    √ data plane proxies are ready

    Example failure:

    1. × data plane proxies are ready
    2. No "linkerd-proxy" containers found

    Ensure you have injected the Linkerd proxy into your application via thelinkerd inject command.

    For more information on linkerd inject, seein our Getting Started guide.

    √ data plane proxy metrics are present in Prometheus

    Example failure:

    1. × data plane proxy metrics are present in Prometheus
    2. Data plane metrics not found for linkerd/linkerd-controller-b8c4c48c8-pflc9.

    Ensure Prometheus can connect to each linkerd-proxy via the Prometheusdashboard:

    1. kubectl -n linkerd port-forward svc/linkerd-prometheus 9090

    …and then browse tohttp://localhost:9090/targets, validate thelinkerd-proxy section.

    You should see all your pods here. If they are not:

    • Prometheus might be experiencing connectivity issues with the k8s api server.Check out the logs and delete the pod to flush any possible transient errors.

    √ data plane is up-to-date

    Example failure:

    1. data plane is up-to-date
    2. linkerd/linkerd-prometheus-74d66f86f6-6t6dh: is running version 19.1.2 but the latest edge version is 19.1.3

    See the page on Upgrading Linkerd.

    1. linkerd/linkerd-web-5f6c45d6d9-9hd9j: is running version 19.1.2 but the latest edge version is 19.1.3

    See the page on .

    √ data plane proxies certificate match the CA’s certificates

    Example failure:

    If the trust anchor has changed while data plane proxies were running, theywill need to be restarted in order to refresh this information and usethe latest trust anchor from the Linkerd configuration.