Use PD Recover to Recover the PD Cluster

    1. Download the official TiDB package:

      In the command above, is the version of the TiDB cluster, such as v5.4.0.

    2. Unpack the TiDB package for installation:

      1. tar -xzf tidb-${version}-linux-amd64.tar.gz

      pd-recover is in the tidb-${version}-linux-amd64/bin directory.

    Recover the PD cluster

    This section introduces how to recover the PD cluster using PD Recover.

      Example:

      1. kubectl get tc test -n test -o='go-template={{.status.clusterID}}{{"\n"}}'
      2. 6821434242797747735

      Step 2. Get Alloc ID

      When you use pd-recover to recover the PD cluster, you need to specify alloc-id. The value of alloc-id must be larger than the largest allocated ID (Alloc ID) of the original cluster.

      1. Access the Prometheus monitoring data of the TiDB cluster by taking steps in Access the Prometheus monitoring data.

      2. Enter pd_cluster_id in the input box and click the Execute button to make a query. Get the largest value in the query result.

      1. Delete the Pod of the PD cluster.

        Execute the following command to set the value of spec.pd.replicas to 0:

        1. kubectl patch tc ${cluster_name} -n ${namespace} --type merge -p '{"spec":{"pd":{"replicas": 0}}}'

        Because the PD cluster is in an abnormal state, TiDB Operator cannot synchronize the change above to the PD StatefulSet. You need to execute the following command to set the spec.replicas of the PD StatefulSet to 0.

        1. kubectl patch sts ${cluster_name}-pd -n ${namespace} -p '{"spec":{"replicas": 0}}'

        Execute the following command to confirm that the PD Pod is deleted:

      2. After confirming that all PD Pods are deleted, execute the following command to delete the PVCs bound to the PD Pods:

        1. kubectl delete pvc -l app.kubernetes.io/component=pd,app.kubernetes.io/instance=${cluster_name} -n ${namespace}
      3. After the PVCs are deleted, scale out the PD cluster to one Pod:

        Execute the following command to set the value of spec.pd.replicas to 1:

        1. kubectl patch tc ${cluster_name} -n ${namespace} --type merge -p '{"spec":{"pd":{"replicas": 1}}}'

        Because the PD cluster is in an abnormal state, TiDB Operator cannot synchronize the change above to the PD StatefulSet. You need to execute the following command to set the spec.replicas of the PD StatefulSet to 1.

          Execute the following command to confirm that the PD Pod is started:

          1. kubectl get pod -n ${namespace}

        Step 4. Recover the cluster

        1. Open a new terminal tab or window, enter the directory where pd-recover is located, and execute the pd-recover command to recover the PD cluster:

          In the command above, ${cluster_id} is the cluster ID got in Get Cluster ID. ${alloc_id} is the largest value of pd_cluster_id (got in ) multiplied by 100.

          After the pd-recover command is successfully executed, the following result is printed:

          1. recover success! please restart the PD cluster
        2. Go back to the window where the port-forward command is executed, and then press Ctrl+C to stop and exit.

        1. Delete the PD Pod:

          1. kubectl delete pod ${cluster_name}-pd-0 -n ${namespace}
        2. After the Pod is started successfully, execute the port-forward command to expose the PD service:

          1. kubectl port-forward -n ${namespace} svc/${cluster_name}-pd 2379:2379
        3. Open a new terminal tab or window, execute the following command to confirm the Cluster ID is the one got in Get Cluster ID.

          1. curl 127.0.0.1:2379/pd/api/v1/cluster

        Step 6. Scale out the PD cluster

        Execute the following command to set the value of spec.pd.replicas to the desired number of Pods: