Understanding persistent storage

    PVCs are specific to a project, and are created and used by developers as a means to use a PV. PV resources on their own are not scoped to any single project; they can be shared across the entire OKD cluster and claimed from any project. After a PV is bound to a PVC, that PV can not then be bound to additional PVCs. This has the effect of scoping a bound PV to a single namespace, that of the binding project.

    PVs are defined by a API object, which represents a piece of existing storage in the cluster that was either statically provisioned by the cluster administrator or dynamically provisioned using a StorageClass object. It is a resource in the cluster just like a node is a cluster resource.

    PVs are volume plug-ins like Volumes but have a lifecycle that is independent of any individual pod that uses the PV. PV objects capture the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.

    PVCs are defined by a PersistentVolumeClaim API object, which represents a request for storage by a developer. It is similar to a pod in that pods consume node resources and PVCs consume PV resources. For example, pods can request specific levels of resources, such as CPU and memory, while PVCs can request specific storage capacity and access modes. For example, they can be mounted once read-write or many times read-only.

    PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs have the following lifecycle.

    In response to requests from a developer defined in a PVC, a cluster administrator configures one or more dynamic provisioners that provision storage and a matching PV.

    Alternatively, a cluster administrator can create a number of PVs in advance that carry the details of the real storage that is available for use. PVs exist in the API and are available for use.

    Bind claims

    When you create a PVC, you request a specific amount of storage, specify the required access mode, and create a storage class to describe and classify the storage. The control loop in the master watches for new PVCs and binds the new PVC to an appropriate PV. If an appropriate PV does not exist, a provisioner for the storage class creates one.

    The size of all PVs might exceed your PVC size. This is especially true with manually provisioned PVs. To minimize the excess, OKD binds to the smallest PV that matches all other criteria.

    Claims remain unbound indefinitely if a matching volume does not exist or can not be created with any available provisioner servicing a storage class. Claims are bound as matching volumes become available. For example, a cluster with many manually provisioned 50Gi volumes would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.

    Use pods and claimed PVs

    Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a pod. For those volumes that support multiple access modes, you must specify which mode applies when you use the claim as a volume in a pod.

    Once you have a claim and that claim is bound, the bound PV belongs to you for as long as you need it. You can schedule pods and access claimed PVs by including persistentVolumeClaim in the pod’s volumes block.

    Storage Object in Use Protection

    The Storage Object in Use Protection feature ensures that PVCs in active use by a pod and PVs that are bound to PVCs are not removed from the system, as this can result in data loss.

    Storage Object in Use Protection is enabled by default.

    A PVC is in active use by a pod when a Pod object exists that uses the PVC.

    If a user deletes a PVC that is in active use by a pod, the PVC is not removed immediately. PVC removal is postponed until the PVC is no longer actively used by any pods. Also, if a cluster admin deletes a PV that is bound to a PVC, the PV is not removed immediately. PV removal is postponed until the PV is no longer bound to a PVC.

    Release a persistent volume

    When you are finished with a volume, you can delete the PVC object from the API, which allows reclamation of the resource. The volume is considered released when the claim is deleted, but it is not yet available for another claim. The previous claimant’s data remains on the volume and must be handled according to policy.

    The reclaim policy of a persistent volume tells the cluster what to do with the volume after it is released. A volume’s reclaim policy can be Retain, Recycle, or Delete.

    • Retain reclaim policy allows manual reclamation of the resource for those volume plug-ins that support it.

    • Recycle reclaim policy recycles the volume back into the pool of unbound persistent volumes once it is released from its claim.

    The Recycle reclaim policy is deprecated in OKD 4. Dynamic provisioning is recommended for equivalent and better functionality.

    • Delete reclaim policy deletes both the PersistentVolume object from OKD and the associated storage asset in external infrastructure, such as AWS EBS or VMware vSphere.

    Dynamically provisioned volumes are always deleted.

    Reclaiming a persistent volume manually

    When a persistent volume claim (PVC) is deleted, the persistent volume (PV) still exists and is considered “released”. However, the PV is not yet available for another claim because the data of the previous claimant remains on the volume.

    Procedure

    To manually reclaim the PV as a cluster administrator:

    1. Delete the PV.

      The associated storage asset in the external infrastructure, such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume, still exists after the PV is deleted.

    2. Clean up the data on the associated storage asset.

    3. Delete the associated storage asset. Alternately, to reuse the same storage asset, create a new PV with the storage asset definition.

    The reclaimed PV is now available for use by another PVC.

    Changing the reclaim policy of a persistent volume

    To change the reclaim policy of a persistent volume:

    1. List the persistent volumes in your cluster:

      1. $ oc get pv

      Example output

      1. NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
      2. pvc-b6efd8da-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Delete Bound default/claim1 manual 10s
      3. pvc-b95650f8-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Delete Bound default/claim2 manual 6s
      4. pvc-bb3ca71d-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Delete Bound default/claim3 manual 3s
    2. Choose one of your persistent volumes and change its reclaim policy:

      1. $ oc patch pv <your-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    3. Verify that your chosen persistent volume has the right policy:

      Example output

      1. NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
      2. pvc-b6efd8da-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Delete Bound default/claim1 manual 10s
      3. pvc-b95650f8-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Delete Bound default/claim2 manual 6s
      4. pvc-bb3ca71d-b7b5-11e6-9d58-0ed433a7dd94 4Gi RWO Retain Bound default/claim3 manual 3s

      In the preceding output, the volume bound to claim default/claim3 now has a Retain reclaim policy. The volume will not be automatically deleted when a user deletes claim default/claim3.

    Each PV contains a spec and status, which is the specification and status of the volume, for example:

    PersistentVolume object definition example

    1. apiVersion: v1
    2. kind: PersistentVolume
    3. metadata:
    4. name: pv0001 (1)
    5. capacity:
    6. storage: 5Gi (2)
    7. accessModes:
    8. - ReadWriteOnce (3)
    9. persistentVolumeReclaimPolicy: Retain (4)
    10. ...
    11. status:
    12. ...
    1Name of the persistent volume.
    2The amount of storage available to the volume.
    3The access mode, defining the read-write and mount permissions.
    4The reclaim policy, indicating how the resource should be handled once it is released.

    Types of PVs

    OKD supports the following persistent volume plug-ins:

    • AWS Elastic Block Store (EBS)

    • Azure Disk

    • Azure File

    • Cinder

    • Fibre Channel

    • GCE Persistent Disk

    • HostPath

    • iSCSI

    • Local volume

    • NFS

    • OpenStack Manila

    • Red Hat OpenShift Container Storage

    • VMware vSphere

    Capacity

    Generally, a persistent volume (PV) has a specific storage capacity. This is set by using the capacity attribute of the PV.

    Currently, storage capacity is the only resource that can be set or requested. Future attributes may include IOPS, throughput, and so on.

    A persistent volume can be mounted on a host in any way supported by the resource provider. Providers have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read-write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.

    Claims are matched to volumes with similar access modes. The only two matching criteria are access modes and size. A claim’s access modes represent a request. Therefore, you might be granted more, but never less. For example, if a claim requests RWO, but the only volume available is an NFS PV (RWO+ROX+RWX), the claim would then match NFS because it supports RWO.

    Direct matches are always attempted first. The volume’s modes must match or contain more modes than you requested. The size must be greater than or equal to what is expected. If two types of volumes, such as NFS and iSCSI, have the same set of access modes, either of them can match a claim with those modes. There is no ordering between types of volumes and no way to choose one type over another.

    All volumes with the same modes are grouped, and then sorted by size, smallest to largest. The binder gets the group with matching modes and iterates over each, in size order, until one size matches.

    The following table lists the access modes:

    Table 1. Access modes
    Access ModeCLI abbreviationDescription

    ReadWriteOnce

    RWO

    The volume can be mounted as read-write by a single node.

    ReadOnlyMany

    ROX

    The volume can be mounted as read-only by many nodes.

    ReadWriteMany

    RWX

    The volume can be mounted as read-write by many nodes.

    Volume access modes are descriptors of volume capabilities. They are not enforced constraints. The storage provider is responsible for runtime errors resulting from invalid use of the resource.

    For example, NFS offers access mode. You must mark the claims as read-only if you want to use the volume’s ROX capability. Errors in the provider show up at runtime as mount errors.

    iSCSI and Fibre Channel volumes do not currently have any fencing mechanisms. You must ensure the volumes are only used by one node at a time. In certain situations, such as draining a node, the volumes can be used simultaneously by two nodes. Before draining the node, first ensure the pods that use these volumes are deleted.

    Table 2. Supported access modes for PVs
    Volume plug-inReadWriteOnce [1]ReadOnlyManyReadWriteMany

    AWS EBS [2]

    -

    -

    Azure File

    Azure Disk

    -

    -

    Cinder

    -

    -

    Fibre Channel

    -

    GCE Persistent Disk

    -

    -

    HostPath

    -

    -

    iSCSI

    -

    Local volume

    -

    -

    NFS

    OpenStack Manila

    -

    -

    Red Hat OpenShift Container Storage

    -

    VMware vSphere

    -

    -

    1. ReadWriteOnce (RWO) volumes cannot be mounted on multiple nodes. If a node fails, the system does not allow the attached RWO volume to be mounted on a new node because it is already assigned to the failed node. If you encounter a multi-attach error message as a result, force delete the pod on a shutdown or crashed node to avoid data loss in critical workloads, such as when dynamic persistent volumes are attached.

    2. Use a recreate deployment strategy for pods that rely on AWS EBS.

    Phase

    Volumes can be found in one of the following phases:

    You can view the name of the PVC bound to the PV by running:

    1. $ oc get pv <pv-claim>

    Mount options

    You can specify mount options while mounting a PV by using the attribute mountOptions.

    For example:

    Mount options example

    1Specified mount options are used while mounting the PV to the disk.

    The following PV types support mount options:

    • AWS Elastic Block Store (EBS)

    • Azure Disk

    • Azure File

    • Cinder

    • GCE Persistent Disk

    • iSCSI

    • Local volume

    • NFS

    • Red Hat OpenShift Container Storage (Ceph RBD only)

    • VMware vSphere

    Fibre Channel and HostPath PVs do not support mount options.

    Each PersistentVolumeClaim object contains a spec and status, which is the specification and status of the persistent volume claim (PVC), for example:

    PersistentVolumeClaim object definition example

    1. kind: PersistentVolumeClaim
    2. apiVersion: v1
    3. metadata:
    4. name: myclaim (1)
    5. spec:
    6. accessModes:
    7. - ReadWriteOnce (2)
    8. resources:
    9. requests:
    10. storage: 8Gi (3)
    11. storageClassName: gold (4)
    12. status:
    1Name of the PVC
    2The access mode, defining the read-write and mount permissions
    3The amount of storage available to the PVC
    4Name of the StorageClass required by the claim

    Storage classes

    Claims can optionally request a specific storage class by specifying the storage class’s name in the storageClassName attribute. Only PVs of the requested class, ones with the same storageClassName as the PVC, can be bound to the PVC. The cluster administrator can configure dynamic provisioners to service one or more storage classes. The cluster administrator can create a PV on demand that matches the specifications in the PVC.

    The Cluster Storage Operator might install a default storage class depending on the platform in use. This storage class is owned and controlled by the operator. It cannot be deleted or modified beyond defining annotations and labels. If different behavior is desired, you must define a custom storage class.

    The cluster administrator can also set a default storage class for all PVCs. When a default storage class is configured, the PVC must explicitly ask for StorageClass or storageClassName annotations set to "" to be bound to a PV without a storage class.

    Access modes

    Claims use the same conventions as volumes when requesting storage with specific access modes.

    Resources

    Claims, such as pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to volumes and claims.

    Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the pod by using the claim. The cluster finds the claim in the pod’s namespace and uses it to get the PersistentVolume backing the claim. The volume is mounted to the host and into the pod, for example:

    Mount volume to the host and into the pod example

    1. kind: Pod
    2. apiVersion: v1
    3. metadata:
    4. name: mypod
    5. spec:
    6. containers:
    7. - name: myfrontend
    8. image: dockerfile/nginx
    9. volumeMounts:
    10. - mountPath: "/var/www/html" (1)
    11. name: mypd (2)
    12. volumes:
    13. - name: mypd
    14. persistentVolumeClaim:
    15. claimName: myclaim (3)
    1Path to mount the volume inside the pod.
    2Name of the volume to mount. Do not mount to the container root, /, or any path that is the same in the host and the container. This can corrupt your host system if the container is sufficiently privileged, such as the host /dev/pts files. It is safe to mount the host by using /host.
    3Name of the PVC, that exists in the same namespace, to use.

    OKD can statically provision raw block volumes. These volumes do not have a file system, and can provide performance benefits for applications that either write to the disk directly or implement their own storage service.

    Raw block volumes are provisioned by specifying in the PV and PVC specification.

    Pods using raw block volumes must be configured to allow privileged containers.

    The following table displays which volume plug-ins support block volumes.

    Any of the block volumes that can be provisioned manually, but are not provided as fully supported, are included as a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see .

    Block volume examples

    PV example

    1. apiVersion: v1
    2. kind: PersistentVolume
    3. metadata:
    4. name: block-pv
    5. spec:
    6. capacity:
    7. storage: 10Gi
    8. accessModes:
    9. - ReadWriteOnce
    10. volumeMode: Block (1)
    11. persistentVolumeReclaimPolicy: Retain
    12. fc:
    13. targetWWNs: ["50060e801049cfd1"]
    14. lun: 0
    15. readOnly: false
    1volumeMode must be set to Block to indicate that this PV is a raw block volume.

    PVC example

    1volumeMode must be set to Block to indicate that a raw block PVC is requested.

    Pod specification example

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: pod-with-block-volume
    5. spec:
    6. containers:
    7. - name: fc-container
    8. image: fedora:26
    9. command: ["/bin/sh", "-c"]
    10. args: [ "tail -f /dev/null" ]
    11. volumeDevices: (1)
    12. - name: data
    13. devicePath: /dev/xvda (2)
    14. volumes:
    15. - name: data
    16. persistentVolumeClaim:
    1volumeDevices, instead of volumeMounts, is used for block devices. Only PersistentVolumeClaim sources can be used with raw block volumes.
    2devicePath, instead of mountPath, represents the path to the physical device where the raw block is mapped to the system.
    3The volume source must be of type persistentVolumeClaim and must match the name of the PVC as expected.
    Table 5. Accepted values for volumeMode
    ValueDefault

    Filesystem

    Yes

    Block

    No

    Table 6. Binding scenarios for block volumes
    PV volumeModePVC volumeModeBinding result

    Filesystem

    Filesystem

    Bind

    Unspecified

    Unspecified

    Bind

    Filesystem

    Unspecified

    Bind

    Unspecified

    Filesystem

    Bind

    Block

    Block

    Bind

    Unspecified

    Block

    No Bind

    Block

    Unspecified

    No Bind

    Filesystem

    Block

    No Bind

    Block

    Filesystem

    No Bind

    Unspecified values result in the default value of .