Block Devices and OpenStack

    To use Ceph Block Devices with OpenStack, you must install QEMU, libvirt,and OpenStack first. We recommend using a separate physical node for yourOpenStack installation. OpenStack recommends a minimum of 8GB of RAM and aquad-core processor. The following diagram depicts the OpenStack/Cephtechnology stack.

    Important

    To use Ceph Block Devices with OpenStack, you must haveaccess to a running Ceph Storage Cluster.

    Three parts of OpenStack integrate with Ceph’s block devices:

    • Images: OpenStack Glance manages images for VMs. Images are immutable.OpenStack treats images as binary blobs and downloads them accordingly.

    • Volumes: Volumes are block devices. OpenStack uses volumes to boot VMs,or to attach volumes to running VMs. OpenStack manages volumes usingCinder services.

    • Guest Disks: Guest disks are guest operating system disks. By default,when you boot a virtual machine, its disk appears as a file on the file systemof the hypervisor (usually under /var/lib/nova/instances/<uuid>/). Priorto OpenStack Havana, the only way to boot a VM in Ceph was to use theboot-from-volume functionality of Cinder. However, now it is possible to bootevery virtual machine inside Ceph directly without using Cinder, which isadvantageous because it allows you to perform maintenance operations easilywith the live-migration process. Additionally, if your hypervisor dies it isalso convenient to trigger nova evacuate and run the virtual machineelsewhere almost seamlessly. In doing so,exclusive locks prevent multiplecompute nodes from concurrently accessing the guest disk.

    You can use OpenStack Glance to store images in a Ceph Block Device, and youcan use Cinder to boot a VM using a copy-on-write clone of an image.

    The instructions below detail the setup for Glance, Cinder and Nova, althoughthey do not have to be used together. You may store images in Ceph block deviceswhile running VMs using a local disk, or vice versa.

    Important

    Using QCOW2 for hosting a virtual machine disk is NOT recommended.If you want to boot virtual machines in Ceph (ephemeral backend or bootfrom volume), please use the raw image format within Glance.

    By default, Ceph block devices use the rbd pool. You may use any availablepool. We recommend creating a pool for Cinder and a pool for Glance. Ensureyour Ceph cluster is running, then create the pools.

    See for detail on specifying the number of placement groups foryour pools, and Placement Groups for details on the number of placementgroups you should set for your pools.

    Newly created pools must be initialized prior to use. Use the rbd toolto initialize the pools:

    1. rbd pool init volumes
    2. rbd pool init images
    3. rbd pool init backups
    4. rbd pool init vms

    The nodes running glance-api, cinder-volume, nova-compute andcinder-backup act as Ceph clients. Each requires the ceph.conf file:

    1. ssh {your-openstack-server} sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf

    On the glance-api node, you will need the Python bindings for :

    1. sudo apt-get install python-rbd
    2. sudo yum install python-rbd

    On the nova-compute, cinder-backup and on the cinder-volume node,use both the Python bindings and the client command line tools:

    1. sudo apt-get install ceph-common
    2. sudo yum install ceph-common

    Setup Ceph Client Authentication

    If you have cephx authentication enabled, create a new user for Nova/Cinderand Glance. Execute the following:

    1. ceph auth get-or-create client.glance mon 'profile rbd' osd 'profile rbd pool=images' mgr 'profile rbd pool=images'
    2. ceph auth get-or-create client.cinder mon 'profile rbd' osd 'profile rbd pool=volumes, profile rbd pool=vms, profile rbd-read-only pool=images' mgr 'profile rbd pool=volumes, profile rbd pool=vms'
    3. ceph auth get-or-create client.cinder-backup mon 'profile rbd' osd 'profile rbd pool=backups' mgr 'profile rbd pool=backups'

    Add the keyrings for client.cinder, client.glance, andclient.cinder-backup to the appropriate nodes and change their ownership:

    1. ceph auth get-or-create client.glance | ssh {your-glance-api-server} sudo tee /etc/ceph/ceph.client.glance.keyring
    2. ssh {your-glance-api-server} sudo chown glance:glance /etc/ceph/ceph.client.glance.keyring
    3. ceph auth get-or-create client.cinder | ssh {your-volume-server} sudo tee /etc/ceph/ceph.client.cinder.keyring
    4. ssh {your-cinder-volume-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
    5. ceph auth get-or-create client.cinder-backup | ssh {your-cinder-backup-server} sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
    6. ssh {your-cinder-backup-server} sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring

      They also need to store the secret key of the client.cinder user inlibvirt. The libvirt process needs it to access the cluster while attachinga block device from Cinder.

      Create a temporary copy of the secret key on the nodes runningnova-compute:

      Then, on the compute nodes, add the secret key to libvirt and remove thetemporary copy of the key:

      1. uuidgen
      2. 457eb676-33da-42ec-9a8c-9293d545c337
      3.  
      4. cat > secret.xml <<EOF
      5. <secret ephemeral='no' private='no'>
      6. <uuid>457eb676-33da-42ec-9a8c-9293d545c337</uuid>
      7. <usage type='ceph'>
      8. <name>client.cinder secret</name>
      9. </usage>
      10. </secret>
      11. EOF
      12. sudo virsh secret-define --file secret.xml
      13. Secret 457eb676-33da-42ec-9a8c-9293d545c337 created
      14. sudo virsh secret-set-value --secret 457eb676-33da-42ec-9a8c-9293d545c337 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml

      Save the uuid of the secret for configuring nova-compute later.

      Important

      You don’t necessarily need the UUID on all the compute nodes.However from a platform consistency perspective, it’s better to keep thesame UUID.

      Glance can use multiple back ends to store images. To use Ceph block devices bydefault, configure Glance like the following.

      Kilo and after

      Edit and add under the [glance_store] section:

      1. [glance_store]
      2. stores = rbd
      3. default_store = rbd
      4. rbd_store_pool = images
      5. rbd_store_user = glance
      6. rbd_store_ceph_conf = /etc/ceph/ceph.conf
      7. rbd_store_chunk_size = 8

      For more information about the configuration options available in Glance please refer to the OpenStack Configuration Reference: http://docs.openstack.org/.

      Enable copy-on-write cloning of images

      Note that this exposes the back end location via Glance’s API, so the endpointwith this option enabled should not be publicly accessible.

      Any OpenStack version except Mitaka

      If you want to enable copy-on-write cloning of images, also add under the [DEFAULT] section:

      1. show_image_direct_url = True

      Disable cache management (any OpenStack version)

      Disable the Glance cache management to avoid images getting cached under /var/lib/glance/image-cache/,assuming your configuration file has flavor = keystone+cachemanagement:

      1. [paste_deploy]
      2. flavor = keystone

      Image properties

      We recommend to use the following properties for your images:

      • hw_scsi_model=virtio-scsi: add the virtio-scsi controller and get better performance and support for discard operation

      • hw_disk_bus=scsi: connect every cinder block devices to that controller

      • hw_qemu_guest_agent=yes: enable the QEMU guest agent

      • os_require_quiesce=yes: send fs-freeze/thaw calls through the QEMU guest agent

      Configuring Cinder

      OpenStack requires a driver to interact with Ceph block devices. You must alsospecify the pool name for the block device. On your OpenStack node, edit/etc/cinder/cinder.conf by adding:

      1. [DEFAULT]
      2. ...
      3. enabled_backends = ceph
      4. glance_api_version = 2
      5. ...
      6. [ceph]
      7. volume_driver = cinder.volume.drivers.rbd.RBDDriver
      8. volume_backend_name = ceph
      9. rbd_pool = volumes
      10. rbd_ceph_conf = /etc/ceph/ceph.conf
      11. rbd_max_clone_depth = 5
      12. rbd_store_chunk_size = 4
      13. rados_connect_timeout = -1

      If you are using cephx authentication, also configure the user and uuid ofthe secret you added to libvirt as documented earlier:

      1. [ceph]
      2. ...
      3. rbd_user = cinder
      4. rbd_secret_uuid = 457eb676-33da-42ec-9a8c-9293d545c337

      Note that if you are configuring multiple cinder back ends,glance_api_version = 2 must be in the [DEFAULT] section.

      1. backup_driver = cinder.backup.drivers.ceph
      2. backup_ceph_conf = /etc/ceph/ceph.conf
      3. backup_ceph_user = cinder-backup
      4. backup_ceph_chunk_size = 134217728
      5. backup_ceph_pool = backups
      6. backup_ceph_stripe_unit = 0
      7. backup_ceph_stripe_count = 0
      8. restore_discard_excess_bytes = true

      Configuring Nova to attach Ceph RBD block device

      In order to attach Cinder devices (either normal block or by issuing a bootfrom volume), you must tell Nova (and libvirt) which user and UUID to refer towhen attaching the device. libvirt will refer to this user when connecting andauthenticating with the Ceph cluster.

      These two flags are also used by the Nova ephemeral backend.

      In order to boot all the virtual machines directly into Ceph, you mustconfigure the ephemeral backend for Nova.

      It is recommended to enable the RBD cache in your Ceph configuration file(enabled by default since Giant). Moreover, enabling the admin socketbrings a lot of benefits while troubleshooting. Having one socketper virtual machine using a Ceph block device will help investigating performance and/or wrong behaviors.

      This socket can be accessed like this:

      1. ceph daemon /var/run/ceph/ceph-client.cinder.19195.32310016.asok help

      Now on every compute nodes edit your Ceph configuration file:

      1. [client]
      2. rbd cache = true
      3. rbd cache writethrough until flush = true
      4. admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
      5. log file = /var/log/qemu/qemu-guest-$pid.log
      6. rbd concurrent management ops = 20

      Configure the permissions of these paths:

      1. mkdir -p /var/run/ceph/guests/ /var/log/qemu/
      2. chown qemu:libvirtd /var/run/ceph/guests /var/log/qemu/

      Note that user and group libvirtd can vary depending on your system.The provided example works for RedHat based systems.

      Tip

      If your virtual machine is already running you can simply restart it to get the socket

      To activate the Ceph block device driver and load the block device pool nameinto the configuration, you must restart OpenStack. Thus, for Debian basedsystems execute these commands on the appropriate nodes:

      1. sudo glance-control api restart
      2. sudo service nova-compute restart
      3. sudo service cinder-volume restart
      4. sudo service cinder-backup restart

      For Red Hat based systems execute:

      1. sudo service openstack-glance-api restart
      2. sudo service openstack-nova-compute restart
      3. sudo service openstack-cinder-volume restart
      4. sudo service openstack-cinder-backup restart

      Once OpenStack is up and running, you should be able to create a volumeand boot from it.

      You can create a volume from an image using the Cinder command line tool:

      1. cinder create --image-id {id of image} --display-name {name of volume} {size of volume}

      You can use qemu-img to convert from one format to another. For example:

      1. qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
      2. qemu-img convert -f qcow2 -O raw precise-cloudimg.img precise-cloudimg.raw

      When Glance and Cinder are both using Ceph block devices, the image is acopy-on-write clone, so it can create a new volume quickly. In the OpenStackdashboard, you can boot from that volume by performing the following steps:

      • Launch a new instance.

      • Choose the image associated to the copy-on-write clone.

      • Select ‘boot from volume’.