IOChaos Experiment

    IOChaos allows you to simulate file system faults such as IO delay and read/write errors. It can inject delay and errno when you use IO system calls such as , read and write.

    Chaos Mesh uses wait-fush.sh to ensure that the fuse-daemon server is running normally before the application starts.

    Therefore, wait-fush.sh needs to be injected into the startup command of the container. If the application process is not started by the , IOChaos cannot work properly.

    Admission controller

    IOChaos needs to inject a sidecar container to user pods and the sidecar container can be added to applicable Kubernetes pods using a provided by Chaos Mesh.

    Chaos Mesh uses a template mechanism to simplify the configuration of sidecar injection.

    Because of the Go Template conflict with helm, the common template is not included in the helm chart. However, it will be deployed automatically if you install Chaos Mesh via the .

    Data directory

    The data directory of the application in the target pod should be a subdirectory of PersistentVolumes.

    Example:

    1. # the config about tikv PersistentVolumes
    2. volumeMounts:
    3. - name: datadir
    4. mountPath: /var/lib/tikv
    5. # the arguments to start tikv
    6. ARGS="--pd=${CLUSTER_NAME}-pd:2379 \
    7. --advertise-addr=${HOSTNAME}.${HEADLESS_SERVICE_NAME}.${NAMESPACE}.svc:20160 \
    8. --addr=0.0.0.0:20160 \
    9. --data-dir=/var/lib/tikv/data \ # data directory
    10. --capacity=${CAPACITY} \
    11. --config=/etc/tikv/tikv.toml

    Injection configuration

    Injection configuration is another ConfigMap and is required to fulfill IO Chaos.

    To define a specified ConfigMap for your application before starting your chaos experiment, please refer to this .

    You can apply the ConfigMap defined for your application to Kubernetes cluster by the following command:

    1. kubectl apply -f app-configmap.yaml # app-configmap.yaml is the ConfigMap file

    Below is a sample YAML file of IOChaos:

    For more sample files, see examples. You can edit them as needed.

    Usage

    Before the application created, you need to enable admission-webhook enabled on the application namespace:

    1. kubectl create ns app-ns
    2. kubectl label ns app-ns admission-webhook=enabled

    Then we have two ways to mark the pods we want to inject IO Chaos:

    1. Set annotation admission-webhook.chaos-mesh.org/init-request on the namespace, then all pods in this namespace meet the selector requirements will be injected.
    1. # set annotation
    2. kubectl annotate ns app-ns admission-webhook.chaos-mesh.org/init-request=chaosfs-tikv
    3. # create your application
    4. ...
    1. Set annotation admission-webhook.chaos-mesh.org/request on the pods, you can check this example.

    Then, you can start your application and define YAML file to start your chaos experiment.

    Assume that you are using examples/io-mixed-example.yaml, you can run the following command to create a chaos experiment:

    IOChaos currently supports the following actions:

    • delay: IO delay action. You can specify the latency before the IO operation returns a result.
    • errno: IO errno action. In this mode, read/write IO operations returns an error.
    • mixed: Both delay and errno actions.

    delay

    If you are using the delay mode, you can edit spec as below:

    1. spec:
    2. action: delay
    3. delay: '1ms'

    If delay is not specified, it is generated randomly on runtime.

    If you are using the errno mode, you can edit spec as below:

    1. spec:
    2. action: errno
    3. errno: '32'

    If errno is not specified, it is generated randomly on runtime.

    mixed

    If you are using the mixed mode, you can edit spec as below:

    The mix mode defines the delay and errno actions in one spec.

    Common Linux system errors

    Common Linux system errors are as below:

    • 1: Operation not permitted
    • 2: No such file or directory
    • 5: I/O error
    • : No such device or address
    • 12: Out of memory
    • 16: Device or resource busy
    • 17: File exists
    • 20: Not a directory
    • 22: Invalid argument
    • 24: Too many open files

    Refer to Errors: Linux System Errors for more.

    • open
    • read
    • write
    • mkdir
    • rmdir
    • opendir
    • fsync
    • flush
    • release
    • truncate
    • getattr
    • chown
    • chmod
    • utimens
    • allocate
    • getlk
    • setlk
    • setlkw
    • statfs
    • readlink
    • symlink
    • create
    • access
    • link
    • mknod
    • rename
    • unlink
    • getxattr
    • listxattr
    • setxattr