KernelChaos Experiment

    Although KernelChaos targets a certain pod, the performance of other pods are also impacted depending on the specific callchain and frequency. It is because all pods of the same host share the same kernel.

    Below is a sample KernelChaos configuration file:

    For more sample files, see examples. You can edit them as needed.

    Description:

    • failkernRequest defines the specified injection mode (kmalloc, bio, etc.) with a call chain and an optional set of predicates. The fields are:

      • failtype indicates what to fail, can be set to 0 / 1 / 2.

        • If 0, indicates slab to fail (should_failslab)
        • If 1, indicates alloc_page to fail (should_fail_alloc_page)
        • If , indicates bio to fail (should_fail_bio)

        For more information, see and inject_example.

      • callchain indicates a special call chain, such as:

        With an optional set of predicates and an optional set of parameters, which used with predicates. See to learn more. If there is no special call chain, just keep callchain empty, which means it will fail at any call chain with slab alloc (eg: kmalloc).

        The challchain’s type is an array of frames, the frame has three fields:

        • parameters is used with predicate, for example, if you want to inject slab error in d_alloc_parallel(struct dentry *parent, const struct qstr *name) with a special name bananas, you need to set it to otherwise omit it.
        • predicate accesses the arguments of this frame, example with parameters’s, you can set it to STRNCMP(name->name, "bananas", 8) to make inject only with it, or omit it to inject for all d_alloc_parallel call chain.
      • headers indicates the appropriate kernel headers you need. Eg: “linux/mmzone.h”, “linux/blkdev.h” and so on.

      • times indicates the max times of fails.

    • duration defines the duration for each chaos experiment. In the sample file above, the time chaos lasts for 10 seconds.

    • scheduler defines the scheduler rules for the running time of the chaos experiment. For more rule information, see robfig/cron

    KernelChaos’s function is similar to , which guarantees the appropriate erroneous return of the specified injection mode (kmalloc, bio, etc.) given a call chain and an optional set of predicates.

    You can read inject_example.txt to learn more.

    Below is a sample program:

    During the injection, the output is similar to this:

    When failtype is 1, it means that physical page allocation will fail. If the behavior is continuous in a very short time (eg: `while (1) {memset(malloc(1M), '1', 1M)}), the system’s oom-killer will be awakened to release memory. So the container_id will lose limit to oom-killer.