Experimental Features

    We can re-interpret a pre-partitioned data stream as a keyed stream to avoid shuffling.

    One use-case for this could be a materialized shuffle between two jobs: the first job performs a keyBy shuffle and materializeseach output into a partition. A second job has sources that, for each parallel instance, reads from the corresponding partitionscreated by the first job. Those sources can now be re-interpreted as keyed streams, e.g. to apply windowing. Notice that this trickmakes the second job embarrassingly parallel, which can be helpful for a fine-grained recovery scheme.

    Given a base stream, a key selector, and type information,the method creates a keyed stream from the base stream.