Broadcast variables allow user to access certain dataset as collection to all operators. Generally, broadcast variables are used when we you want to refer a small amount of data frequently in a certain operation. Those who are familiar with Spark broadcast variables will be able use the same feature in Flink as well.
We just need to broadcast a dataset with a specific name and it will be available on each executors handy. The broadcast variables are kept in memory so we have to be cautious in using them. The following code snippet shows how to broadcast a dataset and use it as needed.
// Get a data set to be broadcasted DataSet<Integer> toBroadcast = env.fromElements(1, 2, 3); DataSet<String> data = env.fromElements("India", "USA", "UK").map(new RichMapFunction<String, String>() { private List<Integer> toBroadcast; // We have to use open method to get broadcast set from the context @Override public void open(Configuration parameters...