In order to operate on datasets/datastreams, first we need to register a table in TableEnvironment
. Once the table is registered with a unique name, it can be easily accessed from TableEnvironment
.
TableEnvironment
maintains an internal table catalogue for table registration. The following diagram shows the details:
It is very important to have unique table names, otherwise you will get an exception.
In order to perform SQL operations on a dataset, we need to register it as a table in BatchTableEnvironment
. We need to define a Java POJO class while registering the table.
For instance, let's say we need to register a dataset called Word Count. Each record in this table will have word and frequency attributes. The Java POJO for the same would look like the following:
public static class WC { public String word; public long frequency; public WC(){ } public WC(String word, long frequency) { this.word = word; this.frequency...