So far, the Cascalog queries you saw have all returned tables of results. However, sometimes you'll want to aggregate the tables in order to boil them down to a single value or into a table where groups from the original data are aggregated.
Cascalog also makes this easy to do, and it includes a number of aggregate functions. For this recipe, we'll only use two—cascalog.logic.opts/distinct-count
and cascalog.logic.ops/sumsum
—but you can find more easily in the API documentation on the Cascalog website (http://nathanmarz.github.io/cascalog/cascalog.logic.ops.html).
We'll use the same dependencies and imports as we did in Parsing CSV Files with Cascalog. We'll also use the same data that we defined in that recipe.