The preceding Incanter dataset is an easily comprehensible representation of our data, but to extract the numbers for each of the groups individually we'll want to store the data in a more readily accessible data structure. Let's write a function to convert the dataset to a series of nested maps:
(defn frequency-map [sum-column group-cols dataset] (let [f (fn [freq-map row] (let [groups (map row group-cols)] (->> (get row sum-column) (assoc-in freq-map groups))))] (->> (frequency-table sum-column group-cols dataset) (:rows) (reduce f {}))))
For example, we can use the frequency-map
function as follows to calculate a nested map of :sex
and :survived
:
(defn ex-4-3 [] (->> (load-data "titanic.tsv") (frequency-map :count [:sex :survived]))) ;; => {"female" {"y" 339, "n" 127}, "male" {"y" 161, "n" 682}}
More generally, given any dataset and sequence of columns...