To easily understand the complex relationship of city airports and the flights between each of them, we can use the concept of motifs to find patterns of airports connected by flights. The result is a DataFrame in which the column names are given by the motif keys.
To make it easier to view our data within the context of Motifs, let's first create a smaller version of the graph
GraphFrame called graphSmall
:
edgesSubset = deptsDelays_GEO.select("tripid", "delay", "src", "dst") graphSmall = GraphFrame(vertices, edgesSubset)
To execute a Motif, execute the following command:
motifs = ( graphSmall .find("(a)-[ab]->(b); (b)-[bc]->(c)") .filter(""" (b.id = 'SFO') and (ab.delay > 500 or bc.delay > 500) and bc.tripid > ab.tripid and bc.tripid < ab.tripid + 10000 """) ) display(motifs)
The result of this query can be seen as follows:
Output of the motif query