Book Image

Mastering Gephi Network Visualization

Book Image

Mastering Gephi Network Visualization

Overview of this book

Table of Contents (19 chapters)
Mastering Gephi Network Visualization
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Primary windows


The three main operating windows discussed earlier are covered in the following sections. While there will be some details provided within each of these sections, this book will not provide a comprehensive guide to the functionality of each and every option. Additional information is available via the Gephi documentation and forums, as well as through my introductory book on Gephi—Network Analysis and Visualization with Gephi, Packt Publishing.

Let's begin with the data laboratory, which will be the repository for our graph data.

Data laboratory

All data that feeds our network graphs will reside in the data laboratory. The laboratory is built around the concepts of nodes and edges, which we covered extensively earlier in this chapter. While the data laboratory might have a spreadsheet-like appearance, do not confuse it with the likes of Excel, Calc, or Google Spreadsheet. Certain aspects of data manipulation can be done here, but it is best to have your base data largely prepared prior to importing it into Gephi. For instance, I find it much easier to utilize a spreadsheet tool when there is a need to create distinct values within a categorical field. Likewise, any field values that are based on a specific sorting scheme might be best created outside of Gephi.

This is not to say that data held within the laboratory is fully static. For example, all statistical and clustering calculations will automatically append new values to each node when a process is run. You can also add columns, copy data from one column into another, and so on. Still, making individual node or edge-level changes here can become tedious (and very time consuming), especially if your dataset consists of hundreds or thousands of values.

There are several ways we can add data to the laboratory, from the very basic to more complex (albeit, more powerful). Here are a few ways:

  • Manual entry

  • CSV import

  • Excel import

  • MySQL import

  • Graph file imports

Let's briefly discuss each of these options. I will not attempt to go into each and every use case, as that alone could fill an entire book. Instead, we'll look at some generic examples, and I recommend you the Gephi forums for cases that are beyond the scope of this book. Example processes for each of these processes will be provided in the Appendix, Data Sources and Other Web Resources.

Manual entry

If you are working with a very small dataset, or are very skilled at data entry, there is a manual option to create a Gephi dataset. This approach can be useful for those who wish to experiment; this is discouraged for all, but the smallest networks. Importing data from a .csv format is so easy that it makes little sense to choose the manual option beyond the simplest of scenarios.

CSV import

One of the simplest ways to move data into Gephi is through the use of comma separated values (.csv) files. Users can start saving and exporting data in Excel, Calc, Google Spreadsheet, or any application that allows the files to be saved and exported in the .csv format. To make the data transfer even simpler, only an edge file is actually needed by Gephi, as it will create a node file automatically. However, if you wish to add more detail to describe your nodes, I recommend that you create separate node and edge worksheets.

Excel import

Excel users have the ability to easily load data into Gephi using the Excel/csv converter to network plugin from the Gephi Marketplace. This plugin uses a more flexible approach as compared to the data laboratory import spreadsheet process. More information on this approach can be found at the Gephi Marketplace at https://marketplace.gephi.org/.

MySQL import

Gephi users that have data housed in the open source MySQL database are also able to directly import data by creating specific tables for nodes and edges, and then pointing Gephi to the database using connection parameters.

Graph file import

Gephi also enables the use of multiple graph file formats, making it very easy for the users of other software (UCINET, Pajek, GUESS, and so on) to import existing files into Gephi.

As a final note, we also have the ability to merge files in Gephi using either the data laboratory or simply through opening a second (or greater) file.

Graph window

All visual output is initially viewed using the graph window, with Gephi providing a somewhat crude initial view of your network. The initial view is very simple given that we have not selected any sort of layout at this stage; this is an issue that will soon be rectified. It is highly likely that the majority of your time will be spent working within the graph window, observing the patterns within your network. All applications of filtering, partitioning, sizing, coloring, and any layout adjustments will be seen here first, so it is wise to become very familiar with this space, if you haven't already done so.

You will observe that the graph window is adjacent to multiple toolbars, each with an array of functions. The functionality behind each of these options is generally intuitive, and should be explored for further understanding. This book will not spend considerable time with each of these functions. For a primer on these, my introductory book on Gephi provides greater detail, or, alternatively, takes some time to play with each option and represents what happens to your graph.

Preview window

The Gephi preview window allows the user to adjust a variety of graph attributes that have been created in the original graph window. Here, we can customize node labels by adjusting font size, font color, outlines, specifying whether to use boxes for the labels, and electing whether to display the labels at all. These decisions can be made based on the density and complexity of our graph; dense graphs might benefit from labeling only the critical nodes. Using Inkscape or Adobe Illustrator to create labels after exporting the graph is another option that allows the greater customization.

Node appearance is also addressed by providing border width, border color, and opacity options. As with the node labels, you can elect to use external tools to provide a greater degree of customization, where you have the ability to color individual nodes and edges rather than applying a one size fits all approach. Remember that you can always toggle to the overview window to do many of these customizations in Gephi, and then simply refresh the preview window.

Additional options are provided for adjusting the appearance of graph edges. Edge thickness, color, opacity, radius, and curved edges are all available options. Likewise, edge arrows (for directed graphs) and edge labels are customizable.

The preview window is also where some of Gephi's built-in export options reside, specifically, in the SVG, PDF, and PNG formats. Let's briefly consider each of these options, and why you might select one over another:

  • PNG: This represents the simplest choice. It creates an image of your network, making it easy to share it online or elsewhere, provided you have no desire to further enhance the graph. This option is ideal for sharing a quick snapshot of your work on the Web or via e-mail, but is obviously limited from an editing standpoint.

  • SVG: The SVG export creates a scalable vector graphic that can be edited in other programs such as Inkscape, although the large file size of this format might be most suitable for graphs without a high degree of complexity.

  • PDF: The PDF export offers some of the advantages of SVG minus the large footprint. This format is also editable in Inkscape and Illustrator, and will ultimately allow you to customize every aspect of your graph, as well as to add titles or other notations describing the graph.