Throughout the book, I will continue to re-enforce the need to have a data dictionary to help with the analysis of data. As with any data we have uncovered so far, a data dictionary will come in all shapes and sizes. This means it could be documented outside the source file, which is common on a help page, a wiki, or a blog or within the source data, as we discussed with XML and JSON files.
Having the data defined and documented will aid you in the journey to understand it but will not be the only method required to become a domain expert for a dataset. Domain expertise comes from experience with understanding how the data is used, along with the business or purpose behind the underlining source data. We covered some of these concepts in Chapter 1, Fundamentals of Data Analysis, looking at how Know Your Data (KYD) and having a data dictionary available aids in the effort to learn more about the underlying dataset.