Working with Stack Exchange data dumps
The Stack Exchange network also provides complete dumps of their data, available for download through the Internet Archive (https://archive.org/details/stackexchange). The data is available in 7Z, a compressed data format with a high-compression ratio (http://www.7-zip.org). In order to read and extract this format, the 7-zip utility for Windows, or one of its ports for Linux/Unix and macOS, must be downloaded.
At the time of writing, the data dumps for Stack Overflow are provided as separate compressed files, with each file representing an entity or table in their dataset. For example, the stackoverflow.com-Posts.7z
file contains the dump for the Posts table (that is, questions and answers). The size of the first version of this file published in 2016 is about 7.9 GB, which when uncompressed yields a file of 39 GB (approximatively five times bigger than the compressed version). All the other Stack Exchange websites have a much smaller data dump, and...