In this second example of fuzzy hashing, we are going to implement a similar script using the ssdeep (version 3.1.1
) Python library. This allows us to leverage the ssdeep tool and the Spamsum algorithm that have been widely used and accepted in the fields of digital forensics and information security. This code will be the preferred method for fuzzy hashing in most scenarios as it is more efficient with resources and produces more accurate results. This tool has seen wide support in the community, and many ssdeep signatures are available online. For example, the website http://VirusTotal.com hosts hashes from ssdeep on their site under additional information for submitted files. This public information can be used to check for known malicious files that match or are similar to executable files on a host machine without the need to download the malicious files.
One weakness of ssdeep is that it does not provide information beyond the matching percentage...