Book Image

Bash Cookbook

By : Ron Brash, Ganesh Sanjiv Naik
Book Image

Bash Cookbook

By: Ron Brash, Ganesh Sanjiv Naik

Overview of this book

In Linux, one of the most commonly used and most powerful tools is the Bash shell. With its collection of engaging recipes, Bash Cookbook takes you through a series of exercises designed to teach you how to effectively use the Bash shell in order to create and execute your own scripts. The book starts by introducing you to the basics of using the Bash shell, also teaching you the fundamentals of generating any input from a command. With the help of a number of exercises, you will get to grips with the automation of daily tasks for sysadmins and power users. Once you have a hands-on understanding of the subject, you will move on to exploring more advanced projects that can solve real-world problems comprehensively on a Linux system. In addition to this, you will discover projects such as creating an application with a menu, beginning scripts on startup, parsing and displaying human-readable information, and executing remote commands with authentication using self-generated Secure Shell (SSH) keys. By the end of this book, you will have gained significant experience of solving real-world problems, from automating routine tasks to managing your systems and creating your own scripts.
Table of Contents (15 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Finding and deleting duplicate files or directories


At one point, we had already talked about checking to see if strings inside of a file were unique and if we could sort them, but we haven't yet performed a similar operation on files. However, before diving in, let's make some assumptions about what constitutes a duplicate file for the purpose of this recipe: a duplicate file is one that may have a different name, but the same contents as another.

One way to investigate the contents of a file would be to remove all white space and purely check the strings contained within, or we could merely use tools such as SHA512sum and MD5sum to generate a unique hash (think unique string full of gibberish) of the contents of the files. The general flow would be as follows:

  1. Using this hash, we can compare the hash against a list of hashes already computed.
  2. If the has matches, we have seen the contents of this file before and so we can delete it.
  3. If the hash is new, we can record the entry and move onto...