Book Image

Mastering Python Scripting for System Administrators

By : Ganesh Sanjiv Naik
Book Image

Mastering Python Scripting for System Administrators

By: Ganesh Sanjiv Naik

Overview of this book

Python has evolved over time and extended its features in relation to every possible IT operation. Python is simple to learn, yet has powerful libraries that can be used to build powerful Python scripts for solving real-world problems and automating administrators' routine activities. The objective of this book is to walk through a series of projects that will teach readers Python scripting with each project. This book will initially cover Python installation and quickly revise basic to advanced programming fundamentals. The book will then focus on the development process as a whole, from setup to planning to building different tools. It will include IT administrators' routine activities (text processing, regular expressions, file archiving, and encryption), network administration (socket programming, email handling, the remote controlling of devices using telnet/ssh, and protocols such as SNMP/DHCP), building graphical user interface, working with websites (Apache log file processing, SOAP and REST APIs communication, and web scraping), and database administration (MySQL and similar database data administration, data analytics, and reporting). By the end of this book, you will be able to use the latest features of Python and be able to build powerful tools that will solve challenging, real-world tasks
Table of Contents (21 chapters)

Metadata: data about data

In this section, we are going learn about the pyPdf module, which helps in extracting the metadata from a pdf file. But first, what is metadata? Metadata is data about data. Metadata is structured information that describes primary data. Metadata is a summary of that data. It contains the basic information regarding your actual data. It helps in finding a particular instance of your data.

Make sure you have the pdf file present in your directory from which you want to extract the information.

First, we have to install the pyPdf module, as follows:

pip install pyPdf

Now, we will write a metadata_example.py script and we will see how we get the metadata information from it. We are going to write this script in Python 2:

import pyPdf
def main():
file_name = '/home/student/sample_pdf.pdf'
pdfFile = pyPdf.PdfFileReader(file...