Book Image

Linux Shell Scripting Cookbook, Second Edition - Second Edition

Book Image

Linux Shell Scripting Cookbook, Second Edition - Second Edition

Overview of this book

The shell remains one of the most powerful tools on a computer system — yet a large number of users are unaware of how much one can accomplish with it. Using a combination of simple commands, we will see how to solve complex problems in day to day computer usage.Linux Shell Scripting Cookbook, Second Edition will take you through useful real-world recipes designed to make your daily life easy when working with the shell. The book shows the reader how to effectively use the shell to accomplish complex tasks with ease.The book discusses basics of using the shell, general commands and proceeds to show the reader how to use them to perform complex tasks with ease.Starting with the basics of the shell, we will learn simple commands with their usages allowing us to perform operations on files of different kind. The book then proceeds to explain text processing, web interaction and concludes with backups, monitoring and other sysadmin tasks.Linux Shell Scripting Cookbook, Second Edition serves as an excellent guide to solving day to day problems using the shell and few powerful commands together to create solutions.
Table of Contents (16 chapters)
Linux Shell Scripting Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Parsing e-mail addresses and URLs from text


Parsing a required text from a given file is a common task that we encounter in text processing. Items such as, e-mails and URLs can be found out with the help of correct regex sequences. Mostly, we need to parse e-mail addresses from a contact list of an e-mail client, which is composed of many unwanted characters and words, or from an HTML web page.

How to do it...

The regular expression pattern to match an e-mail address is as follows:

[A-Za-z0-9._]+@[A-Za-z0-9.]+\.[a-zA-Z]{2,4}

For example:

$ cat url_email.txt 
this is a line of text contains,<email> #[email protected]. </email> and email address, blog "http://www.google.com", [email protected] dfdfdfdddfdf;[email protected]<br />
<a href="http://code.google.com"><h1>Heading</h1>

As we are using extended regular expressions (+, for instance), we should use egrep.

$ egrep -o '[A-Za-z0-9._]+@[A-Za-z0-9.]+\.[a-zA-Z]{2,4}'  url_email.txt
[email protected] 
test@yahoo...