Book Image

BMC Control-M 7: A Journey from Traditional Batch Scheduling to Workload Automation

By : Qiang Ding
Book Image

BMC Control-M 7: A Journey from Traditional Batch Scheduling to Workload Automation

By: Qiang Ding

Overview of this book

Control-M is one of the most widely used enterprise class batch workload automation platform. With a strong knowledge of Control-M, you will be able to use the tool to meet ever growing batch needs. There has been no book that can guide you to implement and manage this powerful tool successfully... until now. With this book you will quickly master Control-M and be able to call yourself "a Control-M" specialist! "BMC Control-M 7: A Journey from Traditional Batch Scheduling to Workload Automation" will lead you into the world of Control-M and guide you to implement and maintain a Control-M environment successfully. By mastering this workload automation tool, you will see new opportunities opening up before you. With this book you will be able to take away and put into practice knowledge from every aspect of Control-M ñ implementation, administration, design and management of Control-M job flows, and more importantly how to move into workload automation and let batch processing utilize the cloud. You will start off with batch processing and workload automation, and then get an understanding of how Control-M meets these needs. Then we will look more in depth at the technical details of Control-M, and finally look at how to work with it to meet critical business needs. Throughout the book, you will learn important concepts and features, as well as learn from the Author's experience, accumulated over many years. By the end of the book you will be set up to work efficiently with this tool and also understand how to utilize the latest features of Control-M.
Table of Contents (16 chapters)
BMC Control-M 7: A Journey from Traditional Batch Scheduling to Workload Automation
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Automating batch processing


As the modern computing batch processing is far more complicated than just simply feeding punched cards in sequence into the mainframe as it was in old days, a lot more factors need to be taken into consideration when running batch processing due to its time consuming and task-intensive nature. Batch scheduling tools were born to automate such processing tasks, thus reducing the possibility of human mistake and security concerns.

There were home-grown toolsets developed on the mainframe computers for automating JCL scripts. Modern age distributed computer systems also came with some general ability to automate batch processing tasks. On a Unix or LINUX computer, CRON is a utility provided as part of the operating system for automating the triggering of executables. The equivalent tool on Windows is called task scheduler. With these tools, the user can define programs or scripts to run at a certain time or at various time intervals. These tools are mainly used for basic scheduling needs such as automating backups at a given time or system maintenance.

These tools do not have the ability to execute tasks according to pre-requisites other than time. Due to the limiting feature and unfriendly user interface, users normally find it challenging when trying to use these tools for complex scheduling scenarios, such as when there is a predefined execution sequence for a group of related program tasks.

Over the years, major software vendors developed dedicated commercial batch scheduling tools such as BMC Control-M to meet the growing needs in batch processing. These tools are designed to automate complicated batch processing requirements by offering the ability to trigger task executions according to the logical dependencies between them.

Basic elements of a job

Similar to CRON, users firstly are required to define each processing task in the batch-scheduling tool together with its triggering conditions. Such definitions are commonly known as "Job Definitions", which get stored and managed by the scheduling tool. The three essential elements within each job definition are:

  • What to trigger — The executable program's physical location on the file system

  • When to trigger — the job's scheduling criteria

  • Dependencies — the job's predecessors and dependents

What to trigger

From a batch scheduling tool point of view, it needs to know which object is to be triggered. It can be a JCL on the mainframe, a Unix shell script, a Perl program, or a Windows executable file. A job also can be a database query, a stored procedure that performs data lookup or update, or even a file transfer task. There are also application-specific execution objects, such as SAP or PeopleSoft tasks.

When to trigger (Job's scheduling criteria)

Each job has its own scheduling criteria, which tells the batch scheduling tool when the job should be submitted for execution. Job scheduling criteria contains the job's execution date and time. A job can be a daily, monthly, or quarterly job, or set to run on a particular date (for example, at the end of each month when it is not a weekend or public holiday). The job can also be set to run at set intervals (running cyclic). In such cases, the job definition needs to indicate how often the job should run and optionally the start time for its first occurrence and end time for its last occurrence (for example, between 3pm to 9pm, run the job every five minutes). Most of the job schedulers also allow users to specify the job's priority, how to handle the job's output, and what action to take if the job fails.

Dependencies (Job's predecessors and dependents)

Job dependency is the logic between jobs that tells which jobs are inter-related. According to the job dependency information, the batch scheduling tool groups the individual, but inter-related jobs together into a batch flow. Depending on the business and technical requirements, they can be defined to run one after another or run in parallel. The common inter-job relationships are:

  • One to one

  • One to many (with or without if-then, else)

  • Many to one (AND/OR)

A one to one relationship simply means the jobs run one after another, for example, when Job A is completed, then Job B starts.

A one to many relationship means many child jobs depend on the parent job, once the parent job is completed, the child jobs will execute. Sometimes there's a degree of decision making within it, such as if job A's return code is 1, then run Job B, or if the return code of Job A is greater than 1, then run Job C and Job D.

A many to one relationship refers to one child job that depends on many parent jobs. The dependency from parent jobs' completion can be an AND relationship, an OR relationship also can be AND and OR mixed, for example, in an AND scenario, Job D will run only if Job A, Job B, and Job C are all completed. In an OR scenario, Job D will run if any of Job A, Job B, or Job C are completed. In a mixed scenario, Job D will run if Job A or Job B and Job C is completed.

During normal running, the batch scheduling tool constantly looks at its record of jobs to find out which jobs are eligible to be triggered according to the job's scheduling criteria, and then it will automatically submit the job to the operating system for execution. In most cases, the batch scheduling tool can control the total number of parallel running jobs to keep the machine from being overloaded. The execution sequence among parallel jobs can be based on individual job's predefined priority, that is, the higher priority jobs can be triggered before the lower priority ones. After each execution, the job scheduling tool will get an immediate feedback (such as an operating system return code) from the job's execution. Based on the feedback, the job scheduling tool will decide the next action such as to run the next job according to the predefined inter-job dependency or rerun the current job if it is cyclic. Batch scheduling tools may also provide a user interface for the system operator to monitor and manage batch jobs, which gives the ability for the user to manually pause, rerun, or edit the batch job.

More advanced features of scheduling tools

Driven by business and user demand, more sophiscated features have been provided with modern scheduling tools apart from automating the execution of batch jobs, such as:

  • The ability to generate an alert message for error events

  • The ability to handle external event-driven batch

  • Intelligent scheduling — decision making based on pre-defined conations

  • Security features

  • Additional reporting, auditing, and history tracking features

Ability to generate notifications for specified events

By having the ability to let notifications be generated on specified events, operators are freed from 24*7 continuous monitoring and only need to monitor jobs by exception. Users can setup rules so a notification will be sent out when the defined event occurs, such as when a job fails, a job starts late, or runs longer than expected. Depending on the ability of the scheduling tool, the destination for the notification can be an alert console, an e-mail inbox, or an SMS to a mobile phone. Some scheduling tools also have the ability to integrate with third-party IT Service Management (ITSM) tools for automatically generating an IT incident ticket. Job-related information could be included in the alert message, for example, the name of the job, the location of the job, the reason for failure, and the time of the failure.

Ability to handle an external event-driven batch

Even-driven batch jobs are defined to run only when the expected external event has occurred. In order to have this capability, special interfaces are developed within a batch scheduling tool to detect such an event. Detecting a file's creation or arrival is a typical interface for event trigger batch. Some batch schedulers also provide their own application programming interface (API) or have the ability to act as a web service or through message queue to accept an external request. The user needs to prespecify what event to trigger, which job or action within the batch scheduling tool, and during what time frame. So during the defined time frame, the scheduler will listen to the event and trigger the corresponding batch job or action accordingly. External events can sometimes be unpredictable and this can happen at any time. The batch scheduling tools also need to have the ability to limit the number of concurrent running event-t riggered batch jobs to prevent the machine from overloading during peak time periods.

Intelligent scheduling — decision-making based on predefined conditions

Besides generating notifications for events, most of the advanced batch scheduling tools also have the ability to perform intelligent scheduling by automatically deciding which action will be performed next, based on a given condition. With this feature, a lot of the repetitive manual actions for handling events can be automated. Such as:

  • Automatically rerun a failed job

  • Trigger job B when job A's output contains message Processing completed or trigger job C when job A's output contains message Processing ended with warning

  • Skip job C if job B is not finished by 5pm

This feature avoided the human response time for handling such tasks and minimized the possible human mistakes. It significantly contributes to shortening the overall batch processing time. However, this is not a one-size-fits-all approach, as there are chances that the events rather need a human decision to take place. This approach can free the user from repetitive tasks, but can also increase maintenance overhead, such as each time when the output of a program is changed, the condition for automatic reaction more likely needs to be changed accordingly.

Security features

For information security concerns, files that reside on the computer system are normally protected with permissions. A script or executable file needs to be running under the corresponding user or group in order to read files as its inputs and write to a file as its output. In the case of a database or FTP job, the login information needs to be recorded in the script for authentication during runtime. The people who manage the batch processing require full access to the user accounts to trigger the script or executable, which means they will also have access to the data that they are not allowed to see. There are also risks that the people with user access rights may modify and execute the executables without authorization. Batch scheduling tools eliminated this concern by providing additional security features from the batch processing prospective, that is, provide user authentication for accessing the batch scheduling console, and group users into different levels of privileges according to their job role. For example, users in the application development group are allowed to define and modify jobs, users in the operation group are allowed to trigger or rerun jobs, and some third-party users may only have the rights to monitor a certain group of jobs.

Additional reporting, auditing, and history tracking features

While the batch scheduling tool provides great scheduling capability and user friendly features, tracking historical job executions and auditing user actions is also available. This is because all jobs runs on the same machines are managed from a central location. By using the reports, rather than getting logs from each job's output directory and searching for relevant system logs, the user can directly track problematic jobs, know when a job failed, who triggered what job at what time, or create a series of reports to review the job execution history trend. Apart from being handy for troubleshooting and optimizing batch runs, the information can also become handy for the organization to meet the IT-related regulatory compliance.