accomplished to maintain production schedules and
minimize cost and lost computer time.
By monitoring the console (CRT or console
printer), you can determine whether a job aborted
because of invalid data or during processing. On some
systems, the operating system software will display on
the console the reason for the jobs cancellation or the
point at which the abort of the program took place. If the
job aborted during the input phase, you may conclude
that bad input data was at fault. If the input data was
accepted and processing begun, you may conclude that
a program malfunction was encountered (barring any
hardware problems) and caused the job to be
automatically flushed (canceled) from the system.
Regardless of why the job aborted, ultimately, you
are responsible for initiating recovery from the job
cancellation, using one of a number of methods. In
many cases, the operators manual or run manual will
provide you with the proper procedures necessary to
recover or restart a job. One method is to rerun the
entire job. However, this could be very costly and time
consuming, especially if the master file(s) had to be
returned to its/their original state. You might have to
recreate files from backup files and rerun programs that
added, changed, and deleted records. This problem is
especially true when working with disk files.
When the operating system supports checkpoint
restart routines, a job can be restarted near the point
where the problem occurred without having to rerun the
entire job (or system).
The logical point to take a
checkpoint is at the end of reading or writing a tape file
or after a predetermined number of records (say,
15,000) have been processed, or after so many minutes
of processing (say, 30 minutes) have occurred. The
programmer determines the points in the program at
which the checkpoints are to occur. This way, if the
program cancels (aborts), it can be started again at the
last checkpoint.
Even if the system provides for an automatic restart
at the last good checkpoint, you still must authorize the
restart. Usually, a message will appear on the console
indicating the job (or task) to be executed and the
checkpoint for restarting the job. It is then up to you to
either restart the job, postpone the restart until the cause
of the problem can be determined, or indicate that the
job is not to be restarted. Under no circumstances
should the termination or cancellation of a job interfere
with the continuous flow of processing within the
system.
2-6
CANCELING A JOB
Among the tasks you may be asked to initiate via
the console is cancellation of a job currently running
within the system. The purpose of the cancel operation
is to allow you to halt (stop) the processing of an
application program and remove it from the system. A
program can be canceled by either the supervisor
control program or by you.
Should the supervisor
control program determine that an application program
is not executable, it automatically directs the computer
to cancel the program and, thereby, halt its processing.
There are times when you must intervene with normal
processing and flush a job from the system even though
the program being executed may not have an error in it.
For example, you could be instructed to process a higher
priority job immediately. Unable to wait for the
completion of the current program (job), you are,
therefore, required to abort it. Dont become confused
over the terms cancel, flush, or abort; they all have the
same meaning. You may also be required to cancel a job
because it has entered a continuous loop, been running
way beyond the allotted time, or because it is trying to
access a restricted file. You will find that there are many
such reasons for having to cancel a program. There are
times when you will cancel a program or a program will
abnormally terminate (ABEND). This will require you
to dump (print out) the contents of storage. This is
known as a post-mortem dump. The system prints the
contents of all the storage areas used by the program in
the processing. This post-mortem dump is used as a
debugging aid to help the programmer analyze the
program.
Whenever a job is canceled or abnormally
terminates, it is your responsibility to make an entry in
the error/trouble log, giving the cause of the problem
and as much detail as possible.
DOCUMENTATION
Documentation, who needs it? In data processing,
we all do: for without it, we would quickly find
ourselves in serious trouble. As a computer operator, if
you want to know how to run a particular procedure,
job, or system or learn more about a particular
procedure, job, or system, the operators manual or run
manual is a good place to start. It can provide you with a
wealth of information.
Examples are a written
overview of the system and systems flowchart, in-depth
coverage pertaining to I/O requirements, file
specifications (layouts), processing methods, job setup,
error messages that might be generated,