schulz.gms : Termination routine to ensure solvers stay with resource limit

Description

Sometimes experimental solvers do not terminate in the prespecified
time limit or at all. Running batches of models with such a solver
(e.g. for performance testing) requires frequent attention (to
terminate the hanging process). This little GAMS program scans the
list of processes and checks if the CPU time exceedes the preset
limit. If the time is exceeded 'schulz' sends a terminate signal to
the process.

gams schulz --watch ^gms --res 1010 --sleep 60

--watch the process names to watch (argument is a regular expression)

--res   the maximum time available to a watched process before sending a
        termination signal. (in seconds)

--sleep frequency of checking processes (in seconds)

The parameters in the example call of 'schulz' represent the defaults
for the options.

Terminate 'schulz' with Ctrl-C.

Keywords: GAMS language features, termination routine


Small Model of Type : GAMS


Category : GAMS Model library


Main file : schulz.gms

$title Termination Routine to ensure Solvers stay with Resource Limit (SCHULZ,SEQ=322)

$onText
Sometimes experimental solvers do not terminate in the prespecified
time limit or at all. Running batches of models with such a solver
(e.g. for performance testing) requires frequent attention (to
terminate the hanging process). This little GAMS program scans the
list of processes and checks if the CPU time exceedes the preset
limit. If the time is exceeded 'schulz' sends a terminate signal to
the process.

gams schulz --watch ^gms --res 1010 --sleep 60

--watch the process names to watch (argument is a regular expression)

--res   the maximum time available to a watched process before sending a
        termination signal. (in seconds)

--sleep frequency of checking processes (in seconds)

The parameters in the example call of 'schulz' represent the defaults
for the options.

Terminate 'schulz' with Ctrl-C.

Keywords: GAMS language features, termination routine
$offText

$if not set sleep $set sleep   60
$if not set res   $set res   1010
$if not set watch $set watch ^gms

$set ptxt '"%gams.scrdir%p.%gams.scrext%"'
$set rgms '"%gams.scrdir%runp.%gams.scrext%"'
$set kill taskkill /F /PID
$if %system.filesys% == UNIX $set kill kill -9

* GAMS Program that determines the running processes that match our watch
$onEcho > %rgms%
$offListing
Set p / 0*99999/, pover(p);
$onEmpty


$ifThen  %system.filesys% == UNIX
$call ps -eofname,pid,time | grep %watch% | sed s/:/" "/g | awk '{print $2 " " ($3*3600+$4*60+$5)}' > %ptxt%
Parameter pactive(p)
/
$include %ptxt%
/
$else
Parameter pactive(p);
$call wmic process get CreationDate,Name,ProcessId | awk "{print $2,$3,$1}" | grep %watch% | awk "{print $2,$3}" | cut -d. -f1 > %ptxt%
$onEmbeddedCode Python:
with open(r%ptxt%) as f:
    lines = [line.rstrip() for line in f]
from datetime import datetime
now = datetime.now()
pa = []
for l in lines:
    pid, t = l.split()
    creation_time = datetime.strptime(t, "%Y%m%d%H%M%S")
    print(f'{now} {creation_time} {(now - creation_time).seconds}')
    pa.append((pid,(now - creation_time).seconds))
gams.set('pactive', pa)
$offEmbeddedCode pactive
$endIf
$offEmpty
pover(p) = pactive(p) > %res%;
execute_unload "pover", pactive, pover;
$offEcho

Set p / 0*99999/, pover(p);

Parameter pactive(p);
scalar x;
repeat
   execute 'sleep %sleep% && gams %rgms% lo=0';
   execute_load 'pover', pover, pactive;
   put_utility$card(pactive) 'log' / 'Active processes:';
   loop(p$pactive(p), put_utility 'log' / ' ':6 p.tl:6 ' ' pactive(p):5:0 'secs';);
   put_utility$card(pover) 'log' / 'Terminating processes:';
   loop(p$pover(p), put_utility 'exec' / '%kill% ' p.tl:0;);
   x = errorlevel;
   display x;
until errorlevel;