Finding and Fixing Execution Errors and Performance Problems

This tutorial is structured in four main parts: we will discuss how to find and resolve errors that are detected during execution of a GAMS model, we will give some guidance for model development and debugging and we will present techniques to increase efficiency by reducing GAMS execution time and memory use.

Resolving Execution Errors

Recall that GAMS passes through a program file several times in the process of generating and solving a model. Errors may occur in each phase. In this section we will give some guidance on how to resolve errors that occur during execution, so after compilation. For advice on resolving compilation errors, see the tutorial Fixing Compilation Errors. For more information on the process of generating and solving a model in GAMS, see the introduction to chapter GAMS Output.

At execution, several things could go wrong and cause an error. We will look at these potential error sources separately in this section. First we look at arithmetic errors and exceeded internal limits during data manipulation, we will continue with problems during model generation and model solution. At the end, we will briefly discuss how execution errors may be managed with the function execError.

Arithmetic Errors

GAMS execution errors may be caused by illegal arithmetic operations like a negative argument for log, division by zero and exponentiation where the base is a negative number. The following simple example serves as illustration:

Set       s     / s1*s5 / ;
Parameter p(s)  "data to be exponentiated"
          d(s)  "divisors"
          r(s)  "result";

p(s)    = 1;
p("s2") = -1;
d(s)    = 1;
d("s3") = 0;
r(s)    = p(s)**2.1 / d(s)
display r;

The first sign that something in the execution went wrong is the following flag in the log output:

*** Status: Execution error(s)

The resulting execution output will contain the following lines:

E x e c u t i o n

**** Exec Error at line 10: rPower: FUNC DOMAIN: x**y, x < 0
**** Exec Error at line 10: division by zero (0)

----     11 PARAMETER r  result

s1 1.000,    s2  UNDF,    s3  UNDF,    s4 1.000,    s5 1.000

Observe that the execution output begins with two error messages that can be easily found since they are marked with four asterisks ****. The error messages are very informative: they indicate the line where the errors occurred and provide details about the nature of the errors. Further, the output generated by the display statement shows that the errors occurred when the values for r("s2") and r("s3") were computed. Inspecting the assignment statement for these two values, we realize that in the first instance the base for the exponentiation is -1, which obviously is a negative number and hence is not allowed in this operation. In the second instance, the problem is that we divide by d("s3") which equals zero.

In this example, the errors are easily resolved with data revisions. In general, we recommend to use conditional assignments to prevent errors like these.

Note that in the example above the error messages indicated exactly where the problem was and it was easy to find the cause of the error. However, this is not always the case. In particular, if the problem is within a multi-dimensional item the user will need more patience. Usually it helps to display the results of the problematic operation and look for faulty entries. In addition, displaying the input data to the respective operation will help to investigate the numerical properties of the data that was entered in the computation. Often more displays will be needed to trace faulty input data through the program. Eventually this will lead the user to understand why the data has taken on the specific numerical values it has.

Exceeding GAMS Limits

By default, GAMS stops the solve of a model after 1e10 seconds (wall clock time) or 2e9 iterations. These limits may be adjusted with the options reslim and iterlim respectively. Note that both options are also available as command line parameters and model attributes. In addition, the workspace may be limited with the command line parameters WorkFactor and WorkSpace. Note that these options are also available as model attributes. If any of these limits are exceeded, the execution of the solve statement will be interrupted.

For example, we could add the following option statement somewhere before the solve statement in the production model [CHENERY]:

option iterlim = 20;

Note that this statement reduces the iteration limit to just 20. The log output will contain the following lines:

 ** Feasible solution. Value of objective =    1033.34069261
 
 ** The iteration limit has been reached.
 
--- Restarting execution
--- chenery.gms(228) 2 Mb
--- Reading solution for model chenrad
*** Status: Normal completion
--- Job chenery.gms Stop 11/21/16 16:52:43 elapsed 0:00:00.106

Also the solve summary in the listing file notes the interrupt:

               S O L V E      S U M M A R Y

     MODEL   chenrad             OBJECTIVE  td
     TYPE    NLP                 DIRECTION  MAXIMIZE
     SOLVER  CONOPT              FROM LINE  228

**** SOLVER STATUS     2 Iteration Interrupt
**** MODEL STATUS      7 Feasible Solution
**** OBJECTIVE VALUE             1058.9199

 RESOURCE USAGE, LIMIT          0.078      1000.000
 ITERATION COUNT, LIMIT        20            20
 EVALUATION ERRORS              0             0

Observe that the solver status Iteration Interrupt indicates that the execution terminated because the iteration limit has been reached resulting in a feasible solution, but not the optimal solution. The line ITERATION COUNT, LIMIT ... reports that 20 iterations were performed and that 20 was also the limit for the number of iterations. Setting iterlim to a larger value will resolve this issue.

Similarly, allocating too little work space will cause the solver to terminate with no solution. For example, restricting the work space for the nonlinear test model [MHW4D] to just 0.1 MB and running it with the solver MINOS will produce the following lines in the log output:

 Work space requested by solver --     0.77 Mb
 Work space requested by user   --     0.10 Mb
 Work space allocated           --     0.10 Mb

  Reading Rows...
  Reading Columns...
  Reading Instructions...


 EXIT - Not enough storage to solve the model.
        Request at least    0.19 Mbytes.

The solve summary in the listing file will contain the following information:

               S O L V E      S U M M A R Y

     MODEL   wright              OBJECTIVE  m
     TYPE    NLP                 DIRECTION  MINIMIZE
     SOLVER  MINOS               FROM LINE  32

**** SOLVER STATUS     9 Setup Failure
**** MODEL STATUS      13 Error No Solution

Note that increasing the work space to at least the minimum amount requested by the solver will resolve this issue.

When dealing with large nonlinear expressions defined over a very large domain, one can face the following error.

*** Status: Terminated due to limits in NLCodeAdd
***         Cannot handle more than 2147483647 instruction in NL code
***         Inspect listing file for more information

The first thing to check in this case is correctness of the model i.e., if you are not generating anything more than necessary. If this is not the case and the model is indeed large, a generic advise is to introduce intermediate variables to get a smaller code size per block of equations. One can also consider partitioning the domains so that one can have more number of smaller blocks.

Resolving Model Generation Errors

Further execution errors may be detected when GAMS is generating the model before passing it to the solver. These errors may be arithmetic errors in the body of equations or errors in the structure of the model that cause the model to be inherently infeasible.

Consider the following simple example with arithmetic errors in the body of the equations. They are similar to the errors in the assignment in the example in section Arithmetic Errors above.

Set       s     / s1*s5 / ;
Parameter p(s)  "data to be exponentiated"
          d(s)  "divisors"
          m(s)  "multipliers";

p(s)    = 1;
p("s2") = -1;
d(s)    = 1;
d("s3") = 0;
m(s)    = 1;
m("s4") = 0;

Positive variable x(s);
Variable          z;

Equations obj  "objective function"
          xlim;

obj..      z =e= sum(s,p(s)**2.2*x(s));
xlim(s)..  m(s) / d(s)*x(s) =e= 1;

Model mymodel / all /;
solve mymodel using lp maximizing z;

If we run this model, the log output will contain the followng lines:

*** SOLVE aborted
--- Executing CPLEX: elapsed 0:00:00.006
--- test.gms(23) 4 Mb 3 Errors
*** Status: Execution error(s)
--- Job test.gms Stop 11/21/16 19:10:12 elapsed 0:00:00.006

Observe that the solve was aborted since there are 3 execution errors. The equation listing in the listing file will contain further details about these execution errors:

Equation Listing    SOLVE mymodel Using LP From line 23

**** Exec Error at line 19: rPower: FUNC DOMAIN: x**y, x < 0

---- obj  =E=  objective function

obj..  - x(s1) + UNDF*x(s2) - x(s3) - x(s4) - x(s5) + z =E= UNDF ; (LHS = UNDF)

**** Exec Error at line 20: division by zero (0)
**** Exec Error at line 20: Equation infeasible due to rhs value

**** INFEASIBLE EQUATIONS ...

---- xlim  =E=

xlim(s4)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)

REMAINING 4 ENTRIES SKIPPED

Note that there is an arithmetic error relating to exponentiation in the first equation and an arithmetic error and an infeasibility in the second equation.

In our example, it was easy to detect the execution errors and their cause. However, an error in a multi-dimensional equation block may be much more difficult to find. Note that by default, only the first three entries in each equation block are shown in the equation listing. We recommend to use the option limrow to get a full listing, as this is the easiest way to inspect execution errors in the body of equations.

Resolving Solve Errors

In the solution phase, an external solver program processes the model and creates output with details about the solution process. Solve errors may be either function evaluation errors or presolve errors.

Resolving Function Evaluation Errors

Some solve statements require the evaluation of nonlinear functions and the computation of derivatives. Since these calculations are not carried out by GAMS but by other subsystems not under the direct control of GAMS, errors associated with these calculations are reported in the solution report.

Function evaluation errors are numerical errors like those discussed in section Arithmetic Errors above. Other examples include square roots of negative variables and squaring a negative term, say x, using the syntax x**2.

Attention
Squaring a negative term, say x, using the syntax x**2 will cause an error. However, the alternatives sqr(x) and x*x will work (see here for an explanation).

Note that by default the solver subsystems will interrupt the solution process if arithmetic errors are encountered. Users may wish to permit a certain number of arithmetic errors and have reported error warnings instead. The option domlim facilitates this modification. Note that the default value for domlim is zero.

The best way to avoid evaluating functions outside their domain of definition is to specify reasonable variable bounds. However, there are cases when bounds are not enough. Consider the following simple example:

Set       i  / i1*i5 /;
Variables x(i), z;
Equations r1, r2(i);

r1..     z =e= log(sum(i, x(i)));
r2(i)..  x(i) =l= 10;
x.lo(i) = 0;
x.l(i)  = 5;

Model takelog / all /;
solve takelog using nlp minimizing z;

If we try to solve this little program with the solver MINOS, the log output will contain the following line:

EXIT - Function evaluation error limit exceeded.

The solution report in the listing file will have more detailed information:

               S O L V E      S U M M A R Y

     MODEL   takelog             OBJECTIVE  z
     TYPE    NLP                 DIRECTION  MINIMIZE
     SOLVER  MINOS               FROM LINE  12

**** SOLVER STATUS     5 Evaluation Interrupt
**** MODEL STATUS      7 Feasible Solution
**** OBJECTIVE VALUE                0.0000

 RESOURCE USAGE, LIMIT          0.183      1000.000
 ITERATION COUNT, LIMIT         0    2000000000
 EVALUATION ERRORS              2             0

...

 EXIT - Function evaluation error limit exceeded.

**** ERRORS/WARNINGS IN EQUATION r1
     2 error(s): log: FUNC DOMAIN: x < 0 (RETURNED 0)

...

**** REPORT SUMMARY :        1     NONOPT ( NOPT)
                             0 INFEASIBLE
                             0  UNBOUNDED
                             1     ERRORS ( ****)

Note that the solver status has a value of 5 (Evaluation Interrupt), which means that the solver has been interrupted as more evaluation errors have been encountered than specified with the option domlim. In our case domlim equals its default value zero, thus one error is enough to cause the interruption. The equation in which the evaluation error occurred and the type of error is reported a few lines later. In our example, the equation r1 is problematic, since we take the logarithm of the expression sum(i, x(i)), an expression which may become zero.

Note that in models such as this each individual variable x(i) should be allowed to become zero, but the sum should not. This may be achieved by introducing an intermediate variable, say xsum, adding a lower bound greater than zero for it and using this variable as the argument for the function log:

Variable xsum;
xsum.lo = 0.0001;

Equations defxsum, r1;

defxsum ..    xsum =e= sum(i, x(i));
r1..          z =e= log(xsum);

For more information on intermediate variables, see section Avoiding Expressions in Nonlinear Functions in the tutorial Good NLP Formulations.

Observe that solvers report the type of arithmetic problem encountered and the problematic equation, but do not identify the particular offending variable or the labels in the index of an equation that cause the error. If the cause is not obvious, users will have to investigate the numerical properties of the variables, labels and parameters in the body of the respective equation. This may involve the following:

  • Displaying the input data items to the nonlinear terms in the respective equation.
  • Searching the solution for equations that are infeasible (INFES) and variables that are nonoptimal (NOPT) in order to see where problems are present and which variables were being manipulated at the end of the run.
  • Investigating variables and equations whose level values are zero, negative or very large at the end of the run.
  • Deactivating part of the code to narrow down the problem as discussed in section Isolating Terms in Slow Statements below.

Resolving function evaluation errors will usually entail the following techniques:

  • Adding lower bounds to variables to keep them above zero.
  • Adding upper bounds to variables to prevent them from getting too large.
  • Reformulating the model, for example, introducing intermediate variables.
  • Providing better starting points that direct the solver search to a more relevant region. See section Specifying Initial Values in tutorial Good NLP Formulations for details.
  • Fixing faulty input data.

Presolve Errors

Some solvers use a pre-processing step where the program is presolved to make the main solution process faster. During this step model errors could already be discovered, as in the following example:

Variables         z;
Integer Variables y1,y2;

Equations         r1,r2,r3,r4;

   r1..  z=e=y1+y2;
   r2..  y1=g=0.10;
   r3..  y2=g=0.10;
   r4..  y1+y2=l=1;

Model badpresol /all/;
solve badpresol using mip maximizing z;

For this problem, Cplex detects in the presolve already, that there is no feasible integer solution. This is reported in the log:

Row 'r4' infeasible, all entries at implied bounds.
Presolve time = 0.00 sec. (0.00 ticks)

...

CPLEX Error  1217: No solution exists.
Problem is integer infeasible.

Here, Cplex makes it clear, where we have a problem: Row r4 is infeasible, because all entries are at their "implied bounds". Lets look at r2 and r3 to see what this means: These equations set a lower bound of 0.1 for y1 and y2. Since both variables are defined as Integer Variables, they get an implicit lower bound of 1. Given that, equation r4 must be infeasible.

Solver Specific Limits

Many solvers have internal limits that may be exceeded and may cause the listing file to report an execution error. These errors may be resolved by using either GAMS options or solver specific options to increase the respective limits. Usually, the listing file will contain information about which options to use. Note that the solver manuals distributed with GAMS list the options that may be specified for each solver. For example, to relax the MINOS major iteration limit, the user may create a file named minos.opt with the following line:

Major iterations 1000

More about solver option files can be found in section The Solver Options File.

Managing Execution Errors with the Function execError

The function execError facilitates implementing procedures that manage execution errors. Consider the following example, which is an extension of the example in section Arithmetic Errors above.

Set       s     / s1*s5 / ;
Parameter p(s)  "data to be exponentiated"
          d(s)  "divisors"
          r(s)  "result";

p(s)    = 1;
p("s2") = -1;
d(s)    = 1;
d("s3") = 0;
r(s)    = p(s)**2.1 / d(s)
display r;

*cause z to be undefined
Scalar z;
z = 1/0;

if(execError > 0,
   r(s)$(r(s) = z) = 0;);
display r;

Observe that we introduced a new scalar z that is deliberately undefined. In the if statement that follows, we use the function execError in the logical condition and the undefined scalar in the conditional assignment. The if statement has the effect that undefined entries are removed from the array of the parameter r, as illustrated in the following lines of the execution output:

E x e c u t i o n

**** Exec Error at line 10: rPower: FUNC DOMAIN: x**y, x < 0
**** Exec Error at line 10: division by zero (0)

----     11 PARAMETER r  result

s1 1.000,    s2  UNDF,    s3  UNDF,    s4 1.000,    s5 1.000

**** Exec Error at line 16: division by zero (0)

----     20 PARAMETER r  result

s1 1.000,    s4 1.000,    s5 1.000

In addition, the function execError may be used to reset the count of the number of execution errors. Typically, it is reset to zero so that GAMS will terminate with the status message Normal completion. For example, we could add the following line at the end of the code in the example above:

execError = 0;
Note
Setting execError = 0; will not only result in a normal completion in the example above. A solve statement will not be executed if there were execution errors before by default. Setting execError = 0; before the solve statement, will allow to execute it again.

Setting execError = 0; results also in a notification in the log:

********************************
*** Errors have been cleared ***
********************************
*** Status: Normal completion

Small to Large: Aid in Development and Debugging

Many GAMS users are overly impressed with how easily GAMS handles large models. Modelers often feel such a facility means they should always work on the full model. The result is often a large, sometimes extremely large, model in the early stages of model development. Debugging such large formulations is not easy.

The algebraic modeling style employed in GAMS is inherently expandable. This offers interesting possibilities in terms of the strategy that may be employed for model development and debugging which are discussed herein.

An Illustrative Example

The set based algebraic modeling style implemented in GAMS is by its very nature easy to expand. It is easy to use the same model formulation on differently sized data sets. We will illustrate this based on the transportation model [TRNSPORT]. Note that we included some post-solution calculations at the end.

* Data section
Sets  i   "canning plants"   / Seattle, San-Diego / 
      j   "markets"          / New-York, Chicago, Topeka / ;
     
Parameters  a(i)  "capacity of plant i in cases"
               / Seattle   350, San-Diego  600/

            b(j)  "demand at market j in cases"
               / New-York 325, Chicago 300, Topeka 275 /;
    
Table       d(i,j)  "distance in thousands of miles"

                          New-York       Chicago      Topeka
           Seattle          2.5             1.7         1.8
           San-Diego        2.5             1.8         1.4  ;
    
Scalar f  "freight in dollars per case per thousand miles" /90/ ;

Parameter   c(i,j)  "transport cost in thousands of dollars per case" ;
c(i,j) = f * d(i,j) / 1000 ;

* Model Section
Positive Variable  x(i,j)  "shipment quantities in cases";
Variable           z       "total transportation costs in thousands of dollars";

Equations cost        "define objective function"
          supply(i)   "observe supply limit at plant i"
          demand(j)   "satisfy demand at market j";
    
cost ..        z  =e=  sum((i,j), c(i,j) * x(i,j)) ;
supply(i) ..   sum(j, x(i,j))  =l=  a(i) ;
demand(j) ..   sum(i, x(i,j))  =g=  b(j) ; 

Model transport /all/ ;
solve transport using lp minimizing z ;

Parameter   m(*,*)  "commodity movement";
m(i,j) = x.l(i,j);
m("total",j) = sum(i, x.l(i,j));
m(i,"total") = sum(j, x.l(i,j));
m("total","total") = sum(j, m("total",j));
option decimals = 0;
display m;

This model may be easily extended by adding more data:

* Data section
Sets  i   "canning plants"   / Seattle, San-Diego, Baltimore, Dallas    /
      j   "markets"          / New-York, Chicago, Topeka, Boston, Miami /;
     
Parameters  a(i)  "capacity of plant i in cases"
               / Seattle   350, San-Diego  600, Baltimore 450, Dallas  750 /

            b(j)  "demand at market j in cases"
               / New-York 325, Chicago 300, Topeka 275, Boston 330, Miami 290 /;
    
Table       d(i,j)  "distance in thousands of miles"

                          New-York    Chicago   Topeka   Boston   Miami
           Seattle          2.5          1.7      1.8     3.1      3.3
           San-Diego        2.5          1.8      1.4     3.0      2.7
           Baltimore        0.2          0.7      1.8     0.4      1.1
           Dallas           1.5          0.9      0.5     1.8      1.3 ;

Scalar f  "freight in dollars per case per thousand miles" /90/ ;

Parameter   c(i,j)  "transport cost in thousands of dollars per case" ;
c(i,j) = f * d(i,j) / 1000 ;

* Model Section
Positive Variable  x(i,j)  "shipment quantities in cases";
Variable           z       "total transportation costs in thousands of dollars";

Equations cost        "define objective function"
          supply(i)   "observe supply limit at plant i"
          demand(j)   "satisfy demand at market j";
    
cost ..        z  =e=  sum((i,j), c(i,j) * x(i,j)) ;
supply(i) ..   sum(j, x(i,j))  =l=  a(i) ;
demand(j) ..   sum(i, x(i,j))  =g=  b(j) ; 

Model transport /all/ ;
solve transport using lp minimizing z ;

Parameter   m(*,*)  "commodity movement";
m(i,j) = x.l(i,j);
m("total",j) = sum(i, x.l(i,j));
m(i,"total") = sum(j, x.l(i,j));
m("total","total")=sum(j, m("total",j));
option decimals = 0;
display m;

Observe that the two sets (i and j) were enlarged, the capacity (a) and demand (b) data were expanded to cover the new plant and market entries and the distance table (d) was adjusted accordingly. However, the data calculation, equations, model definition, model solution and report writing sections are identical in the two models.

Motivation and Step by Step Guide

As we have demonstrated in the example above, GAMS allows the model structure, calculations and report writing to be developed and implemented using a small data set, that may be easily expanded to larger data sets. Thus, we strongly recommend to start with a representative purposefully small data set and enlarge it to its full size once the work of model development, testing and debugging has been done. In short: work from small to large.

The larger the model the longer it takes to compile it, generate the model, execute and solve it. Generally, time expands exponentially. Working with a large model from the start will often lead to frustration even when the user is trying to find some relatively small data problems.

If a model that has already been completed needs some modification, it will be tempting to use the large data set instead of developing the modifications on a small data set. We strongly advise to use a small data set in this case, as experience shows that this way a considerable amount of time may be saved.

We recommend to follow these steps in model development:

  1. Set up a small data set representing the full model with all structural features, set names, parameters etc.
  2. Implement all data calculations, model features and report writing calculations.
  3. Test the results of step 2 thoroughly.
  4. Save the small model. Then implement a larger version with the full data set. Create separate files for data, calculation, model definition and report writing to maintain size independence. Use include files or the save and restart feature.
  5. Test the larger model. Use the modeling techniques discussed below to facilitate your work.
  6. Keep the small model current. As additional structural features are added to the large model, use it to test them. See section Introducing Strategical Subsets below for an easy way to maintain a small model.

Modeling Techniques

If users follow the steps for model development outlined in section Motivation and Step by Step Guide above, they will notice that it will not always be possible to model every needed feature with the small model. It is important to carefully choose the small data set so that it has all features of the larger data set. However, occasionally the peculiarities and interrelationships of the full data set cannot be reproduced in the small data set. In this section we will introduce some modeling techniques for finding problems that arise only when the full data set is used. They include saving and restarting to isolate the problem area, strategically introducing subsets and data reduction.

Isolating Problem Areas through Saving and Restarting

Suppose we have a model with a large data set that takes several hours to run and we wish to add some lines of code in a relatively small segment. The best way to do this is by isolating the relevant part. Isolating the part we wish to modify makes it possible to do tests and repairs without having to input data, do initial calculations and solve the whole model with each run. We recommend to use save and restart files.

For example, in chapter The Save and Restart Feature we demonstrate how to split the transportation model [TRNSPORT] in three parts: the file tranmodel.gms contains the data and the model, the file transolve.gms contains the solve statement and the file tranreport.gms contains a display statement. To run the whole model we use the following sequence, saving and restarting from the saved file:

   > gams tranmodel s=s1
   > gams transolve  r=s1  s=s2
   > gams tranreport r=s2

Assume we want a more elaborate report than just the display of some level values. As the file tranreport.gms contains the code relevant for reporting, we will modify only this file. Then we will test the result by running only this file, restarting from s2, without having to solve the whole model repeatedly.

Introducing Strategical Subsets

When full data sets are used in debugging or development, it is often helpful to narrow the focus on a few items in a set by introducing subsets. The following example is a modified version of the extended transportation model from section An Illustrative Example above.

* Data section
Sets  i   "canning plants"   / Seattle, San-Diego, Baltimore, Dallas / 
      j   "markets"          / New-York, Chicago, Topeka, Boston, Miami / ;

Sets  plants(i)   "a reduced set of canning plants"
                           / Seattle, San-Diego / 
      markets(j)  "a reduced set of demand markets"
                           / New-York, Chicago, Topeka / ;
        
*plants(i) = yes; markets(j) = yes;

Parameters  a(i)  "capacity of plant i in cases"
               / Seattle   350, San-Diego  600, Baltimore 450, Dallas  750 /

            b(j)  "demand at market j in cases"
               / New-York 325, Chicago 300, Topeka 275, Boston 330, Miami 290 /;
    
Table       d(i,j)  "distance in thousands of miles"

                          New-York    Chicago   Topeka   Boston   Miami 
           Seattle          2.5          1.7      1.8     3.1      3.3
           San-Diego        2.5          1.8      1.4     3.0      2.7
           Baltimore        0.2          0.7      1.8     0.4      1.1
           Dallas           1.5          0.9      0.5     1.8      1.3;

Scalar f  "freight in dollars per case per thousand miles" /90/ ;

Parameter   c(i,j)  "transport cost in thousands of dollars per case" ;
c(plants,markets) = f * d(plants,markets) / 1000 ;

* Model section
Positive Variable  x(i,j)  "shipment quantities in cases";
Variable           z       "total transportation costs in thousands of dollars";

Equations cost        "define objective function"
          supply(i)   "observe supply limit at plant i"
          demand(j)   "satisfy demand at market j";
    
cost ..              z  =e=  sum((plants,markets), c(plants,markets) * x(plants,markets)) ;
supply(plants) ..    sum(markets, x(plants,markets))  =l=  a(plants) ;
demand(markets) ..   sum(plants, x(plants,markets))   =g=  b(markets) ; 

Model transport /all/ ;
solve transport using lp minimizing z ;

Observe that we introduced the subsets plants and markets that contain only some of the elements of their supersets i and j. Note that all tables, parameters and variables are defined with the supersets, the equations are declared over the supersets, but defined over the subsets and the calculation of the parameter c is also restricted to the subsets. Hence the model is restricted to the elements of the subsets. However, it is easy to change the restricted model back to the full model by removing the asterisks indicating a comment line:

plants(i) = yes; markets(j) = yes;

Observe that the sets plants and markets are now dynamic sets. Note that this assignment could be inserted anywhere in the code. Thus, introducing strategic subsets may be combined with isolating problem areas, as detailed in section Isolating Problem Areas through Saving and Restarting above.

Introducing strategic subsets has proven to be an effective way of maintaining a small data set with little effort. Users only have to choose elements that are representative for model development and debugging from the full sets.

Reducing Data

Recall that GAMS skips cases where data items equal zero. Thus a large model may be reduced by temporarily removing data from data sets by simply setting items to zero. Consider the following example:

Sets      o         'origin'                           / o1*o100 /
          d         'destination'                      / d1*d100 /;
Parameter dist(o,d) 'distance';
dist(o,d) = 120 + 50*ord(d) - 0.5*ord(o);
 
Sets      so(o)     'small set of origins for testing' / o4, o47, o91 /
          sd(d)     'small set of destinations'        / d3, d44, d99 /;
 
dist(o,d) $ (not (so(o) and sd(d))) = 0;

Parameter cost(o,d) 'transportation cost';
cost(o,d) $ dist(o,d) = 3 + 2*dist(o,d);
display cost, dist;

Note that we introduced strategic subsets and used them in the logical condition of a conditional assignment to set almost all entries of the parameter dist to zero. Note further, that the assignment for the parameter cost is conditioned on nonzero entries for the distance. Now, if the model were conditioned on nonzero transportation costs, the size of the whole model would be greatly reduced.

Increasing Efficiency: Reducing GAMS Execution Time

GAMS can take a long time for computations and model generation. There are some signs which indicate that it may be possible to reduce the execution time, e.g., an execution time that is unexpectedly long in general or a long execution of a single line, which could be seen, if the log shows the same line number for a long time.

In this section we will discuss how to find the causes for slow program execution and how to eliminate the main causes for slow execution.

Finding the Causes for Slow Program Execution

The best strategy for discovering the causes for slow execution is a combination of the techniques discussed in section Small to Large: Aid in Development and Debugging above and the techniques that we will introduce in this section, including generating an execution profile and isolating terms in slow statements. We will also touch briefly on observing the log file and we will point out why this is not the first choice.

Generating an Execution Profile

The quickest way to find GAMS statements that take particularly long to execute, is generating an execution profile in the output file. The execution profile contains the individual and cumulative time required to execute the sections of the GAMS model as well as information on memory use. An execution profile is generated when the option profile is assigned a value larger than zero (zero is the default). This can be done either by setting a command line parameter or by using the option statement. We will show an example of an execution profile below. For more information on execution profiles, further examples and details on the values the option profile may take, see the detailed description here.

Consider the following example:

option profile = 1;
option limrow = 0; option limcol = 0;
option solprint = off;

Sets    a / 1*22 /, b / 1*22 /, c / 1*20 /,
        d / 1*20 /, e / 1*22 /;

Parameters x(e,d,c,b,a), y, z(a,b,c,d,e);
x(e,d,c,b,a) = 10;
z(a,b,c,d,e) = x(e,d,c,b,a);
y            = sum((a,b,c,d,e), z(a,b,c,d,e)*x(e,d,c,b,a));

Variable obj;
Positive Variable var(e,b,a);

Equations objeq, r(b,c,d), q(a,b,c);

objeq..      obj =e= sum((a,b,c,d,e), z(a,b,c,d,e)*x(e,d,c,b,a) * var(e,b,a));
r(b,c,d)..   sum((a,e), var(e,b,a)) =l= sum((a,e), x(e,d,c,b,a)*z(a,b,c,d,e));
q(a,b,c)..   sum((d,e), var(e,b,a)/x(e,d,c,b,a)*z(a,b,c,d,e)) =l= 20;

Model slow /all/;
solve slow maximizing obj using lp;

Parameter sumofvar;
sumofvar = sum((a,b,c,d,e), z(a,b,c,d,e)*x(e,d,c,b,a)*var.l(e,b,a));
display sumofvar;

The listing file will contain an execution profile like this (spread over the file):

----      9 Assignment x             0.374     0.374 SECS    109 MB  4259200
----     10 Assignment z             2.231     2.605 SECS    286 MB  4259200
----     11 Assignment y             2.324     4.929 SECS    286 MB      0
----     23 Solve Init slow          0.000     4.961 SECS    286 MB
----     18 Equation   objeq         3.510     8.471 SECS    287 MB      1
----     19 Equation   r             3.088    11.559 SECS    464 MB   8800
----     20 Equation   q             5.741    17.300 SECS    470 MB   9680
----     23 Solve Fini slow          0.780    18.080 SECS    470 MB  4482809
----     23 GAMS Fini                0.359     0.359 SECS    470 MB
----      1 InitE                    0.032     0.032 SECS    213 MB
----      1 ExecInit                 0.000     0.032 SECS    213 MB
----     23 Solve Alg  slow          0.000     0.032 SECS    213 MB
----     23 Solve Read slow          0.000     0.032 SECS    215 MB
----     26 Assignment sumofvar      2.620     2.652 SECS    287 MB      0
----     27 Display                  0.032     2.684 SECS    287 MB
----     27 GAMS Fini                0.000     0.000 SECS    287 MB

The first column shows the line number in the input file of the GAMS statement that is executed. The second column reports the type of statement. For an overview of all GAMS statements, see section Classification of GAMS Statements. The next two columns give the individual time needed to execute the respective statement and the cumulative time spent so far. The memory use follows and finally, the number of assignments generated in the respective line is shown.

In addition, there is a Profile Summary at the end of the lst file showing the most expensive statements:

---- Profile Summary (19 records processed)
     5.741   0.470GB        20 Equation   q (9680)
     3.510   0.287GB        18 Equation   objeq (1)
     3.088   0.464GB        19 Equation   r (8800)
     2.620   0.287GB        26 Assignment sumofvar (0)
     2.324   0.286GB        11 Assignment y (0)
     2.231   0.286GB        10 Assignment z (4259200)
     0.780   0.470GB        23 Solve Fini slow (4482809)
     0.374   0.109GB         9 Assignment x (4259200)
     0.359   0.470GB        23 GAMS Fini
     0.032   0.213GB         1 InitE

This shows that the execution of the statements in line numbers 20, 18, 19, 26, 11 and 10 are the most expensive ones (in this order). One reason is an inconsistent order when sets are referenced; we will discuss this topic in section Ordering Indices Consistently below.

Note that the execution profile may contain many lines that are not informative since the execution times reported are negligible. These lines may be suppressed by using the the option profileTol to specify the minimum execution time (in seconds) that is to be included. Observe that the option profileTol is available as command line parameter and option statement.

Note further, that the command line parameter profileFile facilitates writing the profiling information to a separate file (instead of the listing file).

Isolating Terms in Slow Statements

In some cases the execution profile shows that the cause for a long execution time is connected with a very long statement. For example, the objective function in some models and some report calculations may take hundreds of lines of code and can contain many terms that are added. If such a long statement is problematic in terms of execution time, it will be necessary to deactivate parts of the code and run the program repeatedly to find the precise lines that are at the root of the problem. This can be done by using comments.

Observing the Log File

Some modelers choose to examine the log file or watch the screen during execution to find the causes for slow program execution. However, we advise against this approach for the following reasons:

  • Statements that are executed slowly are easily missed and often statements are misidentified. In addition, screen watchers may be distracted and will have to repeat the process.
  • GAMS line reporting can be misleading if flow control statements like if statements and loop statements are executed. For example, individual calculations in a loop are not reported to the screen. A user watching the screen would notice that the loop takes a lot of time, but there is no indication which statement within the loop is problematic. This applies to all GAMS control structures.

Therefore we recommend to use the option profile as the main tool for finding the causes for slow program execution. For details, see section Generating an Execution Profile above. In addition, see the techniques outlined in section Advice for Repairing Puzzling Nonworking Code below.

Eliminating the Main Causes for Slow Program Execution

The main reasons for a slow program execution include an inconsistent index order when sets are referenced and taking irrelevant cases into consideration. In this section we will give some guidance on how to eliminate these causes, and also point to problems due to the scaling of a model which could cause a unnecessarily long execution time for the solver.

Ordering Indices Consistently

GAMS employs a sparse matrix data storage scheme. For example, consider the parameter p(a,b,c). Assume that the set a has \(k\) elements, the set b has \(n\) elements and c has \(m\) elements. Then the entries for p are stored in the following order:

a1 b1 c1
a1 b1 c2
...
a1 b1 cm
a1 b2 c1
...
a1 b2 cm
...
a1 bn cm
a2 b1 c1
...
ak bn cm

Note that it is a systematic order where the last entry varies the fastest and the first the slowest. Observe that GAMS will withdraw entries from memory fastest if they are referenced in an order consistent with the storage order. Thus, in the following example, the first assignment statement will be processed faster than the second assignment statement.

x(a,b,c) = p(a,b,c);
y(b,c,a) = p(a,b,c);
Note
GAMS will execute a program fastest if the sets are always referenced in the same order in definitions, assignments and equations.

The example that follows illustrates this principle. First we will solve a program where the indices appear in an arbitrary order and we will record the output generated by setting the option profile to 1. Then we will reformulate the program so that the indices will always appear in an alphabetical order and solve it again, recording the profile output. In the final step, we will compare the execution times of the two runs. We will use the example introduced above.

Note that the indices in the parameters and equations appear in a random order. Here is the profile from the six most expensive statements again:

----     10 Assignment z             2.231     2.605 SECS    286 MB  4259200
----     11 Assignment y             2.324     4.929 SECS    286 MB      0
----     18 Equation   objeq         3.510     8.471 SECS    287 MB      1
----     19 Equation   r             3.088    11.559 SECS    464 MB   8800
----     20 Equation   q             5.741    17.300 SECS    470 MB   9680
----     26 Assignment sumofvar      2.620     2.652 SECS    287 MB      0

In the next step we reformulate the program such that the indices always appear in the same order. For example, we define the parameter x as x(a,b,c,d,e) instead of x(e,d,c,b,a). Here is the complete rewritten model:

option profile = 1;
option limrow = 0; option limcol = 0;
option solprint = off;

Sets    a / 1*22 /, b / 1*22 /, c / 1*20 /,
        d / 1*20 /, e / 1*22 /;

Parameters x(a,b,c,d,e), y, z(a,b,c,d,e);
x(a,b,c,d,e) = 10;
z(a,b,c,d,e) = x(a,b,c,d,e);
y            = sum((a,b,c,d,e), z(a,b,c,d,e)*x(a,b,c,d,e));

Variable obj;
Positive Variable var(a,b,e);

Equations objeq, r(b,c,d), q(a,b,c);

objeq..      obj =e= sum((a,b,c,d,e), z(a,b,c,d,e)*x(a,b,c,d,e) * var(a,b,e));
r(b,c,d)..   sum((a,e), var(a,b,e)) =l= sum((a,e), x(a,b,c,d,e)*z(a,b,c,d,e));
q(a,b,c)..   sum((d,e), var(a,b,e)/x(a,b,c,d,e)*z(a,b,c,d,e)) =l= 20;

Model slow /all/;
solve slow maximizing obj using lp;

Parameter sumofvar;
sumofvar = sum((a,b,c,d,e), z(a,b,c,d,e)*x(a,b,c,d,e)*var.l(a,b,e));
display sumofvar;

After running the modified program, the profile for expensive statements looks like this:

----     10 Assignment z             0.593     0.983 SECS    215 MB  4259200
----     11 Assignment y             0.671     1.654 SECS    215 MB      0
----     18 Equation   objeq         1.778     3.432 SECS    215 MB      1
----     19 Equation   r             2.215     5.647 SECS    392 MB   8800
----     20 Equation   q             1.763     7.410 SECS    398 MB   9680
----     26 Assignment sumofvar      0.952     0.983 SECS    216 MB      0

Observe that executing for example the assignment to z took just 0.593 seconds compared to 2.231 seconds in the first run. Substantial percentage reductions were achieved in all time consuming cases by consistently referencing the sets in the same order.

Replace loops with assignments

The following statement assigns a constant value to a parameter.

loop((i,j,k),p(i,j,k)=2;)

The following assignment is preferred instead.

p(i,j,k)=2;

Restricting Assignments and Equations to Relevant Cases

Assignments

Assume that we have a set of cities with different production capacities and demands for various products. We want to know the maximum transportation cost (which depends on the distance, the amount shipped and a fixed factor) from each city to all others. This cost can be calculated in the following way:

Sets c "cities"   / c1*c800 /
     p "products" / p1*p10  /;
Alias (c,cc);

Parameter capacity(c,p)  "Production capacity for product p in city c"
          demand(c,p)    "Demand for product p in city c"
          distance(c,cc) "Distance between two cities";

*Generate some sparse, random data
capacity(c,p)$(uniform(0,1)<0.05) = uniformInt(150,250);
demand(c,p)$(uniform(0,1)<0.025)  = uniformInt(50,150);
distance(c,cc)$(not sameas(c,cc)) = uniformInt(10,800);

Parameter maxCost(c,cc) "Maximum transportation costs between two cities";

maxCost(c,cc) = sum(p, min(capacity(c,p), demand(cc,p))*distance(c,cc)*90);

The performance profile will tell us something like this:

----     16 Assignment maxCost       0.265     0.436 SECS     19 MB   8756

Since we know, that the parameter maxCost will be zero for a pair of cities if there is no product with production capacity in the first city and demand in the second one, we could reduce the execution time for the last assignment:

Sets c "cities"   / c1*c800 /
     p "products" / p1*p10 / ;
Alias (c,cc);

Parameter capacity(c,p)  "Production capacity for product p in city c"
          demand(c,p)    "Demand for product p in city c"
          distance(c,cc) "Distance between two cities";

*Generate some sparse, random data
capacity(c,p)$(uniform(0,1)<0.05) = uniformInt(50,150);
demand(c,p)$(uniform(0,1)<0.025)  = uniformInt(50,150);
distance(c,cc)$(not sameas(c,cc)) = uniformInt(10,800);

Parameter maxCost(c,cc) "Maximum transportation costs between two cities";

maxCost(c,cc)$sum(p, capacity(c,p)*demand(cc,p))
  = sum(p, min(capacity(c,p), demand(cc,p))*distance(c,cc)*90);

So we did not do the calculation of maxCost if we knew before, that it must be zero anyway. This results in a reduced runtime:

----     17 Assignment maxCost       0.031     0.187 SECS     19 MB   8756
Note
To restrict computations in assignment to the relevant cases, we recommend using dollar conditions and filtering sets. These concepts are introduced and discussed in detail in chapter Conditional Expressions, Assignments and Equations.

For more examples, see sections Conditional Assignments and Conditional Indexed Operations.

Variables and Equations

Like assignments, variables and equations need to be restricted to relevant cases to avoid unnecessary inefficiencies. Dollar conditions and filtering sets may be used over the domain of definition as well as in the body of an equation.

Lets extend the assignment example from the previous paragraph and use the generated data in a transportation model:

Sets c "cities"   / c1*c800 /
     p "products" / p1*p10 / ;
Alias (c,cc);

Parameter capacity(c,p)  "Production capacity for product p in city c"
          demand(c,p)    "Demand for product p in city c"
          distance(c,cc) "Distance between two cities";

*Generate some sparse, random data
capacity(c,p)$(uniform(0,1)<0.05) = uniformInt(150,250);
demand(c,p)$(uniform(0,1)<0.025)  = uniformInt(50,150);
distance(c,cc)$(not sameas(c,cc)) = uniformInt(10,800);

Parameter shipCost(c,cc) "Transportatin costs between two cities per case"
          maxCost(c,cc)  "Maximum transportatin costs between two cities";

shipCost(c,cc) = distance(c,cc)*90;
maxCost(c,cc)$sum(p, capacity(c,p)*demand(cc,p))
  = sum(p, min(capacity(c,p), demand(cc,p))*shipCost(c,cc));

Variables
     x(c,cc,p)  "shipment quantities in cases"
     z          "total transportation costs in thousands of dollars" ;

Positive Variable x ;

Equations
     cost         "define objective function"
     supply(c,p)  "observe supply limit at plant i"
     dem(cc,p)    "satisfy demand at market j" ;

cost..         z  =e=  sum((c,cc,p), shipCost(c,cc)*x(c,cc,p)) ;
supply(c,p)..  sum(cc, x(c,cc,p)) =l=  capacity(c,p) ;
dem(cc,p)..    sum(c,  x(c,cc,p)) =g=  demand(cc,p) ;

Model transport /all/ ;

Solve transport using lp minimizing z ;

The Profile Summary tells us, that the equations are rather expensive to generate and also the reading of the solution takes some time because of the size of the model:

---- Profile Summary (18 records processed)
    98.780   1.070GB        34 Equation   dem (8000)
    26.864   0.515GB        38 Solve Read transport
    25.303   0.454GB        32 Equation   cost (1)
     6.599   0.813GB        33 Equation   supply (8000)

However, as in the previous example, we know, that a product p won't be shipped from city c to city cc if there is either no production capacity in the first city or no demand in the second one. So we could reduce the size of our model by not generating variables and equations from which we know, that they are irrelevant for the solution. Here is a improved formulations of the equations:

cost..                       z  =e=  sum((c,cc,p)$(capacity(c,p)*demand(cc,p)), shipCost(c,cc)*x(c,cc,p)) ;
supply(c,p)$capacity(c,p)..  sum(cc$demand(cc,p), x(c,cc,p))  =l=  capacity(c,p) ;
dem(cc,p)$demand(cc,p)..     sum(c$capacity(c,p),  x(c,cc,p)) =g=  demand(cc,p) ;

This decreases the size of the model and thus the execution time to generate the model and load the solution significantly:

---- Profile Summary (18 records processed)
     0.031   0.035GB        33 Equation   cost (1)
     0.031   0.034GB        39 Solve Read transport
     0.016   0.035GB        34 Equation   supply (380)

Note that the equation dem does not even show up in the summary anymore since its generation was done to quickly.

For more details on conditions in equations, see section Conditional Equations.

Keep the model well scaled

Model solutions within GAMS frequently require manipulation of large matrices and many computations. The heart of most solvers includes many numerical procedures such as a sparse matrix inverter and sets of convergence and infeasibility tolerances. Numerical problems often arise within such procedures. Poorly scaled models can cause excessive time to be taken in solving or can cause the solver to fail. GAMS can assist the user to formulate a well scaled model. Details about this can be found in the sections Model Scaling - The Scale Option and Scaling Variables and Equations.

Other Approaches

In addition to the techniques discussed in section Eliminating the Main Causes for Slow Program Execution above, the following approaches may help to reduce the time needed for program execution:

  • Trying another appropriate solver.
  • Reformulating the model. This may yield particularly good results, if the model is reformulated in such a way that another model type is used, that is easier to solve or for which more advanced solver technology is available.
  • Using starting points for NLP models, as discussed in section Specifying Initial Values.
  • Trading memory for time.

We conclude the discussion of this topic with an example that demonstrates how memory may be traded for time. If an extensive calculation is repeated many times in a model, it may be possible to restructure the code so that the calculation is performed only once, then the result is saved and accessed later. Consider the following equation:

obj..   z =e= sum[(i,j,k,l), a(i,j,k,l)*sum(m, u(m,i))];

The execution time may be substantially reduced by defining a new parameter, say p, for the second sum and using this parameter in the equation:

Parameter p(i);
p(i) = sum(m, u(m,i));
...
obj..   z =e= sum[(i,j,k,l), a(i,j,k,l)*p(i)];

There is only one caveat: Users need to carefully consider whether the input data, here u(m,i), is modified between the assignment for the new parameter p and the equation where p is used. If u is updated, then the assignment statement needs to be repeated, otherwise the data that enters the equation will not be current.

Increasing Efficiency: Reducing Memory Use

Besides slow program execution, excessive memory use may be of concern for modelers. In this section we will present some approaches on how to find the causes for extraordinary memory use and give some advice on eliminating the main causes for it.

Finding the Causes for Excessive Memory Use

The main techniques for finding the causes for excessive memory use are the same as those for finding the causes for slow program execution. We discussed these techniques in section Finding the Causes for Slow Program Execution above.

In addition, the option dmpUserSym is useful in this context. GAMS will report the number of records stored for each symbol at the point in the program where the option dmpUserSym is inserted together with some rough memory estimate.

Consider the following example:

Sets i /1*5 /,  j /1*5 /,  k /1*5 /,  l /1*5 /,
     m /1*5 /,  n /1*5 /,  o /1*5 /;
Parameters y(i,j,k,l,m,n,o)
           q(i,j,k);
Variables  x(i,j,k,l,m,n,o)
           f(i,j,k)
           obj;
y(i,j,k,l,m,n,o)       = 10;
q(i,j,k)               = 10;
x.up(i,j,k,l,m,n,o)    = 10;
x.scale(i,j,k,l,m,n,o) = 1000;
 
Equations  z(i,j,k,l,m,n,o)
           res(i,j,k)
           ob;
    
ob..                obj  =e= sum((i,j,k,l,m,n,o), x(i,j,k,l,m,n,o));
z(i,j,k,l,m,n,o)..  x(i,j,k,l,m,n,o) =l= 8;
res(i,j,k)..        f(i,j,k) =l= 7;

Model memory /all/;
option dmpUserSym;
solve memory maximizing obj using lp;

Note that an option statement with dmpUserSym was added before the solve statement. It generates the following memory dump that is included in the execution output of the listing file:

SYMBOL TABLE DUMP (USER SYMBOLS ONLY), NR ENTRIES = 16
ENTRY                               ID   TYPE DIM  LENGTH MEMORYEST DEFINED ASSIGNED DATAKNOWN
  135                                i    SET   1       5      0 MB    TRUE    FALSE      TRUE
  136                                j    SET   1       5      0 MB    TRUE    FALSE      TRUE
  137                                k    SET   1       5      0 MB    TRUE    FALSE      TRUE
  138                                l    SET   1       5      0 MB    TRUE    FALSE      TRUE
  139                                m    SET   1       5      0 MB    TRUE    FALSE      TRUE
  140                                n    SET   1       5      0 MB    TRUE    FALSE      TRUE
  141                                o    SET   1       5      0 MB    TRUE    FALSE      TRUE
  142                                y  PARAM   7   78125      3 MB   FALSE     TRUE     FALSE
  143                                q  PARAM   3     125      0 MB   FALSE     TRUE     FALSE
  144                                x    VAR   7   78125      5 MB   FALSE     TRUE     FALSE
  145                                f    VAR   3       0      0 MB   FALSE     TRUE     FALSE
  146                              obj    VAR   0       0      0 MB   FALSE     TRUE     FALSE
  147                                z    EQU   7       0      0 MB   FALSE     TRUE     FALSE
  148                              res    EQU   3       0      0 MB   FALSE     TRUE     FALSE
  149                               ob    EQU   0       0      0 MB   FALSE     TRUE     FALSE
  150                           memory  MODEL   0       3              TRUE     TRUE      TRUE
END OF SYMBOL TABLE DUMP

The column ID contains the names of the symbols, the column TYPE gives the data type of the respective entry, the column DIM reports the number of indices and the column LENGTH gives the number of records that is related to memory use, which is estimated in the column MEMORYEST. Note that the other columns are not relevant for this discussion.

Observe that the rows with high counts in column LENGTH indicate symbols within the GAMS program which have large numbers of internal records that must be stored. This is associated with corresponding memory requirements. Note also that not all length counts are of equal significance. In particular, variables and equations use more memory per element than parameters, since they have bounds, levels, marginals and scales that are associated with them. Parameters use more memory per element than sets, since sets may need just one indicator for yes or no. However, the explanatory text for set elements might increase the memory requirements for set elements.

Nevertheless, users may use this report to identify items with many records and verify that all of them are actually needed. For more details, see section Eliminating the Main Causes for Excessive Memory Use below.

Eliminating the Main Causes for Excessive Memory Use

As detailed in section Eliminating the Main Causes for Slow Program Execution above, the main causes for a slow program execution include an inconsistent index order when sets are referenced and taking irrelevant cases into consideration. These programming habits also tend to cause excessive memory use. In this section we will give some advice on avoiding memory traps and show how the memory may be cleared of data that is no longer needed.

Avoiding Memory Traps

Users may inadvertently use a lot of memory if they import data from a database with long explanatory text for sets or set elements. In addition, setting variable attributes for scaling or bounds may be problematic. Consider the following example:

x.scale(i,j,k,l,m) = 100;
x.lo(i,j,k,l,m) = 10;
x.up(i,j,k,l,m) = 77;

These assignments will probably set many more values than are relevant for a particular problem. Therefore we recommend to carefully consider which label combinations are actually necessary and to restrict the assignments to these cases by the use of dollar conditions or filtering sets. For more information, see section Conditional Assignments.

Clearing Memory of Unneeded Data

Sometimes a lot of memory space is used for data that is needed at some point, but not later. Consider the following simple example:

set i /1*1000/
    j /1*1000/;
parameter distance(i,j)
          cost(i,j);
distance(i,j) = 100+ord(i)+ord(j);
cost(i,j)     = 4+8*dist(i,j);

Assume that the parameter distance is used only here, but nowhere else in the program. Therefore users may wish to free the memory space occupied by the data connected with distance. The option clear may be used to achieve this:

option clear = distance;

This will reset all entries in the matrix associated with distance to zero.

Alternatively, an identifier that is no longer needed could be reset to its default value(s) with an assignment statement. In the example above, we could write:

distance(i,j) = 0;

This statement will have the same effect as the option statement. The advantage of the option statements is that they offer a more compact alternative that is particularly useful if equations or variables are to be cleared and multiple equation attributes or variable attributes are affected.

Note that the dollar control options $clear and $kill may also be used to free memory. These are compile time directives, which have a similar effect on the memory consumption but have different side effects: while $clear will reset the values to their defaults, $kill will completely remove the identifier from the program. Hence an identifier that was "killed" may be used later in another declaration and definition statement. For example, the following code snippet is legal:

Set i /1, 2 ,3/;
$kill i
Set i /a, b, c/;

With $clear instead of $kill this would cause a compilation error.

Setting Memory Limits with HEAPLIMIT

In a server environment and in other cases (e.g. to avoid the use of virtual memory) the amount of memory a GAMS run is allowed to use may have to be limited. The command line parameter heapLimit serves this purpose: the amount of memory for GAMS compilation and execution is limited to a specified number (in MB). If the data storage exceeds this limit, the job will be terminated with return code 10, out of memory. In addition, the function heapLimit may be used to interrogate the current limit and to reset it.

Note that limiting memory use for solver execution is not possible from within the GAMS program. However, some solvers like the NLP solver CONOPT have their own heapLimit option which ensures that the solver will not use more dynamic memory than specified.

Advice for Repairing Puzzling Nonworking Code

Assume a GAMS run was terminated and we cannot get a profile output (e.g. because GAMS ran out of memory and crashes). A memory overrun error causes the operating system buffer handling procedures to generally lose the last few lines of profile information when the job malfunctioned. How do we find the problem in this case?

We recommend to use the techniques outlined in section Modeling Techniques above. In addition, successively deactivating code in search for the last GAMS statement that worked will help in most cases. This can be done by using comments. If at some point the run terminates properly, the user will slowly activate parts of the last statements that were deactivated until the code performance will get worse again. By iteratively activating and deactivating terms, the precise problematic terms may be found. The save and restart feature could also be used to save the results until a certain statement and then to execute only the statements that are suspected to be problematic.