A GAMS Tutorial by Richard E. Rosenthal

Introduction

Richard E. Rosenthal of the Naval Postgraduate School in Monterey, California wrote a detailed example of the use of GAMS for formulating, solving, and analyzing a small and simple optimization problem. The example is a quick overview of GAMS and its main features. Many references are made to other parts of the documentation, but they are only to tell you where to look for more details; the material here can be read profitably without reference to the rest of the documentation.

The example is an instance of the transportation problem of linear programming, which has historically served as a 'laboratory animal' in the development of optimization technology. [See, for example, Dantzig (1963) 1) ] It is a good choice for illustrating the power of algebraic modeling languages like GAMS because the transportation problem, no matter how large the instance at hand, possesses a simple, exploitable algebraic structure. You will see that almost all of the statements in the GAMS input file we are about to present would remain unchanged if a much larger transportation problem were considered.

In the familiar transportation problem, we are given the supplies at several plants and the demands at several markets for a single commodity, and we are given the unit costs of shipping the commodity from plants to markets. The economic question is: how much shipment should there be between each plant and each market so as to minimize total transport cost?

The algebraic representation of this problem is usually presented in a format similar to the following.

Indices:

$$i$$ = plants
$$j$$ = markets

Given Data:

$$a_i$$ = supply of commodity of plant $$i$$ (in cases)
$$b_j$$ = demand for commodity at market $$j$$
$$c_{ij}$$ = cost per unit shipment between plant $$i$$ and market $$j$$

Decision Variables:

$$x_{ij}$$ = amount of commodity to ship from plant $$i$$ to market $$j$$
where $$x_{ij} \geq 0$$, for all $$i$$, $$j$$

Constraints:

Observe supply limit at plant $$i$$:    $$\sum_j x_{ij} \leq a_i$$ for all $$i$$    (cases)
Satisfy demand at market $$j$$:    $$\sum_i x_{ij} \geq b_j$$ for all $$j$$    (cases)
Objective Function:    Minimize $$\sum_i \sum_j c_{ij} x_{ij}$$    ($K) Note that this simple example reveals some modeling practices that we regard as good habits in general and that are consistent with the design of GAMS. First, all the entities of the model are identified (and grouped) by type. Second, the ordering of entities is chosen so that no symbol is referred to before it is defined. Third, the units of all entities are specified, and, fourth, the units are chosen to a scale such that the numerical values to be encountered by the optimizer have relatively small absolute orders of magnitude. (The symbol$K here means thousands of dollars.)

The names of the types of entities may differ among modelers. For example, economists use the terms exogenous variable and endogenous variable for given data and decision variable, respectively. In GAMS, the terminology adopted is as follows: indices are called sets, given data are called parameters, decision variables are called variables, and constraints and the objective function are called equations.

The GAMS representation of the transportation problem closely resembles the algebraic representation above. The most important difference, however, is that the GAMS version can be read and processed by a computer.

Table 1: Data for the transportation problem (adapted from Dantzig, 1963) illustrates Shipping Distances from Plants to Markets (1000 miles) as well as Market Demands and Plant Supplies.

PlantsNew York Chicago Topeka Markets
Seattle 2.5 1.7 1.8 350
San Diego 2.5 1.8 1.4 600
Demands325 300 275 Supplies

The $title statement causes the subsequent text to be printed at the top of each page of output. Other available instructions are given in chapter Dollar Control Options. Error Messages When the GAMS compiler encounters an error in the input file, it inserts a coded error message inside the echo print on the line immediately following the scene of the offense. These messages always start with **** and contain a '$' directly below the point at which the compiler thinks the error occurred. The $is followed by a numerical error code, which is explained after the echo print. Several examples follow. Example 1 : Entering the statement set q quarterly time periods / spring, sos1, fall, wtr / ; results in the echo 1 set q quarterly time periods / spring, sos1, fall, wtr / ; ****$160

In this case, the GAMS compiler indicates that something is wrong with the set element sum. At the bottom of the echo print, we see the interpretation of error code 160:

Error Message
160  Unique element expected....

The problem is that sos1 is a reserved word, which can not be used as identifiers in general. The complete list of reserved words is shown in this chapter. Thus our set element must have a unique name like 'summer'. This is a common beginner's error.

Example 2 : Another common error is the omission of a semicolon preceding a direct assignment or equation definition. In our transportation example, suppose we omit the semicolon prior to the assignment of c(i,j), as follows.

Parameter c(i,j)  transport cost in 1000s of dollars per case
c(i,j)  = f * d(i,j)  /  1000  ;

Here is the resulting output.

16   Parameter c(i,j) transport cost in 1000s of dollars per case
17             c(i,j)  = f * d(i,j)/1000
****                     $97$195,96,409
Error Message
96  Blank needed between identifier and text
(-or- illegal character in identifier)
(-or- check for missing ';' on previous line)
97  Explanatory text can not start with '$', '=', or '..' (-or- check for missing ';' on previous line) 195 Symbol redefined with a different type 409 Unrecognizable item - skip to find a new statement looking for a ';' or a key word to get started again It is not uncommon for one little offense like our missing semicolon to generate five intimidating error messages. The lesson here is: concentrate on fixing the first error and ignore the other! The first error detected (in line 17), code 97, indicate that GAMS thinks the symbols in line 17 are a continuation of the documentary text at the end of line 16 rather than a direct assignment as we intended. The error message also appropriately advises us to check the preceding line for a missing semicolon. Unfortunately, you cannot always expect error messages to be so accurate in their advice. The compiler cannot read your mind. It will at times fail to comprehend your intentions, so learn to detect the causes of errors by picking up the clues that abound in the GAMS output. For example, the missing semicolon could have been detected by looking up the c entry in the cross-reference list (to be explained in the next section) and noticing that it was never assigned. SYMBOL TYPE REFERENCES c PARAM declared 15 ref 17 Example 3 : Many errors are caused merely by spelling mistakes and are caught before they can be damaging. For example, with 'Seattle' spelled in the table differently from the way it was introduced in the set declaration, we get the following error message. 4 sets 5 i canning plants /seattle, san-diego / 6 j markets /new-york, chicago, topeka / ; 7 8 table d(i,j) distance in thousand of miles 9 new-york chicago topeka 10 seatle 2.5 1.7 1.8 ****$170
11      san-diego    2.5      1.8            1.4   ;
Error Message
170  Domain violation for element

Example 4 : Similarly, if we mistakenly enter dem(j) instead of b(j) as the right-hand side of the demand constraint, the result is

45   demand(j)  ..   sum(i,  x(i,j)  )   =g=  dem(j)  ;
****                                             $140 Error Message 140 Unknown symbol Example 5 : The next example is a mathematical error, which is sometimes committed by novice modelers and which GAMS is adept at catching. The following is mathematically inconsistent and, hence, is not an interpretable statement. $\mbox{For all i},\quad \sum_i x_{ij} = 100$ There are two errors in this equation, both having to do with the co ntrol of indices. Index $$i$$ is over-controlled and index $$j$$ is under-controlled. You should see that index $$i$$ is getting conflicting orders. By appearing in the quantifier 'for all $$i$$', it is supposed to remain fixed for each instance of the equation. Yet, by appearing as an index of summation, it is supposed to vary. It can't do both. On the other hand, index $$j$$ is not controlled in any way, so we have no way of knowing which of its possible values to use. If we enter this meaningless equation into GAMS, both errors are correctly diagnosed. meaninglss(i) .. sum(i, x(i,j)) =e= 100 ; ****$125   $149 Error Messages 125 Set is under control already [This refers to set i.] 149 Uncontrolled set entered as constant [This refers to set j.] More information about error reporting is given in section Error Reporting. Comprehensive error detection and well-designed error messages are a big help in getting models implemented quickly and correctly. Reference Maps The next section of output, which is the last if errors have been detected, is a pair of reference maps that contain summaries and analyses of the input file for the purposes of debugging and documentation. The first reference map is a cross-reference map such as one finds in most modern compilers. It is an alphabetical, cross-referenced list of all the entities (sets, parameters, variables, and equations) of the model. The list shows the type of each entity and a coded reference for each appearance of the entity in the input. The cross-reference map for our transportation example is as follows (we do not display all tables). To turn the cross-reference map on, please add the line$onSymXRef

SYMBOL      TYPE   REFERENCES
a            PARAM declared        9  defined       10      ref       42
b            PARAM declared       13  defined       14      ref       44
c            PARAM declared       25 assigned       27      ref       40
cost         EQU   declared       36  defined       40 impl-asn       48
ref       46
d            PARAM declared       18  defined       18      ref       27
demand       EQU   declared       38  defined       44 impl-asn       48
ref       46
f            PARAM declared       23  defined       23      ref       27
i            SET   declared        4  defined        4      ref        9
18       25       27       30       37     2*40
2*42       44  control       27       40       42
44
j            SET   declared        5  defined        5      ref       13
18       25       27       30       38     2*40
42     2*44  control       27       40       42
44
supply       EQU   declared       37  defined       42 impl-asn       48
ref       46
transport    MODEL declared       46  defined       46 impl-asn       48
ref       48
x            VAR   declared       30 impl-asn       48      ref       33
40       42       44     2*50
z            VAR   declared       31 impl-asn       48      ref       40
48

For example, the cross-reference list tells us that the symbol a is a parameter that was declared in line 9, defined (assigned value) in line 10, and referenced in line 42. The symbol i has a more complicated entry in the cross-reference list. It is shown to be a set that was declared and defined in line 5. It is referenced once in lines 10, 19, 26, 28, 31, 38, 45 and referenced twice in lines 41 and 43. Set i is also used as a controlling index in a summation, equation definition or direct parameter assignment in lines 28, 41, 43 and 45.

For the GAMS novice, the detailed analysis of the cross-reference list may not be important. Perhaps the most likely benefit he or she will get from the reference maps will be the discovery of an unwanted entity that mistakenly entered the model owing to a punctuation or syntax error.

The second part of the reference map is a list of model entities grouped by type and listed with their associated documentary text. For example, this list is as follows.

SETS

i             canning plants
j             markets

PARAMETERS

a             capacity of plant i in cases
b             demand at market j in cases
c             transport cost in thousands of dollars per case
d             distance in thousands of miles
f             freight in dollars per case per thousand miles

VARIABLES

x             shipment quantities in cases
z             total transportation costs in thousands of dollars

EQUATIONS

cost          define objective function
demand        satisfy demand at market j
supply        observe supply limit at plant i

MODELS

transport

Equation Listings

Once you succeed in building an input file devoid of compilation errors, GAMS is able to generate a model. The question remains, and only you can answer it, does GAMS generate the model you intended?

The equation listing is probably the best device for studying this extremely important question. A product of the solve command, the equation listing shows the specific instance of the model that is created when the current values of the sets and parameters are plugged into the general algebraic form of the model. For example, the generic demand constraint given in the input file for the transportation model is

demand(j)  ..   sum(i, x(i,j))  =g=  b(j)  ;

while the equation listing of specific constraints is

--------demand   =g=   satisfy demand at market  j
demand(new-york)..  x(seattle, new-york) +x(san-diego, new-york)  =g=  325  ;
demand(chicago)..  x(seattle, chicago) +x(san-diego, chicago )  =g=  300  ;
demand(topeka)..  x(seattle, topeka) +x(san-diego, topeka)  =g=  275 ;

The default output is a maximum of three specific equations for each generic equation. To change the default, insert an input statement prior to the solve statement:

option limrow  =  r ;

where r is the desired number.

The default output also contains a section called the column listing, analogous to the equation listing, which shows the coefficients of three specific variables for each generic variable. This listing would be particularly useful for verifying a GAMS model that was previously implemented in MPS format. To change the default number of specific column printouts per generic variable, the above command can be extended:

option limrow  =  r,  limcol  = c ;

where c is the desired number of columns. (Setting limrow and limcol to 0 is a good way to reduce the size of your lst file after your model has been debugged.) In nonlinear models, the GAMS equation listing shows first-order Taylor approximations of the nonlinear equations. The approximations are taken at the starting values of the variables.

Model Statistics

The last section of output that GAMS produces before invoking the solver is a group of statistics about the model's size, as shown below for the transportation example.

MODEL STATISTICS

BLOCKS OF EQUATIONS       3     SINGLE EQUATIONS        6
BLOCKS OF VARIABLES       2     SINGLE VARIABLES        7
NON ZERO ELEMENTS        19

The BLOCK counts refer to the number of generic equations and variables. The SINGLE counts refer to individual rows and columns in the specific model instance being generated. For nonlinear models, some other statistics are given to describe the degree of non-linearity in the problem.

Status Reports

After the solver executes, GAMS prints out a brief Solve Summary whose two most important entries are SOLVER STATUS and the MODEL STATUS. For our transportation problem the solve summary is as follows:

S O L V E      S U M M A R Y

MODEL   TRANSPORT           OBJECTIVE  z
TYPE    LP                  DIRECTION  MINIMIZE
SOLVER  CPLEX               FROM LINE  49

**** SOLVER STATUS     1 Normal Completion
**** MODEL STATUS      1 Optimal
**** OBJECTIVE VALUE              153.6750

RESOURCE USAGE, LIMIT          0.031      1000.000
ITERATION COUNT, LIMIT         4    2000000000

The status reports are preceded by the same **** string as an error message, so you should probably develop the habit of searching for all occurrences of this string whenever you look at an output file for the first time. The desired solver status is 1 NORMAL COMPLETION, but there are other possibilities, documented in section The Solve Summary, which relate to various types of errors and mishaps.

There are eleven possible model statuses, including the usual linear programming termination states (1 Optimal, 3 Unbounded, 4 Infeasible), and others relating to nonlinear and integer programming. In nonlinear programming, the status to look for is 2 Locally Optimal. The most the software can guarantee for nonlinear programming is a local optimum. The user is responsible for analyzing the convexity of the problem to determine whether local optimality is sufficient for global optimality. In integer programming, the status to look for is 8 Integer Solution. This means that a feasible integer solution has been found. More detail follows as to whether the solution meets the relative and absolute optimality tolerances that the user specifies.

Solution Reports

If the solver status and model status are acceptable, then you will be interested in examining the results of the optimization. The results are first presented in as standard mathematical programming output format, with the added feature that rows and columns are grouped and labeled according to names that are appropriate for the specific model just solved. In this format, there is a line of printout for each row and column giving the lower limit, level, upper limit, and marginal. Generic equation block and the column output group the row output by generic variable block. Set element names are embedded in the output for easy reading. In the transportation example, the solver outputs for supply(i), demand(j), and x(i,j) are as follows:

---- EQU supply      observe supply limit at plant i

LOWER     LEVEL     UPPER    MARGINAL

seattle       -INF    350.000   350.000      EPS
san-diego     -INF    550.000   600.000      .

---- EQU demand      satisfy demand at market j

LOWER     LEVEL     UPPER    MARGINAL

new-york   325.000   325.000     +INF      0.225
chicago    300.000   300.000     +INF      0.153
topeka     275.000   275.000     +INF      0.126

---- VAR x           shipment quantities in cases

LOWER     LEVEL     UPPER    MARGINAL

seattle  .new-york      .       50.000     +INF       .
seattle  .chicago       .      300.000     +INF       .
seattle  .topeka        .         .        +INF      0.036
san-diego.new-york      .      275.000     +INF       .
san-diego.chicago       .         .        +INF      0.009
san-diego.topeka        .      275.000     +INF       .

The single dots '.' in the output represent zeroes. The entry EPS, which stands for epsilon, mean very small but nonzero. In this case, EPS indicates degeneracy. (The slack variable for the Seattle supply constraint is in the basis at zero level. The marginal is marked with EPS rather than zero to facilitate restarting the optimizer from the old basis.)

If the solvers results contain either infeasibilities or marginal costs of the wrong sign, then the offending entries are marked with INFES or NOPT, respectively. If the problem terminates unbounded, then the rows and columns corresponding to extreme rays are marked UNBND.

At the end of the solvers solution report is a very important report summary, which gives a tally of the total number of non-optimal, infeasible, and unbounded rows and columns. For our example, the report summary shows all zero tallies as desired.

**** REPORT SUMMARY :        0     NONOPT
0 INFEASIBLE
0  UNBOUNDED

After the solver's report is written, control is returned from the solver back to GAMS. All the levels and marginals obtained by the solver are entered into the GAMS database in the .l and .m fields. These values can then be transformed and displayed in any desired report. As noted earlier, the user merely lists the quantities to be displayed, and GAMS automatically formats and labels an appropriate array. For example, the input statement.

display x.l, x.m  ;

results in the following output.

----     50 VARIABLE  x.L           shipment quantities in cases

new-york     chicago      topeka

seattle        50.000     300.000
san-diego     275.000                 275.000

----     50 VARIABLE  x.M           shipment quantities in cases

chicago      topeka

seattle                     0.036
san-diego       0.009

As seen in reference maps, equation listings, solution reports, and optional displays, GAMS saves the documentary text and 'parrots' it back throughout the output to help keep the model well documented.

Summary

This tutorial has demonstrated several of the design features of GAMS that enable you to build practical optimization models quickly and effectively. The following discussion summarizes the advantages of using an algebraic modeling language such as GAMS versus a matrix generator or conversational solver.

• By using an algebra-based notation, you can describe an optimization model to a computer nearly as easily as you can describe it to another mathematically trained person.
• Because an algebraic description of a problem has generality, most of the statements in a GAMS model are reusable when new instances of the same or related problems arise. This is especially important in environments where models are constantly changing.
• You save time and reduce generation errors by creating whole sets of closely related constraints in one statement.
• You can save time and reduce input errors by providing formulae for calculating the data rather than entering them explicitly.
• The model is self-documenting. Since the tasks of model development and model documentation can be done simultaneously, the modeler is much more likely to be conscientious about keeping the documentation accurate and up to date.
• The output of GAMS is easy to read and use. The solution report from the solver is automatically reformatted so that related equations and variables are grouped together and appropriately labeled. Also, the display command allows you to modify and tabulate results very easily.
• If you are teaching or learning modeling, you can benefit from the insistence of the GAMS compiler that every equation be mathematically consistent. Even if you are an experienced modeler, the hundreds of ways in which errors are detected should greatly reduce development time.
• By using the dollar operator and other advanced features not covered in this tutorial, one can efficiently implement large-scale models. Specific applications of the dollar operator include:
1. It can enforce logical restrictions on the allowable combinations of indices for the variables and equations to be included in the model. You can thereby screen out unnecessary rows and columns and keep the size of the problem within the range of solvability.
2. It can be used to build complex summations and products, which can then be used in equations or customized reports.
3. It can be used for issuing warning messages or for terminating prematurely conditioned upon context-specific data edits.

1) Dantzig, George B. (1963). Linear Programming and Extensions, Princeton University Press, Princeton N.J.