# Misbehaving Model – Infeasible

by Tim Johannessen Excerpts from Bruce McCarl's Newsletter infeasible artificial variables

I was talking to the GAMS staff and they informed me as to what their most common supportcalls involve. One of them involves fixing bad results. In this and the next couple of newsletters. I will cover how to diagnose problems within models that don’t work right. Also in the nextsection I will cover an integer programming issue that commonly comes up.

Years ago I started writing a book length item called - So Your GAMS Model Didn’t WorkRight: A Guide to Model Repair but life got busy and I never finished it. (If someone wanted tohelp finish it please contact me). Here I will adapt material in it on infeasible models forpresentation here. So here we go.

Infeasibility is always a possible outcome when solving models. Simplex based linearprogramming solvers handle infeasibility through a two or three step solution approach. First,there may be some presolution calculations which may determine a model cannot be madefeasible (as done for example by the CPLEX and CONOPT PRESOLVEs). Second, there isusually a Phase I operation wherein the sum of a set of implicitly added artificial variables isminimized. During this phase, the problem is artificially rendered feasible. Third, if theartificial variable values are all driven to zero, then the problem is declared feasible and thesolver turns to the real objective function and proceeds toward optimally.

However, the problem may be declared infeasible by the presolve or the Phase I. In such cases, the information content of the output differs between solvers and may not be very helpful.

### Causes of Infeasible Models

Causes of infeasibility are not always easily identified. Solvers may report a particular equation as infeasible in cases where an entirely different equation is the cause. Consider the following example,

Max 50 x1 +50 x2
x1 + x2 ≤  50
50 x1 + x2 ≤  65
x1      ≥  20
x1,  x2 ≥   0

In this example, the interaction between the constraint x1 ≥ 20, the constraint immediately above it, and the nonnegativity condition on x2 render the model infeasible while the first constraint has nothing to do with the infeasibility. There may be several potential explanations as to why the infeasibility is present. The 65 on the right hand side of the second constraint may be a data entry error, perhaps a number in excess of 1000 was intended. Similarly, the 50 coefficient multiplying x1 in the second constraint may be an error with a number more like 0.50 or a negative entry intended. Third, the limit requiring x1 ≥ 20 may be misspecified with the RHS really intended to be 0.20. Fourth, perhaps the X2 variable should have been allowed to be negative. Fifth, there could be multiple errors involving several of the above cases. Runs with CPLEX, BDMLP and MINOS5 resulted in the marking of either the x1 ≥ 20 or the second constraint as the infeasible item. This may or may not be a proper identification of the problem causing mistake.

This illustrates a general point that infeasibilities occur because of the interaction of multiple balance on variables and the restrictions imposed by equations. In more complex models a larger set of constraints and bounds could be involved and there may be thousands of other variable bounds and constraints that have nothing to do with the infeasibility. Thus, we need procedures to identify the infeasibility causing set of variable and equation restrictions. In turn then we can look for the root cause of the infeasibility in that narrowed down set.

#### Finding Causes of Infeasibility -- Basic Theory

There are two approaches I recommend for finding the set of infeasibility causing equations. The first approach relies on “artificial” variables and the second involves use of infeasibility finders that appear in a few solvers (i.e. the CPLEX conflict refiner, the IIS as in BARON, GUROBI and XPRESS; and the feasibility relaxation in CPLEX and GUROBI). I will only cover artificial variables here as their usage works with all solvers.

Artificial variables are covered in virtually every introductory linear programming course or book. An artificial is an added variable that did not appear in the original model which is structured to make sure that the where it is inserted can all always be satisfied. Additionally, the model objective function is modified to provide a strong incentive to drive the artificial variables to zero. There are two ways such incentives are entered: through the “Big M” penalty method or through the “Phase I/ Phase II” optimization approach. Here we only cover the Big M method. In that case we augment the above model with an artificial variable as follows

Max  50 x1 + 50 x2 -1000000 A
x1 +    x2             ≤ 50
50 x1 +    x2             ≤ 65
x1 +             A     ≥ 20
x1,     x2,      A     ≥  0

with A being the artificial variable and the -1000000 being the objective function penalty. Here we only add one such variable but more generally artificials would be entered for each model equation which is not be satisfied when all decision variables equal zero (i.e., when x1 = 0 the x1 ≥ 20 constraint is not satisfied). In general, the added artificials would have a large, undesirable objective function coefficient (the so called “Big M value”) and an entry in an associated potentially infeasible equation.

#### Identifying Infeasible Causes

Now let us go introduce the procedure for identifying the set of infeasibility causing constraints and variable bounds. This is done by solving the problem with artificials added and then using the solution information to identify the infeasibility causing equation and variable bound set. Now we illustrate the procedure using the example.

A GAMS formulation of the above problem after including the artificial is

     variable            objmax;     positive variables  x1, x2, A     equations           obj,r1,r2,r3;     obj..    objmax =e= 50*x1 +50*x2-1000000*A       ;     r1..                   x1 +   x2           =L= 50;     r2..                50*x1 +   x2           =L= 65;     r3..                   x1               +A =G= 20;     model infe /all/;     solve infe using lp maximizing objmax;

Note here it is possible that a solver with a presolve can eliminate some equations and in turn possibly not report the proper shadow prices. (when using a solver with a presolve one might suppress it, for example using the CPLEX option preind 0 or the XPRESS option presolve 0).

The resultant relevant part of the Big M solution is

     **** SOLVER STATUS      1 Normal Completion
**** MODEL STATUS       1 Optimal
**** OBJECTIVE VALUE         -18699935.0000
LOWER    LEVEL    UPPER    MARGINAL
---- EQU obj              .        .        .        1.000
---- EQU r1             -INF      1.300   50.000      .
---- EQU r2             -INF     65.000   65.000 20001.000
---- EQU r3             20.000   20.000    +INF  -1.000E+6

LOWER    LEVEL    UPPER    MARGINAL
---- VAR objmax        -INF   -1.870E+7    +INF       .
---- VAR x1               .       1.300    +INF       .
---- VAR x2               .        .       +INF  -1.995E+4
---- VAR A                .      18.700    +INF       .

Where note the marginals on r2, r3 and x2 are large.

The question then is: So what? When a linear program is solved, the optimum solution contains items which are influenced by the objective function parameters for the non-zero variables. In particular the GAMS marginals or more classically, the shadow prices, reduced costs and objective function value.

So in this case with the artificial is in the basis, then relaxation of some of the right hand sides will cause that artificial to become smaller and they will have artificially large shadow prices. Similarly the reduced costs on some variables will be influenced by the presence of the artificial indicating that the artificial would get smaller if that variable could go below its lower bound (going negative in this case) or above its upper. Note this is the case for the marginals on r2, r3 and x2 which jointly indicates equations r1, and r2 plus x2>0 is the infeasibility causing set.

#### General procedure for finding the infeasibility causing set

The following gives the steps for finding infeasibility causes using the Big M, artificial variable approach.

Step 1 Identify the relevant equations and/or variable bounds for which artificials are needed to be added (details about this in next section)

Step 2 Add artificial variables to those equations and bounds. These artificials each have a Big M penalty in the objective function and an entry in a single constraint.

Step 3 Solve the model

Step 4 Examine the model solution. Where the marginals (the reduced costs for the variables and the shadow prices for the equations) are distorted by the presence of the artificials, identify those as the variables and equations to be examined for the cause of infeasibility (note once identified then the modeler needs to examine those in the context of the problem hopefully finding the issue).

Step 5 Fix the model and repeat the process if needed

There are several questions inherent in the above procedure. In particular: Where should artificial variables be added? How should the artificial variables be structured? and How does one find a “distorted” marginal? Each is discussed below.

### Where Should one add Artificial Variables?

The places where artificial variables should be added can be determined in several ways. One could look at the model solution and enter artificials in the equations and/or variable bounds marked by the solver as infeasible. However, while this sometimes points to proper places, it does not always do such. The approach advocated here is to add artificials in all possible infeasible locations although a different approach is in order for newly modified models as discussed below.

Programming models will only be infeasible when setting all the decision variables equal to zero is not feasible. This occurs when: a) the interval between variable upper and lower bounds does not include zero; or b) equations appear which are not satisfied when all variables are set to zero.

The equation cases which are not satisfied when the variables are set equal to zero are:

1. Less than or equal to constraints with a negative right side i.e. x ≤ -1
2. Greater than or equal to constraints with a positive right side. i.e. x ≥1
3. Equality constraints with a nonzero right side i.e. x = 1 or x = -1

Additionally, when the interval between a variable’s lower and upper bounds does not include zero then those bounds need to be converted to constraints with artificials. This will occur when:

1. The lower bound is positive, or
2. The upper bound is negative

The ADVISORY and NONOPT -- IDENTIFY procedures in GAMSCHK have been written to create a list of all occurrences of these five cases.

#### Adding artificials in newly modified models

When a model that was feasible before has been newly modified and goes infeasible this raises a different artificial adding procedure. Namely, one can just add the artificials into the newly added constraints and/or bounds. Therein one may need to add artificials to newly added constraints or bounds that in fact are satisfied when all the variables are set to zero. This arises because the new constraints are possible members of the infeasibility causing set through their interaction with other constraints that previously could be satisfied. Here artificials would be added to the new =L= constraints with positive right hand sides or new =G= constraints with a negative right hand side. The exact structure these artificials follows the rules given in the next section.

#### Entering Artificial Variables in GAMS

Once one has found where to add the artificial variables one still has to address: How they should be added? and What should they look like? I recommend using the following general rules address this question.

1. In general a new GAMS variable that is specified to be positive should be defined for each identified potential infeasibility causing equation. This variable should have the same dimension as does the equation. Thus, if artificials are added to RESOUREQ(PLANT,RESOURCE), then the artificial should look like ARTRESOURQ(PLANT,RESOURCE).
2. Artificials should be entered on the left-hand side of the equation with a coefficient of +1 in =L= equations and -1 in =G= equations. When the equation is an =E= the artificial should be a +1 if the equation right hand side is positive and -1 if it is negative.

Note, one cannot add an artificial into variable bounds which are defined using LO, .UP or .FX syntax. One needs to convert these to =G=, to =L= and =E= equations and then add the artificials following the above rules on signs.

One will also need to enter a large objective function penalty for the artificials. The coefficient will be negative when the objective function is maximized and when it is minimized. The magnitude of this penalty is entirely problem dependent and can cause numerical problems in the solver. All that can be said in general is that the penalty should dwarf the other objective function coefficients and should be large enough so that the artificial is driven to zero in any feasible model.

Alternatively, if numerical problems are plaguing the solution with the artificials entered, one can modify the objective function to one which minimizes some of the artificials. In the above example the simplest way to do this is to multiply the original objective function by zero and convert the penalty on the artificial to something like hundred so that the shadow prices are not extremely small. This means the model becomes

     variable              objmax;
positive variables    x1, x2, A
equations             obj,r1,r2,r3;
obj..   objmax =e= 0*(50*x1 +50*x2)-100*A       ;
r1..                     x1 +   x2        =L= 50;
r2..                  50*x1 +   x2        =L= 65;
r3..                     x1            +A =G= 20;
model infe /all/;
OPTION LP=BDMLP;
solve infe using lp maximizing objmax;
And yields a solution
     **** OBJECTIVE VALUE     -1870.0000
LOWER     LEVEL     UPPER     MARGINAL
---- EQU obj           .         .         .        1.000
---- EQU r1          -INF       1.300     50.000     .
---- EQU r2          -INF      65.000     65.000    2.000
---- EQU r3          20.000    20.000     +INF   -100.000
LOWER     LEVEL     UPPER    MARGINAL
---- VAR objmax      -INF   -1870.000    +INF       .
---- VAR x1            .        1.300    +INF       .
---- VAR x2            .         .       +INF     -2.000
---- VAR A             .       18.700    +INF       .
Which again identifies the same infeasibility causing set.

#### How Are Distorted Marginals Identified?

The next question involves finding the distorted marginals. Under the BIG M method one reviews the output in the GAMS LST file reproduced above looking for marginals that have large absolute values or on our nonzero when working with the case just above where the original objective function was multiplied by zero. However, in models with thousands of variables and equations this information can be widely dispersed and difficult to find. The GAMSCHK procedure NONOPT can do this for you as when run it will automatically list out all items with marginals larger in absolute value than 10 to a filter value that is set through the Gams check option file (MARGFILT) for example adding the following to your code.

$onEOLcom Modelname.optfile=1; //replace red part with the name of your mode) *write solver option file for gamschk File opt /gamschk.opt/; Put opt$onput
Margfilt 1
$offput Putclose; *write gck file telling gamschk what to do File gck /%system.fn%.gck/; Put gck$onput
nonopt
\$offput
Putclose;
*invoke gamschk as lp solver
Option LP=GAMSCHK;

### Can you find what I did to the model?

If you want to play around with this a little bit here is a challenge. I made a version of the gams model library model Egypt (where I lengthened the abominably short choice of parameter and set names so the model was more easily comprehended). In that model I made a few subtle modifications that caused to be infeasible. I have also added in the needed code to add needed artificials (to get them you need to activate the set global statement in first line in the code) plus I added in stuff to start up gamschk (again needing activation by removing the * from column one in the line just before the word gamschk appears in the code). As a hint my modifications involve nutrition and land availability.

from Bruce McCarl's GAMS Newsletter No 40, May 2017