Scaling

by Tim Johannessen Excerpts from Bruce McCarl's Newsletter scaling


Sometimes models do odd things like reporting problems as infeasible, stuck or falsely optimal when scaling is the real issue. To avoid this or correct such issues it is often desirable to check scaling and in turn rescale the model or ask the solvers to employ more aggressive scaling.

In terms of solver scaling most LP/MIP solvers do automatic scaling and a number have the option to apply a more aggressive scaling to numerically difficult models, e.g. Cplex option scanind, Gurobi option scaleflag, or Xpress option scaling. However, modelers can typically do better because they know how certain variables and equations are interrelated and can scale with common factors while the solvers do not have that knowledge. Generally, for difficult problems it is desirable to do the scaling as discussed below then let the solver scale as well. Scaling is typically not a concern for small problems.

Ideally, in doing a scaling exercise, one should modify the model so the absolute value of the constraint matrix coefficients are centered around one including the derivatives of any nonlinear terms (at the starting point). This involves manipulating variable and equation scaling. The target of this manipulation is after scaling when dividing the largest matrix coefficient by the smallest the ratio should be no more than 1000 to 10000 ie the coefficient would span from 0.01 to 100 or something like that. Arne Drud author of CONOPT in his CONOPT3 solver manual indicates

  1. Basic and superbasic solution values are expected to be around 1, e.g. from 0.01 to 100. Nonbasic variables will be at a bound, and the bound values should not be larger than say 100.
  2. Dual variables (or marginals) on active constraints are expected to be around 1, e.g. from 0.01 to 100. Dual variables on non-binding constraints will of course be zero.
  3. Derivatives (or Jacobian elements) are expected to be around 1, e.g. from 0.01 to 100.
  4. You should select the unit of measurement for the variables so their expected value is around unity.
  5. After you have selected units for the variables you should select the unit of measurement for the equations so the expected values of the individual terms are around one.

One can do this external to GAMS or can employ scaling within GAMS. The basic concept is discussed in chapter 17 of McCarl and Spreen here and involves altering all coefficients for a variable by the same amount and doing the same for all coefficients in an equation. Consider the following example

Max   X1 -   500X2   -   400X3   -   500X4
s.t.  X1 - 10000X2   -  8000X3               ≤   0
                X2   +     4X3   -    50X4   ≤   0
            1500X2   +  2000X3               ≤  1200
              50X2   +    45X3               ≤   60
      X1        X2          X3          X4    ≥   0
Which can be scaled to become
Max 20X1 -      X2   -      X3   -   0.1X4
s.t.  X1 -      X2   -      X3               ≤   0
                X2   +      X3   -      X4   ≤   0
                X2   + 1.667X3               ≤   0.8
                X2   + 1.125X3               ≤   1.2
       X1'       X2'         X3'         X4'   ≥   0

This would be done by 

  1. Dividing the coefficients in the first constraint by 1000.  
  2. Dividing the coefficients in the second constraint by 5
  3. Dividing the coefficients in the third constraint by 1500, 
  4. Dividing the coefficients in the fourth constraint by 50.
  5. Dividing the coefficients in the objective coefficient by 50.
  6. Multiplying the coefficients for X1 by 1000
  7. Leaving the coefficients for X2 alone 
  8. Multiplying the coefficients for X3 by 1.25
  9. Multiplying the coefficients for X4 by 1/10

Both the original model and the manually scaled one are in the example scale2.gms.

In turn after solution of this scaled model, one can recover the scaled solution by appropriately multiplying and dividing by the scaling factors as explained in McCarl and Spreen.  Specifically one would recover the solution to the original model by multiplying the scaled model solution values for variables by the variable scaling factor (so X1 would be the scaled solution level for X1 times 1000). Similarly the variable marginals would be multiplied by the objective function scaling factor in turn divided by the variable scaling factor (so for X1 the marginal would be the scaled marginal times  0/10000).  At the same time, the equation slacks or levels would be multiplied by the row scaling factor (for the first equation multiplied by 1000) and the equation marginals times the objective function scaler divided by the constraint scaler (for equation 1 multiply by 50/1000).  Finally, the objective function value would be multiplied by the objective function scalar.

Two questions arise from this.  First, do you really need to do the manual scaling and descaling? Second, how do I form the scaling factors?

On the first question, you are in luck.  GAMS has an easy way of specifying the scaling factors and automatically descales the solution so no manual procedures are required.  This is covered in the McCarl Expanded User Guide in the scaling section. For the example above the example scale2.gms contains an implementation of the GAMS scaling procedure and the relevant part is.

PARAMETER SCALEPROC(PROCESS)  VARIABLE SCALING
           / x1 10000, x2        1, x3        1.25,x4        0.1/
PARAMETER SCALERESOR(RESOURCE)  EQUATION SCALING
   /c1 10000, c2 5, c3 1500,c4 50/;
scalar rhsscale /1/
scalar objscale /500/;
PRODUCTION.scale(PROCESS)=SCALEPROC(PROCESS);
objt.scale=objscale;
AVAILABLE.scale(RESOURCE)=SCALERESOR(RESOURCE);
RESALLOC.scaleopt=1;

Where the code specifies the scaling factors, copies them into the GAMS scaling mechanism in turn specifies a model attribute that tells GAMS to solve the scaled model.

On the second question, one can get support on identifying coefficient magnitudes via the following approaches:

  1. One can following Drud's CONOPT3 document, use LIMROW, LIMCOL and search for large or small magnitude numbers within columns and rows then devise scaling factors. 
  2. One can use GAMSCHK (see the solver manual on this here) with first the BLOCKLIST or BLOCKPIC commands to find blocks of variables or equations with large or small numbers or consistent departures from values of one and then later MATCHIT to find individual cases.  These GAMSCHK procedures give largest and smallest magnitudes of coefficients in equations and variables so you don’t have to search.  This can also be supported by use of DISPLAYCR.
  3. Additionally GAMS personnel mentioned use of the CONVERT solver with the option 'jacobian'. This gives a gdx file with that contains the matrix coefficients which for nonlinear terms the gradients of the nonlinear terms (evaluated at the starting point- as limrow/col and GAMSCHK does). Then one can export that file to Excel and manipulate. Note when doing this Convert replaces all the internal variable and equation names. Subsequently the variables are named x1,x2,x3,... and the equations e1,e2,e3,...>  To get back to the original names one needs to manually back translate using the CONVERT generated dictionary file.  I have no experience with this but it just does not seem practical in large models as the GDX and Excel might be unwieldy and the back translation awkward.
  4. GAMS personnel also told me about a new option in Cplex 12.7 has a new option (datacheck=2) that "looks and reports" suspicious item in a model. In turn the log file (not the LST file) contains  information about large and small coefficients, rhs values and anything else Cplex thinks can cause numerical instability.  I tried this on what I considered an intermediate sized model and got messages like
CPLEX Warning  1043: Detected righthand side <= CPX_FEAS_TOL at constraint 'AGTILLSTART(US.TxTranspec.Cropland.Zero)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.2)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.18)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.26)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.41)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.45)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.49)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.60)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_White_Fla.71)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_Red_Tex.2)'.
CPLEX Warning  1045: Detected nonzero <= CPX_CANCEL_TOL at constraint 'WELFAR', variable 'AGDEMANDS(2015.US.dom_demand.GrpfrtFrsh_Red_Tex.18)'.
CPLEX Warning  1045: Too many warnings of this type have been detected.  All further warnings of this type will be ignored.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'WELFAR' are fractions and can be scaled with 17/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.CB.RefSugar.base)' are fractions and can be scaled with 47/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.CB.CornforDairyCattle.base)' are fractions and can be scaled with 70/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.GP.Oats.base)' are fractions and can be scaled with 71/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.GP.Canola.base)' are fractions and can be scaled with 11/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.GP.RefSugar.base)' are fractions and can be scaled with 47/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.GP.CornforDairyCattle.base)' are fractions and can be scaled with 70/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.LS.Oats.base)' are fractions and can be scaled with 71/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.LS.RefSugar.base)' are fractions and can be scaled with 47/1.
CPLEX Warning  1047: Decimal part of coefficients in constraint 'AGPRODBAL(2015.US.LS.CornforDairyCattle.base)' are fractions and can be scaled with 70/1.
CPLEX Warning  1047: Too many warnings of this type have been detected.  All further warnings of this type will be ignored.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'WELFAR' the ratio of largest and smallest (in absolute value) coefficients is 4.48036e+020.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.CB.Hay.base)' the ratio of largest and smallest (in absolute value) coefficients is 100000.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.CB.CottonseedMeal.base)' the ratio of largest and smallest (in absolute value) coefficients is 100000.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.CB.CottonseedOil.base)' the ratio of largest and smallest (in absolute value) coefficients is 776300.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.GP.Hay.base)' the ratio of largest and smallest (in absolute value) coefficients is 100000.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.GP.CottonseedMeal.base)' the ratio of largest and smallest (in absolute value) coefficients is 100000.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.GP.CottonseedOil.base)' the ratio of largest and smallest (in absolute value) coefficients is 776300.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.LS.Silage.base)' the ratio of largest and smallest (in absolute value) coefficients is 101350.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.LS.CottonseedOil.base)' the ratio of largest and smallest (in absolute value) coefficients is 776300.
CPLEX Warning  1048: Detected constraint with wide range of coefficients. In constraint 'AGPRODBAL(2015.US.NE.CottonseedOil.base)' the ratio of largest and smallest (in absolute value) coefficients is 776300.
CPLEX Warning  1048: Too many warnings of this type have been detected.  All further warnings of this type will be ignored

Across these, not surprisingly since I wrote it, I prefer the GAMSCHK where BLOCKLIST and MATCHIT tell you where the big and small numbers are and then I use targeted displays (through DISPLAYCR) to look at things.  The CPLEX information backed by GAMSCHK DISPLAYCR also looks god but is not available for other solver.Note when using the GAMS internal scaling feature the all of the above mentioned procedures report out the coefficients after the GAMS internal scaling has been applied. The above GAMSCHK features are implemented at the bottom of the scale2.gms example and LIMROW is set large enough in there to display the model. 


from Bruce McCarl's GAMS Newsletter No 41, July 2017

Archive of all Newsletters