|
GAMS/DEA |
| Slice Modeling in GAMS |
|
Adding GAMS/DEA to your GAMS system enables you to solve linear and
mixed integer Data Envelopment Analysis (DEA) programs, as well as other
linear, mixed integer and simple quadratic slice models, efficiently.
GAMS/DEA is part of the regular distribution since 21.6.
|
| Introduction |
A slice model is a group of mathematical programs with the same structure but
different data. The name slice comes from the idea that individual programs
within the model are defined by identifying their corresponding slices of data.
The mathematical formulation for the k-th slice model is given by
| (1) |
minx | fk(x) | (objective slice) |
| subject to | Akx = bk | (slice constraints) |
| x in X | (core constraints) |
DEA models are examples of slice models. The basic (CCR) DEA model is
defined by
| (2) |
maxu,v | uTY*,k | (objective slice) |
| subject to | | vTX*,k = 1 | (slice constraint) |
| uTY <= vTX | (core constraint) |
| u,v >= 0 | (core constraint) |
where X,Y are data matrices.
The GAMS/DEA interface was originally developed to solve large scale DEA
models efficiently. But it is general enough to be applied to any linear,
mixed integer or simple quadratic slice model.
Without the GAMS/DEA interface in GAMS, a slice model would be defined and
solved in a loop over k, requiring the model to be generated multiple times.
With the slice interface in GAMS, the slice
constraints are identified by a special alias (slice) and are
generated all at once. The GAMS/DEA interface then handles defining the
individual programs and passing them to the underlying LP solver (CPLEX). In
this way, individual programs are not re-generated, but are instead defined
as data modifications of each other. This reduces overall model generation
time. Further, previous solutions can be used as starting points in later
solves to speed up overall solve time.
Some DEA examples compare the two formulations.
|
| DEA Examples |
Original GAMS formulation (without the GAMS/DEA interface):
$title Data Envelopment Analysis - DEA
Sets i units / Depot1*Depot20 /
is(i) selected unit
j inputs and outputs / stock, wages, issues, receipts, reqs /
ji(j) inputs / stock, wages /
jo(j) outputs / issues, receipts, reqs /;
Table data(i,j)
stock wages issues receipts reqs
Depot1 3 5 40 55 30
Depot2 2.5 4.5 45 50 40
Depot3 4 6 55 45 30
Depot4 6 7 48 20 60
Depot5 2.3 3.5 28 50 25
Depot6 4 6.5 48 20 65
Depot7 7 10 80 65 57
Depot8 4.4 6.4 25 48 30
Depot9 3 5 45 64 42
Depot10 5 7 70 65 48
Depot11 5 7 45 65 40
Depot12 2 4 45 40 44
Depot13 5 7 65 25 35
Depot14 4 4 38 18 64
Depot15 2 3 20 50 15
Depot16 3 6 38 20 60
Depot17 7 11 68 64 54
Depot18 4 6 25 38 20
Depot19 3 4 45 67 32
Depot20 3 6 57 60 40
;
Positive variables v(ji) input weights
u(jo) output weights;
Variable eff efficiency;
Equations defe(i) efficiency definition - weighted output
denom(i) weighted input
lime(i) 'output / input < 1';
defe(is).. eff =e= sum(jo, u(jo)*data(is,jo));
denom(is).. sum(ji, v(ji)*data(is,ji)) =e= 1;
lime(i).. sum(jo, u(jo)*data(i,jo)) =l= sum(ji, v(ji)*data(i,ji));
model dea /defe, denom, lime /;
set ii(i) set of units to analyze;
ii(i) = yes;
is(i) = no;
loop(ii,
is(ii) = yes;
solve dea using lp max eff;
is(ii) = no
);
|
GAMS/DEA formulation:
$title Data Envelopment Analysis - DEA
Sets i units / Depot1*Depot20 /
is(i) selected unit
j inputs and outputs / stock, wages, issues, receipts, reqs /
ji(j) inputs / stock, wages /
jo(j) outputs / issues, receipts, reqs /;
Table data(i,j)
stock wages issues receipts reqs
Depot1 3 5 40 55 30
Depot2 2.5 4.5 45 50 40
Depot3 4 6 55 45 30
Depot4 6 7 48 20 60
Depot5 2.3 3.5 28 50 25
Depot6 4 6.5 48 20 65
Depot7 7 10 80 65 57
Depot8 4.4 6.4 25 48 30
Depot9 3 5 45 64 42
Depot10 5 7 70 65 48
Depot11 5 7 45 65 40
Depot12 2 4 45 40 44
Depot13 5 7 65 25 35
Depot14 4 4 38 18 64
Depot15 2 3 20 50 15
Depot16 3 6 38 20 60
Depot17 7 11 68 64 54
Depot18 4 6 25 38 20
Depot19 3 4 45 67 32
Depot20 3 6 57 60 40
;
Positive variables v(ji) input weights
u(jo) output weights;
Variables eff efficiency;
alias (slice,i);
Equations defe(slice) efficiency definition - weighted output
denom(slice) weighted input
lime(i) 'output / input < 1';
defe(is).. eff =e= sum(jo, u(jo)*data(is,jo));
denom(is).. sum(ji, v(ji)*data(is,ji)) =e= 1;
lime(i).. sum(jo, u(jo)*data(i,jo)) =l= sum(ji, v(ji)*data(i,ji));
model dea /defe, denom, lime/;
*set to run over: run over all
is(i) = yes;
option lp=dea;
solve dea using lp max eff;
|
The special alias name slice identifies the set which defines the
particular
data slices. All slice constraints must be declared over the alias
slice,
but can be defined over any subset of the original set.
Slice equations can also be multi-dimensional (resulting in blocks of slices).
This is shown by the dual of the basic DEA model, given by
| (3) |
minz,lambda | z | (objective) |
| subject to | X*lambda <= zX*,k | (slice constraint) |
| Y*lambda >= Y*,k | (slice constraint) |
| lambda >= 0 | (core constraint) |
The next example compares the two formulations for this model.
Original GAMS formulation (without the GAMS/DEA interface):
$title Data Envelopment Analysis - DEA
sets i units / Depot1*Depot20 /
is(i) selected unit
j inputs and outputs / stock, wages, issues, receipts, reqs /
ji(j) inputs / stock, wages /
jo(j) outputs / issues, receipts, reqs /;
Table data(i,j)
stock wages issues receipts reqs
Depot1 3 5 40 55 30
Depot2 2.5 4.5 45 50 40
Depot3 4 6 55 45 30
Depot4 6 7 48 20 60
Depot5 2.3 3.5 28 50 25
Depot6 4 6.5 48 20 65
Depot7 7 10 80 65 57
Depot8 4.4 6.4 25 48 30
Depot9 3 5 45 64 42
Depot10 5 7 70 65 48
Depot11 5 7 45 65 40
Depot12 2 4 45 40 44
Depot13 5 7 65 25 35
Depot14 4 4 38 18 64
Depot15 2 3 20 50 15
Depot16 3 6 38 20 60
Depot17 7 11 68 64 54
Depot18 4 6 25 38 20
Depot19 3 4 45 67 32
Depot20 3 6 57 60 40
;
Positive variable lam(i) dual weights;
Variable eff efficiency;
Equations dii(i,ji) input duals
dio(i,jo) output duals;
* dual model
dii(is,ji).. sum(i, lam(i)*data(i,ji)) =l= eff*data(is,ji);
dio(is,jo).. sum(i, lam(i)*data(i,jo)) =g= data(is,jo);
model dea / dii, dio /;
*set to run over: run over all
set ii(i);
ii(i) = yes;
is(i) = no;
loop(ii,
is(ii) = yes;
solve dea using lp min eff;
is(ii) = no;
);
|
GAMS/DEA formulation:
$title Data Envelopment Analysis - DEA
sets i units / Depot1*Depot20 /
is(i) selected unit
j inputs and outputs / stock, wages, issues, receipts, reqs /
ji(j) inputs / stock, wages /
jo(j) outputs / issues, receipts, reqs /;
Table data(i,j)
stock wages issues receipts reqs
Depot1 3 5 40 55 30
Depot2 2.5 4.5 45 50 40
Depot3 4 6 55 45 30
Depot4 6 7 48 20 60
Depot5 2.3 3.5 28 50 25
Depot6 4 6.5 48 20 65
Depot7 7 10 80 65 57
Depot8 4.4 6.4 25 48 30
Depot9 3 5 45 64 42
Depot10 5 7 70 65 48
Depot11 5 7 45 65 40
Depot12 2 4 45 40 44
Depot13 5 7 65 25 35
Depot14 4 4 38 18 64
Depot15 2 3 20 50 15
Depot16 3 6 38 20 60
Depot17 7 11 68 64 54
Depot18 4 6 25 38 20
Depot19 3 4 45 67 32
Depot20 3 6 57 60 40
;
Positive variable lam(i) dual weights;
Variables f(j) auxiliary variables
eff efficiency;
alias (slice,i);
Equations dii(slice,ji) input duals
dio(slice,jo) output duals
deff(j) auxiliary equations;
* dual model
dii(is,ji).. f(ji) =l= eff*data(is,ji);
dio(is,jo).. f(jo) =g= data(is,jo);
deff(j).. f(j) =e= sum(i, lam(i)*data(i,j));
model dea / dii, dio, deff /;
*set to run over: run over all
is(i) = yes;
option lp=dea;
solve dea using lp min eff;
|
Note that in the GAMS/DEA formulation, we use the auxiliary variables
f for the sums over the units set. This is equivalent to the
original formulation but speeds up the model generation time, particularly
for large slice models, since we generate all of the equations at once.
(The original GAMS formulation only generates
the equations as needed, so there is no need for f there.)
|
| Another Example: Cross-Validation |
|
DEA models are examples of addition slice models: each individual program is defined by specifying the slice(s) which are added to the core constraints in the model.
By default, GAMS/DEA treats the slices in this manner.
However, some slice models (like cross-validation models) are more easily defined by deleting slices from the entire constraint set.
To deal with these types of models, the option slicetype can be set to 0 in the options file.
The following example compares the two formulations for a feature-selection
model under cross-validation.
Original GAMS formulation (without the GAMS/DEA interface):
$title Ten-fold cross validation example
$eolcom !
$setglobal num_folds 10
set a /1*1505/; ! set for category 1
set b /1*957/; ! set for category 2
set o /1*14/; ! observations
set p /1*%num_folds%/; ! folds to perform
set f /1*10/; ! maximum features to select
! Read in the data from the data files
parameter a_data(a, o) /
$offlisting
$include "a_data.inc"
$onlisting
/;
parameter b_data(b, o) /
$offlisting
$include "b_data.inc"
$onlisting
/;
! Declare testing and training sets
set a_test(a);
set a_trai(a);
set b_test(b);
set b_trai(b);
! Define problem
scalar w_tol /1/;
scalar features;
positive variables a_err(a),
b_err(b);
variables c,
weight(o),
gamma;
binary variable y(o);
equations w_def1(o),
w_def2(o),
y_def,
c_def,
a_def(a),
b_def(b);
w_def1(o)..
weight(o) =l= w_tol*y(o);
w_def2(o)..
weight(o) =g= -w_tol*y(o);
y_def..
sum(o, y(o)) =e= features;
c_def..
c =e= sum(a_trai, a_err(a_trai)) + sum(b_trai, b_err(b_trai));
a_def(a_trai)..
-sum(o, a_data(a_trai, o)*weight(o)) + gamma + 1 =l= a_err(a_trai);
b_def(b_trai)..
sum(o, b_data(b_trai, o)*weight(o)) - gamma + 1 =l= b_err(b_trai);
model train /all/;
train.optfile = 1;
train.optca = 0;
train.optcr = 0;
! Solve
features = 6;
loop(p,
! Generate testing and training sets
$batinclude gentestset.inc a b
a_trai(a) = not a_test(a);
b_trai(b) = not b_test(b);
solve train using mip minimizing c;
);
|
Options file for the original formulation: cplex.opt
The batinclude file gentestset.inc
gives instructions for generating the testing sets.
In the original formulation, it is used to generate the testing sets
dynamically.
In the GAMS/DEA formulation,
gentestset.inc is used not only to
generate the testing sets (all at once), but also to build a mapping between
the category sets and the slice set: a_test and b_test now
describe which equations are left out on solve number p.
GAMS/DEA formulation:
$title Ten-fold cross validation example using GAMS/DEA
$eolcom !
$setglobal num_folds 10
set a /1*1505/; ! set for category 1
set b /1*957/; ! set for category 2
set o /1*14/; ! observations
set p /1*%num_folds%/; ! folds to perform
set f /1*10/; ! maximum features to select
alias(p,slice); ! slice set is fold set
! Read in the data from the data files
parameter a_data(a, o) /
$offlisting
$include "a_data.inc"
$onlisting
/;
parameter b_data(b, o) /
$offlisting
$include "b_data.inc"
$onlisting
/;
! Declare testing sets
set a_test(a,p);
set b_test(b,p);
! Define problem
scalar w_tol /1/;
scalar features;
positive variables a_err(a),
b_err(b);
variables c,
weight(o),
gamma;
binary variable y(o);
equations w_def1(o),
w_def2(o),
y_def,
c_def,
a_def(a,slice),
b_def(b,slice);
w_def1(o)..
weight(o) =l= w_tol*y(o);
w_def2(o)..
weight(o) =g= -w_tol*y(o);
y_def..
sum(o, y(o)) =e= features;
c_def..
c =e= sum(a, a_err(a)) + sum(b, b_err(b));
a_def(a,p)$a_test(a,p)..
-sum(o, a_data(a, o)*weight(o)) + gamma + 1 =l= a_err(a);
b_def(b,p)$b_test(b,p)..
sum(o, b_data(b, o)*weight(o)) - gamma + 1 =l= b_err(b);
model train /all/;
train.optfile = 1;
train.optca = 0;
train.optcr = 0;
! Solve
features = 6;
option mip = dea;
! Generate testing sets (to be deleted in each problem)
loop(p,
$batinclude gentestset.inc "a,p" "b,p"
);
solve train using mip minimizing c;
|
Options file for GAMS/DEA formulation: dea.opt
mipstart 0
slicetype 0
eqnchk 0
|
Note that equation c_def changes from a sum over the training sets to
a sum over the entire sets in the GAMS/DEA formulation. We can make this
change because variables related to the testing sets in c_def
will be eliminated by presolve (since they appear no where else in the model).
By default, the GAMS/DEA interface checks that the number of equations in
each program is the same. Most slice models have this property, since they
involve different instances of the same program.
However, in the cross-validation example, card(p) does not
evenly divide card(a) or card(b), and so the last program
has fewer equations than the rest. Because of this,
we use the option eqnchk to turn off the
automatic equation check.
|
| The DEA Solver |
|
If you specify a DEA solver with the OPTION LP=DEA statement,
the OPTION MIP=DEA statement, or the OPTION LP=DEAQP
statement
the model will be solved as a slice model. GAMS will call the GAMS/DEA
interface, which defines and solves the individual programs (based on the
generated slice constraints). CPLEX is used as the underlying LP solver.
Because only one call is made to the solver, the resulting listing file will
contain only one of the solutions (the solution for the last program).
Solutions for all of the programs are written to the solution file dea.sol.
Inside dea.sol, the solutions are given as GAMS parameters, indexed by the
slice set.
|
|
|