Data Entry: Parameters, Scalars and Tables

Introduction

Data handling is of crucial importance in all modeling applications. The quality of the numbers and the intelligence with which they are used is likely to be at least as important as the logic of the model in determining if an application is successful or not. GAMS has been designed to have a complete set of facilities for entering information, manipulating it and reporting on the results. In this chapter we will concentrate on data entry. Chapter Data Manipulations with Parameters introduces and discusses data manipulations. For details on reporting, see chapters GAMS Output, The Display Statement, The Put Writing Facility, and GAMS Data eXchange (GDX).

One very important principle will motivate all our discussions on data:

Note
Data should be entered in its most basic form and each data item should be entered only once.

There are two reasons for adopting this principle. Numbers are almost certain to change, and when they do we want to be able to make the process of changing them as easy and safe as possible. We also want to make our model easy for others to read and understand. Keeping the amount of data as small as possible will certainly help. All the data transformations are shown explicitly in the GAMS representation, which makes it possible to reproduce the results of a study and shows the reader all the assumptions made during data manipulation. Another advantage is that everything needed to run or change the model is included in one program that can easily be moved from place to place or from one machine to another.

This chapter deals with the data type parameter. For other data types, see section Data Types and Definitions. Data for parameters can be entered in three basic formats: scalar, list oriented, or tables of two or more dimensions. For each of these formats, GAMS offers a separate keyword:

Keyword Description
Scalar Single (scalar) data entry.
Parameter List oriented data, defined over one or more sets.
Table Table oriented data, must involve two or more dimensions.

Table 1: Parameters, Scalars and Tables

Note that the term parameter is used in two ways: as data type and as keyword, so one could also see scalars and tables as special formats of parameters. Each of the data input formats will be introduced and discussed in the following sections. At the end of the chapter the special data type acronym is introduced.

Note
  • By default, parameters in all input formats may only be initialized once, thereafter data must be modified with assignment statements. This can be changed using the dollar control option $onMulti.
  • This chapters explains the complete syntax to declare parameters which includes the optional initialization. So, while it is possible to initialize the data at declaration, often the data is read from other sources like databases or spreadsheets. More information about this can be found in the chapter Data Exchange with Other Applications.

Scalars

The scalar statement is used to declare and (optionally) initialize a GAMS parameter of dimensionality zero. This means that there are no associated sets, so there is exactly one number associated with the parameter.

The Syntax

In general, the syntax for a scalar declaration in GAMS is as follows:

scalar[s] scalar_name [text] [/numerical_value/]
       {  scalar_name [text] [/numerical_value/]} ;

The keyword scalar[s] indicates that this is a scalar statement and scalar_name is the internal name of the scalar in GAMS, it is an identifier. The optional explanatory text is used to describe the scalar and the optional numerical_value is assigned to be the value of scalar_name. Numerical_value can be given as fixed number or as constant evaluation. Alternatively, the special data type acronym may be used as value. For details on acronyms, see section Acronyms.

Note that more than one scalar may be declared in one scalar statement. The entries have to be separated by commas or by end of line. For advice on explanatory text and how to choose a scalar_name, see the tutorial Good Coding Practices.

Note that scalars may be declared but not initialized in the scalar statement. A value can also be assigned later as illustrated in the example that follows.

An Illustrative Example

An example of a scalar definition in GAMS is shown below.

Scalar   
    rho  "discount rate"                           / .15 /
    irr  "internal rate of return"
    life "financial lifetime of productive units"  / 20  /;

The statement above initializes rho and life, but not irr. Later on another scalar statement can be used to initialize irr or an assignment statement could be used to provide the value:

irr  =  0.07;

For more on scalar assignments and parameter assignments in general, see section Data Entry by Assignment.

Parameters

The parameter format is used to enter list oriented data which can be indexed over one or several sets.

The Syntax

In general, the syntax for a parameter declaration in GAMS is as follows:

parameter[s] param_name[(index_list)] [text] [/ element [=] numerical_value
                                              {,element [=] numerical_value} /]
           {,param_name[(index_list)] [text] [/ element [=] numerical_value
                                              {,element [=] numerical_value} /]} ;

The keyword parameter[s] indicates that this is a parameter statement and param_name is the internal name of the parameter in GAMS, it is an identifier. A parameter may be defined over one or more sets that may be specified in the index_list. Note that the specification of the index list in the declaration is optional. However, mostly it is adviasable to specify it for reasons of clarity and to enable domain checking. For more on domain checking, see section Domain Checking. The optional explanatory text is used to describe the parameter.

Parameter initialization requires a list of data elements, each consisting of a label or label-tuple and a value. Element is an element of the defining set or - if there is more than one defining set - a combination of the elements of the defining sets. The referenced set elements must belong to the set that the parameter is indexed over. Finally, numerical_value is the value assigned to the record defined by the set element or element tuple. It can be given as fixed number or as constant evaluation. Alternatively, the special data type acronym may be entered as value. For details on acronyms, see section Acronyms.

Note
The default value of a parameter is 0.

Slashes must be used at the beginning and end of the list, and commas must be used if several data elements are listed in one line. An equals sign or a blank separates the label tuple from its associated value. A parameter can be defined in a similar syntax to that used for a set. For advice on explanatory text and how to choose a parameter name, see the tutorial Good Coding Practices.

Note
Several parameters may be declared in one parameter statement.

Illustrative Examples

The following example illustrates the parameter statement. It is adapted from [MEXSS]. We also show the set definitions because they make the example clearer. For more on sets definitions, see chapter Set Definition.

Set i     "steel plants"   / hylsa      monterrey
                             hylsap     puebla  /
    j     "markets"        / mexico-df, monterrey, guadalaja /;

Parameter  
    dd(j) "distribution of demand"
                         /  mexico-df   55,
                            guadalaja   15 /;

The index specification for the parameter dd means that there will be a vector of data associated with it, one number corresponding to every member of the set j. The numbers are specified along with the declaration in a format very reminiscent of the way we specify sets: in this simple case a label followed by a blank separator and then a value. Any of the legal number entry formats are allowable for the value. For details on number formats in GAMS, see subsection Numbers. The default data value is zero. Since monterrey has been left out of the data list, the value associated with dd('monterrey') is zero. As with sets, commas are optional at end of line.

We may also list several data elements on a line, separated by commas as in the following example:

Parameter  
    a(i)  / seattle  =  350,  san-diego  =  600 /
    b(i)  / seattle    2000,  san-diego    4500 /;

If a parameter is defined over a set and all elements of the set are assigned the same value, then the following notation may be used as a shortcut:

parameter param_name[(set_name)]  [text]  /(#|set.)set_name numerical_value/;

Here set is a reserved word and set_name is the name of the set as it has been declared in a previous set declaration statement. Instead of set. one could also use the # sign. The following artificial example illustrates this notation:

Set        j      /j1, j2/;
Parameter  hh(j)  /set.j 10/
           gg     /#j    10/;

This resolves in hh('j1') = hh('j2') = gg('j1') = gg('j2') = 10.

Note
By default it is not possible to define an empty parameter at declaration. This may be changed using the dollar control option $onEmpty, as shown in the following example:
Set i     / seattle,  san-diego /;
$onEmpty
Parameter  
    a(i)  /  /;

That initializes a('seattle') and a('san-diego') to 0. So it is not the same as this:

Set i     / seattle,  san-diego /;
Parameter  
    a(i);

Here, a is declared, but not initialized (so, it is not defined yet) and one would get an error when trying to read it.

Parameter Data for Higher Dimensions

A parameter may have several dimensions. For the current maximum number of permitted dimensions, see Dimensions. The list oriented data initialization through the parameter statement can be easily extended to data of higher dimensionality. The label that appears on each line in the one-dimensional case is replaced by a label tuple for higher dimensions. The elements in the \(n\)-tuple are separated by dots (.) just like in the case of multi-dimensional sets.

The following example illustrates the use of parameter data for higher dimensions:

Parameter  
    salaries(employee,manager,department)
        / anderson  .murphy  .toy          = 6000
          hendry    .smith   .toy          = 9000
          hoffman   .morgan  .cosmetics    = 8000 /;

All the mechanisms using asterisks and parenthesized lists that we introduced in our discussion of sets are available here as well. For details see section Multi-Dimensional Sets. Below is an artificial example, in which a very small fraction of the total data points are initialized. GAMS will mark an error if the same label combination (or label-tuple) appears more than once in a data list.

Set row / row1*row10 /
    col / col1*col10 /;
Parameter  
    a(row, col)
        /  (row1,row4) . col2*col7    12
            row10      . col10        17
            row1*row7  . col10        33 /;

In this example the twelve elements row1.col2 to row1.col7 and row4.col2 to row4.col7 are all initialized at 12, the single element row10.col10 at 17, and the seven elements rows1.col10 to row7.col10 at 33. The other 80 elements (out of a total of 100) remain at their default value, which is 0. This example shows the ability of GAMS to provide a concise initialization or definition for a sparse data structure.

Tables

Tabular data can be declared and initialized in GAMS using a table statement. For two and higher-dimensional parameters this provides an easier and more concise method of data entry than the list based approach, since - at least in smaller tables - each label appears only once.

The Syntax

In general, the syntax for a table declaration in GAMS is as follows:

table table_name[(index_list)]  [text] EOL
                element               { element }     EOL
  element    numerical_value       { numerical_value} EOL
 {element    numerical_value       { numerical_value} EOL} ;

The keyword table indicates that this is a table declaration and table_name is the internal name of the table in GAMS, it is an identifier. The name of the parameter can be followed by the index_list. In the index_list the sets are specified over which the table is defined. Note that the specification of the index list in the declaration is optional. However, mostly it is adviasable to specify it for reasons of clarity and to enable domain checking. For more on domain checking, see section Domain Checking. The optional explanatory text is used to describe the table, followed by EOL which means "end of line", a line break. Element is an element of one of the driving sets. More details follow below. Numerical_value is the value of the entry associated with the corresponding element combination. It can be given as fixed number or as constant evaluation. Alternatively, the special data type acronym may be used as value. For details on acronyms, see section Acronyms. For advice on explanatory text and how to choose a table_name, see the tutorial Good Coding Practices.

Attention
By default, the table statement is the only statement in the GAMS language that is not free format. This may be changed using the dollar control option $onDelim.

The following rules apply:

  • The relative positions of all entries in a table are significant. This is the only statement where end of line (EOL) has meaning. The character positions of the numeric table entries must overlap the character positions of the column headings.
  • The column section has to fit on one line.
  • The sequence of values forming a row must be on the same line.
  • The element definition of a row can span more than one line.
  • A specific column can appear only once in the entire table.

The rules for building simple tables are straightforward. The components of the header line are

keyword - identifier - index_list - text

Note that the index_list and the text are optional. Labels are used on the top and the left to map out a rectangular grid that contains the data values. The order of labels is unimportant, but if domain checking has been specified (i.e. the index_list has been given in the first line of the table declaration) each label must match one in the associated set. Labels must not be repeated, but can be left out if the corresponding numbers are all zero or not needed. At least one blank must separate all labels and data entries. Blank entries imply that the default value (zero) will be associated with that label combination.

Note
  • Tables must have at least two dimensions. For the current maximum number of permitted dimensions, see Dimensions.
  • In contrast to the set, scalar, and parameter statements, only one identifier may be declared and initialized in a table statement.

An Illustrative Example

In the following example a simple table is presented. It is adapted from [KORPET], the relevant set definitions are also given.

Set i   "plants"
        / inchon, ulsan, yosu /
    m   "productive units" 
        / atmos-dist   "atmospheric distillation unit"
          steam-cr     "steam cracker"
          aromatics    "aromatics unit"
          hydrodeal    "hydrodealkylator"  /;

Table ka(m,i) "initial cap. of productive units (100 tons per yr)"
                     inchon      ulsan      yosu
    atmos-dist         3702      12910      9875
    steam-cr                       517      1207
    aromatics                      181       148
    hydrodeal                      180          
;

In this example the row labels are drawn from the set m and those on the column from the set i. Note that the data for each row is aligned under the corresponding column headings. Entries that are not specified are assigned the default value zero.

Note
If there is any uncertainty about which column a number belongs to, GAMS will protest with an error message and mark the ambiguous entry.
Attention
Special care has to be taken, if tabs are used. The GAMS command line option TabIn controls the tab spacing. Note that this spacing might be different form the spacing that the editor is showing, hence the visible alignment might be different from the alignment that GAMS is actually using.

Continued Tables

If a table has too many columns to fit nicely on a single line, then the columns that don't fit may be continued on additional lines. We use the same example to illustrate:

Table ka(m,i) "initial cap. of productive units (100 tons per yr)"
                   inchon    ulsan
    atmos-dist     3702      12910
    steam-cr                   517
    aromatics                  181
    hydrodeal                  180
    
        +          yosu
    atmos-dist     9875
    steam-cr       1207
    aromatics       148  
;

The crucial item is the plus '+' sign above the row labels and to the left of the column labels in the continued part of the table. The row labels have been duplicated, except that hydroreal has been left out, since it does not have any associated data. Tables may be continued as many times as necessary.

Tables with more than Two Dimensions

Tables may have more than two dimensions. For the current maximum number of permitted dimensions, see Dimensions. As usual, dots are used to separate adjacent labels and may be used in the row or column position. The label on the left of the row corresponds to the first set in the index list, and that on the right of each column header to the last. Obviously, there must be the same number of labels associated with each number in the table, as there are sets in the index list.

The best layout depends on the size of the defining sets and the amount of data. It should provide the most intuitively satisfactory way of organizing and inspecting the data. For most people it is easier to look down a column of numbers than across a row. However, putting extra labels on the row has the advantage of greater density of information.

The following example, adapted from [MARCO], illustrates the use of tables with more than two dimensions.

Set ci   "commodities :   intermediate"
         / naphtha    "naphtha"
           dist       "distillate"
           gas-oil    "gas-oil"  /
    cr   "commodities :   crude oils"
         / mid-c       "mid-continent"
           w-tex       "west-texas"  /
    q    "attributes of intermediate products"
         / density, sulfur /;

Table attrib(ci, cr, q) "blending attributes"
                          density    sulfur
    naphtha. mid-c         272        .283
    naphtha. w-tex         272       1.48
    dist   . mid-c         292        .526
    dist   . w-tex         297       2.83
    gas-oil. mid-c         295        .98
    gas-oil. w-tex         303       5.05   
;

The table attrib could also be laid out as shown below:

Table attrib (ci,cr,q) "blending attributes"
            w-tex.density  mid-c.density  w-tex.sulfur  mid-c.sulfur
     naphtha      272             272         1.48         .283
     dist         297             292         2.83         .526
     gas-oil      303             295         5.05         .98   
;

Condensing Tables

All the mechanisms using asterisks and parenthesized lists that were introduced in the discussion of sets are available here as well. For details on these mechanisms, see section Multi-Dimensional Sets. The following example shows how repeated columns or rows can be condensed with asterisks and lists in parentheses. The set membership is not shown, but can easily be inferred.

Table upgrade(strat,size,tech)
                small.tech1  small.tech2  medium.tech1  medium.tech2
    strategy-1       .05          .05           .05           .05
    strategy-2       .2           .2             .2           .2
    strategy-3       .2           .2             .2           .2
    strategy-4                                   .2           .2

Table upgradex(strat,size,tech) "alternative way of writing table"
                                         tech1*tech2
    strategy-1.(small,medium)                 .05
    strategy-2*strategy-3.(small,medium)      .2
    strategy-4.medium                         .2;

Handling Long Row Labels

It is possible to continue the row labels in a table on a second, or even third line in order to accommodate a reasonable number of columns. The break must come after a dot, and the rest of each line containing an incomplete row label-tuple must be blank.

The following example, adapted from [INDUS], is used to illustrate. This table actually has nine columns and many rows, here we have reproduced just a small part to show continued row label-tuples.

Table yield (c,t,s,w,z) "crop yield (metric tons per acre)"
                                                         nwfp     pmw
    wheat.(bullock, semi-mech).la-plant.
                                     (heavy, january)   .385     .338
    wheat.(bullock, semi-mech).la-plant. light          .506     .446
    wheat.(bullock, semi-mech).la-plant. standard       .592     .524
    wheat.(bullock, semi-mech).(qk-harv, standard).
                                     (heavy, january)   .439     .387

Constant Evaluation

Instead of fixed numerical values, one can also use constant expressions to assign values to parameters in a data statement. The syntax of constant expressions used in data statements follows the GAMS syntax as descibed in Data Manipulations with Parameters, but is restricted to scalar values and a subset of the GAMS intrinsic functions, as summarized below:

  • Real numbers only
  • Evaluation left to right
  • Operator precedence:
    • ^ **
    • * /
    • + - binary and unary
    • < <= = <> >= > LE LE EQ NE GE GT
    • NOT
    • AND
    • OR XOR EQV IMP
  • See Functions for list of supported functions

When used in a data statement, the constant expressions have to be enclosed in a pair of square brackets [ ] or curly brackets { }. Spaces can be used freely inside those brackets. Here is a little examples:

Scalars x "PI half"       / [pi/2] /
        e "famous number" / [ exp( 1 ) ] /;

Parameter y "demo" / USA.(high,low) [1/3]
                     USA.medium {1/4}    /;

Data Entry by Assignment

Data may also be entered using assignment statements. Assignments are introduced and discussed in detail in section The Assignment Statement. This section here is a short outlook and shows how parameters that have already been declared may be assigned values. The general assignment statement has the following form:

parameter_name[(index_list)] = expression;

Here parameter_name is the name of a parameter that has been declared previously in a scalar, parameter or table statement, index_list indicates the controlling indices and may either contain a set or sets, a lable or label tuple or a combination of those, and expression may be a number, a numerical expression or an acronym. For details on numerical expressions, see section Expressions.

The following examples illustrate how assignments may be used for data entry.

Set            j       /j1, j2, j3/;
Scalar         a1;
Scalars        a2      /11/;
Parameter      cc(j),
               bc(j)   /j2 22/;
a1 = 10;
a2 = 5;
cc(j) = bc(j)+10;
cc("j1") = 1;

The scalar a1 is declared but not initialized in the first scalar statement. It is assigned the value of 10 in the first assignment. The scalar a2 is initialized in the second scalar statement and this value is changed to 5 in the second assignment. Note that the original data is not retained. In the parameter statement the parameter cc(j) is declared but not initialized and the parameter bc(j) is only initialized for j2. This means that bc('j2') = 22 and bc('j1') = bc('j3') = 0, the default value. Now, the third assignment sets the parameter cc(j) and assigns to all elements of the set j the value of the parameter bc(j) plus 10. So we have cc('j2') = 32 and cc('j1') = cc('j2') = 10. Note that in this example the set j has only three elements so only 3 assignments are made simultaneously. However, suppose that the number of set elements is large, say 100,000, then to each element a value is assigned with just one assignment statement. Finally, the value of cc('j1') is changed to 1.

Observe that in the examples above assignments either refer to one specific set element or to the whole set. It is also possible to make assignments to only a part of the set. The mechanisms for partial set references are discussed in section Restricting the Domain in Assignments. Set elements that are not assigned new values in an assignment with a partial set reference retain their previous values. Recall that these may be the default value, values from the parameter or table statement, or values resulting from previous calculations.

Acronyms

An acronym is a special data type that allows the use of strings as values. Note that acronyms have no numeric values and are treated as character strings only.

The Syntax

The declaration for an acronym is similar to a set or parameter declaration. The basic format is as follows:

Acronym[s]  acronym_name [text] {, acronym_name [text]};

The keyword acronym[s] indicates that this is an acronym statement and acronym_name is the internal name of the acronym in GAMS, it is an identifier. The optional explanatory text is used to describe the acronym. For advice on explanatory text and how to choose an acronym_name, see the tutorial Good Coding Practices.

Note that more than one acronym may be declared in one acronym statement. The entries have to be separated by commas or by end of line. A simple example illustrates this:

Acronym Monday, Tuesday, Wednesday, Thursday, Friday;

Acronym Usage

Acronyms may be used as data in scalar, parameter and table statements. An example for acronyms in a parameter statement follows.

Set machines / m-1*m-5 / ;
Acronym 
    Monday, Tuesday, Wednesday, Thursday, Friday;
Parameter 
    shutdown(machines) 
     /   m-1  Tuesday
         m-2  Wednesday
         m-3  Friday
         m-4  Monday
         m-5  Thursday /;

Acronyms may also be used in assignments as in the example below. For more on assignments, see section The Assignment Statement.

Acronym Monday, Tuesday, Wednesday, Thursday, Friday;
Scalar dayOfWeek;
dayOfWeek = Wednesday;

Note that numerical operations like addition or subtraction are not allowed with acronyms. Such operations would be meaningless since acronyms do not have numeric values.

Another context where acronyms may be used is in logical conditions. For more on logical conditions, see chapter Conditional Expressions, Assignments and Equations. This is shown in the following example:

Acronym Monday, Tuesday, Wednesday, Thursday, Friday;
Scalar dayOfWeek
       workHours /6/;
dayOfWeek = Wednesday;
workHours$(dayOfWeek <> Friday) = 8;

Note that only the equality and inequality operators may be used with acronyms. Other operations like addition and division are meaningless since acronyms do not have numeric values.

Acronyms are specific to GAMS and hence difficult to deal with when exchanging data with other systems. Users often replace parameters that contain acronyms with dynamic sets that have an additional index whose values correspond to the acronyms found in the original parameter. The machine shutdown data from above can be represented via a two-dimensional set as follows:

Set machines / m-1*m-5 / 
    weekdays / Monday, Tuesday, Wednesday, Thursday, Friday /
    shutdown(machines,weekdays) 
     /   m-1.Tuesday
         m-2.Wednesday
         m-3.Friday
         m-4.Monday
         m-5.Thursday /;

Summary

In this chapter, the declaration and initialization of parameters with the Scalar, Parameter, and Table statement have been discussed. Chapter Data Manipulations with Parameters will describe how this data can be changed with assignment statements.