Table of Contents
- Overview
- Requirements
- Batch Usage
- Multi-Query Batch Usage
- Interactive Usage
- Strategies
- Command Files
- Notes
- Attention
MDB2GMS
is deprecated (see GAMS 40 MDB2GMS release notes). Please use Connect agent SQLReader instead.
Overview
MDB2GMS
is a tool to convert data from an Microsoft Access database into GAMS readable format. The source is an MS Access database file (.mdb or .accdb) and the target is a GAMS Include File (.inc) or a GAMS GDX File (.gdx).
When running the executable mdb2gms.exe without command-line arguments, the tool will run interactively with a built-in GUI interface. Alternatively MDB2GMS
can be run in batch mode, which is useful when running it directly from a GAMS model without user intervention using the $call
command at compile time or the execute
command at execution time.
Database tables can be considered as a generalization of a GAMS parameter. GAMS parameters have multiple index columns but just one value column. If the table is organized as multi-valued table, a UNION operation in the SQL statement can be used to generate the correct GAMS file.
There are no special requirements on the data types used in the database. The data is converted to strings, which is almost always possible. Data types like LONG BINARY may not be convertible to a string, in which case an exception will be raised. In general NULL's should not be allowed to get into a GAMS data structure. The handling of NULL's can be specified in an option.
Besides parameters it is also possible to generate set data.
Requirements
MDB2GMS
runs only on Windows PC's and with MS Access installed. MS Access comes with certain versions of MS Office, but some MS Office versions will not include Access. The actual retrieval of the database records is performed by [DAO] (https://en.wikipedia.org/wiki/Data_access_object) or Data Access Objects, an object layer for accessing the database. The actual database is the Jet engine, which performs the queries and retrieves the data. Also consider to use SQL2GMS instead of MDB2GMS
, if MS Access is not installed on your system.
To use this tool effectively you will need to have a working knowledge of [SQL] (https://en.wikipedia.org/wiki/SQL) in order to formulate proper database queries.
Batch Usage
MDB2GMS
can be run in batch mode without user intervention from within the GAMS model by using the $call
resp. execute
statements or directly from command prompt while specifying all arguments in the command-line. A MDB2GMS
batch call is of the following form:
mdb2gms inputFile outputFile queryString
A proper batch call will at least contain the following three command-line arguments:
- The name of the MS Access database inputFile must be specified (.mdb or .accbd format). Use the I argument to enter the file name, i.e.
I=inputFile
. - The name of outputFile, either an include file (.inc) or GDX file (.gdx), must be specified. Using an include file to store the results of the query is indicated by the option
O
, i.e.O=outputFile.inc
, while the use of a GDX file is indicated by the optionX
, i.e.X=outputFile.gdx
. - The SQL queryString, containing the SQL statement to be executed on the database, must be specified within the option
Q
, i.e.Q=queryString
.
See also Command-line Arguments below for a complete list of all possible command-line arguments. Consider that the $call
or execute
usage is rather error prone and you will need to spend a little bit of time to get the call correct and reliable. Alternatively, use the interactive built-in GUI interface or enter the command-line arguments in an external text file in order to write a more structured and readable command. The use of an external parameter file is indicated by preceding the file name with a @
(At sign).
Also consider to take a look at the section Strategies, mentioning some of the drawbacks of the batch usage and how to overcome them.
If you only specify I=inputFile
then the interactive user interface is started with an initial setting of the input file name edit box equal to the name given in the command-line argument. Only if an input file, an output file and a query string is provided, the call will be considered as batch invocation.
Command-line Arguments
The following table summarizes the command-line arguments that can be specified when using MDB2GMS
directly from the GAMS model or command prompt.
Argument | Interpretation | Default | Description |
---|---|---|---|
I | inputFile | none | Specify the name of the input file (required). |
O | outputIncludeFile | none | Specify the name of the output file (.inc). Either O= or X= must be specified (or both). |
On | n-th outputIncludeFile | none | Match the nth query with the nth output file (.inc format) if multiple queries are used. |
X | outputGDXFile | none | Specify the name of the output file (.gdx). Either O= or X= must be specified (or both). |
Q | Query | none | This option can be used to specify a SQL query (required). |
Qn | n-th Query | none | Match the nth query with the nth output file (.inc) format or with the nth set- or parameter name when writing to GDX if multiple queries are used. |
S | setName | none | If we write to a GDX file, use this option to specify the name of a set to be used inside the GDX file. |
Sn | n-th setName | none | Match the nth query with the nth set in the GDX file if multiple queries are used. |
Y | setName (with expl. text) | none | If we write to a GDX file, use this option to specify the name of a set to be used inside the GDX file. Use this argument to store a set with explanatory text. |
Yn | n-th setName (with expl. text) | none | Match the nth query with the nth set (with explanatory text) in the GDX file if multiple queries are used. |
P | parameterName | none | If we write to a GDX file, use this option to specify the name of a parameter to be used inside the GDX file. |
Pn | n-th parameterName | none | Match the nth query with the nth parameter in the GDX file if multiple queries are used. |
D | Debug | disabled | Generate debug information. |
B | Quote Blanks | disabled | Quote strings if they contain blanks or embedded quotes. |
M | Mute | disabled | Controls if additional information is written to the log and include file. |
L | Listing | disabled | Controls if the data is embedded in the listing file. |
@fileName | ext. options file | none | Causes the program to read options from an external text file. |
N | iniFileName | mdb2gms.ini | Indicates the usage of a different INI file. |
F | formatString | none | Specify a format string. |
W | Wiring | none | Maps database columns to GAMS index positions. |
R | rowBatchSize | 100 | Row batch size; the default is 100 records. |
Some more detailed remarks on the command-line arguments:
I = string (inputFile, default = none)
This option is required and specifies the name of the .mdb or .accbd file containing the Access database. If the file contains blanks the name should be surrounded by double quotes. It is advised to use absolute paths, so Access has no confusion what file to open. On a network UNC Names can be used, and files from another computer can be accessed, e.g.
"\\hostname\c\my documents\a.mdb."This option is required for batch processing. To specify a path equal to the location where the .gms file is located, you can use:
I=system.fpmydb.mdbThis option is demonstrated in most examples, see Example 1 - Reading a single valued Table for instance.
O = string (outputIncludeFile, default = none)
This option specifies the name of the output file. The format of the output file will be a GAMS include file for a parameter or set statement. Make sure the directory is writable. UNC Names can be used. An output file must be specified for batch operation: i.e. either
O=
orX=
needs to be specified (or both). The include file will be an ASCII file that can be read by GAMS using the $include command within the data definition of a set, parameter or scalar. If the include file already exists, it will be overwritten. This option is demonstrated in most examples, see Example 1 - Reading a single valued Table for instance.On = string (outputIncludeFile, default = none)
When using multiple queries in a single
MDB2GMS
call, you can append a number to match a query with an output file, as an include file storing the results for multiples queries cannot be interpreted later on in your GAMS model when using the include file in a set or parameter definition:
Q1="SELECT a, b FROM table"
O1=ab.inc
Q2="SELECT c, d FROM table"
O2=cd.inc
See also section Multi-Query Batch Usage or Example 7 - Multi-Query Batch Example for instance.
X = string (outputGDXFile, default = none)
This option specifies the name of the output file. The format of the output file will be a GAMS GDX file. Make sure the directory is writable. UNC names can be used. If the GDX file already exists it will be overwritten - it is not possible to append to a GDX file. An output file must be specified for batch operation: i.e. either
O=
orX=
needs to be specified (or both). This option is demonstrated in Example 5 - Reading Set with Explanatory Text or Example 7 - Multi-Query Batch Example for instance.Q = string (Query, default = none)
This option can be used to specify an SQL query. Queries can contain spaces and thus have to be surrounded by double quotes. For the exact syntax of the queries that is accepted by Access we refer to the documentation that comes with MS Access. The query is passed on directly to the Jet database engine, so the complete power and expressiveness of Access SQL is available. For an exact description of allowed expressions consult a text on MS Access.
One notable syntax feature is that when field names or table names contain blanks, they can be specified in square brackets. Examples:
Q="SELECT * FROM mytable"
Q="SELECT year, production FROM [production table]"
Q="SELECT [GAMS City], value FROM [example table], CityMapper WHERE [Access City]=city"
This option is demonstrated in Example 1 - Reading a single valued Table for instance
Qn = string (Query, default = none)
When using multiple queries in a single
MDB2GMS
call, you can append a number to match a query with an output file, as an include file storing the results for multiples queries cannot be interpreted later on in your GAMS model when using the include file in a set or parameter definition. In addition, you can match the results of a query with a specific set- or parameter name when writing to GDX.
Q1="SELECT a, b FROM table"
O1=ab.inc
Q2="SELECT c, d FROM table"
O2=cd.inc
or (GDX output file format - where several sets and parameters can be stored in a single file):
Q1="SELECT a, b FROM table"
P1=abParameter
Q2="SELECT c FROM table"
S2=cSet
Note the usage of the arguments Pn resp. Sn in order to store the results as parameter resp. set and to specify the name of the symbols. See also section Multi-Query Batch Usage or Example 7 - Multi-Query Batch Example for instance.
S = string (setName, default = none)
If we write to a GDX file, use this option to specify the name of a set to be stored in the GDX file (containing the results of the query). This option is demonstrated in Example 4 - Reading a multi dimensional Set.
Sn = string (setName, default = none)
If multiple queries are used in a single
MDB2GMS
call while writing to a GDX file, use this option to specify the name of the nth set to be stored in the GDX file (containing the results of the nth query), e.g.
Q1="SELECT i FROM table"
S1=iSet
Q2="SELECT j FROM table"
S2=jSet
See also section Multi-Query Batch Usage or Example 7 - Multi-Query Batch Example for instance.
Y = string (setName, default = none)
If we write to a GDX file, use this option to specify the name of a set to be used inside the GDX file. The last column specified within the select clause in the SQL statement will be used as explanatory text. This option is demonstrated in Example 5 - Reading Sets with Explanatory Text for instance.
Yn = string (setName, default = none)
If multiple queries are used in a single
MDB2GMS
call while writing to a GDX file, use this option to specify the name of the nth set (with explanatory text) to be stored in the GDX file (containing the results of the nth query), e.g.
Q1="SELECT i, explTextForSeti FROM table"
Y1=iSet
Q2="SELECT j, explTextForSetj FROM table"
Y2=jSet
The last column specified within the select clause in the SQL statements will be used as explanatory text.
P = string (parameterName, default = none)
If we write to a GDX file, use this option to specify the name of a parameter to be stored the GDX file (containing the results of the query).
Pn = string (parameterName, default = none)
If multiple queries are used in a single
MDB2GMS
call while writing to a GDX file, use this option to specify the name of the nth parameter to be stored in the GDX file (containing the results of the nth query), e.g.
Q1="SELECT i, j, value FROM table"
A1=ijValue
Q2="SELECT n, m, value FROM table"
A2=nmValue
See also section Multi-Query Batch Usage or Example 7 - Multi-Query Batch Example for instance.
This option can be used for debugging purposes. If specified the import filter will not run minimized but "restored", i.e. as a normal window. In addition the program will not terminate until the user clicks the Close button. This allows you to monitor possible errors during execution of
MDB2GMS
.B (Quote Blanks, default = disabled)
If this parameter is specified, strings that have blanks in them will be quoted. If the string is already quoted this step is not performed. If the name contains an embedded single quote, the surrounding quotes will be double quotes. If the name already contains a double quote, the surrounding quotes will be single quotes. If both single and double quotes are present in the string, then all double quotes are replaced by single quotes and the surrounding quotes will be double quotes. By default this option is turned off. For more information see subsection Quotes. This option only applies to an output include file.
Run in modest or mute mode: no additional information, such as version number, number of rows in the data, elapsed time, used query etc. is written to the log and include file.
L (Listing, default = disabled)
Embed the data between the
$offListing
and$onListing
dollar control options, so the data will not be listed in the listing file. This is a quick way to reduce the size of the listing file when including very large data files into the model. Otherwise the listing file would become too large to be handled comfortably.@fileName = string (fileName, default = none)
Causes the program to read options from an external text file. If the file name contains blanks, it can be surrounded by double quotes. The option file contains one option per line, in the same syntax as if they were specified directly on the command-line. See also Command Files for some further details.
N = string (fileName, default = mdb2gms.ini)
Use a different INI file than the standard mdb2gms.ini located in the same directory as the executable mdb2gms.exe.
F = string (formatString, default = none)
In special cases we can apply a format string on the include file output (not for GDX output). Each column in the result set is a string and can be represented by a s in the format string.
W = string (wiring, default = none)
By using the
W
option, one can map database columns to GAMS index positions. See model [Wiring] for reference.R = integer (rowBatchSize, default = 100)
Row batch size; the default is 100 records. This option must be specified in an INI file when using the interactive mode of
MDB2GMS
.
Example 1 - Reading a single valued Table
Suppose we want to read the distances parameter of the [trnsport] model from the GAMS Model Library. The data is stored in the Microsoft Access Database format (file Sample.mdb).
The data can be queried with a simple SQL statement:
By running the following MDB2GMS
statement, the connection to the database Sample.mdb is established. In addition, the data will be queried and the results are written to a GAMS include file afterwards (.inc).
mdb2gms I=Sample.mdb Q="SELECT city1, city2, distance FROM distances" O=distances.inc
The MS Access database file name is specified using the argument I. Note that the string is enclosed by quotes, as the string contains blanks. The arguments Q and O are used to specify the query and the output file name (and format).
The generated include file distances.inc looks like:
* -----------------------------------------------------
* MDB2GMS 24.8.5 r61358 Released May 10, 2017 VS8 x86 32bit/MS Windows
* Erwin Kalvelagen, GAMS Development Corp
* -----------------------------------------------------
* DAO version: 14.0
* Jet version: 4.0
* Database: F:\datalib\Sample.mdb
* Query: SELECT city1, city2, distance FROM distances
* -----------------------------------------------------
SEATTLE.NEW-YORK 2.5
SAN-DIEGO.NEW-YORK 2.5
SEATTLE.CHICAGO 1.7
SAN-DIEGO.CHICAGO 1.8
SEATTLE.TOPEKA 1.8
SAN-DIEGO.TOPEKA 1.4
* -----------------------------------------------------
The commented header section summarizes some information about the MDB2GMS
resp. GAMS version and about the executed database query. The standard export format is to consider the last column as the value column (containing the distances) and the previous columns as the indices (containing the city names). The indices are separated by a dot, allowing the generated include file to be used as part of a parameter declaration statement in your GAMS model.
Retrieving the data using MDB2GMS
from the database and including the queried data in your GAMS model within the parameter declaration statement (at compile time) can be combined in the following way:
Set
i 'canning plants' / seattle, san-diego /
j 'markets' / new-york, chicago, topeka /;
$call mdb2gms I=Sample.mdb Q="SELECT city1, city2, distance FROM distances" O=distances.inc
Parameter d(i,j) 'distance in thousands of miles' /
$include distances.inc
/;
display d;
Finally, the values of the parameter d
are displayed:
new-york chicago topeka seattle 2.500 1.700 1.800 san-diego 2.500 1.800 1.400
This example is also part of the GAMS Data Utilities Library, see model [Distances1] for reference. Note that the query results are written to a GDX file in addition.
Example 2 - Reading a multi valued Table
In this scenario, we want two read the three index columns year
, loc
, prod
and the value columns sales
and profit
from the database file Sample.mdb. Therefore, we use two separate parameters and queries or alternatively a parameter with an extra index position (for sales
resp. profit
) and a UNION select.
Consider the table with two value columns sales
and profit
:
Two separate Parameters
A simple way to import this into GAMS is to use two parameters and two SQL queries. The SQL queries can look like:
We can generate a include file sales.inc by running the following command:
mdb2gms I=Sample.mdb Q="SELECT year, loc, prod, sales FROM data" O=sales.inc
Note that we specify the first query in order to select the sales and the relevant index columns within the Q argument. The query results are written to sales.inc using the O argument. Analogously we generate a include file profit.inc by running the following command while specifying the second query in order to obtain the profits and the relevant index columns:
mdb2gms I=Sample.mdb Q="SELECT year, loc, prod, profit FROM data" O=profit.inc
Retrieving the data using MDB2GMS
from the database Sample.mdb and including the queried data in your GAMS model within the parameter declaration statements of sales
and profit
(at compile time) can be combined in the following way:
Set
year 'years' / 1997*1998 /
loc 'locations' / nyc, was, la, sfo /
prd 'products' / hardware, software /;
$call mdb2gms I=Sample.mdb Q="SELECT year, loc, prod, sales FROM data" O=sales.inc
Parameter sales(year,loc,prd) /
$include sales.inc
/;
$call mdb2gms I=Sample.mdb" Q="SELECT year, loc, prod, profit FROM data" O=profit.inc
Parameter profit(year,loc,prd) /
$include profit.inc
/;
This example is also part of the GAMS Data Utilities Library, see model [SalesProfitDB1] for reference.
Single Parameter with extra Index Position
The operation can also be performed in one big swoop by using a different GAMS datastructure, i.e. a single parameter is defined with an extra index type
to indicate the data type (sales or profit). The index and value columns will be selected by the following SQL statement. Note the UNION operation in order to combine the results and the strings 'sales' resp. 'profit' to identify the data type later on.
The data is accessed, queried and written to data.inc by running the following command:
mdb2gms @howToRead.txt
Note that usage of the external parameter file howToRead.txt shown below in order to increase the readability of the command (one argument per line, quotes can be omitted):
I=Sample.mdb Q=SELECT year, loc, prod, 'sales', sales FROM data UNION SELECT year, loc, prod, 'profit', profit FROM data O=data.inc
The generated include file data.inc looks like (shortened for presentation):
* -----------------------------------------------------
* MDB2GMS 24.8.5 r61358 Released May 10, 2017 VS8 x86 32bit/MS Windows
* Erwin Kalvelagen, GAMS Development Corp
* -----------------------------------------------------
* DAO version: 14.0
* Jet version: 4.0
* Database: F:\datalib\Sample.mdb
* Query: SELECT year, loc, prod, 'sales', sales FROM data UNION SELECT year, loc, prod, 'profit', profit FROM data
* -----------------------------------------------------
1997.la.hardware.profit 8
1997.la.hardware.sales 80
1997.la.software.profit 16
1997.la.software.sales 60
1997.nyc.hardware.profit 5
1997.nyc.hardware.sales 110
1997.nyc.software.profit 10
1997.nyc.software.sales 100
1997.sfo.hardware.profit 9
1997.sfo.hardware.sales 80
1997.sfo.software.profit 10
1997.sfo.software.sales 50
1997.was.hardware.profit 7
1997.was.hardware.sales 120
1997.was.software.profit 20
1997.was.software.sales 70
1998.la.hardware.profit 6
1998.la.hardware.sales 70
* -----------------------------------------------------
Retrieving the data using MDB2GMS
from the database and including the queried data in your GAMS model within the parameter declaration (at compile time) can be combined in the following way (note that the parameter has a fourth index type
in order to access the data type sales
resp. profit
):
$onEcho > howToRead.txt
I=Sample.mdb
Q=SELECT year, loc, prod, 'sales', sales FROM data UNION SELECT year, loc, prod, 'profit', profit FROM data
O=data.inc
$offEcho
Set
year 'years' / 1997*1998 /
loc 'locations' / nyc, was, la, sfo /
prd 'products' / hardware, software /
type 'data type' / sales, profit /;
$call mdb2gms @howToRead.txt
Parameter data(year,loc,prd,type) /
$include data.inc
/;
This example is also part of the GAMS Data Utilities Library, see model [SalesProfitDB2c] for reference.
Example 3 - Reading a one dimensional Set
This example demonstrates how to read set elements of a one dimensional set from a single column of a MS Access database file. Suppose we want to read the column City1 (see table distances) in order to define the set i
in the GAMS model. Make sure elements are unique by using the distinct operation within the SQL statement (otherwise there will be an error when including the file within the set definition in the GAMS model, as some set elements will be redefined):
The include file city1.inc looks like (header informations are removed in order to shorten the representation):
* -----------------------------------------------------
SAN-DIEGO
SEATTLE
* -----------------------------------------------------
All steps (data access via MDB2GMS
, set definition) can be combined:
$call mdb2gms I=Sample.mdb Q="SELECT distinct(city1) FROM distances" O=city_i.inc
Set i 'canning plants' /
$include city_i.inc
/;
display i;
The display statement generates the following output in the listing file:
---- 56 SET i seattle , san-diego
Example 4 - Reading a multi dimensional Set
When reading a multi dimensional set from database and writing the results to an include file by using the O argument, one has to observe that the elements in the include file must have the correct format in order to be interpreted as element of a multi dimensional set. For instance, a line containing a
b
c
is not recognized as a proper set element of a three dimensional set. In particular, one has to add periods between the single elements, i.e. a
.b
.c
will be interpreted correctly.
There are different ways to add these periods explicitly within the SQL statement. E.g. add a dummy value field by adding a quoted blank to the select clause (index1, index2, index3 and dataTable are some placeholders):
or by adding the periods explicitly within the select clause (|| or & depending on DBMS):
For instance, suppose we want to define a two dimensional set
Set ij(i,j) 'canning plants - markets';
based on the data of the table distances stored in Sample.mdb. The following MDB2GMS
statement connects you to the database, queries the columns with the city names and adds an empty value field in order to create periods between the set elements:
mdb2gms I=Sample.mdb Q="SELECT city1, city2, ' ' FROM distances" O=city_ij.inc
The include file city_ij.inc looks like (header informations are removed in order to shorten the representation):
* -----------------------------------------------------
SEATTLE.NEW-YORK ' '
SAN-DIEGO.NEW-YORK ' '
SEATTLE.CHICAGO ' '
SAN-DIEGO.CHICAGO ' '
SEATTLE.TOPEKA ' '
SAN-DIEGO.TOPEKA ' '
* -----------------------------------------------------
Without adding the empty value field, the resulting include file would look like (shortened):
* -----------------------------------------------------
SEATTLE NEW-YORK
SAN-DIEGO NEW-YORK
* -----------------------------------------------------
Since the periods are missing, the lines are not recognized as valid elements of a two dimensional set. All steps can be combined in the following way:
Set
i 'canning plants' / seattle, san-diego /
j 'markets' / new-york, chicago, topeka /;
$call mdb2gms I=Sample.mdb" Q="SELECT city1, city2, ' ' FROM distances" O=city_ij.inc
Set ij(i,j) 'two dimensional set' /
$include city_ij.inc
/;
display ij;
The display statement generates the following output in the listing file:
---- 75 SET ij two dimensional set new-york chicago topeka SAN-DIEGO YES YES YES SEATTLE YES YES YES
Note that there is no need to add periods explicitly when reading multi dimensional sets, if the results are written only to a GDX file by using the X and S resp. Y arguments, i.e. there is no need to modify the query:
when using MDB2GMS
in the following way:
mdb2gms I=Sample.mdb Q="SELECT index1, index2, index3 FROM datatable" X=setData.gdx S=setName
which will generate the file setData.gdx with a three dimensional set named setName
containing the results of the query.
Example 5 - Reading Sets with Explanatory Text
In this example, we will demonstrate how to read set elements with explanatory text from a MS Access database file using MDB2GMS
. In the first place, we are going to write the query results to an include file, afterwards we use the Y argument in order to store the query results as a set with explanatory text in a GDX file.
Note the blanks and the mixed quotes in the column containing the explanatory text. The data can be accessed by the following query:
Writing the Query Results in an include File
The last column in the select clause will be used as explanatory text. Take in mind to add the argument B in order to handle text strings with embedded blanks or quotes. The following GAMS code accesses the data and writes the results to an include file setData.inc:
$call mdb2gms I=Sample.mdb B Q="SELECT setElement, explText FROM setData" O=setData.inc
Set a /
$include setData.inc
/;
The resulting include file will look like (header informations are removed in order to shorten the representation):
* -----------------------------------------------------
firstSetElement "Explanatory text for the first 'set element'"
secondSetElement 'Explanatory text for the second "set element"'
thirdSetElement "Explanatory text for the third 'set element'"
fourthSetElement 'Explanatory text for the fourth set element'
* -----------------------------------------------------
Note the handling of the quotes according to the description in B.
Writing the Query Results in a GDX File
When storing the results of the query as a set with explanatory text in a GDX file, there is no need to observe embedded blanks or quotes manually, instead one can use the Y argument. The last column specified in the select clause of the SQL statement will be interpreted as explanatory text. The following GAMS code accesses the data and writes the results to a GDX file setData.inc:
$call mdb2gms I=Sample.mdb Q="SELECT setElement, explText FROM setData" X=setData.gdx Y=set_b
Set b;
$gdxIn setData.gdx
$load b = set_b
$gdxIn
Note that the name of the set in the GDX file is set_b
(specified within the Y
argument), while the name of the GDX file was specified within the X argument.
Example 6 - Index Mapping
In some cases the index elements used in the database are not the same as in the GAMS model. E.g. consider the case where the GAMS model has defined a set as:
Set i / NY, DC, LA, SF /;
Now assume a data table looks like:
This means we have to map ‘new york' to ‘NY' etc. This mapping can be done in two places: either in GAMS or in the database.
Index Mapping done in GAMS
When we export the table directly, we get the following include file (header informations are removed in order to shorten the representation):
* -----------------------------------------------------
'new york' 100
'los angeles' 120
'san francisco' 105
'washington dc' 102
* -----------------------------------------------------
Note that the single quotes are added by activating the option B (quote blanks), as the index elements contain blanks. Accessing the data, importing the resulting include file and converting it to a different index space can be done by the following GAMS code:
Set i / NY, DC, LA, SF /;
Set idb 'from database' / 'new york', 'washington dc', 'los angeles', 'san francisco' /;
$call mdb2gms I=Sample.mdb B O="city1.inc" Q="SELECT city, value FROM [example table]"
Parameter dbdata(idb) /
$include city1.inc
/;
Set mapindx(i,idb) / NY.'new york', DC.'washington dc', LA.'los angeles', SF.'san francisco' /;
Parameter data(i);
data(i) = sum(mapindx(i,idb), dbdata(idb));
display data;
The display statement generates the following output in the listing file:
---- 47 PARAMETER data NY 100.000, DC 102.000, LA 120.000, SF 105.000
This example is also part of the GAMS Data Utilities Library, see model [IndexMapping1] for reference.
Index mapping done in Database
The second approach is to handle the mapping inside the database. We can introduce a mapping table that looks like:
This table can be used in a join to export the data in a format we can use by executing the query:
The resulting include file looks like (header informations are removed in order to shorten the representation):
* -----------------------------------------------------
la 120
ny 100
sf 105
dc 102
* -----------------------------------------------------
All steps can be combined in the GAMS model:
Set i / NY, DC, LA, SF /;
$onEcho > howToRead.txt
I=Sample.mdb
Q=SELECT [GAMS City], [value] FROM example_table, CityMapper WHERE CityMapper.[Access City]=example_table.city
O=city2.inc
$offEcho
$call mdb2gms @howToRead.txt
Parameter data(i) /
$include city2.inc
/;
display data;
The display statement generates the following output in the listing file:
---- 38 PARAMETER data NY 100.000, DC 102.000, LA 120.000, SF 105.000
Note: MS Access allows table names with embedded blanks. In that case the table name can be surrounded by square brackets. Other databases may not allow this.
This example is also part of the GAMS Data Utilities Library, see model [IndexMapping2] for reference.
Multi-Query Batch Usage
In some cases a number of small queries need to be performed on the same database. However, several individual MDB2GMS
execution can become expensive, since there is significant overhead in starting Access and opening the database. For these cases, we have added the option to do multiple queries in one call. To execute several queries in a single MDB2GMS
call and write several GAMS include files containing the results of the queries, we can use the command-line arguments Qn and On. The structure of a multi-query call looks like:
I=sample.mdb Q1=firstQuery O1=outputFileName.inc Q2=secondQuery O2=outputFileName.inc Q3=thirdQuery O3=outputFileName.inc
The terms firstQuery
, secondQuery
etc. are placeholders for some SQL statements. We see that the argument Qn
is matched by an argument On
. That means that the results of the n-th query are written to the n-th output file.
In case we want to store the results of a multi-query call to a single GDX file, we can use the command-line arguments Qn, Sn, Pn and Yn. The structure of a multi-query call when writing to a GDX file looks like:
I=sample.mdb
X=sample.gdx
Q1=firstQuery
S1=setName
Q2=secondQuery
S2=setName
Q3=thirdQuery
A3=parameterName
Q4=fourthQuery
A4=setName
Again, the terms firstQuery
, secondQuery
etc. are placeholders for some SQL statements. Here we see that a query Qn
is matched by either a set name Sn
or a parameter name Pn
, i.e. the results of the first query will be stored as a set whose name is specified within the S1
argument, the results of the third query will be stored as a parameter whose name is specified within the P3
argument etc. The X
argument is used to specify the name of the GDX file.
For a complete example see section Example 7 - Multi-Query Batch Example.
Example 7 - Multi-Query Batch Example
As an example database we use the following Access table (file Sample.mdb):
We want to extract the following information:
- The set year
- The set loc
- The set prd
- The parameter sales
- The parameter profit
Output: Several include Files
This can be accomplished using the following GAMS code with multiple queries in a single MDB2GMS
call (note the usage of the distinct operator in the select clauses of the queries whose results will be used as sets in order to keep the set elements unique):
$onEcho > howToRead.txt
I=Sample.mdb
Q1=SELECT distinct(year) FROM data
O1=year.inc
Q2=SELECT distinct(loc) FROM data
O2=loc.inc
Q3=SELECT distinct(prod) FROM data
O3=prod.inc
Q4=SELECT prod, loc, year, sales FROM data
O4=sales.inc
Q5=SELECT prod, loc, year, profit FROM data
O5=profit.inc
$offEcho
$call =mdb2gms @howToRead.txt
Set y 'years' /
$include year.inc
/;
Set loc 'locations' /
$include loc.inc
/;
Set prd 'products' /
$include prod.inc
/;
Parameter sales(prd,loc,y) /
$include sales.inc
/;
display sales;
Parameter profit(prd,loc,y) /
$include profit.inc
/;
display profit;
This example is also part of the GAMS Data Utilities Library, see model [SalesProfitDB3] for reference.
Output: A single GDX File
The same example imported through a GDX file can look like:
$onEcho > howToRead.txt
I=Sample.mdb
X=Sample.gdx
Q1=SELECT distinct(year) FROM data
S1=year
Q2=SELECT distinct(loc) FROM data
S2=loc
Q3=SELECT distinct(prod) FROM data
S3=prd
Q4=SELECT prod, loc, year, sales FROM data
P4=sales
Q5=SELECT prod, loc, year, profit FROM data
P5=profit
$offEcho
$call =mdb2gms @howToRead.txt
Set
y 'years'
loc 'locations'
prd 'products';
Parameter
sales(prd,loc,y)
profit(prd,loc,y);
$gdxIn Sample.gdx
$load y=year prd loc sales profit
$gdxIn
display sales, profit;
The call of the GDXViewer will display the GDX file in the stand-alone GDX viewer. This example is also part of the GAMS Data Utilities Library, see model [SalesProfitDB4] for reference.
Interactive Usage
When the tool is called without command-line parameters, it will startup interactively. Using it this way, one can specify the database file (.mdb or .accbd file), the query and the final destination file (a GAMS include file or a GDX file) using the built-in interactive environment. The main screen (see figure below) contains a number of buttons and edit boxes, which are explained below.
- Input file (.mdb or .accbd). This is the combo box to specify the input file. See also inputFile for some more detailed remarks. The browse button can be used to launch a file open dialog which makes it easier to specify a file. The file may be located on a remote machine using the notation
\\machine\directory\file.mdb
.
- Output GAMS Include file (*.inc). If you want to create a GAMS include file, then specify here the destination file. See also outputIncludeFile for some more detailed remarks.
- Output GDX file (*.gdx). As an alternative to a GAMS include file, the tool can also generate a GDX file. One or both of the output files need to be specified. See also outputGDXFile for some more detailed notes.
- SQL Query. The SQL Query box is the place to provide the query. Note that the actual area for text can be larger than is displayed: use the cursor-keys to scroll. See also Q for some more detailed remarks. For an exact description of allowed expressions consult a text on MS Access.
- Progress Memo. This memo field is used to show progress of the application. Also error messages from the database are printed here. This is a read-only field.
- The edit boxes above all have a drop down list which can be used to access quickly file names and queries that have been used earlier (even from a previous session).
- The Tables button will pop up a new window with the contents of the database file selected in the input file edit line. This allows you to see all table names and field names needed to specify a correct SQL query. An exception will be generated if no database file name is specified in the input edit line.
- The Options button will pop up a window where you can specify a number of options.
- Pressing the Help button will show this documentation.
- Pressing the OK button will execute the query and an include file or GDX file will be generated.
- Pressing the Batch button will give information on how the current query can be executed directly from GAMS in a batch environment. The batch call will be displayed and can be copied to the clipboard. In the IDE press
Ctrl-C
or chooseEdit|Paste
to copy the contents of the clipboard to a GAMS text file.
- Pressing Close button will exit the application. The current settings will be saved in an INI file so when you run
MDB2GMS
again all current settings will be restored.
Options
The Options window can be created by pressing the options button:
The following options are available in the options window:
- Quote blanks: Quote strings if they contain blanks or embedded quotes. See also B for some more detailed notes.
- Mute: Don't include the extra informational text (such as used query etc.) in the include file.
- No listing: Surround the include file by
$offListing
and$onListing
so that the data will not be echoed to the listing file. The equivalent command-line argument is L.
- Format SQL: If an SQL text is reloaded in the SQL Edit Box, it will be formatted: keywords will be printed in CAPS and the FROM and WHERE clause will be printed on their own line. If this check box is unchecked this formatting will not take place and SQL queries will be shown as is.
The following options are only needed in special cases:
- NULL: This radio box determines how NULL's are handled. A NULL in an index position or a value column will usually make the result of the query useless: the GAMS record will be invalid. To alert you on NULL's the default to throw an exception is a safe choice. In special cases you may want to map NULL's to an empty string or a 'NULL' string.
- Output Lines: By default output lines are created as follows: all first n-1 fields are considered indices and the last n-th column is the value. The format corresponding to this situation is ‘%s.%s.%s %s'` (for a three dimensional parameter). In special cases you may want to tinker with the format string being used. The fields are all considered strings, so only use s as format placeholder. Make sure you specify exactly the same number of s's as there are columns in the result set.
The buttons have an obvious functionality:
- OK button will accept the changes made.
- Cancel button wil ignore the changes made, and all option settings will revert to their previous values.
- Help button will show this help text.
Strategies
Including SQL statements to extract data from a database inside your model can lead to a number of difficulties:
- The database can change between runs, leading to results that are not reproducible. A possible scenario is a user calling you with a complaint: "the model is giving strange results". You run the model to verify and now the results are ok. The reason may be because the data in the database has changed.
- There is significant overhead in extracting data from a database. If there is no need to get new data from the database it is better to use a snapshot stored locally in a format directly accessible by GAMS.
- It is often beneficial to look at the extracted data. A first reason, is just to make sure the data arrived correctly. Another argument is that viewing data in a different way may lead to a better understanding of the data. A complete "under-the-hood" approach may cause difficulties in understanding certain model behavior.
Often it is a good strategy to separate the data extraction step from the rest of the model logic.
If the sub-models form a chain or a tree, like in:
Data Extraction --> Data Manipulation --> Model Definition --> Model Solution --> Report Writing
we can conveniently use the save/restart facility. The individual submodel are coded as:
- Step 0: sr0.gms
$onText step 0: data extraction from database execute as: > gams sr0 save=s0 $offText Set i 'suppliers' j 'demand centers'; Parameter demand(j) supply(i) dist(i,j) 'distances'; $onEcho > howtoRead.txt I=transportation.mdb Q1=select name from suppliers O1=i.inc Q2=select name from demandcenters O2=j.inc Q3=select name,demand from demandcenters O3=demand.inc Q4=select name,supply from suppliers O4=supply.inc Q5=select supplier,demandcenter,distance from distances O5=dist.inc $offEcho $call =mdb2gms.exe @howtoRead.txt Set i / $include i.inc /; Set j / $include j.inc /; Parameter demand / $include demand.inc /; Parameter supply / $include supply.inc /; Parameter dist / $include dist.inc /; display i, j, demand, supply, dist;
- Step 1: sr1.gms
$onText step 1: data manipulation step execute as: > gams sr1 restart=s0 save=s1 $offText Scalar f 'freight in dollars per case per thousand miles' / 90 /; Parameter c(i,j) 'transport cost in thousands of dollars per case'; c(i,j) = f*dist(i,j)/1000;
- Step 2: sr2.gms
$onText step 2: model definition execute as: > gams sr2 restart=s1 save=s2 $offText Variable x(i,j) 'shipment quantities in cases' z 'total transportation costs in thousands of dollars'; Positive Variable x; Equation ecost 'define objective function' esupply(i) 'observe supply limit at plant i' edemand(j) 'satisfy demand at market j'; ecost.. z =e= sum((i,j), c(i,j)*x(i,j)); esupply(i).. sum(j, x(i,j)) =l= supply(i); edemand(j).. sum(i, x(i,j)) =g= demand(j);
- Step 3: sr3.gms
$onText step 3: model solution execute as: > gams sr3 restart=s2 save=s3 $offText option lp = cplex; Model transport / all /; solve transport using lp minimizing z;
- Step 4: sr4.gms
$onText step 4: report writing execute as: > gams sr4 restart=s3 $offText abort$(transport.modelStat <> 1) "model not solved to optimality"; display x.l, z.l;
A model that executes all steps can be written as:
execute '=gams.exe sr0 lo=3 save=s0';
abort$errorLevel "step 0 failed";
execute '=gams.exe sr1 lo=3 restart=s0 save=s1';
abort$errorLevel "step 1 failed";
execute '=gams.exe sr2 lo=3 restart=s1 save=s2';
abort$errorLevel "step 2 failed";
execute '=gams.exe sr3 lo=3 restart=s2 save=s3';
abort$errorLevel "step 3 failed";
execute '=gams.exe sr4 lo=3 restart=s3';
abort$errorLevel "step 4 failed";
If you only change the reporting step, i.e. generating some output using PUT
statements, then you only need to change and re-execute step 4. If you change solver or solver options, then only steps 3 and 4 need to be redone. For a small model like this, this exercise may not be very useful, but when the model is large and every step is complex and expensive, this is a convenient way to achieve quicker turn-around times in many cases.
The model [MDBSr5] is also part of the GAMS Data Utilities Library.
In some cases the save/restart facility is not appropriate. A more general approach is to save the data from the database in a GDX file, which can then be used by other models. We can use the model from step 0 to store the data in a GDX file:
MDB2GDX1.gms
execute '=gams.exe sr0 lo=3 gdx=trnsport.gdx';
abort$errorLevel "step 0 failed";
execute '=gdxviewer.exe trnsport.gdx';
The model [MDB2GDX1] is also part of the GAMS Data Utilities Library.
We can also let MDB2GMS
create the GDX file:
MDB2GDX2.gms
This model demonstrates how to store data from Access database (file Transportation.mdb) into a GDX file.
$onEcho > howToRead.txt
I=Transportation.mdb
X=Transportation.gdx
Q1=SELECT name FROM suppliers
S1=i
Q2=SELECT name FROM demandcenters
S2=j
Q3=SELECT name, demand FROM demandcenters
P3=demand
Q4=SELECT name, supply FROM suppliers
P4=supply
Q5=SELECT supplier, demandcenter, distance FROM distances
P5=dist
$offEcho
$call =mdb2gms.exe @howToRead.txt
The first approach has the advantage that a complete audit record is available from the data moved from the database to the GDX file in the sr0.lst listing file. If someone ever wonders what came out of the database and how this was stored in the GDX file, that file gives the answer.
The model [MDB2GDX2] is also part of the GAMS Data Utilities Library.
To load the GDX data the following fragment can be used:
GDXTRNSPORT.gms
This model demonstrates how to load the transportation data from GDX file at compile time.
Set
i 'suppliers'
j 'demand centers';
Parameter
demand(j)
supply(i)
dist(i,j) 'distances';
$gdxIn transportation.gdx
$load i j demand supply dist
display i, j, demand, supply, dist;
DBTimestamp1.gms
In one application I had to retrieve data from the database each morning, at the first run of the model. The rest of the day, the data extracted that morning could be used. The following logic can implement this:
$onText
Retrieve data from data base first run each morning.
$offText
$onEcho > getdate.txt
I=%system.fp%transportation.mdb
Q=select day(now())
O=dbtimestamp.inc
$offEcho
$if not exist dbtimestamp.inc $call "echo 0 > dbtimestamp.inc"
Scalar dbtimestamp 'day of month when data was retrieved' /
$include dbtimestamp.inc
/;
Scalar currentday 'day of this run';
currentday = gday(jnow);
display "compare", dbtimestamp, currentday;
if(dbtimestamp <> currentday,
execute '=gams.exe sr0 lo=3 gdx=transportation.gdx';
abort$errorLevel "step 0 (database access) failed";
execute '=mdb2gms.exe @getdate.txt'
);
The include file dbtimestamp.inc contains the day of the month (1,..,31) on which the data was extracted from the database. If this file does not exist, we initialize it with 0. We then compare this number with the current day of the month. If the numbers do not agree, we execute the database extraction step and rewrite the dbtimestamp.inc file. This last operation could be done using a PUT
statement, but in this case we used an SQL statement.
The model [DBTimestamp1] is also part of the GAMS Data Utilities Library.
Command Files
Parameters can be specified in a command file. This is important if the length of the command-line exceeds 255 characters, which is a hard limit on the length that GAMS allows for command-lines. Instead of specifying a long command-line as in:
$call =mdb2gms I="c:\My Documents\test.mdb" O="c:\My Documents\data.inc" Q="SELECT * FROM mytable"
we can use a command-line like:
$call =mdb2gms @"c:\My Documents\options.txt"
The command file
c:\My Documents\options.txt
can look like:
I=c:\My Documents\test.mdb O=c:\My Documents\data.inc Q=SELECT * FROM mytable
It is possible to write the command file from inside a GAMS model using the $echo
command. The following example will illustrate this:
$set cmdfile "c:\windows\temp\commands.txt"
$echo "I=E:\models\labordata.mdb" > "%cmdfile%"
$echo "O=E:\models\labor.INC" >> "%cmdfile%"
$echo "Q=SELECT * FROM labor" >> "%cmdfile%"
$call =mdb2gms @"%cmdfile%"
Parameter p /
$include "E:\models\labor.INC"
/;
display p;
Newer versions of GAMS allow the usage of the $onEcho
and $offEcho
commands:
$set cmdfile "c:\windows\temp\commands.txt"
$onEcho > "%cmdfile%"
I=E:\models\labordata.mdb
O=E:\models\labor.INC
Q=SELECT * FROM labor
$offEcho
$call =mdb2gms @"%cmdfile%"
Parameter p /
$include "E:\models\labor.INC"
/;
display p;
Note that the quotes enclosing strings with blanks like Q=SELECT * FROM labor
can be omitted when using an external parameter file.
If a query becomes very long, it is possible to spread it out over several lines. To signal a setting will continue on the next line insert the character \ as the last character. E.g.:
Q=SELECT prod, loc, year, 'sales', sales FROM data \
UNION \
SELECT prod, loc, year, 'profit', profit FROM data
Notes
GDX Files
A GDX file contains GAMS data in binary format. The following GAMS commands will operate on GDX files: $gdxIn, $load, execute_load, execute_unload. The GDX=filename command-line argument will save all data to a GDX file. A GDX file can be viewed in the GAMS IDE using File|Open
.
UNC Names
UNC means Unified Naming Convention. UNC names are a Microsoft convention to name files across a network. The general format is:
\\<server>\<share>\<path>\<file>
Examples:
\\athlon\c\My Documents\MDB2GMS.rtf
Quotes
Examples of handling of indices when the option B for quoting strings containing blanks is used:
Input | Output | Remarks |
---|---|---|
Hello | hello | blanks or embedded quotes |
"hello" | "hello" | touched, is quoted already |
'hello' | 'hello' | id. |
"hello' | "hello' | id, but will generate an error in GAMS |
o'brien | "o'brien" | |
'o'brien' | 'o'brien' | touched, will generate an error in GAMS |
art"ificial | 'art"ificial' | |
art"ifi'cial | "art'ifi'cial" |
$CALL Command
The $call
command in GAMS will execute an external program at compile time. There are two forms:
The $call
command in GAMS will execute an external program at compile time. There are two forms:
$call externalProgram
$call =externalProgram
The version without the leading '=' calls the external through the command processor (command.com or cmd.exe). The second version with the '=' bypasses the command processor and directly executes the external program. We mention some of the differences:
- Some commands are not external programs but built-in commands of the command processor. Examples are COPY, DIR, DEL, ERASE, CD, MKDIR, MD, REN, TYPE. If you want to execute these commands you will need to use the form
$call externalProgram
which uses the command processor. - If you want to execute a batch file (.bat or .cmd file) then you will need to use the form
$call externalProgram
. - If it is important to stop with an appropriate error message if the external program does not exist, only use the form
$call =externalProgram
. The other form is not reliable in this respect. This can lead to surprising results and the situation is often difficult to debug, so in general we would recommend to use the form:$call =externalProgram
. - When calling pure Windows programs it is important to call the second form. The first form will not wait until the external Windows program has finished. If it is important to use a command processor in the invocation of a Windows program, use the
START
command, as in:$call start /w externalWindowsProgram
. Otherwise, it is preferred to use:$call =externalWindowsProgram
.
- Attention
- In general it is recommended to use the
$call =externalProgram
version for its better error-handling.
When command line arguments need to be passed to the external program, they can be added to the line, separated by blanks:
$call externalProgram parameter1 parameter2
$call =externalProgram parameter1 parameter2
The total length of the command line can not exceed 255 characters. If the program name or the parameters contain blanks or quotes you will need to quote them. You can use single or double quotes. In general the following syntax will work:
$call '"external program" "parameter 1" "parameter 2"'
$call ="external program" "parameter 1" "parameter 2"
It is noted that the first form needs additional quotes around the whole command line due to bugs in the parsing of the $call in GAMS. The second form work without additional quotes only if the = appears outside the double quotes.
Compile Time Commands
All $
commands in GAMS are performed at compile time. All other statements are executed at execution time. This means that a compile time command will be executed before an execution time command, even if it is below. As an example consider:
File batchfile / x.bat /;
putClose batchfile "dir"/;
$call x.bat
This fragment does not work correctly as already during compilation, the $call
is executed, while the put statements are only executed after the compilation phase has ended and GAMS has started the execution phase. The above code can be fixed by moving the writing of the batch file to compilation time as in
$echo "dir" > x.bat
$call x.bat
or by moving the external program invocation to execution time:
File batchfile / x.bat /;
putClose batchfile "dir"/;
execute x.bat;
Notice that all $
commands do not include a semi-colon but are terminated by the end-of-line.