CSV2GDX is a program that can read a CSV file (comma separated values) and writes to a GDX file.
Many CSV files can be read by a GAMS program directly using a table statement as illustrated in Example 21, but a number of features available in CSV2GDX make it possible to read a CVS file where the table statement cannot be used.
CSV2GDX data.csv ID=ID1 UseHeader=y Index=(1,2,6) Values=(4,5)
Read the file
data.csv and write the file
data.gdx. The csv file has a header row, columns one, two and six are to be used for the index. The values can be found in the columns four and five with the labels on the first row of columns four and five used for the last index.
CSV2GDX filename parameters
The input file; the .csv file extension is assumed when no extension has been specified. Reading of a compressed input file with an optional password is supported. The gzip program in the gbin sub-directory can be used to compress a file.
Parameters can also be read from a text file; the use of a file for parameters is indicated by preceding the file name with a @ (At sign.). When reading parameters from a text file, lines starting with an asterisk (*) will be ignored and act as a comment.
Identifier for the symbol in the gdx file. Additional symbols, dim1, dim2, ... will be added to the gdx file, representing the domain sets for the symbol id.
Optional output filename. If no output file is specified, the program will use the input file name and change the file extension to .gdx. If a path is not specified, the output file will be created in the current directory
Sets the amount of information written to the log. Higher values will generate more output. Valid range is
Index=<list of columns>
Identify columns to get UELS from. The columns are represented as a list of integers separated by commas. For example
Index=(1,2,3,4);in this case the notation (
1..4) is allowed. The order in which the columns are specified is used to specify the sequence for the unique elements in the key to store the data.
Specify the column to get the value from. See also DecimalSep below.
Specify header columns to use for UEL/Value. When using a list of columns for the values, the field in the first line of the column is used as the unique element to store the value in that column. See also UseHeader and AutoCol below. If the number of fields varies, the symbolic constant
LastColcan be useful:
Indicate if the first row is a header row. The fields in the header row are used as unique elements when the Values option is used. A header row is not needed or can be ignored when using the AutoCol parameter.
Indicate if zero values are ignored or written as
EPS; an empty field is always ignored.
Number of columns in the input file. This parameter is required if there is no header line.
Write gdx file only if the csv file is more recent than the gdx file
Generate automatic UELs for each row. The AutoRow string is used as the prefix for the row label numbers. The generated unique elements will be used in the first index position shifting other elements to the right. Using AutoRow can be helpful when there are no labels that can be used as unique elements but also to store entries that would be a duplicate entry without a unique row label. See Example2.
Generate automatic UELs for each column. The AutoCol value is used as the prefix for the column label numbers. This option overrides the use of a header line.
FieldSep=[Comma, SemiColon, Tab]
Specify a field separator. Fields are normally separated by a comma, but this parameter allows for some additional choices.
Specify a decimal separator. The decimal is normally a period, but this parameter allows a comma as the decimal. Special values recognized are
Undef. A string that is not recognized as a valid number will be stored as
Password for an encrypted input file.
Reading the file
data.csv shown below:
one,two,three,four,five,six red,red,1.1,2.2,3.3,red red,red,4.4,5.5,Eps,green red,green,7.7,8.8,9.9,blue blue,blue,10,0,NA,purple
csv2gdx data.csv id=A Index=(1,2,6) Values=(3..5) UseHeader=Y StoreZero=Y
Resulting GDX file shown in the IDE:
Note the sets
Dim4 that correspond to the domains of index positions one to four.
Assuming we want to declare the parameter A as:
A(color, color, color, number) we can proceed as follows:
$call csv2gdx data.csv id=A Index=(1,2,6) Values=(3..5) UseHeader=Y StoreZero=Y set color(*), number(*); $gdxin data.gdx $load color=dim1 $loadm color=dim2 $loadm color=dim3 $load number=dim4 parameter A(color, color, color, number); $load A $gdxin display color, number, A;
Consider the following input file; note the duplicate key in the first two lines:
red,red,1 red,red,2 red,green,3 blue,blue,4
The following call uses the AutoRow parameter to add a unique row label to each row. This way a GAMS program can deal with duplicate entries and prepare for better error messages.
csv2gdx example22 id=A Index=(1,2) Value=3 ColCount=3 Autorow=Row