Computer programs often need to read data that was prepared by a different program or even a different system. This data is obtained from data storage devices such as hard disks, floppy disks, CD-ROM, magnetic tape and information servers on a network. There are many generations of data storage devices an many kinds of computer systems that use them. The specifics of each situation depend upon the technology of the device, the computer system, the language, the server and the network. We will focus on a few principles and techniques that can be used immediately and which create a foundation for future work in this area.
IDL has tools that permit data to be read and written to files. The details that depend on the hardware and operating system are largely hidden from the user. We will concentrate on a subset of the tools that provide basic functionality. Further information can be obtained by reading the section Building IDL Applications -- Files Input and Output found under the Contents tab in IDL Online Help.
The basic steps in reading data from a file are:
10 | 13.5 | 11 |
5 | 19.1 | 3 |
14 | 11.91 | 4 |
-17 | 5.7 | 8 |
295 | -14.2 | 6 |
PRO READ_DATA1,H,filename
OPENR,1,filename
H=FLTARR(3,5)
READF,1,H
CLOSE,1
END
The OPENR statement tells IDL where to find the data. It also creates a logical unit number (LUN) that other statements will refer to in reading from the file. In this case the LUN is equal to 1.
The array H has been set up to hold the numerical data. It has the same size as the data file. Later we will look at ways to get around the need to know the size of the data file before you read it.
The READF statement reads the file and stores the data in H.
The CLOSE statement is used to close the path to the file. This should always be done when reading is complete.
We could now use the above procedure to print out the contents of the file by entering the following statements:
fname='/cis/myname/idl/data/exdata.dat'
READ_DATA1,A,fname
PRINT,A
Exercise 1
Create a directory named data within your account to hold the
data files that you will be creating. You can use the "save frame as" option
in the netscape file menu to save the data file
to this directory.
Create a procedure READ_DATA1 using the above example and verify that you can read the data from this file. Print out the values of the array H after you have read the file to check that you can access the file correctly.
Exercise 2
Change the statement H=FLTARR(3,5) to H=INTARR(3,5) in the DATA_READ
procedure. Now test the procedure. What was the effect of this change?
What do you think was the cause? It makes sense to use the FLTARR function
to set up an array so the numbers are not rounded to integers. There are
also times when one may want to use double or complex data types.
Exercise 3
In the above exercises we needed to know the size of the file that
we want to read before it is read. Suppose that all that we know is that
the data is stored in three columns, and that the number of rows is unknown.
This is often the case with a data file. Let us investigate the idea of
reading the file a line at a time and then saving the results. We do this
until the end of the file is found.
We might try a procedure like the following. The idea is to construct an array that has more rows than we think will possibly be used. Then we can read one line at a time into a vector, here called S, and store that into the current row of the array. We expect the reading to stop when we hit the end of the file.
Put the following program in a file named read_data2.pro and test it on the exdata.dat file. Do not be surprised when it breaks. We will fix up the problems once we see them.
PRO READ_DATA2, H, filename
OPENR,1,filename
H=FLTARR(3,1000) ;A big array to hold the
data
S=FLTARR(3) ;A
small array to read a line
FOR n=0,999 DO BEGIN
READF,1,S ;Read
a line of data
PRINT,S ;Print
the line
H[*,n]=S ;Store
it in H
ENDFOR
CLOSE,1
END
You can run the file by doing something like the following
fname='/cis/myname/idl/data/exdata.dat'
READ_DATA2,A,fname
The results will be a list of the file contents (produced by the print statement) and then a lot of complaints and error messages that are produced by an encounter with the end of the file. In fact, it quits so ungracefully that it does not return any values in the array H. As you would expect, there is a way to overcome this kind of problem.
First, type CLOSE,1 in the IDL command line to close the file that was left open when the program stopped. If you don't do this then you will run into complaints below.
We will make use of a function called ON_IOERROR which does the job of detecting errors when reading files. It can be used to cause the program to jump to another statement (which we will use to close the file and continue gracefully). The statement ON_IOERROR,stmt causes the program to jump to the line that begins with stmt: when an error is detected.
Create the following procedure. The key changes are shown in red. A while loop is used to read data. A counter n is incremented on each cycle of the loop. Reading stops whenever 1000 lines have been read or an end of file is encountered. After the end of file, the array is shortened to the number of rows that were actually read.
PRO READ_DATA3, H, filename
OPENR,1,filename
H=FLTARR(3,1000) ;A big array to hold the
data
S=FLTARR(3) ;A
small array to read a line
ON_IOERROR,ers
;Jump to statement ers when I/O error is detected
n=0 ; Create
a counter
WHILE n LT 1000 DO BEGIN
READF,1,S ;Read
a line of data
PRINT,S ;Print
the line
H[*,n]=S ;Store
it in H
n=n+1
;Increment the counter
ENDWHILE
;End of while loop
ers: CLOSE,1
;Jump to this statement when an end of
file is detected
H=H[*,0:n-1]
END
Test this program on the exdata.dat file.
Exercise 4
The above procedure has the drawback that the maximum size of the file
that can be read is 1000 lines. It also is restricted to files with 3 columns.
We will modify these problems by putting in some options and setting default
values. Create the file shown below. The changes are highlighted in red.
Note that the print statement has been eliminated, since you can print
the file after you use the procedure to read it.
PRO READ_DATA, H, filename,COLUMNS=cols,ROWS=rows
OPENR,1,filename
IF N_ELEMENTS(cols) LE 0 THEN cols=1
;Default value for cols
IF N_ELEMENTS(rows) LE 0 THEN rows=1000
;Default value for rows
H=FLTARR(cols,rows) ;A
big array to hold the data
S=FLTARR(cols)
;A small array to read a line
ON_IOERROR,ers
;Jump to statement ers when I/O error
is detected
n=0 ; Create
a counter
WHILE n LT rows
DO BEGIN
READF,1,S ;Read
a line of data
H[*,n]=S ;Store
it in H
n=n+1
;Increment the counter
ENDWHILE
;End of while loop
ers: CLOSE,1
;Jump to this statement when an end of
file is detected
H=H[*,0:n-1]
END
Save the above procedure in a file read_data.pro. Then test it on the
exdata.dat file.
fname='/cis/myname/idl/data/exdata.dat'
READ_DATA,A,fname
PRINT,A
Exercise 5
Save the data file called sunspot.dat to
a file in your data directory. Then read this data and draw a graph. This
represents the annual sunspot readings since the year 1700. The first column
is the year and the second is the sunspot activity. Your graph should look
something like the one shown below.
Exercise 6 Use of READ_ASCII
The function READ_ASCII can be used to read files that have an unknown
number of columns and rows. If the files are arranged in columns and rows,
then READ_ASCII will discover the arrangement and read the files appropriately.
This is a very useful tool.
If you look up the reference information about READ_ASCII in IDL online help you will see that it is a function with many keyword options and that it has extensive flexibility. Here we will describe its use with files that are tables of numbers. The only requirement is that the tables be in ASCII text format.
The following statements will read the data from exdata.dat and print them out. Note that no knowledge of the structure of the file is used. The printed data looks like a table except for the curly brackets that open and close the whole array. The reason for the braces is that ASTR is an IDL structure.
fname='/cis/myname/idl/data/exdata.dat'
ASTR=READ_ASCII(fname)
PRINT,ASTR
{ 10.0000
13.5000 11.0000
5.00000
19.1000 3.00000
14.0000
11.9100 4.00000
-17.0000
5.70000 8.00000
295.000
-14.2000 6.00000
}
The data can be extracted from the structure by the statement A=ASTR.(0) The structure ASTR contains only one curly bracket collection, and it can be accessed by using the index 0. A reference to an element of a structure uses a dot followed by the index or name tag of the item. If we type the following commands we will extract the data from the structure and put it into an array A. The array can then be printed or used in any way that one wants.
A=ASTR.(0)
PRINT,A
10.0000
13.5000 11.0000
5.00000
19.1000 3.00000
14.0000
11.9100 4.00000
-17.0000
5.70000 8.00000
295.000
-14.2000 6.00000
In summary, the READ_ASCII function provides a means to capture the elements of a file when we do not know how many columns or rows it contains. It puts the results into an IDL structure. To extract them from the structure, use the reference ".(0)" as illustrated above.
Use the READ_ASCII function to read the sunspot data and put it into an array so it can be plotted as you did in the previous problem.