Working with Files – Part 1

Sequential File Processing Statements and Functions

Processing a Comma-Delimited File

 

Visual Basic .NET provides the capability of processing three types of files:

 

sequential files        Files that must be read in the same order in which they were written – one after the other with no skipping around

 

binary files               "unstructured" files which are read from or written to as series of bytes, where it is up to the programmer to specify the format of the file

 

random files             files which support "direct access" by record number

 

The next several topics address VB.NET's sequential file processing capabilities. Binary and Random files will be covered in later topics.

 

The following sequential file-related functions will be discussed:

 

FileOpen:      Opens a file for input or output.

 

FreeFile:       Returns an Integer value representing the next file number available for use by the

                        FileOpen function.

 

Input:             Reads data from an open sequential file and assigns the data to variables.

 

LineInput:     Reads a single line from an open sequential file and assigns it to a String      variable.

 

EOF:              Returns a Boolean value True when the end of a file opened for Random or sequential

                        Input has been reached.

 

Write &

WriteLine:     Writes data to a sequential file. Data written with Write is usually read from a file       with

                        Input.

 

Print &

PrintLine:     Writes display-formatted data to a sequential file.

 

FileClose:     Concludes input/output (I/O) to a file opened using the FileOpen function.

 

As you know, a data file consists of records, which consist of fields.  The file that will be used for all examples in this section is a simplified employee file, which consists of the following fields:

 

            Field                                       Data Type

            Employee Name                   String

            Department Number Integer

            Job Title                                 String

            Hire Date                               Date

            Hourly Rate                            Single

 

Suppose there were five records in the file.  A graphic representation of the file populated with the five data records follows (the field names are not stored in the file):

 

Employee Name

Dept #

Job Title

Hire Date

Hourly Rate

ANDY ANDERSON

100

PROGRAMMER

3/4/1997

25.00

BILLY BABCOCK

110

SYSTEMS ANALYST

2/16/1996

33.50

CHARLIE CHEESEMAN

100

COMPUTER OPERATOR

3/1/1996

15.00

DARLENE DUNCAN

200

RECEPTIONIST

10/11/1998

12.75

ERNIE EACHUS

300

MAIL ROOM CLERK

8/19/1997

10.00

 

Please note that the data types for these fields are the data types of the variables into which these fields will be stored.  On the sequential file, all fields will be represented as a string of characters.

 

Following are three different ways that the data in this sequential file might be stored; for example, if you opened up a sequential data file in a text editor such as Notepad, this is what you might see.

 

Scenario 1: Comma-Delimited Format

 

Each field is separated by a comma.  Both string and numeric fields are "trimmed" (contain no extraneous spaces or zeroes).  String fields are enclosed in quotes (Note: The quotes enclosing the string fields are optional, VB and other applications that can read comma-delimited files will access the string fields properly with or without the quotes.  The only time a string field MUST be enclosed in quotes is when it contains an embedded comma.) If Date fields are enclosed in pound signs (#), VB will automatically recognize the field as the Date data type.  If the Date fields are enclosed in quotes instead, you need to use the CDate function to convert the date from string format to the Date data type.

 

"ANDY ANDERSON",100,"PROGRAMMER",#3/4/1997#,25

"BILLY BABCOCK",110,"SYSTEMS ANALYST",#2/16/1996#,33.5

"CHARLIE CHEESEMAN",100,"COMPUTER OPERATOR",#3/1/1996#,15

"DARLENE DUNCAN",200,"RECEPTIONIST",#10/11/1998#,12.75

"ERNIE EACHUS",300,"MAIL ROOM CLERK",#8/19/1997#,10

 

Scenario 2: Fixed-Width ("Print" Format)

 

In some sequential data files, fields are stored in a fixed position.  On each record, a particular field starts and ends in the same position and occupies the same amount of space.  In a "print" format file, each line (record) of the file consists of a formatted "detail line" containing each field (as if the lines were intended to be printed on a hard-copy report).

 

In the example below, a column position guide is shown above the records.  From the example, it should be clear that the employee name occupies positions 1 through 20 of each record (note that names shorter than 20 characters are padded with blank spaces); the department number occupies positions 21 through 24; the job title occupies positions 30 through 50; the hire date occupies positions 51 through 60; and the hourly rate occupies positions 61 through 65.

 

            1    1    2    2   3    3    4    4    5    5    6    6

   1...5....0....5....0....5...0....5....0....5....0....5....0....5.   

ANDY ANDERSON        100    PROGRAMMER           3/4/1997  25.00

BILLY BABCOCK        110    SYSTEMS ANALYST      2/16/1996 33.50

CHARLIE CHEESEMAN    100    COMPUTER OPERATOR    3/1/1996  15.00

DARLENE DUNCAN       200    RECEPTIONIST         10/11/199812.75

ERNIE EACHUS         300    MAIL ROOM CLERK      8/19/1997 10.00

 

Scenario 3: Fixed-Width ("COBOL-Style" Format)

 

Typical of sequential files originating on mainframe computers and processed by languages such as COBOL, fields are stored one after the other in a continuous string with no distinguishing marks or white space between them.  Although some of the character-string fields can be picked out easily, the numbers are run together and are difficult to interpret unless you know something about the record.  Also, numeric fields containing a decimal portion are typically stored without the decimal point (they have an implied decimal point).  For example, the employee file might look something like this:

 

            1    1    2    2   3    3    4    4    5    5    6    6

   1...5....0....5....0....5...0....5....0....5....0....5....0....5.    

ANDY ANDERSON       0100PROGRAMMER       030419972500

BILLY BABCOCK       0110SYSTEMS ANALYST  021619963350

CHARLIE CHEESEMAN   0100COMPUTER OPERATOR030119961500

DARLENE DUNCAN      0200RECEPTIONIST     101119981275

ERNIE EACHUS        0300MAIL ROOM CLERK  081919971000

 

In the example above, the employee name occupies the first 20 positions of each record; the department number occupies the next four bytes (note that it contains a leading zero); the job title occupies the next 17 bytes; the hire date (stored in MMDDYYYY format with no slashes) occupies the next 10 bytes; and finally, the hourly rate occupies the last four bytes of the record.  Note that the hourly rate does not contain a physical decimal point; however, the program that processes this file must "know" that the decimal point is implied (i.e., "2500" means "25.00").  Given the proper data definition, COBOL can interpret the implied decimal point just fine; in VB, we have to convert the string "2500" to a number and then divide it by 100.  This technique is shown further below.

 

VB.NET Functions for Sequential File Processing

 

FileOpen Function

Opens a file for input or output.

Syntax:

Public Sub FileOpen( _
   ByVal FileNumber As Integer, _
   ByVal FileName As String, _
   ByVal Mode As OpenMode, _
   Optional ByVal Access As OpenAccess = OpenAccess.Default, _
   Optional ByVal Share As OpenShare = OpenShare.Default, _
   Optional ByVal RecordLength As Integer = -1 _
)
 
Parameters:
 

FileNumber:             Required. Any valid file number. Use the FreeFile function to obtain the next

                                    available file number.

FileName:                 Required. String expression that specifies a file name — may include directory

                                    or folder, and drive.

Mode:                                    Required. Enum specifying the file mode: Append, Binary, Input, Output, or

                                    Random. When a file is opened for Input, that file must already exist.

                                    When a file is opened for Output, if it does not exist, it will be created; if it does

                                    exist, its previous contents will be overwritten.

 

                                    When a file is opened for Append, if it does not exist, it will be created, if it does

                                    exist, records will be added to the file after the last record in the file (the previous

                                    contents of the file will not be overwritten).

 

Access:                     Optional. Keyword specifying the operations permitted on the open file: Read,

                                    Write, or ReadWrite. Defaults to ReadWrite.

Share:                       Optional. Enum specifying the operations restricted on the open file by other

                                    processes: Shared, Lock Read, Lock Write, and Lock Read Write. Defaults to

                                    Shared.

RecordLength:        Optional. Number less than or equal to 32,767 (bytes). For files opened for

                                    random access, this value is the record length. For sequential files, this value is

                                    the number of characters buffered.

 

Examples:

This example illustrates various uses of the FileOpen function to enable input and output to a file.

The following code opens the file TESTFILE in Input mode.

FileOpen(1, "TESTFILE", OpenMode.Input)
' Close before reopening in another mode.
FileClose(1)

This code example opens the file in Output mode; any process can read or write to file.

FileOpen(1, "TESTFILE", OpenMode.Output, OpenShare.Shared)
' Close before reopening in another mode.
FileClose(1)

 

 

FreeFile Function

Returns an Integer value representing the next file number available for use by the FileOpen function. Use FreeFile to supply a file number that is not already in use.

Syntax:

Public Function FreeFile() As Integer

Example:

This example uses the FreeFile function to return the next available file number. Five files are opened for output within the loop, and some sample data is written to each.

Dim count As Integer
Dim fileNumber As Integer
For count = 1 To 5   
   fileNumber = FreeFile()
   FileOpen(fileNumber, "TEST" & count & ".TXT", OpenMode.Output)
   PrintLine(fileNumber, "This is a sample.")
   FileClose(fileNumber)
Next

 

Input Function

Reads data from an open sequential file and assigns the data to variables.

Data read with Input is usually written to a file with Write. Use this function only with files opened in Input or Binary mode.

When read, standard string or numeric data is assigned to variables without modification. The following table illustrates how other input data are treated:

Data

Value assigned to variable

Delimiting comma or blank line

Empty

#NULL#

DBNull

#TRUE# or #FALSE#

True or False

#yyyy-mm-dd hh:mm:ss#

The date and/or time represented by the expression

#ERROR errornumber#

errornumber (variable is an object tagged as an error)

If you reach the end of the file while you are inputting a data item, the input is terminated and an error occurs.

Note:   The Input function is not localized. For example, in the German version, if you input 3,14159, it will return only 3, since the comma is treated as a variable separator, rather than a decimal point.

Syntax:

Public Sub Input( _
   FileNumber As Integer, _
   ByRef Value As Object _
)

Parameters:

 

FileNumber:             Required. Any valid file number.

Value:                        Required. Variable that is assigned the values read from the file — can't be an

                                    array or object variable.

Example:

This example uses the Input function to read data from a file into two variables. This example assumes that TESTFILE is a file with a few lines of data written to it using the Write function; that is, each line contains a string in quotations and a number separated by a comma, for example, ("Hello", 234).

FileOpen(1, "TESTFILE", OpenMode.Output)
Write(1, "hello")
Write(1, 14)
FileClose(1)
 
Dim s As String
Dim i As Integer
FileOpen(1, "TESTFILE", OpenMode.Input)
Input(1, s)
Debug.WriteLine(s)
Input(1, i)
Debug.WriteLine(i)
FileClose(1)

 

Recall the comma-delimited version of the employee file shown earlier:

 

"ANDY ANDERSON",100, "PROGRAMMER",#3/4/1997#,25

"BILLY BABCOCK",110,"SYSTEMS ANALYST",#2/16/1996#,33.5

"CHARLIE CHEESEMAN",100,"COMPUTER OPERATOR",#3/1/1996#,15

"DARLENE DUNCAN",200,"RECEPTIONIST",#10/11/1998#,12.75

"ERNIE EACHUS",300,"MAIL ROOM CLERK",#8/19/1997#,10

 

Assume you declare the following variables in your program:

 

Dim strEmpName As String

      Dim intDeptNbr As Short

      Dim strJobTitle As String

      Dim dtmHireDate As Date

      Dim sngHrlyRate As Single

 

The following

 

Input(intEmpFileNbr, strEmpName)

      Input(intEmpFileNbr, intDeptNbr)

      Input(intEmpFileNbr, strJobTitle)

      Input(intEmpFileNbr, dtmHireDate)

      Input(intEmpFileNbr, sngHrlyRate)

 

would cause ANDY ANDERSON to be stored in strEmpName, 100 to be stored in intDeptNbr, PROGRAMMER to be stored in strJobTitle, 3/4/1997 to be stored in dtmHireDate, and 25 to be stored in sngHrlyRate the first time that the statement was executed. 

 

The second time, BILLY BABCOCK, 110, SYSTEMS ANALYST, 2/16/1996, and 33.5 would be stored respectively in strEmpName, intDeptNbr, strJobTitle, dtmHireDate, sngHrlyRate; and so on. 

 

As VB.NET reads each field into its respective variable, it automatically performs the conversion to the correct data type (Integer, Date, Single, etc.).  As mentioned earlier, VB.NET will only convert an incoming field to the Date data type if that field is enclosed in pound signs (#) – if the field was enclosed in quotes, it would be treated as a string and the CDate function would have to be used to convert it to a Date.

 

VB.NET "knows" that the data is to be read from the "EMPLOYEE.DAT" file because the Input function is referring to file intEmpFileNbr, and file intEmpFileNbr is associated with "EMPLOYEE.DAT" in the FileOpen function.

 

EOF Function

Returns a Boolean value True when the end of a file opened for Random or sequential Input has been reached.

Use EOF to avoid the error generated by attempting to get input past the end of a file.

The EOF function returns False until the end of the file has been reached. With files opened for Random or Binary access, EOF returns False until the last executed FileGet function is unable to read an entire record.

With files opened for Binary access, an attempt to read through the file using the Input function until EOF returns True generates an error. Use the LOF and Loc functions instead of EOF when reading binary files with Input, or use Get when using the EOF function. With files opened for Output, EOF always returns True.

Syntax:

Public Function EOF(ByVal FileNumber As Integer) As Boolean

Parameter:

 

FileNumber:             Required. An Integer containing any valid file number.

 

Example:

This example uses the EOF function to detect the end of a file. This example assumes that TESTFILE is a text file with a few lines of text.

Dim TextLine As String
FileOpen(1, "TESTFILE", OpenMode.Input)   ' Open file.
Do While Not EOF(1)   ' Loop until end of file.
   TextLine = LineInput(1)   ' Read line into variable.
   Debug.WriteLine(TextLine)   ' Print to the Command window.
Loop
FileClose(1)   ' Close file.

 

FileClose Function

Concludes input/output (I/O) to a file opened using the FileOpen function.

If you omit FileNumbers(), all active files opened by the FileOpen function are closed.

When you close files that were opened for Output or Append, the final buffer of output is written to the operating system buffer for that file. All buffer space associated with the closed file is released.

When the FileClose function is executed, the association of a file with its file number ends.

Syntax:

Public Sub FileClose(ParamArray FileNumbers() As Integer)

Parameter:

 

FileNumbers():        Optional. Parameter array of 0 or more channels to be closed.

 

Example:

This example uses the FileClose function to close a file opened for Input.

Dim TextLine As String
FileOpen(1, "TESTFILE", OpenMode.Input)   ' Open file.
Do While Not EOF(1)   ' Loop until end of file.
   TextLine = LineInput(1)   ' Read line into variable.
   Debug.WriteLine(TextLine)   ' Print to the Immediate window.
Loop
FileClose(1)   ' Close file.

 

To see the above code in action, please download the project here.

 

Screenshot: