Unit 7: File Handling
It is all well and good that data can be created within a program via variable assignments and user input. However, we must also be able to deal with data stored in files. In this unit, we will introduce methods for reading data from and writing data to a file. At its heart, Python is an object-oriented language. Pay attention to the syntax we use here, which will prepare you for the rest of the course.
Completing this unit should take you approximately 2 hours.
Upon successful completion of this unit, you will be able to:
- use file handling and file handling modes to read and write to text files;
- write programs that use file handling modes, such as reading from, writing to, appending, and creating files;
- write programs that using file handling methods; and
- apply file handling to the data analysis and visualization programs written in Units 3–6.
7.1: File Input and Output
File input and output (or File I/O) is the ability to read data from and write data to files stored in a location such as a directory or a folder. The ability to handle files is actually a pretty deep subject that requires some measure of interaction with the computer operating system. Fortunately, for high-level languages such as Python, the nuts and bolts of file I/O are absorbed into a relatively simple set of methods. There are three major steps to referencing a file:
- Open the file:
This lets the operating system know the name and location of the file being referenced and how the file is to be used (such as read or write) - Perform operations on the file data (such as read, write, or append):
Now that the operating system has opened the file, it is ready to be used for the purpose specified in step 1 - Close the file:
After the desired set of operations has been completed, the operating system must be informed that access to the file is no longer necessary.
Here is an example of the three steps you can try in Repl.it:
fhandle = open('examp.txt','w') fhandle.write('This is a write example. ') fhandle.write('Text will be sequentially written until a newline control character occurs. \n') fhandle.write('Then a new line will begin with \n') fhandle.write('and another new line, etc \n') fhandle.close()
The syntax for implementing the above step is fairly straightforward:
- The "open" command creates and opens a file "examp.txt" where "w" means that data will be written to the file. If we were to read data, we would use an "r" instead (the "r" is actually optional where, if omitted, a file read will be assumed). In short,
the first argument to the "open" command is the file name, and the second argument indicates the operation to be performed. An object (named "fhandle" in this example) is created that allows access to a host of methods that will be practiced in this unit.
In Repl.it, the file location will be in the leftmost column under the "main.py" reference. Since Repl.it is web-based, this column effectively acts like your local directory from which files can be downloaded or uploaded. You should notice that when the "open" command executes, the file "examp.txt" is created. The file is still empty, but it is now ready to be written to. - Since the file was opened with the parameter "w", data will be written to the file. This operation can be accomplished using the "write" method.
- Finally, the "close" method closes the file. After writing the data to the file, you should be able to click on the filename in the left window and see the text that was written in step 2.
Try adding the following code to the above script:
f2 = open('examp.txt','r') print(f2.read()) #the 'read' method reads the file f2.close()
You now have the ability to create a file, as well as read data from and write data to a file. The files source.txt, source2.txt, and source3.txt are provided here. You should upload them into the leftmost window in Repl.it for your code to reference them for a file read. You can then practice the examples on this page.
- Open the file:
- Now that you're familiar with file input and output, read this for more on syntax and usage.
7.2: Visualizing Data from a File
The next project will require a couple of steps to set up. First, download these files. Then, Start a new Repl.it session and either upload or 'drag and drop' these three files into the leftmost window. The "csv" stands for comma-separated values. This is a common data file type that is readable by programs such as Excel. We do not need to grab the other data files as we will challenge the Repl.it graphics capability with these three files.
Once you can see the data files listed in the Repl.it leftmost window, feel free to copy the code in the example provided into the Repl.it run window and run the code. This example is very instructive as it ties together the reading of multiple data files and the use of numpy combined with matplotlib introduced earlier in the course.
After the code runs, you should see a graph appear in the rightmost window. On the graphic, click on the resize box in the upper left-hand corner to resize the figure. Make sure you see plots similar to those given in the example.
After you download the files above, complete this exercise.
Study Session Video Review
Unit 7 Review and Assessment
In this video, course designer Eric Sakk walks through the major topics we covered in Unit 7. As you watch, work through the exercises to try them out yourself.
- Receive a grade
Take this assessment to see how well you understood this unit.
- This assessment does not count towards your grade. It is just for practice!
- You will see the correct answers when you submit your answers. Use this to help you study for the final exam!
- You can take this assessment as many times as you want, whenever you want.