Eric Brauer
Being able to read and write files is probably one of the most important things you can do.
/etc
as you have discovered in previous
semesters.In this course we will focus on plaintext files
because they are a universal format. If you decided to, for example,
write a Python script to work with a proprietary format such as
.docx
you will notice that it comes with a lot of pain!
All file operations have (at least) three major components:
A file object requires a valid filename, and will open that filename for a specified operation: reading from, writing into, or append into.
f
breaks that rule, but since file
operations are so common, any experienced Python programmer will know
what it is referring to. Feel free to name it something else, if you
like.'datafile'
is a filename relative to
your currently running script. If you want to be extra safe, use an
absolute filepath.'r'
specifies that you are opening this
file for a read operation. Note that the actual reading will
only occur in the next step.Now that we’ve created a file object called f
, we can
use one of its methods to read all data in the file and save it
as a string.
The simplest method is to read a file, and save it to a string object. This will take all the data and combine into one string, regardless of how many lines there are.
This closes the file. f
will no longer be usable. While
not technically necessary, it’s bad form to leave a file open. It can
cause data corruption if things go wrong!
If the data we’ve read has multiple lines, we may want to take that
string and convert it to a list. Making each line an item in the list is
common. When data is read from a text file, the separate lines you see
in vi are coded using an invisible control character, \n
.
We’ll use that to separate the string using the .split()
method.
We can also read data straight into a list object, bypassing that conversion. This leads to cleaner code and faster run times. There are two major methods:
Output:
['First line\n', 'Second line\n', 'Third line\n']
There are two operations that will write to a file. It’s vitally important to understand the difference between them:
f = open('data.txt', 'w')
w
option opens a file and immediately erases everything
inside, before writing what the code is told to do later on.
(i.e. overwrite)
f = open('data.txt', 'a')
a
option appends data to the bottom of the file without
erasing that file’s contents.
Writing to a file is similar to reading. There’s an open operation, the write operation(s), and the close file operation.
f = open('file2.txt', 'w')
f.write('Line 1\nLine 2 is a little longer\nLine 3 is as well\n')
f.write('This is the 4th line\n')
f.write('Last line in file\n')
f.close()
Notice that there are several write operations. When using the ‘w’ option in the open, the file only gets overwritten at that point. As the file is now open, you can continue to add lines and it won’t erase the previous ones you just added.
Only strings are accepted when writing to a file. You must convert any other object type to a string before starting the write operation. This includes extracting items from objects like lists or dictionaries.
An exception is a type of error that occurs during the run of a Python program. These are separate from syntax errors, which will prevent the script from running.
Often an exception is the best outcome: rather than doing something ambiguous, the Python interpreter is telling you that an unexpected result might occur. It forces you to be more specific in your instructions.
print('5' + 10)
---------------------------------------------------------
Traceback (most recent call last)
# Field "<stdin>", line 1, in <module>
TypeError: Can't convert 'int' object to str implicitly
The exception here is caused by a mismatch in types. The
"+
" sign has different meaning when using integers
vs. strings.
Are you expecting concatenation (510
) or
addition (15
)?
Rather than do the wrong thing, Python forces you to make the choice, and add conversion to your code.
Up until now, you’ve mostly seen exceptions occur because of your code. But plenty of things can go wrong that have nothing to do with your code!
For example, what can go wrong with a file operation? The given filename could be wrong, you might not permission to open it, the drive might not be mounted, data might be corrupt, etc.
By implementing error handling, we can prevent these issues from causing worse errors down the line.
Python exceptions tend to be cryptic and use technical language to describe a problem. Our error handling can be as simple as printing a more user-friendly message to tell the user what has occurred, and how they can fix the issue.
Handling an error is more than just customizing the exception message. You can create code that will run only if a specific error occurs, possibly allowing you to fix the problem on the fly.
In this statement, python will attempt the code in the
try
block. If any line of code in the try
block fails, it will skip the rest of it and jump to the code in the
except
block.
This isn’t a great solution because the error message has no useful information. How will the user fix the problem?
We can also test for exact errors, known as types, or concrete exceptions. Many are built-in to the interpreter. (Refer to the References slide for a list.)
try:
f = open('not_a_file.txt', 'r')
except FileNotFoundError:
print('File not found! Please check your spelling and try again.')
exit()
In this statement, we will only enter the except
block
if the file is not found. This allows us to provide a more
descriptive error message for the user.
Please note: unhandled exceptions will
terminate the execution of your code. Using exception
handling will allow the code to continue running, unless you
specifically call something like exit()
.
try:
f = open('not_a_file.txt', 'r')
data = f.read()
f.close()
except FileNotFoundError:
print('File not found! Please check your spelling and try again.')
print(data)
Hint: in what case will the last line of code fail?
There are many built-in types. Here are a few examples:
It’s important to understand what the most common exception types are triggered by.
Many types have sub-types. As you get more specific, it’s important to keep the type hierarchy in mind. Here’s an example:
+-- OSError
| +-- BlockingIOError
| +-- ChildProcessError
| +-- ConnectionError
| | +-- BrokenPipeError
| | +-- ConnectionAbortedError
| | +-- ConnectionRefusedError
| | +-- ConnectionResetError
| +-- FileExistsError
| +-- FileNotFoundError
| +-- InterruptedError
| +-- IsADirectoryError
| +-- NotADirectoryError
| +-- PermissionError
| +-- ProcessLookupError
| +-- TimeoutErrorOSError
We can combine many except
blocks, allowing us to cover
specific errors, and then anything else. A bare except should
always go last.
Just like a conditional statement, the first except that matches the error will be path you take, just as how the first condition in an if statement that returns true. Always start with most specific, and then go more general.
Reading and Writing to Files Handling Errors and Exceptions Built-In Exceptions