Files

Reading Files

Up to now the only way we’ve used large portions of data in our code is to put it directly into the code.

Let’s learn how to read data from files.

Let’s read from the file declaration-of-independence.txt.

>>> declaration_file = open('declaration-of-independence.txt')
>>> print(declaration_file.read())
>>> declaration_file.close()

First we open the file, then we read the contents of the file and print them out, then we close the file.

Let’s make a program file_stats.py that will read from a file and gives us statistics on the text in a given file.

filename = input('What is the name of the file you want to know about? ')

def print_file_stats(filename):
    stat_file = open(filename)
    contents = stat_file.read()
    stat_file.close()
    word_count = len(contents.split())
    print("Number of Words: {}".format(word_count))

print_file_stats(filename)

Let’s try it out:

$ python3.6 file_stats.py
What is the name of the file you want to know about? declaration-of-indepedence.txt
Number of Words: 1338

It works!

Note

Hey, what’s that def print_file_stats(filename) thing about? That’s a function! Functions allow us to put a bunch of code in one block, and call it later with a single line. It’s great for functionality you need to use over and over again, because it keeps you from having to repeat yourself. Learn more about functions in the bonus section.

Closing Files

We need to remember to always close our files. This isn’t as important when reading files, but will be very important when writing files.

This is such a common concern in Python, that the open function supports a special syntax for this.

filename = input('What is the name of the file you want to know about? ')

def print_file_stats(filename):
    with open(filename) as stat_file:
        contents = stat_file.read()
    stat_file.close()
    word_count = len(contents.split())
    print("Number of Words: {}".format(word_count))

print_file_stats(filename)

This with block is called a context manager. Context managers allow us to ensure that particular cleanup tasks occur whenever a block of code is exited. Basically after our context manager block is exited, the stat_file file descriptor will be closed.

Don’t worry about understanding context managers fully, just remember that from now on we will always use the with open syntax for opening files.

Mode and Encoding

The last thing we’ll learn about are the mode and encoding arguments. Files are opened in read text mode by default. The encoding uses the system default. This is “utf-8” on my machine, but it can be different.

Let’s make our code a little more explicit about these values:

filename = input('What is the name of the file you want to know about? ')

def print_file_stats(filename):
    with open(filename, mode='rt', encoding='utf-8') as stat_file:
        contents = stat_file.read()
    stat_file.close()
    word_count = len(contents.split())
    print("Number of Words: {}".format(word_count))

print_file_stats(filename)

Writing Files

Let’s make a program that writes to a file.

To write to a file, we need to open it with a w in the mode argument. Then we can use the write() function, to write to the file:

>>> with open('test.txt', mode='wt', encoding='utf-8') as test_file:
...     test_file.write("Hello world!\n")
...
13

The write method on our file descriptor writes every character we give it to the file. It returns the number of characters it wrote to the file.

Let’s pause here a second so you can all try this on your own, and we’ll see what kind of problems arise.

And now for exercises!

Your Turn: Files 🏁

File Exercises

Contact creator

Create a program my_contacts.py that allows the user to enter name, email, phone number, and Twitter handle and write it to a file contacts.csv.

Make sure that it appends to the end of the file instead of overwriting the whole file so you can call the program more than once.

Example usage:

$ python3.6 my_contacts.py
What is the contact's name? Brenda
What is Brenda's email address? brenda@bebrenda.com
What is Brenda's phone number? 619-867-5309
What is Brenda's Twitter handle? @bebrenda

Output:

Brenda, brenda@bebrenda.com, 619-867-5309, @bebrenda

Tip

Opening a file with mode='a' allows you to append to a file, instead of writing over it.

Tidier Capital Guesser

Update your program capital_guesser.py to use the file us-state-capitals.py instead of the list of lists of states and their capitals.

Nice Troll

Make a program nice_troll.py that asks for a file, reads it, and replaces any adjectives that appear in an angry_words list with a random adjective from a nice_words list.

Note

We’ve been asking for file names as input thus far, but you can also do with with sys.argv. If you’re ready for a new challenge, feel free to the Command Line Arguments section in the bonus material.

ASCIIbetical Contacts

Add to your program my_contact.py so it sorts the tables ASCIIbetically after each new contact is added.

Tip

Hint: sorted() will be of use here.

Present Moment Journal with Files

Update your present moment journal program to write the input from the user to a file.