ENGI E1006: Introduction to Computing for Engineers and Applied Scientists


Lecture5: Lists, Sets, and Dictionaries

Reading: Punch and Enbody Chapters 7 and 9

Lists

Recall: A list is a built-in Python collections type used for storing objects in sequence. The truth is that the list doesn't really store the objects themselves but rather a sequence of object references to objects. This is an important distinction that we want to keep in mind.

We can use the same built-in functions we used on strings on lists. So len, max, min, when they make sense, will work on lists. Like strings, in addition to the built-in functions, lists have many methods we can use as well. Unlike strings lists are mutable. What does this mean? This means that you can change a list. For example:

In [1]:
l=[1,2,3]
l[1]=5
l
Out[1]:
[1, 5, 3]

Notice the list, l, has been changed without reassigning the object reference l. Instead we only changed the middle term l[1]. Contrast this with strings. So what? Well so here's something that could never happen with strings:

In [2]:
l2=l
l[0]=100
l2
Out[2]:
[100, 5, 3]

Unlike with strings or any other immutable object, the list l2 changed when we changed l. This is becuase these variables are really just object references providing the location of the list object in memory. So when we do something to a variable, we're really just following the reference to the object and doing something to the object. When you set two object references to refer to the same object like with l2=l, then whatever you do one will also affect the other. That's because they refer to the same object. There is only one list. It has two object references, l and l2 that both reference it.

So why didn't this happen with strings or numeric types? Because they were immutable. We could never actually change the objects their variables refered to. We could only reassign the variables to new objects. If a variable is reassigned then it refers to an entirely different object. When we execute l[0]=100 above we are reassigning the first object reference in the list l but not the object reference l itself. So l still points to the same object, namely a container for 3 other object references. We've just switched out the first of those three object references. This affects l2 because it is also referencing the same container.

list methods: So we have established that lists are mutable. We've also seen that we can mutate (change) a list by reassigning one of the object references contained inside the list. We do this using the familiar array notation l[i]. What else you got? Here's where the methods that come with the list type really shine. There are two kinds of methods in the list class: methods that change the list and methods that don't change the list. See your textbook pages 292 and 293 for a partial list of these methods. Follow the link for a more complete treatment.

One important nuance to be fully cognitive of is the difference between the fuction sorted and the list method sort. Ask yourself right now whether you understand the difference. How do you use them? How do they behave differently? Why? Let's try them out.

In [3]:
l=[3,1,6,2]
sorted(l)
l
Out[3]:
[3, 1, 6, 2]

What just happened? It would appear that sorted did nothing. In fact sorted returns a brand new list with the same items as l only in sorted order. We just didn't keep track of what the sorted function returned. Let's try again

In [4]:
l=[3,1,6,2]
l2=sorted(l)
l2
Out[4]:
[1, 2, 3, 6]

So l2 is a different list that the function returned. It's elements are different object references refering to the same numerical objects.

Contrast this with the method sort which actually changes the list. Observe

In [5]:
l=[3,1,6,2]
l.sort()
l
Out[5]:
[1, 2, 3, 6]

Make sure you understand what just happened and why it's important. I'll test you on it for sure!

Dictionaries

A dictionary is a Python collections type where arbitrary (immutable) keys are mapped to values. What does that mean? Well think about a list. A list is a collection of values that are indexed by their position in the list, namely an integer. A dictionary is a collection of stuff but not necessarly indexed by integers. Another way to think about dictionaries is as a collection of key-value pairs. The keys must be unique and they each map to some value. Here are some examples:

In [6]:
d={} # This makes an empty dictionary
d['apple']='red'
d['cherry']='red'
d['banana']='yellow'
d
Out[6]:
{'apple': 'red', 'banana': 'yellow', 'cherry': 'red'}

So what we created was a list of fruits (the keys) togethe with their colors (the values). Both were represented as strings. Notice the values did not need to be unique, but the keys do. Dictionaries are incredibly useful. Think about the contact information in your phone. An individual contact could be maintained as a dictionary. What would some of the keys be? Maybe name, phone, and email.

In [7]:
contact={}
contact['name']='Joe Lion'
contact['phone']=2125551212
contact['email']='joe.lion@columbia.edu'
#now we can access the value associated with any key in contact using the usual array notation
contact['phone']
Out[7]:
2125551212

We can use integers, strings, or even tuples as keys in the dictionaries. The values can be anything and don't have to be unique. Dictionaries are iterable (they iterate through keys) and mutable. I can change the value associated with a key at any time just by reassigning it. There are many useful features of the dict type (dict is it's official Python name, like int of integer). You can read about these here. To get a list of the keys use the method keys(). To get a list of the values use the method values.

In [8]:
d.keys()
Out[8]:
dict_keys(['apple', 'cherry', 'banana'])
In [9]:
d.values()
Out[9]:
dict_values(['red', 'red', 'yellow'])

Sets

Sets are anothe Python collections type. Sets are used to model the mathematical set. That is an unordered collection of stuff with no repeats. We create sets like this

In [10]:
a={1,2,3}
a
Out[10]:
{1, 2, 3}

To create the empty set we use the function set( ) since {} is already used to create a dictionary. Sets are iterable but they are not ordered in any particlar way. All of the usual set operations from math are present in Python:

  • intersection: a & b or a.intersection(b)
  • union: a | b or a.union(b)
  • difference: a-b or a.difference(b)
  • symmetric difference: a^b or a.symmetric_difference(b)

Also the usual set relations are present in Pythohn:

  • superset: a >= b or a.issuperset(b)
  • subset: a<=b or a.issubset(b)

There's more. You can learn more about sets in your text and by reading here

Image Editor Project

Reading: Homework 3 Assignment

In this project we are using text files to represent images. The Gimp file editor may be used to view these files. We use the extension ppm to indicate that these text files represent images.

Object Filter: Read the assignment sheet in courseworks. The objectfilter effect is local to the RGB values of each file which makes it very simple to implement. For any spcecific RGB value simply compare the value across all of the given files and write the value that appears in the majority of the input files to the new output file. What's new here is the use of the *filter_files parameter in the function definition. The * before the parameter name means that the `filter_filesvariable will be of typelist` and will be populated with as many arguments as the caller provides beyond the first three. This is called a variable length positional parameter and must always come after the regular named positional parameters. It allows a function call to have a variable number of arguments.

Shades of Gray: This effect is only slightly less local than the object filter. Instead of only focusing on each individual RGB value now we must focus on each pixel's three RGB values. So we basically just copy the input file except instead of writing the exact RGB values to the new file we average them over each pixel.

Negate Red: This effect is also easy to implement. Let Max_x be the maximum color value found on the third line of the file. Now in the new file, copy the original file exactly except for the red component values. For those, if x_r is the original value, write (Max_x-x_r) to the new file. That's it. You can do the same for the negate green and negate blue effects.

Mirror: This is probably the toughest effect to implement. Do this one last. For this effect you must read in an entire line of the image (not the same as a single line in the ppm file). When writing the line to the new file you must put the last pixel first, the second to last second, etc.... Remember, pixels are represented by groups of three numbers so it's not the same as just reversing the order of the numbers.

Lecture 14: More on Text Files

Reading: Punch and Enbody Chapters 5 & 14

Consider the text file coffee.txt listed here:

Dark Roast
30.0
The Good Stuff
100.0
Kona
50.0
Super Duper
20.0

The file lists the inventory for a coffee shop in a specific format. It lists the coffee name and then on the next line the pounds of coffee of that type left. In the following examples we will learn how to search and modify that file.

Example

show_coffee_records.py

In [12]:
# This program displays the records in the
# coffee.txt file.

def main():
    # Open the coffee.txt file.
    coffee_file = open('coffee.txt', 'r')

    # Read the first record's description field.
    descr = coffee_file.readline()

    # Read the rest of the file.
    while descr != '':
        # Read the quantity field.
        qty = float(coffee_file.readline())

        # Strip the \n from the description.
        descr = descr.rstrip('\n')

        # Display the record.
        print ('Description:', descr)
        print ('Quantity:', qty)

        # Read the next description.
        descr = coffee_file.readline()

    # Close the file.
    coffee_file.close()

# Call the main function.
main()
Description: Dark Roast
Quantity: 25.0
Description: The Good Stuff
Quantity: 100.0
Description: Super Duper
Quantity: 20.0

What just happened?
Just as before we first create a file object using the open. We then read the first line of the file and stored it in the variable desc. Then, we entered a while loop under the condition that the descr variable is not empty. We now read the next line which our inventory format dictates will be numeric so we store it as a float. We strip the special character sequence \n from descr and then print the two lines we have read so far. Finally we read the next line of the file. When we call readline() at the end of the file, the method will return the empty string which will lead to the while loop condition evaluationg to false. Finally we close the file.

So what if we only want to search for a single record? Then we'll have to get input from the user for which record we're searching for, then we'll read lines until one of the descriptions matches. We'll only print data for matching descriptions.

Example

search_coffee_records.py

In [13]:
# This program allows the user to search the
# coffee.txt file for records matching a
# description.

def main():
    # Create a bool variable to use as a flag.
    found = False

    # Get the search value.
    search = input('Enter a description to search for: ')

    # Open the coffee.txt file.
    coffee_file = open('coffee.txt', 'r')

    # Read the first record's description field.
    descr = coffee_file.readline()

    # Read the rest of the file.
    while descr != '':
        # Read the quantity field.
        qty = float(coffee_file.readline())

        # Strip the \n from the description.
        descr = descr.rstrip('\n')

        # Determine whether this record matches
        # the search value.
        if descr == search:
            # Display the record.
            print('Description:', descr)
            print ('Quantity:', qty, '\n') 
            # Set the found flag to True.
            found = True

        # Read the next description.
        descr = coffee_file.readline()

    # Close the file.
    coffee_file.close()

    # If the search value was not found in the file
    # display a message.
    if not found:
        print('That item was not found in the file.')

# Call the main function.
main()
Enter a description to search for: kona
That item was not found in the file.

What just happened?
The pattern is similar to the simple show_coffee_records.py program except this time we ask the user for a description and store it in a variable search. Then, we proceed in the same way but only print the data when the description line in the inventory file matches the search term. It's as simple as putting an if statement before the print statements.

So now what if we want to delte certain types of coffee? Hmmm, so this means changing the file, right? Well, not exactly. What we'll do is create a new file named temp.txt. We will proceed by copying the old file to temp.txt record by record except for the record matching the search description. Then at the end, we will replace the old coffee.txt with temp.txt by first deleting coffee.txt and then renaming temp.txt to coffee.txt. To do this we'll need to use the Python module os.

Example

delete_coffee_records.py

In [14]:
# This program allows the user to delete
# a record in the coffee.txt file.

import os  # Needed for the remove and rename functions

def main():
    # Create a bool variable to use as a flag.
    found = False

    # Get the coffee to delete.
    search = input('Which coffee do you want to delete? ')
    
    # Open the original coffee.txt file.
    coffee_file = open('coffee.txt', 'r')

    # Open the temporary file.
    temp_file = open('temp.txt', 'w')

    # Read the first record's description field.
    descr = coffee_file.readline()

    # Read the rest of the file.
    while descr:
        # Read the quantity field.
        qty = float(coffee_file.readline())

        # Strip the \n from the description.
        descr = descr.rstrip('\n')

        # If this is not the record to delete, then
        # write it to the temporary file.
        if descr != search:
            # Write the record to the temp file.
            temp_file.write(descr + '\n')
            temp_file.write(str(qty) + '\n')
        else:
            # Set the found flag to True.
            found = True

        # Read the next description.
        descr = coffee_file.readline()

    # Close the coffee file and the temporary file.
    coffee_file.close()
    temp_file.close()

    # Delete the original coffee.txt file.
    os.remove('coffee.txt')

    # Rename the temporary file.
    os.rename('temp.txt', 'coffee.txt')

    # If the search value was not found in the file
    # display a message.
    if found:
        print('The file has been updated.')
    else:
        print('That item was not found in the file.')

# Call the main function.
main()
Which coffee do you want to delete? Dark Roast
The file has been updated.

What just happened?
We proceeded exactly as described above. We used a very similar pattern as we did in the search example with a few interesting differences. First, notice we needed to import the os module. This is so we could use the functions remove and rename found in that module. Second, notice we created two separte file objects, one for reading and one for writing. Finally notice the while loop condition. It's simply the descr variable? How does that have a True or False value? Well, it turns out that when expecting a boolean Python will interpret an empty string as False and a nonempty string as True. This is very useful in instances like this. Indeed, that's part of the reason this convenient shortcut exists!

Finally let's look at a program that can actually modify a record in the inventory file. The pattern of the program is nearly identical to the delete example only instead of just not writing the record when a match is found, we will write in the new quantity.

Example

modify_coffee_record.py

In [15]:
# This program allows the user to modify the quantity
# in a record in the coffee.txt file.

import os  # Needed for the remove and rename functions

def main():
    # Create a bool variable to use as a flag.
    found = False

    # Get the search value and the new quantity.
    search = input('Enter a description to search for: ')
    new_qty = float(input('Enter the new quantity: '))
    
    # Open the original coffee.txt file.
    coffee_file = open('coffee.txt', 'r')

    # Open the temporary file.
    temp_file = open('temp.txt', 'w')

    # Read the first record's description field.
    descr = coffee_file.readline()

    # Read the rest of the file.
    while descr:
        # Read the quantity field.
        qty = float(coffee_file.readline())

        # Strip the \n from the description.
        descr = descr.rstrip('\n')

        # Write either this record to the temporary file,
        # or the new record if this is the one that is
        # to be modified.
        if descr == search:
            # Write the modified record to the temp file.
            temp_file.write(descr + '\n')
            temp_file.write(str(new_qty) + '\n')
            
            # Set the found flag to True.
            found = True
        else:
            # Write the original record to the temp file.
            temp_file.write(descr + '\n')
            temp_file.write(str(qty) + '\n')

        # Read the next description.
        descr = coffee_file.readline()

    # Close the coffee file and the temporary file.
    coffee_file.close()
    temp_file.close()

    # Delete the original coffee.txt file.
    os.remove('coffee.txt')

    # Rename the temporary file.
    os.rename('temp.txt', 'coffee.txt')

    # If the search value was not found in the file
    # display a message.
    if found:
        print('The file has been updated.')
    else:
        print('That item was not found in the file.')

# Call the main function.
main()
Enter a description to search for: The Good Stuff
Enter the new quantity: 50
The file has been updated.

What just happened?

We proceeded as outlined above. Almost identically to the delete example only whenever we found our search term we used the new quantity when writing to the temporary file. Notice this will work for every occurrence of the search term.