Homework 9

Due: Tuesday, April 29 at 11:59 pm

Submission: On Courseworks

Instructions

Please read the instructions carefully. An incorrectly formatted submission may not be counted.

There are several questions in this assignment. This assignment starts from skeleton code: eda.py and oop.py. Please include comments in your code detailing your thought process where appropriate. Put these files in a folder called uni-hw9 (so for me this would be tkp2108-hw9). Then compress this folder into a .zip file. For most operating systems, you should be able to right click and have a “compress” option in the context menu. This should create a file called tkp2108-hw9.zip (with your uni). Submit this on courseworks.

Exploratory Data Analysis (E.D.A.)

In the first part of this homework, we will do some exploratory data analysis, or E.D.A. for short.

We’ll be using a dataset called “Wisconsin Diagnostic Breast Cancer”. This is a collection of 30 measurements of 569 breast cancer tumors. These measurements are numerical calculations of things like tumor radius, area, perimeter, and more.

For each set of measurements, there is also a tumor outcome: b for benign or m for malignant.

Additionally, there is an ID. We will drop this column for our analysis.

You can download the dataset here.

For this part, you have have to implement two analysis functions with pandas, and two analysis functions with matplotlib/seaborn. One of each is well defined, the other is for you to implement however you want. Note that you can change the arguments for customPandasAnalysis and customPlot. Please provide comments about what you are doing, and note that creativity counts.

Skeleton Code

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns


def load_data(file_path):
    # read the csv file, dropping the id column

def pivotByDiagnosisAndShowMean(df, columns):
    # pivot the dataframe by diagnosis and show the mean of the columns

def customPandasAnalysis(df):
    # TODO: Note you can change the arguments and usage as needed

def showPairPlot(df, columns):
    # show the pair plot of the columns

def customPlot(df):
    # TODO: Note you can change the arguments and usage as needed


if __name__ == "__main__":
    df = load_data("wdbc.csv")
    print(df.head())
    assert len(df.columns) == 31
    assert df.columns.tolist() == [
        "diagnosis",
        "radius_mean",
        "radius_worst",
        "radius_se",
        "texture_mean",
        "texture_worst",
        "texture_se",
        "perimeter_mean",
        "perimeter_worst",
        "perimeter_se",
        "area_mean",
        "area_worst",
        "area_se",
        "smoothness_mean",
        "smoothness_worst",
        "smoothness_se",
        "compactness_mean",
        "compactness_worst",
        "compactness_se",
        "concavity_mean",
        "concavity_worst",
        "concavity_se",
        "concavepoints_mean",
        "concavepoints_worst",
        "concavepoints_se",
        "symmetry_mean",
        "symmetry_worst",
        "symmetry_se",
        "fractaldimension_mean",
        "fractaldimension_worst",
        "fractaldimension_se",
    ]

    # show the mean of the columns grouped by diagnosis
    mean_df = pivotByDiagnosisAndShowMean(df, ["radius_mean", "smoothness_mean", "texture_mean"])
    print(mean_df)

    # show the pair plot of the columns
    showPairPlot(df, ["radius_mean", "perimeter_mean", "smoothness_mean"])

    # show the custom pandas analysis
    # customPandasAnalysis(df)

    # show the custom plot
    # customPlot(df)

Example Output

  diagnosis  radius_mean  radius_worst  radius_se  ...  symmetry_se  fractaldimension_mean  fractaldimension_worst  fractaldimension_se
0         M        17.99        122.80      10.38  ...       0.6656                 0.2654                 0.11890               0.4601
1         M        20.57        132.90      17.77  ...       0.1866                 0.1860                 0.08902               0.2750
2         M        19.69        130.00      21.25  ...       0.4245                 0.2430                 0.08758               0.3613
3         M        11.42         77.58      20.38  ...       0.8663                 0.2575                 0.17300               0.6638
4         M        20.29        135.10      14.34  ...       0.2050                 0.1625                 0.07678               0.2364

[5 rows x 31 columns]
           radius_mean  smoothness_mean  texture_mean
diagnosis
B            12.146524         2.000321    462.790196
M            17.462830         4.323929    978.376415

Object Oriented Programming (OOP)

In this homework, we will create a zoo management system using object-oriented programming principles. The system will include classes for animals, exhibits, and the zoo itself. The main features of the system will be:

Skeleton code and an example run are provided. Note that this is an excercise in reading and understanding example code, and reverse-engineering the solution. Our tester code will vary slightly from what is provided below.

N.B. Remember you can call your parents method with super().<the method>(), and if you don’t need to put any details you can use pass, e.g.

class Parent:
    def __init__(self, name):
        self.name = name

class Child(Parent):
    pass

Skeleton Code

def main():
    # Create a new zoo
    my_zoo = Zoo()

    # Create some animals
    lion = Lion("Leo", 5)
    tiger = Tiger("Tiggy", 3)
    elephant = Elephant("Dumbo", 10)
    zebra = Zebra("Zara", 4)
    zebra_2 = Zebra("Zuzu", 3)

    # Add animals to the zoo
    my_zoo.add_animal(lion)
    my_zoo.add_animal(tiger)
    my_zoo.add_animal(elephant)
    my_zoo.add_animal(zebra)
    my_zoo.add_animal(zebra_2)

    print(f"My Zoo: {my_zoo}")
    print(f"Number of animals in the zoo: {len(my_zoo)}")
    print("")

    # Duplicates not added!
    my_zoo.add_animal(zebra)
    print(f"After attempting to add a duplicate zebra: {my_zoo}")
    print(f"Number of animals in the zoo after duplicate attempt: {len(my_zoo)}")
    print("")

    # Check if the animals are instances of Animal
    print(f"Is {lion.name} an Animal? {isinstance(lion, Animal)}")
    print(f"Is {tiger.name} an Animal? {isinstance(tiger, Animal)}")
    print(f"Is {elephant.name} an Animal? {isinstance(elephant, Animal)}")
    print(f"Is {zebra.name} an Animal? {isinstance(zebra, Animal)}")
    print(f"Is {zebra_2.name} an Animal? {isinstance(zebra_2, Animal)}")
    print("")

    # Check if the animals are instances of Carnivore or Herbivore
    print(f"Is {lion.name} a Carnivore? {isinstance(lion, Carnivore)}")
    print(f"Is {tiger.name} a Carnivore? {isinstance(tiger, Carnivore)}")
    print(f"Is {elephant.name} a Herbivore? {isinstance(elephant, Herbivore)}")
    print(f"Is {zebra.name} a Herbivore? {isinstance(zebra, Herbivore)}")
    print(f"Is {zebra_2.name} a Herbivore? {isinstance(zebra_2, Herbivore)}")
    print("")

    # Make an exhibit
    exhibit = Exhibit("Savannah")
    print(f"Exhibit before adding animals: {exhibit}")
    print(f"Exhibit is a subclass of Zoo: {isinstance(exhibit, Zoo)}")
    print("")

    # Add animals to the exhibit
    exhibit.add_animal(elephant)
    exhibit.add_animal(zebra)
    print(f"Exhibit after adding animals: {exhibit}")
    print(f"Number of animals in the exhibit: {len(exhibit)}")
    print("")

    # Attempt to add a carnivore to the herbivore exhibit
    print("Attempting to add a carnivore to the herbivore exhibit...")
    exhibit.add_animal(lion)
    print(f"Exhibit after attempting to add a carnivore: {exhibit}")
    print(f"Number of animals in the exhibit after attempting to add a carnivore: {len(exhibit)}")

if __name__ == "__main__":
    main()

Example Output

My Zoo: {'Carnivore': [<Leo the Lion, Age: 5>, <Tiggy the Tiger, Age: 3>], 'Herbivore': [<Dumbo the Elephant, Age: 10>, <Zara the Zebra, Age: 4>, <Zuzu the Zebra, Age: 3>]}
Number of animals in the zoo: 5

After attempting to add a duplicate zebra: {'Carnivore': [<Leo the Lion, Age: 5>, <Tiggy the Tiger, Age: 3>], 'Herbivore': [<Dumbo the Elephant, Age: 10>, <Zara the Zebra, Age: 4>, <Zuzu the Zebra, Age: 3>]}
Number of animals in the zoo after duplicate attempt: 5

Is Leo an Animal? True
Is Tiggy an Animal? True
Is Dumbo an Animal? True
Is Zara an Animal? True
Is Zuzu an Animal? True

Is Leo a Carnivore? True
Is Tiggy a Carnivore? True
Is Dumbo a Herbivore? True
Is Zara a Herbivore? True
Is Zuzu a Herbivore? True

Exhibit before adding animals: Savannah Exhibit: {}
Exhibit is a subclass of Zoo: True

Exhibit after adding animals: Savannah Exhibit: {'Herbivore': [<Dumbo the Elephant, Age: 10>, <Zara the Zebra, Age: 4>]}
Number of animals in the exhibit: 2

Attempting to add a carnivore to the herbivore exhibit...
Cannot add Leo to Savannah exhibit. It is a carnivore.
Exhibit after attempting to add a carnivore: Savannah Exhibit: {'Herbivore': [<Dumbo the Elephant, Age: 10>, <Zara the Zebra, Age: 4>]}
Number of animals in the exhibit after attempting to add a carnivore: 2