Introduction to Information Science

Week 15

May 4, 2014

Review

Final Exam

  • Date: Monday, May 11th
  • Time: 9:00AM - 12:00PM
  • Location: 503 Hamilton
  • Instructions: No textbooks or notes. No calculators

Week 13

April 20, 2015 April 22, 2015

Week 12

April 13, 2015 April 15, 2015

Reference

  • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. Third Edition
  • Pearson Custom Computer Science COMS W1001 Introduction to Information Science, Columbia University. Chapter 11 Introduction to Computer and Programming by Tony Gaddis.

This week's topics

Sorting Algorithms

A sorting problem is defined as:
Input: A sequence of n numbers <a1,a2,...,an>.
Output: A permutation (reordering) <a'1,a'2,...,a'n> of the input sequence such that a'1 ≤ a'2 ≤ ... ≤ a'n.


  • Insertion Sort
    • Pseudocode
      Insertion_Sort(A)
          for j = 2 to A.length
              key = A[j]
              // Insert A[j] into the sorted sequence A[1..j-1].
              i = j - 1
              while i > 0 and A[i] > key
                  A[i + 1] = A[i]
                  i = i - 1
                  A[i + 1] = key
    • Demo (Courtesy of Wikipedia)
    • Complexity
      • Best case: O(N)
      • Average case: O(N2)
      • Worst case: O(N2)
    • Python code for download
      [insertion_sort.py]

  • Bubble Sort
    • Pseudocode
      Bubble_Sort(A)
          for i = A.length to 1
              for j = 1 to i - 1
                  if A[j] > A[j + 1]
                      swap(A[j], A[j + 1]
    • Demo (Courtesy of Wikipedia)
    • Complexity
      • Best case: O(N)
      • Average case: O(N2)
      • Worst case: O(N2)
    • Python code for download
      [bubble_sort.py]

  • Selection Sort
    • Pseudocode
      Selection_Sort_by_Max(A)
          for i = A.length to 1
          	max_index = i
              for j = 1 to i
                  if A[j] > A[max_index]
                      max_index = j
              swap(A[i], A[max_index]
      or
      Selection_Sort_by_Min(A)
          for i = 1 to A.length
          	min_index = i
              for j = i to A.length
                  if A[j] < A[min_index]
                      min_index = j
              swap(A[i], A[min_index]
    • Demo (Courtesy of Wikipedia)
    • Complexity
      • Best case: O(N2)
      • Average case: O(N2)
      • Worst case: O(N2)
    • Python code for download
      [selection_sort.py]

  • Merge Sort
    • Pseudocode
      Merge_Sort(A, p, r)
          // when there are only two elements
          if p == r - 1
              if A[p] > A[q]
                  swap(A[p], A[q])
          // when there are more than two elements
          if p < r - 1
              q = (p + r) / 2
              Merge_Sort(A, p, q)
              Merge_Sort(A, q + 1, r)
              Merge(A, p, q, r)
              
      Merge(A, p, q, r)
          L = A[p .. q]
          R = A[q + 1 .. r]
          i = 1
          j = 1
          k = p
          while i < L.length and j < R.length
          	if L[i] <= R[j]
          	    A[k] = L[i]
          	    i = i + 1
          	    k = k + 1
          	else
          	    A[k] = R[j]
          	    j = j + 1
          	    k = k + 1
          if i < L.length
              A[k .. r] = L[i .. L.length]
          else
              A[k .. r] = R[j .. R.length]
    • Demo (Courtesy of Wikipedia)
    • Complexity
      • Best case: O(NlogN)
      • Average case: O(NlogN)
      • Worst case: O(NlogN)
    • Python code for download
      [merge_sort.py]

  • Quicksort
    • Pseudocode
      Quicksort(A, p, r)
          if p < r
              p = Partition(A, p, r)
              Quicksort(A, p, q - 1)
              Quicksort(A, q + 1, r)
      
      Partition(A, p, r)
          pivot = A[r]
          i = p - 1
          for j = p to r - 1
              if A[j] <= pivot
                  i = i + 1
                  swap(A[i], A[j])
          swap(A[i + 1], A[r])
          return i + 1
    • Demo (Courtesy of Wikipedia)
    • Complexity
      • Best case: O(NlogN)
      • Average case: O(NlogN)
      • Worst case: O(N2)
    • Python code for download
      [quicksort.py]

  • Python code to compare the above algorithms
    [eval_sorting.py]

Week 11

April 6, 2015 April 8, 2015

Reference

  • Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. Third Edition
  • Pearson Custom Computer Science COMS W1001 Introduction to Information Science, Columbia University. Chapter 11 Introduction to Computer and Programming by Tony Gaddis.

This week's topics

Data Structures in Python

A data structure is a way to store and organize data in order to facilitate access and modifications. When we are learning Python programming, we will focus on the following Python built-in data structures.

  • List: [item0, item1, item2, ..., itemN]
    Implement a queue (FIFO) using a list.
    Sample applications:
    • Waiting list for course registration
    • Subway train scheduling
    Implement a stack (LIFO) using a list.
    Sample applications:
    • Storing and displaying posts in reverse chronological order
  • Tuple: (attribute0, attribute1, attribute2, ..., attributeN)
    Sample applications:
    • GPS coordinates
    • Represent colors based on RGB
  • Set: ([key0, key1, key2, ..., keyN])
    Sample applications:
    • Distinct items in a shopping basket
    • Who (IP addresses) have visited my website
  • Dictionary: {key0=value0, key1=value1, key2=value2, ..., keyN=valueN}
    Sample applications:
    • Record the number of votes for each candidate
    • Record the number of sales for each item
  • Composite structures
    Sample applications:
    • Stock price of a company: a list of tuples - [(price_i, date_i), (price_j, date_j), ..., (price_k, date_k)]
    • Customer purchase history: a dictionary with the key being the customer id and the value being a list of items purchased - {customer_i=[item_i1, item_i2, ...], customer_j=[item_j1, item_j2, ...]}

Algorithms

An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output.

  • Sorting problems
  • Searching problems
  • Graph algorithms
  • etc.

Week 10

March 30, 2015 April 1, 2015

Reference

  • Pearson Custom Computer Science COMS W1001 Introduction to Information Science, Columbia University. Chapter 13-17 Introduction to Computer and Programming by Tony Gaddis.

This week's topics

Variable Naming Rules

  • No Python keywords
  • No space
  • First character: a-z, A-Z, or _
  • After the first character: a-z, A-Z, 0-9, _
  • Case sensitive

Variable Naming Conventions

  • Use underscore (_) for space. e.g. tax_rate
  • camelCase. e.g. taxRate

Math Calculations

  • Operator: +, -, *, /, %, **
  • Integer division, floating point division
  • Data type conversion
  • Operator precedence: 1. **; 2. *, /, %; 3. +, -

Input/Output

  • print statement
  • Single quote, double quotes, triple quotes
  • input
  • raw_input
  • Output format

Data Types/Data Structures

  • Number: integer and floating point
  • String
  • List
  • Tuple
  • Set
  • Dictionary

Control Flow

  • if
  • if else
  • if elif else

Functions

  • def
  • arguments, keyword arguments
  • scope of variables: local variables and global variables
  • return value/values

Loops

  • for
  • while
  • break and continue

Recursion

Demo for Loop vs. Recursion: [loop_vs_recursion.py]

Week 9

March 23, 2015 March 25, 2015

Reference

  • Pearson Custom Computer Science COMS W1001 Introduction to Information Science, Columbia University. Chapter 12 Introduction to Computer and Programming by Tony Gaddis.
  • Compilers, Principles, Techniques, and Tools. Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman.

This week's topics

Why We Need Programming Languages

  • Low-level Programming Language
  • High-level Programming Language

How a Program Works

  • Compiler
  • Interpreters

Slides

Download: pdf

Week 8

March 9, 2015 March 11, 2015

Week 7

March 2, 2015 March 4, 2015

Databases are widely used in many applications. We will provide an overview of the nature and purpose of database system. The focus of this study is how to use Entity-Relationship (E-R) model to design a database, and how to use relational model to create databases and tables in a database-management system (DBMS). We will also study SQL (Structured Query Language), a "query language" originally developed by IBM and now a standard programming language for managing data held in a relational database-management system.

Reference

  • Abraham Silberschatz, Henry F. Korth, S. Sudarshan. Database System Concepts. McGraw-Hill.
  • Brookshear, J. Glenn (2011-04-13). Computer Science: An Overview (11th Edition). Prentice Hall.

This week's topics

Database

  • Introduction to Database
  • Entity-Relational Model
  • Relational Model
  • SQL

Sample Database

E-R Diagram

Schema Diagram

Slides

Download: pdf

Week 6

February 23, 2015 February 25, 2015

This week we will explore the basics of computer architecture and learn how computers are programmed by means of encoded instructions, called machine language instructions.

Reference

  • Brookshear, J. Glenn (2011-04-13). Computer Science: An Overview (11th Edition). Prentice Hall.

Announcement

  • Supplementary reading materials for Chapter 5 and 6 are available for download on Courseworks, under the hw3 folder.

This week's topics

  • Machine Language
  • Program Execution

Slides

Download: pdf

Week 5

February 16, 2015 February 18, 2015

Week 4

February 9, 2015 February 11, 2015

This week we discuss the topics associated with data representation and the storage of data within a computer. We will mainly focus on the concepts of bits and bytes, and how the binary system is used in computer to store and manipulate data. You will learn how to do computation in binary space and the conversions among binary, decimal, octal and hexadecimal numbers.

Reference

  • Brookshear, J. Glenn (2011-04-13). Computer Science: An Overview (11th Edition). Prentice Hall.

Announcement

  • Homework 2 is available. It's due in two weeks, on Feb 23, before class (10:10AM).
  • Supplementary reading materials for Chapter 5 and 6 are available for download on Courseworks, under the hw2 folder.

This week's topics

  • Bits and Logic Gates
  • Bits, Bytes and the Binary System
  • Binary, Decimal, Octal, Hexadecimal
  • Representing Numbers
  • Representing Text

Sample HTML+CSS for HW2 (click here to download)

Slides

Download: pdf

Week 3

February 2, 2015 February 4, 2015

Computer networks encompasses the study of how computers can be linked together to share information and resources. We will discuss the construction and operation of networks, how networks can be connected to form an internet, the applications and protocols of networks, and security issues. We will also introduce World Wide Web and HTML. You will get chance to use basic HTML elements to create a webpage and publish it.

Reference

  • Brookshear, J. Glenn (2011-04-13). Computer Science: An Overview (11th Edition). Prentice Hall.
  • Andrew S. Tanenbaum. Computer Networks (4th Edition)

This week's topics

HTML

  • <html>, <head>, <body>
  • <h1>, ..., <h6>
  • <p>
  • <a>
  • <ul>, <ol>, <li>
  • <table>, <thead>, <tbody>, <th>, <tr>, <td>
  • <img>
  • comment: <!-- -->

Network

  • Network classification
  • Methods of process communication
  • Devices
  • Distributed systems
  • Internet protocols
  • Internet addressing
  • Internet security

Slides

Download: pdf

Week 2

January 26, 2015 January 28, 2015

We'll discuss operating systems this week, which are software packages that coordinate a computer’s internal activities as well as oversee its communication with the outside world. It is a computer’s operating system that transforms the computer hardware into a useful tool. The goal is to understand what operating systems do and how they do it. Such a background is central to being an enlightened computer user.

Reference

This week's topics

  • The History of Operating Systems
  • Getting Operating System Started
  • Kernel and User Space
  • Process Administration
  • Coordinating the Machine’s Activities
  • Handling Competition Among Processes
  • Memory
  • File System
  • Security

Announcement

Homework 1 is available. It's due in two weeks, on February 9, before class.

Useful Resources

Sample HTML (click here to download)

Software that I recommend for writing HTML for hw1

  • Windows user, use Notepad++
  • Mac user, use TextWrangler
  • Emacs
  • If you have trouble running the above editors on your computer, please email me and I will give you other suggestions.

How to connect to CUNIX

  • Server: cunix.columbia.edu
  • Username: your uni
  • For Windows users, use PuTTY.
  • For Mac users, open Terminal, and use ssh, e.g. ssh uni@cunix.columbia.edu

CUNIX tutorial

How to publish your webpage

How to upload your file (e.g. index.html) to CUNIX

Slides

Download: pdf

Week 1

January 21, 2015

I'm glad you're interested in this class. It's the first week and I'll describe the objectives of the course, provide general course information, and give a quick overview of what will be covered for this semester.

Course Objectives

  • To provide an introduction to using Information technology and the World Wide Web
  • To explore some of the fundamental concepts involved with information storage and retrieval
  • To introduce the fundamentals of algorithmic problem solving
  • To provide a basic introduction to computer programming in Python

What We Will Cover This Semester

  • Operating System Basics
  • Networks and the Internet
  • Web Design Basics - HTML
  • Data Storage and Manipulation
  • Spreadsheets
  • Database
  • Data Structure & Algorithm
  • Programming in Python

Most importantly, if you register for this class, please take 5 minutes to complete the following survey and email me [xie at cs.columbia.edu] your answer. The goal of the survey is to help me know more about you, so that I have an opportunity to adjust the course materials to fit your interests.

Use one or two sentences to describe:

  1. Tell me about you: name, school, major, year
  2. Why do you want to take this course?
  3. What do you expect to learn?
  4. How do you think this course can be relevant to your current major, future study, or career?
  5. Do you have any existing knowledge in computer science, e.g. programming language, web design, database, etc?

Slides

Download: pdf

Week 0

January 14, 2015

Welcome to the class! This is an introductory course to information science. The class content is self-contained and there's no prerequisites. In this class, students will learn the basics of computer science and the recent advancements that have been applied to real world problems.

Class name: Introduction to Information Science, Columbia University

Class code: COMS W1001

Instructor: Boyi Xie

Semester: Spring 2015

Day/Time: Monday and Wednesday 10:10AM-11:25AM

Classroom: 503 Hamilton Hall

Office hour: Tuesday 9:30AM-11:30AM at CCLS

TA: Sahana Subbanna (ss4737@columbia.edu); Location: TA Room; Office hour: Thursday 4:15PM-6:15PM

TA: Mei-Vern Then (mt2837@columbia.edu); Location: TA Room; Office hour: Monday 1:00PM-3:00PM

Prerequisites: None

Textbook:

Title Author Publisher
Pearson Custom Computer Science COMS W1001 Introduction to Information Science Brookshear, Snyder, and Gaddis Pearson

The following is a tentative schedule. It is subject to change as the course proceeds, according to progress of the class and the interests of the students.

Class Schedule
Date Day Session Topic Required Reading HW
Jan 21 Wed 1 Introduction to Information Science: What we will cover Chapter 1 of Brookshear
Jan 26 Mon 2 Operating System Basics Chapter 2 of Brookshear HW1 available
Jan 28 Wed 3 More on Operating System Basics and CUNIX Chapter 2 of Brookshear
Feb 2 Mon 4 Your Homepage Chapter 4 of Snyder
Feb 4 Wed 5 The Basics of Networks and the World Wide Web Chapter 3 of Brookshear
Feb 9 Mon 6 More on HTML and CSS Chapter 5 of Brookshear HW1 due; HW2 available
Feb 11 Wed 7 Binary: Bits, bytes and all the rest Chapter 5 of Brookshear
Feb 16 Mon 8 Representing Information Digitally Chapter 5 of Brookshear
Feb 18 Wed 9 The Anatomy of a Computer Chapter 6 of Brookshear
Feb 23 Mon 10 Introduction to Database Chapter 9 of Snyder HW2 due; HW3 available
Feb 25 Wed 11 Database (continued) Chapter 9 of Snyder
Mar 2 Mon 12 Database (continued) Chapter 9 of Snyder
Mar 4 Wed 13 Midterm Review Chapter 7 of Snyder
Mar 9 Mon 14 The Big Picture So Far HW3 due
Mar 11 Wed 15 Midterm Exam Chapter 12,13,18,19 of Gaddis
Mar 23 Mon 16 Introduction to Programming Languages HW4 available
Mar 25 Wed 17 Python I: Python Basics: IO, variables, conditionals Chapter 14,15 of Gaddis
Mar 30 Mon 18 Python II: Python Basics: functions Chapter 16,17 of Gaddis
Apr 1 Wed 19 Python III: Python Basics: iterations, recursions Chapter 16,17 of Gaddis
Apr 6 Mon 20 Data Structure Chapter 11 of Brookshear
Apr 8 Wed 21 Algorithms I: Searching Chapter 11 of Brookshear
Apr 13 Mon 22 Algorithms II: Sorting Chapter 11 of Brookshear
Apr 15 Wed 23 Algorithms III: Complexity Analysis Chapter 11 of Brookshear HW4 due; HW5 available
Apr 20 Mon 24 Python IV: Implementing Algorithms using Python
Apr 22 Wed 25 Python V: Regular Expression
Apr 27 Mon 26 Python VI: Object-Oriented Programming
Apr 29 Wed 27 Problem Solving using What We've Learned HW5 due
May 4 Mon 28 Final Review
1