CMSC 14100 — Lecture 15

Classes

As we've discussed them, abstract data types are a good idea in the abstract (ha ha) but practically speaking, they are nothing more than convention. In our implementation of a stack, we use a list to manage the stack, but there's nothing stopping someone from ignoring convention and treating that stack like a list, knowingly or not. We would like some mechanisms to formalize the separation between the interface for our abstract data type and its implementation.

This is the same idea that we discussed with colours. We've already seen the example of colours as tuples of three integers—not every triple of integers is necessarily a colour! So though we may, by convention, treat a certain triple as a colour, there's nothing stopping someone from ignoring this and treating the tuple of integers like any other.

This may seem like a distinction that doesn't matter, but it becomes important if we want to enforce what we're allowed to do with each of these things. For another example, we've seen that tables can be represented as a list of lists, but we can also represnt matrices in the same way and we can represnt images in the same way. We have defined some operations for tables but these operations would not make sense for matrices or images. Yet we have no way to tell them apart because they appear identical.

This goes back to our understanding of the need for types. Types tell us what values our data can take on and what functions we can apply to them. Ideally, we would be able to define a type for a table and another type for an image. We can then define functions that specify the appropriate types for their inputs and outputs.

In object-oriented programming languages like Python, a class is a type definition. A class contains two kinds of data:

Once we have defined a class, we can create objects of that class. An object is a particular instance of the class. In non-object terms, a class is a type and an object is a value.

Everything is an object

You may wonder what distinguishes a class from a type. In Python, the answer is: there is no distinction—every type is a class. For some types, this is not surprising. For example, you are very familiar by now that strings and lists all have methods. But what about, say, integers or booleans?

Yes, even those are classes, though Python tries its best to hide this from us. We've already discussed the fact that every variable in Python really stores a reference to a value, whether that's a list or an integer. The truth is that those store references to objects.

What does it mean that a particular list or integer is an object? Clearly, there's an associated value that those objects store. But what about methods? We've seen list methods, but what about integer methods? It turns out that many of the "operations" we use on integers are really syntactic sugar that hides the fact that we're really applying an object's methods.


    >>> [2, 3, "a", True, "ooo"].pop()
    'ooo'
    >>> True.__eq__(False)
    False
    >>> (23).__add__(-4)
    19
    

We'll be discussing some of these funny looking methods later.

Defining a class

How do we define a class? We need to know a few things. First, we need a name for our class. For instance, if we want to create a class for representing students, a good name might be Student. By convention, class names are capitalized. To define a class, we use the keyword class. So our class definition begins with


    class Student:
        # Rest of the definition
    

Like other statements in Python, the body of the class definition is denoted by indentation.

Secondly, we need a way to create an object of that class. The creation of an object is performed by a method called the constructor.

Typically, an object is created by using its class name in the same way as a function. This does not really square with your experience in Python because the built-in classes behave slightly differently. But they also have explicit constructors that can be used, which we have used without knowing it.


    >>> s = str([3, 1, 5])
    >>> s
    '[3, 1, 5]'
    >>> n = int("500000000")
    >>> n
    500000000
    

What we've called casting (i.e. "converting types") is really constructing a new object using the class' constructor and supplying our existing object as an argument to it. They construct objects even when we don't pass any arguments.


    >>> str()
    ''
    >>> int()
    0
    >>> list()
    []
    

So to define a class, we provide a name and define a constructor. The constructor will then create an object based on the provided arguments. So if we want to create a Student, we will want to call something like


    >>> s = Student('Oliver Evans')
    

Constructors

To define a constructor for a class in Python, the constructor is a method with the name __init__ defined in the class definition.


    class Student:
        """
        Represents a university student.
        """
        def __init__(self, name):
            """
            Inputs:
                name (str): Full name of student
            """
            self.name = name
            self.institution = 'University of Chicago'
    

Notice that the constructor is a function and we define it in the same way, using def. However, this function definition is inside the definition of the class. It must be one indentation level in to denote that it belongs in this class.

We can then create and use a Student in the following way.


    >>> s1 = Student('Oliver Evans')
    >>> s2 = Student('Sonny Brisko')
    >>> s3 = Student('Kanae')
    >>> s1.name
    'Oliver Evans'
    >>> s1.institution
    'University of Chicago'
    >>> s2.institution
    'University of Chicago'
    

That is, we treat the class name like a function and pass in arguments intended for the constructor. What does our constructor do? It assigns uses the provided arguments to set the attributes of the student object it creates, name and institution.

Of course, the constructor doesn't have to do just that—it's a function and we can define a function to do whatever we want. So the constructor can do some processing on the arguments, like performing checks in the process of setting up the object.

Here, we also note the appearance of a new concept, the parameter self. You may have noticed that while we passed one argument to the constructor, the definition of __init__ has two parameters. The missing argument is self, which is the first parameter of every method that gets defined in the class.

The keyword self is meant to represent the object that is using the method. Whenever a method is called, the object that we used to call the method is always passed to the method as an argument, as self.

So what happens when a constructor is called is that it is really syntactic sugar for the following process:

  1. We make the assignment s1 = Student('Oliver Evans')
  2. An empty object is created. This empty object is our new object. Denote it by our_object.
  3. The reference for our_object is assigned to s1.
  4. Student.__init__(our_object, 'Oliver Evans') is called.
  5. The constructor mutates our_object by setting name and institution in the way that we defined.

Attributes

Let's flesh out our student representation. Something you may notice about our class is that there's only one name attribute for the full name of the student. Why not separate this into two attributes: first name and last name? Many systems do this, but not all names have this format. There are lots of different ways that names are used. Sometimes, the simplest thing to do is not make any assumptions where you don't need to—do we really need the concept of first and last names separately?

One important assumption about names that we definitely know not to make is that they are unique. This means that we need some other way to distinguish students. We have two ways to do this at this university: your UCID and your CNetID, both commonly confused with each other. The UCID is what you can think of as your student number (though people who work at the university also have a UCID) while the CNetID is your username used for logging into computer systems and email.


    class Student:
        """
        Represents a University of Chicago student.
        """
        def __init__(self, name, ucid, cnetid):
            """
            Inputs:
                name (str): Full name of student
                ucid (int): UCID number
                cnetid (str): CNetID
            """
            self.name = name
            self.institution = 'University of Chicago'
            self.id_number = ucid
            self.email = cnetid + '@uchicago.edu'
    

    >>> s1 = Student('Oliver Evans', 20210722, 'oliverd23')
    >>> s2 = Student('Sonny Brisko', 20220227, 'sbrisko')
    >>> s3 = Student('Kanae', 20180502, 'kanae')
    >>> s1.id
    202107224
    >>> s2.name
    'Sonny Brisko'
    >>> s3.email
    'kanae@uchicago.edu'
    

Notice the following:

So far, what we've done is created a structure that is just a collection of values. This is something that seems like we could've accomplished by using tuples or dictionaries. But the value in this is to formally define this concept and give it a name. This allows us to separate the idea of a Student from just another tuple and signals to the user that this value needs to be treated differently. But we can take this idea further.

Methods

The other kind of data that we can define for classes are methods, or functions that act on an object. We've already seen and used many examples of methods—these are the functions that "belong" to an object and appear distinct from functions that are defined independently of an object and are provided arguments.

For example, the dict class has a method get that either retrieves the value associated with a given key if it exists in the dictionary, or produces a value if it doesn't.


    >>> d1 = {'n': 24, 'j': 34, 's': 31}
    >>> d2 = {'n': 42, 'm': 43, 's': 13}
    >>> d1.get('n', -99)
    24
    >>> d2.get('n', -99)
    42
    >>> d2.get('m', -99)
    43
    >>> d1.get('m', -99)
    -99
    

Notice that a method always operates on the object that it is called on. To see why, let's see how to define a method. We define it in the usual way we would for a function, but inside the class (i.e. one indentation level in).


    class Student:
        """
        Represents a University of Chicago student.
        """
        def __init__(self, name, ucid, cnetid):
            """
            Inputs:
                name (str): Full name of student
                ucid (int): UCID number
                cnetid (str): CNetID
            """
            self.name = name
            self.institution = 'University of Chicago'
            self.id_number = ucid
            self.email = cnetid + '@uchicago.edu'

        def cnetid(self):
            """
            Retrieves the CNetID of the student based on their email

            Output (str): the CNetID
            """
            cnetid = self.email.removesuffix('@uchicago.edu')
            return cnetid
    

Notice that just like with __init__, a method definition must have a self parameter. Then a call c = s1.cnetid() is really turned into something like

c = Student.cnetid(s1)

In fact, we can call this directly.


    >>> s1.cnetid()
    'oliverd23'
    >>> Student.cnetid(s1)
    'oliverd23'
    

Just as with the constructor, here Python will pass a reference to the object on which the method is acting as the parameter self. The method then has access to the object via this reference.

This is not so different from what we were originally doing with our stack ADT, where we had to explicitly pass in the stack as an argument. However, this syntax makes it more clear that we should think of methods as functions that belong to a particular class and are applied on the object that it is attached to.

Stacks

With this in mind, let's try to define a stack that we can use without having to pretend that it's a list.

It turns out in working through the ADT and its implemenation, we've already got most of the actual class definition finished. The ADT gives us our desired methods and our implementation gives us the idea to store a list maintaining the stack as an attribute.

One change that we make is that create is an obvious choice for our constructor, so we replace it with __init__.


    class Stack:
        """
        A collection of items with controlled last in, first out access.
        """
        def __init__(self):
            """
            Initializes an empty stack.
            """
            self.items = []

        def is_empty(self):
            """
            Return whether this stack is empty.

            Output (bool): True if stack contains no items, False otherwise
            """
            return self.items == []

        def push(self, item):
            """
            Add the given item to the top of this stack.

            Input:
                item (Any): an item to put on the top of the stack
            """
            self.items.append(item)

        def pop(self):
            """
            Remove the item at the top of this stack and return it.
            The stack cannot be empty.

            Output (Any): the item that is removed from the top of the stack
            """
            return self.items.pop()
    

Now, we can use the stack as a Stack.


    >>> st = Stack()
    >>> st.push(3)
    >>> st.push(-23)
    >>> st.push("hey")
    >>> st
    [3, -23, 'hey']
    >>> st.pop()
    'hey'
    >>> st
    [3, -23]
    

Now it is very clear how to use the stack and much more convenient to use it in the way that's prescribed by our ADT.


    def matched_brackets(expr):
        """
        Given an expression, determine whether the brackets in the 
        expression are correctly matched.
        
        Input:
            expr (str): the expression to check

        Output (bool): True if brackets are correctly matched, False 
            otherwise
        """
        brackets = {
            '(' : ')',
            '[' : ']',
            '{' : '}'
        }

        st = Stack()
        for c in expr:
            if c in brackets:
                st.push(st, c)
            elif c in brackets.values():
                if st.is_empty():
                    return False
                matching = st.pop()
                if brackets[matching] != c:
                    return False
        return st.is_empty()