Type Safety#

Twitter Handle LinkedIn Profile GitHub Profile Tag Tag

Type Safety#

A subtype is a foundational concept in type theory and object-oriented programming that facilitates type safety and polymorphism. The relationship between a subtype \(S\) and a supertype \(T\) is denoted as \(S <: T\), \(S \subseteq T\), or \(S ≤: T\). Before we detail the criterion[1] for subtype in the next section, we state an important implication of subtype - type safety.

Definition 22 (Subtype and Type Safety)

If \(S\) is a subtype of \(T\), the subtyping relation (written as \(S \leq T\), \(S <: T\), \(S \subseteq T\), or \(S \leq: T\)) means that any term of type \(S\) can safely be used in any context where a term of type \(T\) is expected.

In other words, we say that \(S\) is a subtype of \(T\) if a piece of code written for variables of type \(T\) can also safely be used on variables of type \(S\) [2].

What does this mean?

This means that if \(S\) is a subtype of \(T\), you can use an instance of \(S\) in any place where an instance of \(T\) is required, without any issues related to type compatibility. That is what we call subtype polymorphism.

This concept of subtyping forms the basis of subtype polymorphism in object-oriented programming. Subtype polymorphism allows objects of a subtype to be treated as objects of a supertype, enabling methods to operate on objects of different types as long as they share a common supertype. This mechanism is critical for implementing interfaces and abstract classes in a type-safe manner.

More formally, in subtype polymorphism, if \(S\) is a subtype of \(T\) (denoted as \(S <: T\)), then objects of type \(S\) can be used in contexts expecting objects of type \(T\). This interoperability is guaranteed without loss of integrity or behavior of the original type \(S\), ensuring that operations performed on \(T\) are valid on \(S\). This allows for greater flexibility and code reuse while maintaining strict type safety, as it ensures that the substitutability of subtypes for their supertypes does not lead to runtime type errors or unexpected behaviors.

A Type Safe Example#

Assume for a moment that the class Cat and Dog are both valid subtype of the class Animal through class inheritance, which we have learned earlier to be called nominal subtyping (i.e. subclasses are subtypes).

 1class Animal:
 2    def describe(self) -> str:
 3        return str(self.__class__.__name__)
 4
 5    def make_sound(self) -> str:
 6        return "Generic Animal Sound!"
 7
 8
 9class Dog(Animal):
10    def make_sound(self) -> str:
11        return "Woof!"
12
13    def fetch(self) -> str:
14        return "Happily fetching balls!"
15
16
17class Cat(Animal):
18    def make_sound(self) -> str:
19        return "Meow"
20
21    def how_many_lives(self) -> str:
22        return "I have 9 lives!"

Then this means that any instance of Dog or Cat can safely be used in a context where an instance of Animal is expected.

For example, consider the following function describe_animal that takes in an animal of type Animal. This is telling developers that we can pass in any animal as long as it is a subtype of the Animal class.

 1def describe_animal(animal: Animal) -> str:
 2    return animal.describe() + " makes sound " + animal.make_sound()
 3
 4generic_animal = Animal()
 5generic_animal_sound = describe_animal(generic_animal)
 6print(generic_animal_sound)
 7
 8generic_dog = Dog()
 9generic_dog_sound = describe_animal(generic_dog)
10print(generic_dog_sound)
11
12generic_cat = Cat()
13generic_cat_sound = describe_animal(generic_cat)
14print(generic_cat_sound)
Animal makes sound Generic Animal Sound!
Dog makes sound Woof!
Cat makes sound Meow

In fact, what we have described is also inherently related with variable assignment. You can also think of variable assigning from the function describe_animal above. How so? When you pass in an instance of Dog, say generic_dog, to the function describe_animal, you are essentially assigning generic_dog to the parameter/variable animal in the function.

By extension, the following assignment:

1generic_animal = generic_dog  # Safe because Dog <: Animal

is allowed and considered safe because we are substituting (assigning) an expression generic_dog of type instance Dog to the variable generic_animal is allowed because we established that Dog is a subtype of Animal - so it is safe. A static type checker such as mypy will not raise an error here. However, if you were to do the reverse:

1generic_dog = generic_animal  # Unsafe because Animal is not a subtype of Dog

Then the static type checker will raise an error:

error: Incompatible types in assignment (expression has type "Animal", variable has type "Dog")  [assignment]
    generic_dog = generic_animal

indicating that you are trying to assign an expression generic_animal of type Animal to the variable generic_dog of type Dog. This is unsafe because there is no guarantee that the Animal class has say, all methods that a Dog instance might have!

In short, we have:

Therefore, it’s safe to assign an instance of Dog to a variable of type Animal since Dog contains all functionalities (make_sound) of Animal and possibly more (fetch) so there won’t be any surprise here. But it is deemed unsafe to assign generic_animal to generic_dog because not every Animal is a Dog. While every Dog instance is an Animal (fulfilling the subtype criteria), the reverse isn’t true. An Animal instance might not have all functionalities of a Dog (like fetch()), leading to potential errors or undefined behaviors if treated as a Dog. This violates the principle that the subtype should be able to handle everything the supertype can, plus potentially more.

Violating Type Safety#

Consider one example that violates type safety:

 1class Robot:
 2    def describe(self) -> str:
 3        return str(self.__class__.__name__)
 4
 5    def make_sound(self) -> int:
 6        return 1
 7
 8robot = Robot()
 9
10try:
11    robot_sound = describe_animal(robot)
12    print(robot_sound)
13except Exception as err:
14    print(f"Error: {err}")
Error: can only concatenate str (not "int") to str

In python there is no notion of type checking during compile time unless you have a static type checker. Consequently, the above code will only throw an error during runtime because we are adding an integer 1 to the string animal.describe() + " makes sound ". This is because we are errorneously passing in an instance of Robot to a function that accepts Animal only. Since Robot is not a subtype of Animal, there is no type safety guarantee. This example can be way worse if we were to just change the describe_animal to return a f-string instead - which will not throw any error at all, leading to hidden bugs!

1def describe_animal(animal: Animal) -> str:
2    return f"{animal.describe()} makes sound {animal.make_sound()}"
3
4robot = Robot()
5robot_sound = describe_animal(robot)
6print(robot_sound)
Robot makes sound 1

and this can often happen in code, sometimes unknowingly.

In the for loop below, we iterate over entities that are presumed to be Animal types. However, including a Robot in the list leads to a violation of type safety that might only be caught at runtime, or worse, not caught at all, potentially allowing a bug to go unnoticed until it causes a failure in a production environment.

entities = [Dog, Cat, Robot]
for entity in entities:
    describe_animal(entity)

Further Violation of Type Safety#

1old: float = 3.01
2new: int = 5
3old = new  # Safe because int <: float
1old: int = 3
2new: float = 3.03
3old = new  # Unsafe because int <: float
4# assume the static language doesnt compile error then old will truncate to 3 silently because it is defined as an `int`!

On Dynamic vs Static Type Checking#

We have come quite a long way in understanding subtyping schemes and the concept of type safety. We would like to end this discussion with a brief note on the difference between dynamic and static type checking. In what follows, we would turn to the seminal work by Jeremy Siek, titled What is Gradual Typing, as a reference.

Dynamic Type Checking#

Dynamic type checking is a form of type checking that is performed at runtime. Consider the following Python code, where we erroneously pass a Car object to the can_we_fly function, which expects a Flyable object. The error manifests as an AttributeError at runtime, indicating that the Car object does not have the fly method.

 1from __future__ import annotations
 2
 3from typing import Protocol, runtime_checkable
 4
 5@runtime_checkable
 6class Flyable(Protocol):
 7    def fly(self) -> str:
 8        ...
 9
10class Bird:
11    def fly(self) -> str:
12        return "Bird flying"
13
14class Car:
15    def drive(self) -> str:
16        return "Car driving"
17
18
19def can_we_fly(obj: Flyable) -> str | None:
20    try:
21        return obj.fly()
22    except AttributeError as err:
23        print(err)
24        return None
25
26bird = Bird()
27car = Car()
28
29_ = can_we_fly(bird)
30_ = can_we_fly(car)
'Car' object has no attribute 'fly'

This is deemed as dynamic type checking because the type of the object is only checked at runtime and the error is only caught when the code is executed.

Static Type Checking#

Static type checking, on the other hand, is a form of type checking that is performed at compile time. Consider the same piece of Python code, but now we use the mypy static type checker to catch the error at compile time (a better example would be to use a language like Java or C# that has a more robust static type checking system).

sandbox.py:30: error: Argument 1 to "can_we_fly" has incompatible type "Car"; expected "Flyable"  [arg-type]
    _ = can_we_fly(car)
                   ^~~
Found 1 error in 1 file (checked 1 source file)

It is worth noting that static type checking make a conservative approximation[3] of what can happen to the code at runtime, and raise potential type errors that may happen at runtime. In fact, the the halting problem implies that we cannot be 100% sure whether a type error will really occur during runtime before execution, and thus impossible to build a type checker that can “predict” what type errors will happen in runtime[3]. Consequently, static type checkers often make conservative approximations to ensure that the code is type-safe and result in false positives (i.e. raising type errors that will not actually occur at runtime).

We quote verbatim the example given by Jeremy Siek in his article. Consider the following Java code:

class A {
    int add1(int x) {
        return x + 1;
    }
    public static void main(String args[]) {
        A a = new A();
        if (false)
            add1(a);
        else
            System.out.println("Hello World!");
    }
}

The Java compiler rejects the following program even though it would not actually result in a type error because the if (false) branch is never taken. However,the Java type checker does not try to figure out which branch of an if statement will be taken at runtime. Instead it conservatively assumes that either branch could be taken and therefore checks both branches[3].

Comparison between Dynamic and Static Type Checking#

Some reasons Jeremy gave includes, but not limited to:

  • Static type checking enhances execution speed by eliminating the need for type verification during runtime and allowing for the utilization of more efficient data storage formats.

  • Dynamic type checking simplifies the handling of scenarios where the type of a value is determined by information available at runtime.

You can find more reasons in his article.

References and Further Readings#