Type Safety#
Type Safety#
A subtype is a foundational concept in type theory and object-oriented programming that facilitates type safety and polymorphism. The relationship between a subtype \(S\) and a supertype \(T\) is denoted as \(S <: T\), \(S \subseteq T\), or \(S ≤: T\). Before we detail the criterion[1] for subtype in the next section, we state an important implication of subtype - type safety.
(Subtype and Type Safety)
If \(S\) is a subtype of \(T\), the subtyping relation (written as \(S \leq T\), \(S <: T\), \(S \subseteq T\), or \(S \leq: T\)) means that any term of type \(S\) can safely be used in any context where a term of type \(T\) is expected.
In other words, we say that \(S\) is a subtype of \(T\) if a piece of code written for variables of type \(T\) can also safely be used on variables of type \(S\) [2].
What does this mean?
This means that if \(S\) is a subtype of \(T\), you can use an instance of \(S\) in any place where an instance of \(T\) is required, without any issues related to type compatibility. That is what we call subtype polymorphism.
This concept of subtyping forms the basis of subtype polymorphism in object-oriented programming. Subtype polymorphism allows objects of a subtype to be treated as objects of a supertype, enabling methods to operate on objects of different types as long as they share a common supertype. This mechanism is critical for implementing interfaces and abstract classes in a type-safe manner.
More formally, in subtype polymorphism, if \(S\) is a subtype of \(T\) (denoted as \(S <: T\)), then objects of type \(S\) can be used in contexts expecting objects of type \(T\). This interoperability is guaranteed without loss of integrity or behavior of the original type \(S\), ensuring that operations performed on \(T\) are valid on \(S\). This allows for greater flexibility and code reuse while maintaining strict type safety, as it ensures that the substitutability of subtypes for their supertypes does not lead to runtime type errors or unexpected behaviors.
A Type Safe Example#
Assume for a moment that the class Cat
and Dog
are both valid subtype of
the class Animal
through class inheritance, which we have learned earlier to
be called nominal subtyping (i.e.
subclasses are subtypes).
1class Animal:
2 def describe(self) -> str:
3 return str(self.__class__.__name__)
4
5 def make_sound(self) -> str:
6 return "Generic Animal Sound!"
7
8
9class Dog(Animal):
10 def make_sound(self) -> str:
11 return "Woof!"
12
13 def fetch(self) -> str:
14 return "Happily fetching balls!"
15
16
17class Cat(Animal):
18 def make_sound(self) -> str:
19 return "Meow"
20
21 def how_many_lives(self) -> str:
22 return "I have 9 lives!"
Then this means that any instance of Dog
or Cat
can safely be used in a
context where an instance of Animal
is expected.
For example, consider the following function describe_animal
that takes in an
animal
of type Animal
. This is telling developers that we can pass in any
animal as long as it is a subtype of the Animal
class.
1def describe_animal(animal: Animal) -> str:
2 return animal.describe() + " makes sound " + animal.make_sound()
3
4generic_animal = Animal()
5generic_animal_sound = describe_animal(generic_animal)
6print(generic_animal_sound)
7
8generic_dog = Dog()
9generic_dog_sound = describe_animal(generic_dog)
10print(generic_dog_sound)
11
12generic_cat = Cat()
13generic_cat_sound = describe_animal(generic_cat)
14print(generic_cat_sound)
Animal makes sound Generic Animal Sound!
Dog makes sound Woof!
Cat makes sound Meow
In fact, what we have described is also inherently related with variable
assignment. You can also think of variable assigning from the function
describe_animal
above. How so? When you pass in an instance of Dog
, say
generic_dog
, to the function describe_animal
, you are essentially assigning
generic_dog
to the parameter/variable animal
in the function.
By extension, the following assignment:
1generic_animal = generic_dog # Safe because Dog <: Animal
is allowed and considered safe because we are substituting (assigning) an
expression generic_dog
of type instance Dog
to the variable
generic_animal
is allowed because we established that Dog
is a subtype of
Animal
- so it is safe. A static type checker such as mypy
will not raise an
error here. However, if you were to do the reverse:
1generic_dog = generic_animal # Unsafe because Animal is not a subtype of Dog
Then the static type checker will raise an error:
error: Incompatible types in assignment (expression has type "Animal", variable has type "Dog") [assignment]
generic_dog = generic_animal
indicating that you are trying to assign an expression generic_animal
of type
Animal
to the variable generic_dog
of type Dog
. This is unsafe because
there is no guarantee that the Animal
class has say, all methods that a Dog
instance might have!
In short, we have:
Therefore, it’s safe to assign an instance of
Dog
to a variable of typeAnimal
sinceDog
contains all functionalities (make_sound
) ofAnimal
and possibly more (fetch
) so there won’t be any surprise here. But it is deemed unsafe to assigngeneric_animal
togeneric_dog
because not everyAnimal
is aDog
. While everyDog
instance is anAnimal
(fulfilling the subtype criteria), the reverse isn’t true. AnAnimal
instance might not have all functionalities of aDog
(likefetch()
), leading to potential errors or undefined behaviors if treated as aDog
. This violates the principle that the subtype should be able to handle everything the supertype can, plus potentially more.
Violating Type Safety#
Consider one example that violates type safety:
1class Robot:
2 def describe(self) -> str:
3 return str(self.__class__.__name__)
4
5 def make_sound(self) -> int:
6 return 1
7
8robot = Robot()
9
10try:
11 robot_sound = describe_animal(robot)
12 print(robot_sound)
13except Exception as err:
14 print(f"Error: {err}")
Error: can only concatenate str (not "int") to str
In python there is no notion of type checking during compile time unless you
have a static type checker. Consequently, the above code will only throw an
error during runtime because we are adding an integer 1
to the string
animal.describe() + " makes sound "
. This is because we are errorneously
passing in an instance of Robot
to a function that accepts Animal
only.
Since Robot
is not a subtype of Animal
, there is no type safety guarantee.
This example can be way worse if we were to just change the describe_animal
to
return a f-string
instead - which will not throw any error at all, leading to
hidden bugs!
1def describe_animal(animal: Animal) -> str:
2 return f"{animal.describe()} makes sound {animal.make_sound()}"
3
4robot = Robot()
5robot_sound = describe_animal(robot)
6print(robot_sound)
Robot makes sound 1
and this can often happen in code, sometimes unknowingly.
In the for loop below, we iterate over entities that are presumed to be Animal
types. However, including a Robot
in the list leads to a violation of type
safety that might only be caught at runtime, or worse, not caught at all,
potentially allowing a bug to go unnoticed until it causes a failure in a
production environment.
entities = [Dog, Cat, Robot]
for entity in entities:
describe_animal(entity)
Further Violation of Type Safety#
1old: float = 3.01
2new: int = 5
3old = new # Safe because int <: float
1old: int = 3
2new: float = 3.03
3old = new # Unsafe because int <: float
4# assume the static language doesnt compile error then old will truncate to 3 silently because it is defined as an `int`!
On Dynamic vs Static Type Checking#
We have come quite a long way in understanding subtyping schemes and the concept of type safety. We would like to end this discussion with a brief note on the difference between dynamic and static type checking. In what follows, we would turn to the seminal work by Jeremy Siek, titled What is Gradual Typing, as a reference.
Dynamic Type Checking#
Dynamic type checking is a form of type checking that is performed at runtime.
Consider the following Python code, where we erroneously pass a Car
object to
the can_we_fly
function, which expects a Flyable
object. The error manifests
as an AttributeError
at runtime, indicating that the Car
object does not
have the fly
method.
1from __future__ import annotations
2
3from typing import Protocol, runtime_checkable
4
5@runtime_checkable
6class Flyable(Protocol):
7 def fly(self) -> str:
8 ...
9
10class Bird:
11 def fly(self) -> str:
12 return "Bird flying"
13
14class Car:
15 def drive(self) -> str:
16 return "Car driving"
17
18
19def can_we_fly(obj: Flyable) -> str | None:
20 try:
21 return obj.fly()
22 except AttributeError as err:
23 print(err)
24 return None
25
26bird = Bird()
27car = Car()
28
29_ = can_we_fly(bird)
30_ = can_we_fly(car)
'Car' object has no attribute 'fly'
This is deemed as dynamic type checking because the type of the object is only checked at runtime and the error is only caught when the code is executed.
Static Type Checking#
Static type checking, on the other hand, is a form of type checking that is
performed at compile time. Consider the same piece of Python code, but now we
use the mypy
static type checker to catch the error at compile time (a better
example would be to use a language like Java or C# that has a more robust static
type checking system).
sandbox.py:30: error: Argument 1 to "can_we_fly" has incompatible type "Car"; expected "Flyable" [arg-type]
_ = can_we_fly(car)
^~~
Found 1 error in 1 file (checked 1 source file)
It is worth noting that static type checking make a conservative approximation[3] of what can happen to the code at runtime, and raise potential type errors that may happen at runtime. In fact, the the halting problem implies that we cannot be 100% sure whether a type error will really occur during runtime before execution, and thus impossible to build a type checker that can “predict” what type errors will happen in runtime[3]. Consequently, static type checkers often make conservative approximations to ensure that the code is type-safe and result in false positives (i.e. raising type errors that will not actually occur at runtime).
We quote verbatim the example given by Jeremy Siek in his article. Consider the following Java code:
class A {
int add1(int x) {
return x + 1;
}
public static void main(String args[]) {
A a = new A();
if (false)
add1(a);
else
System.out.println("Hello World!");
}
}
The Java compiler rejects the following program even though it would not
actually result in a type error because the if (false)
branch is never taken.
However,the Java type checker does not try to figure out which branch of an if
statement will be taken at runtime. Instead it conservatively assumes that
either branch could be taken and therefore checks both
branches[3].
Comparison between Dynamic and Static Type Checking#
Some reasons Jeremy gave includes, but not limited to:
Static type checking enhances execution speed by eliminating the need for type verification during runtime and allowing for the utilization of more efficient data storage formats.
Dynamic type checking simplifies the handling of scenarios where the type of a value is determined by information available at runtime.
You can find more reasons in his article.