Bound and Constraint in Generics and Type Variables#
If you recall from
Generic Functions section, we
made up a simple example on add
:
T = TypeVar('T')
def add(x: T, y: T) -> T:
return x + y
And note that running mypy above will yield two particular errors:
4: error: Returning Any from function declared to return "T" [no-any-return]
return x + y
^~~~~~~~~~~~
4: error: Unsupported left operand type for + ("T") [operator]
return x + y
^~~~~
As a recap, this is because T
can represent literally any type, and not all
type can do add
. For instance, two dictionaries cannot be added together
because their __add__
is not well defined. This error that mypy
raised
forces you to rethink your code design - in that what types of type the type
variable T
can take on. Assume for the sake of simplicity, that your program’s
add
would only operate on a few types:
int
float
NDArray[np.float64]
And if we can somehow tell our type variable T
to take on any one of the 3
types above, then our job is done. This is where we would use the
constraints.
Constraining Type Variable#
Constraints allow you to specify a list of explicit types that a type
variable (TypeVar
) can take. This is akin to saying that the type variable can
represent any one of these specified types, and no others.
Syntax:
T = TypeVar('T', Type1, Type2, ...)
Meaning: The type variable
T
can be any one ofType1
,Type2
, etc.
Revisiting Add Example#
In our add
example, since we only want the type variable T
to take on 3, and
only 3 types, we can thus re-write the type variable as such:
1T = TypeVar('T', int, float, NDArray[np.float64])
2
3def add(a: T, b: T) -> T:
4 return a + b
Then running mypy
again, will yield no errors, because the static type checker
is intelligent enough to know that T
can only take on int
, float
or
NDArray[np.float64]
, all of which has the add operator well defined.
Union Type versus Constrained Type Variable#
The definition of constrained type variable may not be immediately clear on why
we cannot just use an Union
type.
Consider the following union type:
1U = Union[int, float, NDArray[np.float64]]
2
3def add(a: U, b: U) -> U:
4 return a + b
5
6
7add_two_int = add(a=3, b=4)
8pprint(add_two_int)
9
10add_int_and_float = add(a=3, b=9.99)
11pprint(add_int_and_float)
12
13add_int_and_ndarray = add(a=3, b=np.array([1, 2, 3]))
14pprint(add_int_and_ndarray)
7
12.99
array([4, 5, 6])
With Union
, the operation we use between arguments (i.e. a
and b
) is
supported by any permutation order[1]. As we can see, we added two int
,
added int
and float
, and lastly, added int
and an NDArray
! Is this
really what we want? Do we really want to allow int
and NDArray
to be added
together freely (they can, but some may regard it as not safe as it might lead
to undesirable consequences via broadcasting). Consequently, there will be no
error raised.
Furthermore, what if your programming logic changes and now you only want to add
int
and str
. This will be problematic because if we use union type, then it
has potential of adding an int
and a str
, which is likely lead a type error
saying unsupported operand type for the add operator between int
and str
.
1U = Union[int, str]
2
3def add(a: U, b: U) -> U:
4 return a + b
Our static type checker is fast to spot this potential error and raised aptly the following:
4: error: Unsupported operand types for + ("int" and "str") [operator]
return a + b
^
4: error: Unsupported operand types for + ("str" and "int") [operator]
return a + b
^
4: note: Both left and right operands are unions
This is when type variable prove to be more type safe here.
1T = TypeVar("T", int, float, NDArray[np.float64])
2
3
4def add(a: T, b: T) -> T:
5 return a + b
6
7
8add_int_and_ndarray = add(a=3, b=np.array([1, 2, 3]))
And now, mypy
will raise an error here, telling you that you need to abide to
the contract, that within the scope of the function add
, all type variable
must be the same type! So add_int_and_ndarray
will raise an error.
Furthermore, now the below example will be okay because the static type checker
has piece of mind that both a
and b
must of same type and no mixing is
involved.
1T = TypeVar("T", int, str)
2
3
4def add(a: T, b: T) -> T:
5 return a + b
Upper Bounding Type Variables#
Bounds specify an upper bound for the type variable. This means that the type variable can represent any type that is a subtype of the specified bound.
Syntax:
T = TypeVar('T', bound=SuperType)
Meaning: The type variable
T
can be any type that is a subtype ofSuperType
(includingSuperType
itself).
The excerpt discusses how to enforce type constraints in Python using the
TypeVar
function from the typing
module, focusing on the concept of an upper
bound. Here’s a more structured explanation for clarity:
Defining Type Variables with Upper Bounds#
In Python’s type hinting system, you can define a type variable that restricts
which types can be used in place of it by specifying an upper bound. This is
done using the bound=<type>
argument in the TypeVar
function. The key point
is that any type that replaces this type variable must be a subtype of the
specified boundary type. It’s important to note that the boundary type itself
cannot be another type variable or a parameterized type.
Example: Ensuring Type Safety with Sized
#
Consider the Sized
protocol from Python’s typing
module, which represents
any type that supports the len()
function. We define a type variable ST
with
Sized
as its upper bound:
from typing import TypeVar, Sized
ST = TypeVar('ST', bound=Sized)
This definition means that ST
can be replaced by any type that has a len()
method, ensuring that objects of type ST
can be measured for their size.
The function longer
takes two parameters, x
and y
, both of type ST
. It
returns the object with the greater length:
def longer(x: ST, y: ST) -> ST:
if len(x) > len(y):
return x
else:
return y
Because ST
is bound to Sized
, we can safely use len()
on x
and y
. This
allows the function to work with any sized collection, such as lists or sets.
longer([1], [1, 2])
correctly returns the longer list, with the return type beingList[int]
.longer({1}, {1, 2})
operates on sets, returning the larger set asSet[int]
.The statement about
longer([1], {1, 2})
being okay and returning a typeCollection[int]
is correct as well. This is because unlike constraints, we do not need bothx
andy
to be of the same exact type, they just need to be subclass of the bound super type.
Bounding and Semantic Clarity#
Bounding also offers more clarity and semantic meaning, than say, an Union
type.
class Animal:
...
class Dog(Animal):
...
class Cat(Animal):
...
class Car:
...
AnimalType = TypeVar("AnimalType", bound=Animal)
def function_with_bound(arg: AnimalType) -> AnimalType:
return arg
def function_with_union(arg: Union[Dog, Cat, Car]) -> Union[Dog, Cat, Car]:
return arg
In function_with_bound
, the argument arg must be an instance of Animal
or a
subclass of Animal
. This means you could pass in an instance of Dog
or
Cat
, but not Car
, because Car
is not a subclass of Animal
.
In function_with_union
, the argument arg can be an instance of Dog
, Cat
,
or Car
. There’s no requirement that these types are related in any way.
Bound versus Constraints#
Bounds are used to specify that a type variable must be a subtype of a particular type. This is akin to setting an upper limit (or in some contexts, a lower limit) on what the type variable can be. The purpose of bounds is to ensure that the type variable adheres to a hierarchical type constraint, typically ensuring that it inherits certain methods or properties.
Constraints, on the other hand, specify a list of explicit types that a type variable can represent, without implying any hierarchical relationship between them. The purpose of constraints is to allow a type variable to be more flexible by being one of several types, rather than restricting it to a subtype of a specific class or interface.
The comparison is pretty superficial but something to remember is that you can
mix types within arguments if you use bound, which behaves a little like
Union
, whereas in constraint, all arguments must be of the exact same type.
Let’s see an example:
AnimalType = TypeVar("AnimalType", bound=Animal)
def function_with_bound(arg1: AnimalType, arg2: AnimalType) -> Tuple[AnimalType, AnimalType]:
return arg1, arg2
cat = Cat()
tabby = Cat()
dog = Dog()
_, _ = function_with_bound(cat, dog)
This above code will not raise any issue when compared to the below code:
AnimalType = TypeVar("AnimalType", Cat, Dog)
def function_with_bound(arg1: AnimalType, arg2: AnimalType) -> Tuple[AnimalType, AnimalType]:
return arg1, arg2
cat = Cat()
tabby = Cat()
dog = Dog()
_, _ = function_with_bound(cat, dog)
This is because if we use constraint, our contract is that within the scope, all
arguments must be of type AnimalType
, whereas when in bound, arg1
and arg2
can be different, as long as both are upper bounded by Animal
.
References and Further Readings#
What’s the difference between a constrained TypeVar and a Union?
Difference between TypeVar(‘T’, A, B) and TypeVar(‘T’, bound=Union[A, B])