Sentinel Types#

Twitter Handle LinkedIn Profile GitHub Profile Tag Tag

Motivation#

Inspired by OpenAI’s approaches in software development and type handling, this discussion explores the concept of sentinel types, a powerful technique in for representing unique default values or states. Sentinel types are particularly useful in scenarios where None might be a valid input value, necessitating a distinct marker for “no value given” cases. In fact, None is in itself a form of sentinel value, a singleton object that represents the absence of a value.

While this exploration introduces the core idea as formalized in PEP 661 – Sentinel Values, which advocates for a standardized approach to sentinel values in Python, we’ll steer clear of the more formalized specifics. Readers interested in a deeper dive into the formal aspects of sentinel types are encouraged to consult the original PEP 661 – Sentinel Values for comprehensive insights and technical details.

NotGiven#

OpenAI’s own implementation of a sentinel type, NotGiven, was introduced here in OpenAI’s GitHub repository.

 1class _NotGiven:
 2
 3    _instance: _NotGiven | None = None
 4
 5    def __new__(cls: Type[_NotGiven]) -> _NotGiven:  # noqa: PYI034
 6        if cls._instance is None:
 7            cls._instance = super(_NotGiven, cls).__new__(cls)  # noqa: UP008
 8        return cls._instance
 9
10    def __bool__(self) -> Literal[False]:
11        """
12        This method is used to define the boolean value of an instance of `_NotGiven`.
13        By returning `False`, it allows `_NotGiven` to be used in boolean contexts (like
14        `if` statements) to signify the absence of a value. This is especially useful
15        for checking if an argument was provided or not in a function.
16        """
17        return False
18
19    @override
20    def __repr__(self) -> Literal["NOT_GIVEN"]:
21        return "NOT_GIVEN"
22
23    def __setattr__(self, key: str, value: Any) -> None:
24        raise AttributeError(f"{self.__class__.__name__} instances are immutable")
25
26    def __delattr__(self, key: str) -> None:
27        raise AttributeError(f"{self.__class__.__name__} instances are immutable")
28
29
30NOT_GIVEN = _NotGiven()

Purpose and Behavior#

The use of such sentinels indicates that a parameter was not provided at all. It’s used to distinguish between a parameter being explicitly set to None and not being provided. Structurally, both None and NotGiven are singleton instances and are considered “falsy” in boolean contexts. This allows them to be used in conditional statements to check if a value was provided or not. The singleton property ensures that all instances of NotGiven are equal to each other.

Use Case 1. Timeouts in HTTP Requests#

The sentinel pattern like NotGiven is common in APIs where default behavior is triggered when a parameter is not given, but None might be a valid, meaningful input. For example, None might mean “disable timeout”, while NotGiven means “use a default timeout”.

Consider you implemented a function get (not to be confused with the method get from the requests library) to make HTTP requests, the timeout parameter specifies the maximum number of seconds to wait for a response. If timeout is set to None, it means that there is no timeout limit for the request. In other words, the request will wait indefinitely until the server responds or the connection is closed.

Here, we will use a relatively simple example to illustrate. Consider the following function call get that takes in a argument timeout that defines how many seconds to wait before raising a TimeoutError. If user specifies None, it means that this program should have no timeout, and therefore should run indefinitely until a server or something responds to halt.

 1import time
 2
 3def get(timeout: int | None = 2) -> int | float:
 4    if timeout is None:
 5        actual_timeout = float("inf")
 6    else:
 7        actual_timeout = timeout
 8    return actual_timeout
 9
10print(f"Use default timeout: {get()}")
11print(f"Use 2 seconds timeout: {get(timeout=2)}")
12print(f"Use 3 seconds timeout: {get(timeout=3)}")
13print(f"Use no timeout: {get(timeout=None)}")
Use default timeout: 2
Use 2 seconds timeout: 2
Use 3 seconds timeout: 3
Use no timeout: inf

What is the issue here? Not much. But one quirk is that the program has no elegant way to distinguish whether a user passed in a default value or not.

1print(f"Use default timeout: {get()}")
2print(f"Use 2 seconds timeout: {get(timeout=2)}")
Use default timeout: 2
Use 2 seconds timeout: 2

The above two will yield the same result, because the timeout has a default value of 2, so when the function is called without specifying timeout, it automatically takes the value of 2 - which is the standard behaviour for default values.

This approach does not disinguish between an user not providing the argument at all and an user explicitly setting the argument to its default value.

Why does it matter? Besides the reason of expressing user intent and explicitness, we can argue that we want more fine-grained behaviour control of our program. If user pass in their own values, we may want to check whether that value is within bounds, or in other words, legitimate.

The key motivations for using a singleton sentinel class are primarily centered around distinguishing between different states of function arguments, especially in the context of default values and optional arguments.

  1. Differentiating Between ‘None’, ‘Default Values’ and ‘Not Provided’: In Python, None is often used as a default value for function arguments. However, there are situations where None is a meaningful value distinct from the absence of a value. The NotGiven singleton allows you to differentiate between a user explicitly passing None (which might have a specific intended behavior) and not passing any value at all.

  2. Default Behavior Control: By using a sentinel like NotGiven, we can implement a default behavior that is only triggered when an argument is not provided. This is different from setting a default value in the function definition, as it allows the function to check if the user has explicitly set the argument, even if it’s set to None.

  3. Semantic Clarity: In complex APIs or libraries, using a sentinel value can provide clearer semantics. It makes the intention of the code more explicit, both for the developer and for users of the API. It indicates that thought has been given to the different states an argument can be in, and different behaviors are intentionally designed for each state.

 1def get_with_not_given(timeout: int | _NotGiven | None = NOT_GIVEN) -> int | float:
 2    actual_timeout: int | float
 3    if timeout is NOT_GIVEN:
 4        actual_timeout = 2
 5    elif timeout is None:
 6        actual_timeout = float("inf")
 7    else:
 8        assert isinstance(timeout, int)
 9        actual_timeout = timeout
10    return actual_timeout
11
12print(f"Use default timeout: {get_with_not_given()}")
13print(f"Use 2 seconds timeout: {get_with_not_given(timeout=2)}")
14print(f"Use 3 seconds timeout: {get_with_not_given(timeout=3)}")
15print(f"Use no timeout: {get_with_not_given(timeout=None)}")
Use default timeout: 2
Use 2 seconds timeout: 2
Use 3 seconds timeout: 3
Use no timeout: inf

Missing#

Another common sentinel type is MISSING, which is used to represent a missing value in data structures or configurations (for e.g. in Dataclasses).

 1class _Missing:
 2    """
 3    -   **Primary Use:** `MISSING` is more common in data structures,
 4        configurations, or APIs where you need to signify that a value hasn't been
 5        set or provided, and it's expected to be present or filled in later.
 6    -   **Semantics:** It indicates the absence of a value in a more passive sense,
 7        as in "not yet provided" or "awaiting assignment."
 8    -   **Example:** In a configuration object, `None` might be used to disable an
 9        option, whereas `MISSING` would indicate that the user has not yet made a
10        decision about that option.
11    """
12
13    _instance: _Missing | None = None
14
15    def __new__(cls: Type[_Missing]) -> _Missing:  # noqa: PYI034
16        if cls._instance is None:
17            cls._instance = super(_Missing, cls).__new__(cls)  # noqa: UP008
18        return cls._instance
19
20    def __bool__(self) -> Literal[False]:
21        return False
22
23    def __repr__(self) -> Literal["MISSING"]:
24        return "MISSING"
25
26    def __setattr__(self, key: str, value: Any) -> None:
27        raise AttributeError(f"{self.__class__.__name__} instances are immutable")
28
29    def __delattr__(self, key: str) -> None:
30        raise AttributeError(f"{self.__class__.__name__} instances are immutable")
31
32
33MISSING = _Missing()

Purpose and Behavior#

The typical use case of MISSING is often used in data structures or configurations to indicate that a value is missing or has not been set. It’s particularly useful in contexts like dictionaries, APIs, or data processing where you need to differentiate between a value that is intentionally set to None and a value that is not provided at all.

For example, in a configuration dictionary where each key is supposed to map to a specific value, MISSING could be used to represent keys that have not been assigned a value yet. It signals that the value is expected but not available, which is different from being intentionally set to None.

1config = {
2    "timeout": 30,
3    "mode": MISSING,  # Indicates that the mode setting is yet to be configured
4}
5if config["mode"] is MISSING:
6    print("Mode is not yet configured")
Mode is not yet configured

NotGiven vs. MISSING#

  • Use NOTGIVEN to explicitly indicate that no value has been provided for a parameter, especially when None is a valid input with a specific meaning.

  • Use MISSING to represent an absent or unassigned value in data structures or configurations, where you need to differentiate between an unassigned state and a value explicitly set to None.

References and Further Readings#