Two Sum#

Twitter Handle LinkedIn Profile GitHub Profile LeetCode Problem Difficulty

Problem#

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

You can return the answer in any order.

Intuition#

This problem can be solved by using a hash map to store the numbers in the nums array as keys and their indices as values. By doing so, we can achieve a \(\mathcal{O}(1)\) look-up time for checking if a number exists in our array. For each number in the nums array, we calculate the difference between the target and that number, and then check if this difference exists in our hash map. If it does, we have found our two numbers that add up to the target.

Assumptions#

  • Each given input will have exactly one solution.

  • The same element cannot be used twice in the pair.

  • The order of indices in the output does not matter.

  • All the elements in the nums array are integers.

  • The target is also an integer.

  • The nums array will at least have two elements (since we are looking for a pair).

Constraints#

  • It’s assumed that the nums array size is in the range of:

    \[ 2 \leq \text{nums.length} \leq 10^4. \]
  • Each element in nums array is an integer in the range of:

    \[ -10^9 \leq \text{nums[i]} \leq 10^9. \]
  • The target is an integer in the range of:

    \[ -10^9 \leq \text{target} \leq 10^9. \]
  • There is only one valid answer.

What are Constraints for?#

In programming problems, constraints are given to define the scope and limits of the problem. They help to determine the feasible approaches and solutions for the problem by providing information about the range and characteristics of the input data. They help us to anticipate the worst-case scenarios that our algorithm should be able to handle without leading to inefficiencies or failures, like time limit exceeded or memory limit exceeded errors.

In the context of the problem at hand:

  • The constraint on nums.length (i.e., \(2 \leq \text{nums.length} \leq 10^4\)) specifies the minimum and maximum size of the nums array that our algorithm needs to handle. Knowing this, we can evaluate whether our solution would scale efficiently for the largest possible input size. In this case, a solution with a time complexity of \(\mathcal{O}(n)\) or \(\mathcal{O}(n \log n)\) would likely be acceptable, but a solution with a time complexity of \(\mathcal{O}(n^2)\) may not be efficient enough.

  • The constraints on the individual elements of nums and on target (i.e., \(-10^9 \leq \text{nums[i]} \leq 10^9\) and \(-10^9 \leq \text{target} \leq 10^9\)) let us know the range of values that these variables can take. These constraints are important to take into account when considering possible edge cases and overflow issues.

  • The specification that there is only one valid answer helps to simplify the problem by indicating that we don’t need to consider scenarios where there might be multiple pairs of numbers in nums that add up to target.

Test Cases#

  • Test Case 1: nums = [2, 7, 11, 15], target = 9, Return: [0, 1]

  • Test Case 2: nums = [3, 2, 4], target = 6, Return: [1, 2]

  • Test Case 3: nums = [3, 3], target = 6, Return: [0, 1]

  • Test Case 4: nums = [-1, -2, -3, -4, -5], target = -8, Return: [2, 3] (Negative numbers)

  • Test Case 5: nums = [0, 1, 2, 0], target = 0, Return: [0, 3] (Zero as target)

  • Test Case 6: nums = [1, 3, 4, 2], target = 6, Return: [2, 3] (Elements are not in order)

Edge Cases#

  • Edge Case 1: nums = [1], target = 1, Return: [] (No pair)

  • Edge Case 2: nums = [0, 0], target = 0, Return: [0, 1] (Zeroes)

  • Edge Case 3: nums = [0, 1, 2, 0], target = 2, Return: [1, 3] (Multiple valid pairs, return any)

  • Edge Case 4: nums = [-1, 2, 1, -4], target = 1, Return: [0, 1] (Negative and positive numbers)

Walkthrough / Whiteboarding#

Consider nums = [2, 7, 11, 15], target = 9.

  1. Initialize an empty hash map.

  2. Iterate over the array:

    • For the first element 2, calculate 9 - 2 = 7.

    • Check if 7 is in the hash map. It’s not, so add {2:0} to the hash map.

    • For the second element 7, calculate 9 - 7 = 2.

    • Check if 2 is in the hash map. It is, so return the current index 1 and the stored index 0.

Theoretical Best Time Complexity#

The theoretical best time complexity for this problem is \(\mathcal{O}(n)\), where \(n\) is the length of the nums array. This is because the most efficient solution to this problem involves traversing the array once, which requires \(\mathcal{O}(n)\) time.

Theoretical Best Space Complexity#

The theoretical best (auxiliary) space complexity for this problem is \(\mathcal{O}(n)\), where \(n\) is the number of elements in the nums array. This is due to the space required by the seen hash map, which is used to store the elements of the array along with their indices.

Space-Time Tradeoff#

This represents a space-time tradeoff if we compare this algorithm to the brute force method. The brute force method uses \(\mathcal{O}(n^2)\) time and \(\mathcal{O}(1)\) space. Here, we are using additional space (the seen hash map) to reduce the time complexity of our solution from \(\mathcal{O}(n^2)\) (if we were to use a brute-force approach with two nested loops) to \(\mathcal{O}(n)\).

Solution#

Note that using python’s built in enumerate would be cleaner, but we will refrain from using it here to demonstrate the logic behind the solution.

Implementation#

from typing import List, Dict

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        """
        Given a list of integers and a target integer, this function finds two
        integers in the list that sum to the target. Returns a list of two indices
        of the numbers that add up to the target.

        Parameters
        ----------
        nums : List[int]
            The list of numbers.
        target : int
            The target number.

        Returns
        -------
        List[int]
            The indices of the two numbers that sum to the target.

        Examples
        --------
        >>> nums = [15, 2, 3, 7]
        >>> target = 9
        >>> Solution().twoSum(nums, target)
        [1, 3]
        """
        seen: Dict[int, int] = {}
        index: int = 0

        for num in nums:
            complement: int = target - num
            if complement not in seen:
                seen[num] = index
            else:
                return [seen[complement], index]
            index += 1
nums = [15, 2, 3, 7]
target = 9
Solution().twoSum(nums, target)
[1, 3]

Time Complexity#

If we denote \(\mathcal{T}(n)\) as the time complexity of your function, we can break down the time complexity of each major step.

The time complexity of the algorithm is \(\mathcal{O}(n)\). This is because we traverse the list of length \(n\) a single time. Every operation inside the loop, including the table lookup, has a time complexity of \(\mathcal{O}(1)\).

To understand this in a more practical sense, consider the for loop from lines 32 to 38. Each line inside this loop has a time complexity of \(\mathcal{O}(1)\), meaning it takes a constant amount of time to execute, regardless of the size of the input. This suggests that a typical iteration inside the loop would require approximately \(\mathcal{O}(3)\) time, given that there are three operations within the loop.

However, since the loop runs \(n\) times (where \(n\) is the length of the input list), the overall time required would be \(n\) times the time required for a single iteration. This gives us:

\[\begin{split} \begin{aligned} \mathcal{T}(n) &= n \cdot \mathcal{O}(3) + \mathcal{O}(1) \\ &\approx \mathcal{O}(3n) \end{aligned} \end{split}\]

Note that the \(\mathcal{O}(1)\) term comes from the time required to initialize the hashmap. This is a constant time operation that is independent of the size of the input list.

In big-O notation, we typically omit constant multipliers since we’re primarily interested in how the runtime scales with the size of the input. Hence, we simplify \(\mathcal{O}(3n)\) to \(\mathcal{O}(n)\). This signifies that the runtime of the algorithm grows linearly with the size of the input list.

Table 50 Time Complexity of Two-Sum Function Using Hash Map#

Case

Worst Case

Average Case

Best Case

Element is in the list (Two numbers add up to target)

\(\mathcal{O}(n)\)

\(\mathcal{O}\left(\frac{n}{2}\right)\)

\(\mathcal{O}(1)\)

Element is not in the list (No two numbers add up to target)

\(\mathcal{O}(n)\)

\(\mathcal{O}(n)\)

\(\mathcal{O}(n)\)

Remark 67 (The Average Case)

If the inputs are uniformly distributed, then each possible pair of indices \((i, j)\) that sums to the target is equally likely.

Assuming nums is an array of \(n\) elements, and \((i, j)\) is a solution (i.e., nums[i] + nums[j] == target), where \(i\) and \(j\) are any indices in the array.

If the distribution of inputs is uniform, then the probability that \((i, j)\) is the solution is the same for all valid pairs of indices \((i, j)\).

So if you have to find one pair \((i, j)\) that sums to target, on average, you would have to search through half of all possible pairs before finding the correct one. Therefore, on average, you’d inspect \(n/2\) elements in nums before finding the pair that sums to target.

So technically, we can think of the average time complexity to be \(\mathcal{O}\left(\frac{n}{2}\right)\). However, as we simplify this to its highest order term and ignore the coefficients in Big O notation, the average case complexity becomes \(\mathcal{O}(n)\).

Space Complexity#

If we denote \(\mathcal{S}(n)\) as the space complexity of your function, we can break down the space complexity of each major step as well.

Table 51 Space Complexity#

Type

Complexity

Description

Input Space

\(\mathcal{O}(n)\)

The input space depends on the size of the input list nums.

Auxiliary Space

\(\mathcal{O}(n)\)

The auxiliary space depends on the size of the seen dictionary, which in the worst case can contain all elements from nums.

Total Space

\(\mathcal{O}(n)\)

The total space is the sum of the input and auxiliary space, hence it is also \(\mathcal{O}(n)\).

The auxiliary (extra) space required depends on the number of items stored in the dictionary seen, which stores at most \(n\) elements.

The space complexity of an algorithm quantifies the amount of space or memory the algorithm requires in relation to the size of the input. Let’s analyze the space complexity of the twoSum function.

Input Space#

The input space for this function is the space needed to store the input data, which in this case is the list nums. Since nums is a list of n integers, the space required to store this list is proportional to n. Therefore, the input space complexity is \(\mathcal{O}(n)\).

Auxiliary Space#

The auxiliary space is the extra space or the temporary space used by the algorithm. In the twoSum function, we use a dictionary seen to keep track of the numbers we’ve already seen and their indices. In the worst-case scenario, we might end up storing all n elements from nums in the seen dictionary. Therefore, the auxiliary space complexity is also \(\mathcal{O}(n)\).

Total Space#

The total space required by an algorithm is the sum of the input space and the auxiliary space. In this case, the total space is \(\mathcal{O}(n) + \mathcal{O}(n)\), which simplifies to \(\mathcal{O}(n)\). Therefore, the total space complexity of the twoSum function is \(\mathcal{O}(n)\).

\[\begin{split} \begin{aligned} \mathcal{S}(n) &= \mathcal{O}(n) + \mathcal{O}(n) \\ &= \mathcal{O}(2n) \\ &\approx \mathcal{O}(n) \end{aligned} \end{split}\]

In summary, the twoSum function has a space complexity of \(\mathcal{O}(n)\), where n is the number of elements in the input list nums. This is because it requires storing the input data (nums) and a dictionary (seen) that may contain up to n elements in the worst case.