Two Sum#
Problem#
Given an array of integers nums
and an integer target
, return indices of
the two numbers such that they add up to target
.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
Intuition#
This problem can be solved by using a
hash map to store the numbers in
the nums
array as keys and their indices as values. By doing so, we can
achieve a \(\mathcal{O}(1)\) look-up time for checking if a number exists in our
array. For each number in the nums
array, we calculate the difference between
the target
and that number, and then check if this difference exists in our
hash map. If it does, we have found our two numbers that add up to the target.
Assumptions#
Each given input will have exactly one solution.
The same element cannot be used twice in the pair.
The order of indices in the output does not matter.
All the elements in the
nums
array are integers.The
target
is also an integer.The
nums
array will at least have two elements (since we are looking for a pair).
Constraints#
It’s assumed that the
nums
array size is in the range of:\[ 2 \leq \text{nums.length} \leq 10^4. \]Each element in
nums
array is an integer in the range of:\[ -10^9 \leq \text{nums[i]} \leq 10^9. \]The
target
is an integer in the range of:\[ -10^9 \leq \text{target} \leq 10^9. \]There is only one valid answer.
What are Constraints for?#
In programming problems, constraints are given to define the scope and limits of the problem. They help to determine the feasible approaches and solutions for the problem by providing information about the range and characteristics of the input data. They help us to anticipate the worst-case scenarios that our algorithm should be able to handle without leading to inefficiencies or failures, like time limit exceeded or memory limit exceeded errors.
In the context of the problem at hand:
The constraint on
nums.length
(i.e., \(2 \leq \text{nums.length} \leq 10^4\)) specifies the minimum and maximum size of thenums
array that our algorithm needs to handle. Knowing this, we can evaluate whether our solution would scale efficiently for the largest possible input size. In this case, a solution with a time complexity of \(\mathcal{O}(n)\) or \(\mathcal{O}(n \log n)\) would likely be acceptable, but a solution with a time complexity of \(\mathcal{O}(n^2)\) may not be efficient enough.The constraints on the individual elements of
nums
and ontarget
(i.e., \(-10^9 \leq \text{nums[i]} \leq 10^9\) and \(-10^9 \leq \text{target} \leq 10^9\)) let us know the range of values that these variables can take. These constraints are important to take into account when considering possible edge cases and overflow issues.The specification that there is only one valid answer helps to simplify the problem by indicating that we don’t need to consider scenarios where there might be multiple pairs of numbers in
nums
that add up totarget
.
Test Cases#
Test Case 1:
nums = [2, 7, 11, 15], target = 9
, Return:[0, 1]
Test Case 2:
nums = [3, 2, 4], target = 6
, Return:[1, 2]
Test Case 3:
nums = [3, 3], target = 6
, Return:[0, 1]
Test Case 4:
nums = [-1, -2, -3, -4, -5], target = -8
, Return:[2, 3]
(Negative numbers)Test Case 5:
nums = [0, 1, 2, 0], target = 0
, Return:[0, 3]
(Zero as target)Test Case 6:
nums = [1, 3, 4, 2], target = 6
, Return:[2, 3]
(Elements are not in order)
Edge Cases#
Edge Case 1:
nums = [1], target = 1
, Return:[]
(No pair)Edge Case 2:
nums = [0, 0], target = 0
, Return:[0, 1]
(Zeroes)Edge Case 3:
nums = [0, 1, 2, 0], target = 2
, Return:[1, 3]
(Multiple valid pairs, return any)Edge Case 4:
nums = [-1, 2, 1, -4], target = 1
, Return:[0, 1]
(Negative and positive numbers)
Walkthrough / Whiteboarding#
Consider nums = [2, 7, 11, 15], target = 9
.
Initialize an empty hash map.
Iterate over the array:
For the first element
2
, calculate9 - 2 = 7
.Check if
7
is in the hash map. It’s not, so add{2:0}
to the hash map.For the second element
7
, calculate9 - 7 = 2
.Check if
2
is in the hash map. It is, so return the current index1
and the stored index0
.
Theoretical Best Time Complexity#
The theoretical best time complexity for this problem is \(\mathcal{O}(n)\), where
\(n\) is the length of the nums
array. This is because the most efficient
solution to this problem involves traversing the array once, which requires
\(\mathcal{O}(n)\) time.
Theoretical Best Space Complexity#
The theoretical best (auxiliary) space complexity for this problem is
\(\mathcal{O}(n)\), where \(n\) is the number of elements in the nums
array. This
is due to the space required by the seen
hash map, which is used to store the
elements of the array along with their indices.
Space-Time Tradeoff#
This represents a space-time tradeoff if we compare this algorithm to the
brute force method. The brute force method uses \(\mathcal{O}(n^2)\) time and
\(\mathcal{O}(1)\) space. Here, we are using additional space (the
seen
hash map) to reduce the time complexity of our solution from
\(\mathcal{O}(n^2)\) (if we were to use a brute-force approach with two nested
loops) to \(\mathcal{O}(n)\).
Solution#
Note that using python’s built in enumerate
would be cleaner, but we will
refrain from using it here to demonstrate the logic behind the solution.
Implementation#
from typing import List, Dict
class Solution:
def twoSum(self, nums: List[int], target: int) -> List[int]:
"""
Given a list of integers and a target integer, this function finds two
integers in the list that sum to the target. Returns a list of two indices
of the numbers that add up to the target.
Parameters
----------
nums : List[int]
The list of numbers.
target : int
The target number.
Returns
-------
List[int]
The indices of the two numbers that sum to the target.
Examples
--------
>>> nums = [15, 2, 3, 7]
>>> target = 9
>>> Solution().twoSum(nums, target)
[1, 3]
"""
seen: Dict[int, int] = {}
index: int = 0
for num in nums:
complement: int = target - num
if complement not in seen:
seen[num] = index
else:
return [seen[complement], index]
index += 1
nums = [15, 2, 3, 7]
target = 9
Solution().twoSum(nums, target)
[1, 3]
Time Complexity#
If we denote \(\mathcal{T}(n)\) as the time complexity of your function, we can break down the time complexity of each major step.
The time complexity of the algorithm is \(\mathcal{O}(n)\). This is because we traverse the list of length \(n\) a single time. Every operation inside the loop, including the table lookup, has a time complexity of \(\mathcal{O}(1)\).
To understand this in a more practical sense, consider the for loop from lines
32
to 38
. Each line inside this loop has a time complexity of
\(\mathcal{O}(1)\), meaning it takes a constant amount of time to execute,
regardless of the size of the input. This suggests that a typical iteration
inside the loop would require approximately \(\mathcal{O}(3)\) time, given that
there are three operations within the loop.
However, since the loop runs \(n\) times (where \(n\) is the length of the input list), the overall time required would be \(n\) times the time required for a single iteration. This gives us:
Note that the \(\mathcal{O}(1)\) term comes from the time required to initialize the hashmap. This is a constant time operation that is independent of the size of the input list.
In big-O notation, we typically omit constant multipliers since we’re primarily interested in how the runtime scales with the size of the input. Hence, we simplify \(\mathcal{O}(3n)\) to \(\mathcal{O}(n)\). This signifies that the runtime of the algorithm grows linearly with the size of the input list.
Case |
Worst Case |
Average Case |
Best Case |
---|---|---|---|
Element is in the list (Two numbers add up to target) |
\(\mathcal{O}(n)\) |
\(\mathcal{O}\left(\frac{n}{2}\right)\) |
\(\mathcal{O}(1)\) |
Element is not in the list (No two numbers add up to target) |
\(\mathcal{O}(n)\) |
\(\mathcal{O}(n)\) |
\(\mathcal{O}(n)\) |
(The Average Case)
If the inputs are uniformly distributed, then each possible pair of indices \((i, j)\) that sums to the target is equally likely.
Assuming nums
is an array of \(n\) elements, and \((i, j)\) is a solution (i.e.,
nums[i] + nums[j] == target
), where \(i\) and \(j\) are any indices in the array.
If the distribution of inputs is uniform, then the probability that \((i, j)\) is the solution is the same for all valid pairs of indices \((i, j)\).
So if you have to find one pair \((i, j)\) that sums to target
, on average, you
would have to search through half of all possible pairs before finding the
correct one. Therefore, on average, you’d inspect \(n/2\) elements in nums
before finding the pair that sums to target
.
So technically, we can think of the average time complexity to be \(\mathcal{O}\left(\frac{n}{2}\right)\). However, as we simplify this to its highest order term and ignore the coefficients in Big O notation, the average case complexity becomes \(\mathcal{O}(n)\).
Space Complexity#
If we denote \(\mathcal{S}(n)\) as the space complexity of your function, we can break down the space complexity of each major step as well.
Type |
Complexity |
Description |
---|---|---|
Input Space |
\(\mathcal{O}(n)\) |
The input space depends on the size of the input list |
Auxiliary Space |
\(\mathcal{O}(n)\) |
The auxiliary space depends on the size of the |
Total Space |
\(\mathcal{O}(n)\) |
The total space is the sum of the input and auxiliary space, hence it is also \(\mathcal{O}(n)\). |
The auxiliary (extra) space required depends on the number of items stored in
the dictionary seen
, which stores at most \(n\) elements.
The space complexity of an algorithm quantifies the amount of space or memory
the algorithm requires in relation to the size of the input. Let’s analyze the
space complexity of the twoSum
function.
Input Space#
The input space for this function is the space needed to store the input data,
which in this case is the list nums
. Since nums
is a list of n
integers,
the space required to store this list is proportional to n
. Therefore, the
input space complexity is \(\mathcal{O}(n)\).
Auxiliary Space#
The auxiliary space is the extra space or the temporary space used by the
algorithm. In the twoSum
function, we use a dictionary seen
to keep track of
the numbers we’ve already seen and their indices. In the worst-case scenario, we
might end up storing all n
elements from nums
in the seen
dictionary.
Therefore, the auxiliary space complexity is also \(\mathcal{O}(n)\).
Total Space#
The total space required by an algorithm is the sum of the input space and the
auxiliary space. In this case, the total space is
\(\mathcal{O}(n) + \mathcal{O}(n)\), which simplifies to \(\mathcal{O}(n)\).
Therefore, the total space complexity of the twoSum
function is
\(\mathcal{O}(n)\).
In summary, the twoSum
function has a space complexity of \(\mathcal{O}(n)\),
where n
is the number of elements in the input list nums
. This is because it
requires storing the input data (nums
) and a dictionary (seen
) that may
contain up to n
elements in the worst case.