Arrays and lists
Reviewing the basics of arrays and tips for solving array-style code challenges.
After I didn't pass my last coding interview, I decided to spend some time learning about data structures and algorithms, with the goal to improve my performance in technical interviews.
I've been doing daily challenges on AlgoExpert and watching their "Crash Course" series. I'm grateful for this platform as it simplifies abstract, complex topics and facilitates learning.
When I first began studying, even the "Easy" challenges were difficult for me, taking over an hour to solve even with a "brute force" solution.
Embracing a "growth mindset" to learn from mistakes, today I revisited some of the beginner challenges and was encouraged to discover that I could find a working solution faster, discuss trade-offs and find an optimized/optimal solution. The challenges are now taking me between 15 to 30 minutes to solve.
I decided to document my learning journey here on the blog, with the hope to consolidate knowledge and keep track of progress.
Today I'd like to focus on Arrays/Lists, simple yet powerful data structures, built-in on most popular programming languages.
Arrays: basic concepts
Arrays store ordered collections of data with zero-based indexing. They are defined with square brackets and comma-separated values. Under the hood, arrays are stored in contiguous memory slots.
Accessing their data is like [a, b, c], as easy as [0], [1], [2]. Because they use contiguous memory slots, accessing data in an array is a constant time operation, meaning O(1) time complexity.
Inserting or deleting elements from an array can be costly, however, as these operations may require shifting all the elements (O(n) time complexity).
In some programming languages, arrays are static, while other languages support dynamic arrays. Static arrays have their length defined (memory allocated) when they are created. Increasing the size of a static array (for example, appending a new item to the end of the array) can be a costly operation because the entire array may need to be copied to a new (larger) set of contiguous memory slots.
In contrast, languages like JavaScript and Python make use of dynamic arrays (such as lists in Python and arrays in JavaScript), which abstract away the effort of allocating memory slots. Dynamic arrays are often implemented with extra memory slots (commonly doubling the size of the array), so appending elements is less costly because the need for moving the array to a different segment of memory is amortized.
Array methods
Programming languages offer built-in methods to allow for data manipulation of arrays. Some of the most common operations include:
Access: Retrieving an element using its index.
Insertion: Adding an element at a specific position (can involve shifting elements).
Deletion: Removing an element (can involve shifting elements).
Traversal: Visiting all elements of the array, typically using loops.
Searching: Finding an element (e.g., linear search, binary search).
Sorting: Arranging elements in a specific order (e.g., bubble sort, quicksort).
Mastering these operations means not only having the methods in your toolbox, but also understanding how they work.
Code challenge: Two Number Sum
One of the mistakes I made as a beginner was using built-in methods to perform operations without fully understanding how they work "under the hood".
When I first encountered the "Two Number Sum" problem, I came up with the following solution (in Python):
def twoNumberSum(array, targetSum):
result = []
for idx in range(len(array)):
num = array[idx]
diff = targetSum - num
filteredArr = array.copy()
filteredArr.pop(idx)
if diff in filteredArr:
return [num, diff]
return result
The code above works (it passes all the tests on AlgoExpert), but looking back, I can now see that it was far from the most efficient solution. Namely, calling built-in methods like copy()
and pop()
makes the algorithm unnecessarily costly from the standpoint of both time and space complexities.
Time complexity:
Outer loop: the for loop is a O(n) time operation.
Copy operation: the
copy()
method is an O(n) time operation, because it traverses the full input array in order to make a copy of it.Pop operation: the
.pop()
method with the index argument is also an O(n) time operation, because in the worst case (idx == 0
) it will shift the indices of all the remaining array elements.Membership test: finally, the line with
if diff in filteredArr:
traversal also has an O(n) time complexity, because it may need to traverse the full array in case the item doesn't exist or is the last element.
The total time complexity is O(n2), because the outer loop has an O(n) time multiplied by the O(n) operations inside it.
Space complexity:
- Copying the array creates a new list of the same size, which takes O(n) additional space.
In this challenge, I learned about two common tricks that can be used to solve many array challenges: 1) sorting the input array and 2) using pointers to keep track of additional datapoints.
Combining these tricks, I coded a more efficient solution that runs in O(n log(n)) time and O(1) space:
def twoNumberSum(array, targetSum):
array.sort()
result = []
leftPointer = 0
rightPointer = len(array) - 1
while left < right:
twoSum = array[leftPointer] + array[rightPointer]
if twoSum == targetSum:
return [array[leftPointer], array[rightPointer]]
elif twoSum < targetSum:
leftPointer += 1
elif twoSum > targetSum:
rightPointer -= 1
return result
- Sorting the input array is usually a good trick when the problem doesn't depend on the order of elements in the original array. It can help achieve an optimized algorithm because it ensures the order of the elements. The
.sort()
method has a quasilinear time complexity, or O(n log(n)).
Using two pointers to keep track of data allows us to make use of properties of a sorted array.
If the sum of the numbers on positions left and right is higher than the targetSum, that means we should try again with a smaller number on the right, so we move (decrement) the right pointer by 1 and try again.
Conversely, if the sum of positions left and right is less than the targetSum, we should try again with a bigger number on the left, so we move (increment) the left index by 1.
Code challenge: Validate Subsequence
Another code challenge I revisited is the Validate Subsequence problem: given two non-empty arrays of integers, write a function that determines whether the second array is a subsequence of the first array. A subsequence is a set of numbers that appear in the same order, but not necessarily adjacent. Example: [2, 4] is a subsequence of [1, 2, 3, 4] because the numbers 2 and 4 appear in the same order.
A month ago, I couldn't come up with a good solution that passed all the tests. I spent probably 2 hours and came close with the following code, which passes most of the tests:
def isValidSubsequence(array, sequence):
if len(sequence) > len(array):
return False
if len(sequence) == len(array) and sequence != array:
return False
if sequence == array:
return True
newArr = array.copy()
for a in sequence:
i = newArr.index(a)
newArr[i::]
if a not in newArr:
return False
return True
This algorithm has a time complexity of O(m * n), where m is the length of the sequence, and n is the length of the array. Also, the check if a not in newArr
is inside the loop but is followed by return True
immediately, meaning the loop will always exit after the first iteration, making the algorithm incorrect for sequences with more than one element.
Despite the mistake, in retrospect, this was not a great solution, for a few reasons:
Code is hard to read: handling different cases individually with if statements at the top of the function.
Copying the input array with
.copy()
is a costly O(n) time and O(n) space operation.Nested loops: the outer for loop
for a in sequence:
comprises a few inner loops:Finding the index:
.index(a)
is an O(n) operation because it may need to traverse the entire array to find a matching element.Slicing the copied array:
newArr[i::]
is an O(n) operation.Membership test: determining
if a not in newArr
also requires traversal through the copied, sliced array.
For the solution, this problem relies on the order of the elements in the input array, so sorting the input arrays was not an option; that said, because the elements must follow the same order, we can still use the two pointer approach to write an optimized solution:
def isValidSubsequence(array, sequence):
sequencePointer = 0
for idx in range(len(array)):
# 0, 1, 2, 3, 4, 5, 6, 7 | 8
print("Checking: ", array[idx], sequence[sequencePointer])
if array[idx] == sequence[sequencePointer]:
if sequencePointer == len(sequence) - 1:
return True
sequencePointer += 1
return False
The worst-case time complexity of the above algorithm is O(n), looping through all the elements in the array. The operations inside the for loop are all constant time operations (comparing both pointers and incrementing the sequencePointer).
In terms of space, this algorithm has a constant O(1) space complexity. It uses a couple of variables in constant space, but doesn't create any data structures that might increase based on the input.
On top of using the pointers trick, I also used two visual tricks, since I am a visual learner:
Comments: I sometimes find that my code runs out of bounds while looping through an array. To help mitigate this, whenever I write a for loop, I add an example of the loop iterations as a comment, which helps me visualize the value of the variable in each step:
# 0, 1, 2, 3, 4, 5, 6, 7 | 8
(For an array of length 8, loops through indices 0 to 7, stopping before the array length 8).Printing: I find it helpful to print the values of variables, almost as if the code were talking to me while it's running. This is particularly helpful for debugging.
Conclusion
By revisiting code challenges and studying up on data structures like arrays, I hope to continuously improve my coding skills. I also hope to make a habit of documenting my journey on the blog, to share tips and tricks and also to consolidate my knowledge.
There is still a lot for me to learn. If you finished reading this, I would enjoy hearing from you: did I miss any key concepts about arrays? What other tricks have you found helpful when solving array-type code problems? Write a comment below or send me a message on LinkedIn: https://www.linkedin.com/in/brossi1/
Thanks for reading,
Bruno