Two Pointer Algorithm: Li Yin January 19, 2019

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Two Pointer Algorithm

Li Yin1

January 19, 2019

1
www.liyinscience.com
ii
Linear Search

0.1 Two-pointer Search


refers https://tp-iiita.quora.com/The-Two-Pointer-Algorithm
There are actually 50/900 problems on LeetCode are tagged as two point-
ers. Two pointer search algorithm are normally used to refer to searching
that use two pointer in one for/while loop over the given data structure.
Therefore, this part of algorithm gives linear performance as of O(n). While,
it does not refer to situation such as searching a pair of items in an array
that sums up to a given target value, then two nested for loops are needed
to search all the possible pairs. There are different ways to put these two
pointers:

1. Equi-directional: Both start from the beginning: we have slow-faster


pointer, sliding window algorithm.

2. Opposite-directional: One at the start and the other at the end, they
move close to each other and meet in the middle, (-> <-).

In order to use two pointers, most times the data structure needs to be
ordered in some way, and decrease the time complexity from O(n2 ) or O(n3 )
of two/three nested for/while loops to O(n) of just one loop with two pointers
and search each item just one time. In some cases, the time complexity is
highly dependable on the data and the criteria we set.
As shown in Fig. 1, the pointer i and j can decide: a pair or a subarray
(with all elements starts from i and end at j). We can either do search
related with a pair or a subarray. For the case of subarray, the algorithm
is called sliding window algorithm. As we can see, two pointers and sliding
window algorithm can be used to solve K sum (Section ??), most of the
subarray (Section ??), and string pattern match problems (Section ??).
Two pointer algorithm is less of a talk and more of problem attached. We
will explain this type of algorithm in virtue of both the leetcode problems
and definition of algorihtms. To understand two pointers techniques, better
to use examples, here we use two examples: use slow-faster pointer to find

iii
iv LINEAR SEARCH

Figure 1: Two pointer Example

node of a given linked list.png

Figure 2: Slow-fast pointer to find middle

the median and Floyd’s fast-slow pointer algorithm for loop detection in an
array/linked list and two pointers to get two sum.

0.1.1 Slow-fast Pointer


Find middle node of linked list The simpest example of slow-fast
pointer application is to get the middle node of a given linked list. (LeetCode
problem: 876. Middle of the Linked List)
Example 1 ( odd l e n g t h ) :

Input : [ 1 , 2 , 3 , 4 , 5 ]
Output : Node 3 from t h i s l i s t ( S e r i a l i z a t i o n : [ 3 , 4 , 5 ] )

Example 2 ( even l e n g t h ) :

Input : [ 1 , 2 , 3 , 4 , 5 , 6 ]
Output : Node 4 from t h i s l i s t ( S e r i a l i z a t i o n : [ 4 , 5 , 6 ] )

[h] We place two pointers simultaneously at the head node, each one moves
at different paces, the slow pointer moves one step and the fast moves two
steps instead. When the fast pointer reached the end, the slow pointer will
stop at the middle. For the loop, we only need to check on the faster pointer,
0.1. TWO-POINTER SEARCH v

make sure fast pointer and fast.next is not None, so that we can successfuly
visit the fast.next.next. When the length is odd, fast pointer will point at
the end node, because fast.next is None, when its even, fast pointer will
point at None node, it terimates because fast is None.
1 d e f middleNode ( s e l f , head ) :
2 slow , f a s t = head , head
3 w h i l e f a s t and f a s t . next :
4 f a s t = f a s t . next . next
5 s l o w = s l o w . next
6 return slow

Floyd’s Cycle Detection (Floyd’s Tortoise and Hare) Given a linked


list which has a cycle, as shown in Fig. 3. To check the existence of the cycle
is quite simple. We do exactly the same as traveling by the slow and fast
pointer above, each at one and two steps. (LeetCode Problem: 141. Linked
List Cycle). The code is pretty much the same with the only difference been
that after we change the fast and slow pointer, we check if they are the same
node. If true, a cycle is detected, else not.
1 d e f h a s C y c l e ( s e l f , head ) :
2 s l o w = f a s t = head
3 w h i l e f a s t and f a s t . next :
4 s l o w = s l o w . next
5 f a s t = f a s t . next . next
6 i f s l o w == f a s t :
7 r e t u r n True
8 return False

In order to know the starting node of the cycle. Here, we set the distance
of the starting node of the cycle from the head is x, and y is the distance from
the start node to the slow and fast pointer’s node, and z is the remaining
distance from the meeting point to the start node.
Now, let’s try to device the algorithm. Both slow and fast pointer starts
at position 0, the node index they travel each step is: [0,1,2,3,...,k] and
[0,2,4,6,...,2k] for slow and fast pointer respectively. Therefore, the total
distance traveled by the slow pointer is half of the distance travelled by the
fat pointer. From the above figure, we have the distance travelled by slow
pointer to be ds = x+y, and for the fast pointer df = x+y+z+y = x+2y+z.
With the relation 2 ∗ ds = df . We will eventually get x = z. Therefore, by
moving slow pointer to the start of the linked list after the meeting point,
and making both slow and fast pointer to move one node at a time, they
will meet at the starting node of the cycle. (LeetCode problem: 142. Linked
List Cycle II (medium)).
1 d e f d e t e c t C y c l e ( s e l f , head ) :
2 s l o w = f a s t = head
3 bCycle = F a l s e
4 w h i l e f a s t and f a s t . next :
vi LINEAR SEARCH

Figure 3: Floyd’s Cycle finding Algorithm

5 s l o w = s l o w . next
6 f a s t = f a s t . next . next
7 i f s l o w == f a s t : # a c y c l e i s found
8 bCycle = True
9 break
10
11 i f not bCycle :
12 r e t u r n None
13 # r e s e t t h e s l o w p o i n t e r t o f i n d t h e s t a r t i n g node
14 s l o w = head
15 w h i l e f a s t and s l o w != f a s t :
16 s l o w = s l o w . next
17 f a s t = f a s t . next
18 return slow

Figure 4: One example to remove cycle

In order to remove the cycle as shown in Fig. 4, the starting node is when
slow and fast intersect, the last fast node before they meet. For the example,
we need to set -4 node’s next node to None. Therefore, we modify the above
code to stop at the last fast node instead:
1 # r e s e t t h e s l o w p o i n t e r t o f i n d t h e s t a r t i n g node
2 s l o w = head
3 w h i l e f a s t and s l o w . next != f a s t . next :
4 s l o w = s l o w . next
0.1. TWO-POINTER SEARCH vii

5 f a s t = f a s t . next
6 f a s t . next = None

0.1.2 Opposite-directional Two pointer


Two pointer is usually used for searching a pair in the array. There are
cases the data is organized in a way that we can search all the result space
by placing two pointers each at the start and rear of the array and move
them to each other and eventually meet and terminate the search process.
The search target should help us decide which pointer to move at that step.
This way, each item in the array is guaranteed to be visited at most one
time by one of the two pointers, thus making the time complexity to be
O(n). Binary search used the technique of two pointers too, the left and
right pointer together decides the current searching space, but it erase of
half searching space at each step instead.

Two Sum - Input array is sorted Given an array of integers that is


already sorted in ascending order, find two numbers such that they add up
to a specific target number. The function twoSum should return indices of
the two numbers such that they add up to the target, where index1 must
be less than index2. (LeetCode problem: 167. Two Sum II - Input array is
sorted (easy).)
I n p u t : numbers = [ 2 , 7 , 1 1 , 1 5 ] , t a r g e t = 9
Output : [ 1 , 2 ]
E x p l a n a t i o n : The sum o f 2 and 7 i s 9 . T h e r e f o r e i n d e x 1 = 1 ,
index2 = 2 .

Due to the fact that the array is sorted which means in the array [s,s1
..., e1, e], the sum of any two integer is in range of [s+s1, e1+e]. By placing
two pointers each start from s and e, we started the search space from the
middle of the possible range. [s+s1, s+e, e1+e]. Compare the target t with
the sum of the two pointers v1 and v2 :

1. t == v1 + v2 : found

2. v1 + v2 < t: we need to move to the right side of the space, then we


increase v1 to get larger value.

3. v1 + v2 > t: we need to move to the left side of the space, then we


decrease v2 to get smaller value.

1 d e f twoSum ( s e l f , numbers , t a r g e t ) :
2 #u s e two p o i n t e r s
3 n = l e n ( numbers )
4 i , j = 0 , n−1
5 while i < j :
6 temp = numbers [ i ] + numbers [ j ]
viii LINEAR SEARCH

7 i f temp == t a r g e t :
8 r e t u r n [ i +1 , j +1]
9 e l i f temp < t a r g e t :
10 i += 1
11 else :
12 j −= 1
13 return [ ]

0.1.3 Sliding Window Algorithm

Figure 5: Sliding Window Algorithm

Given an array, imagine that we have a fixed size window as shown in


Fig. 5, and we can slide it forward each time. If we are asked to compute
the sum of each window, the bruteforce solution would be O(kn) where k is
the window size and n is the array size by using two nested for loops, one to
set the starting point, and the other to compute the sum. A sliding window
algorithm applied here used the property that the sum of the current window
(Sc ) can be computed from the last winodow (Sl ) knowling the items that
just slided out and moved in as ao and ai . Then Sc = Sl − ao + ai . Not
necessarily using sum, we generalize it as state, if we can compute Sc from Sl ,
ao and ai in O(1), a function Sc = f (Sl , ao , ai ) then we name this sliding
window property. Therefore the time complexity will be decreased to
O(n).
1 d e f f i x e d S l i d e W i n d o w (A, k ) :
2 n = l e n (A)
3 i f k >= n :
4 r e t u r n sum (A)
5 # compute t h e f i r s t window
6 a c c = sum (A [ : k ] )
7 ans = a c c
8 # s l i d e t h e window
9 f o r i i n r a n g e ( n−k ) : # i i s t h e s t a r t p o i n t o f t h e window
10 j = i + k # j i s t h e end p o i n t o f t h e window
11 a c c = a c c − A[ i ] + A[ j ]
12 ans = max( ans , a c c )
0.1. TWO-POINTER SEARCH ix

13 r e t u r n ans

When to use sliding window It is important to know when we can use


sliding window algorithm, we summarize three important standards:

1. It is a subarray/substring problem.

2. sliding window property:T he requirement of the sliding window


satisfy the sliding window property.

3. Completeness: by moving the left and right pointer of the sliding


window in a way that we can cover all the search space. Sliding window
algorithm is about optimization problem, and by moving the left and
right pointer we can search the whole searching space. Therefore,
to testify that if applying the sliding window can cover the
whole search space and guarentee the completeness decide if
the method works.

For example, 644. Maximum Average Subarray II (hard) does not satisfy
the completeness. Because the average of subarray does not follow a certain
order that we can decided how to move the window.

Flexible Sliding Window Algorithm Another form of sliding window


where the window size is flexble, and it can be used to solve a lot of real
problems related to subarray or substring that is conditioned on some pat-
tern. Compared with the fixed size window, we can first fix the left pointer,
and push the right pointer to enlarge the window in order to find a subarray
satisfy a condition. Once the condition is met, we save the optimal result
and shrink the window by moving the left pointer in a way that we can set
up a new starting pointer to the window (shrink the window). At any point
in time only one of these pointers move and the other one remains fixed.

Sliding Window Algorithm with Sum In this part, we list two exam-
ples that we use flexible sliding window algorithms to solve subarray problem
with sum condition.
Given an array of n positive integers and a positive integer s, find the
minimal length of a contiguous subarray of which the sum >= s. If there isn’t
one, return 0 instead. (LeetCode Problem: 209. Minimum Size Subarray
Sum (medium)).
Example :

I n p u t : s = 7 , nums = [ 2 , 3 , 1 , 2 , 4 , 3 ]
Output : 2
E x p l a n a t i o n : t h e s u b a r r a y [ 4 , 3 ] has t h e minimal l e n g t h under t h e
problem c o n s t r a i n t .
x LINEAR SEARCH

Figure 6: The array and the prefix sum

As we have shown in Fig. 6, the prefix sum is the subarray starts with the
first item in the array, we know that the sum of the subarray is monotoni-
cally increasing as the size of the subarray increase. Therefore, we place a
’window’ with left and right as i and j at the first item first. The steps are
as follows:

1. Get the optimal subarray starts from current i, 0: Then we first move
the j pointer to include enough items that sum[0:j+1]>=s, this is the
process of getting the optimial subarray that starts with 0. And as-
sume j stops at e0

2. Get the optimal subarray ends with current j, e0 : we shrink the window
size by moving the i pointer forward so that we can get the optimal
subarray that ends with current j and the optimal subarray starts from
s0 .

3. Now, we find the optimal solution for subproblem [0:i,0:j](the start


point in range [0, i] and end point in range [0,j]. Starts from next i
and j, and repeat step 1 and 2.

The above process is a standard flexible window size algorithm, and it is a


complete search which searched all the possible result space. Both j and i
pointer moves at most n, it makes the total operations to be at most 2n,
which we get time complexity as O(n).
1 d e f minSubArrayLen ( s e l f , s , nums ) :
2 ans = f l o a t ( ’ i n f ’ )
0.1. TWO-POINTER SEARCH xi

3 n = l e n ( nums )
4 i = j = 0
5 acc = 0 # acc i s the s t a t e
6 while j < n :
7 a c c += nums [ j ]# i n c r e a s e t h e window s i z e
8 w h i l e a c c >= s :# s h r i n k t h e window t o g e t t h e o p t i m a l
result
9 ans = min ( ans , j −i +1)
10 a c c −= nums [ i ]
11 i += 1
12 j +=1
13 r e t u r n ans i f ans != f l o a t ( ’ i n f ’ ) e l s e 0

What happens if there exists negative number in the


array?
Sliding window algorithm will not work any more, because the sum of
the subarray is no longer monotonically increase as the size increase.
Instead (1) we can use prefix sum and organize them in order, and use
binary search to find all posible start index. (2) use monotone stack
(see LeetCode probelm: 325. Maximum Size Subarray Sum Equals k,
325. Maximum Size Subarray Sum Equals k (hard)))

More similar problems:

1. 674. Longest Continuous Increasing Subsequence (easy)

Sliding Window Algorithm with Substring For substring problems,


to be able to use sldiing window, s[i,j] should be gained from s[i,j-1] and
s[i-1,j-1] should be gained from s[i,j-1]. Given a string, find the length of
the longest substring without repeating characters. (LeetCode Problem: 3.
Longest Substring Without Repeating Characters (medium))
Example 1 :

I n p u t : " abcabcbb "


Output : 3
E x p l a n a t i o n : The answer i s " abc " , with t h e l e n g t h o f 3 .

Example 2 :

I n p u t : " bbbbb "


Output : 1
E x p l a n a t i o n : The answer i s " b " , with t h e l e n g t h o f 1 .

First, we know it is a substring problem. Second, it askes to find substring


that only has unique chars, we can use hashmap to record the chars in
current window, and this satisfy the sliding window property. When the
current window violates the condition ( a repeating char), we shrink the
xii LINEAR SEARCH

window in a way to get rid of this char in the current window by moving
the i pointer one step after this char.
1 def lengthOfLongestSubstring ( s e l f , s ) :
2 i f not s :
3 return 0
4 n = len ( s )
5 state = set ()
6 i = j = 0
7 ans = − f l o a t ( ’ i n f ’ )
8 while j < n :
9 i f s [ j ] not i n s t a t e :
10 s t a t e . add ( s [ j ] )
11 ans = max( ans , j −i )
12 else :
13 # s h r i n k t h e window : g e t t h i s c h a r out o f t h e window
14 w h i l e s [ i ] != s [ j ] : # f i n d t h e c h a r
15 s t a t e . remove ( s [ i ] )
16 i += 1
17 # skip t h i s char
18 i += 1
19 j += 1
20 r e t u r n ans i f ans != − f l o a t ( ’ i n f ’ ) e l s e 0

Now, let us see another example with string ang given a pattern to match.
Given a string S and a string T, find the minimum window in S which will
contain all the characters in T in complexity O(n). (LeetCode Problem: 76.
Minimum Window Substring (hard))
Example :

Input : S = "ADOBECODEBANC" , T = "ABC"


Output : "BANC"

In this problem, the desirable window is one that has all characters from
T. The solution is pretty intuitive. We keep expanding the window by
moving the right pointer. When the window has all the desired characters,
we contract (if possible) and save the smallest window till now. The only
difference compared with the above problem is the definition of desirable:
we need to compare the state of current window with the required state in
T. They can be handled as a hashmap with character as key and frequency
of characters as value.
1 d e f minWindow ( s e l f , s , t ) :
2 d i c t _ t = Counter ( t )
3 s t a t e = Counter ( )
4 required = len ( dict_t )
5
6 # l e f t and r i g h t p o i n t e r
7 i , j = 0, 0
8
9 formed = 0
10 ans = f l o a t ( " i n f " ) , None # min l e n , and s t a r t pos
0.1. TWO-POINTER SEARCH xiii

11
12 while j < len ( s ) :
13 char = s [ j ]
14 # record current state
15 i f char in dict_t :
16 s t a t e [ c h a r ] += 1
17 i f s t a t e [ c h a r ] == d i c t _ t [ c h a r ] :
18 formed += 1
19
20 # Try and c o n t r a c t t h e window t i l l t h e p o i n t where i t
c e a s e s t o be ’ d e s i r a b l e ’ .
21 # bPrint = False
22 w h i l e i<=j and formed == r e q u i r e d :
23 # i f not b P r i n t :
24 # p r i n t ( ’ found : ’ , s [ i : j +1] , i , j )
25 # b P r i n t = True
26 char = s [ i ]
27 i f j −i +1 < ans [ 0 ] :
28 ans = j − i + 1 , i
29 # change t h e s t a t e
30 i f char in dict_t :
31 s t a t e [ c h a r ] −= 1
32 i f s t a t e [ c h a r ] == d i c t _ t [ c h a r ] −1:
33 formed −= 1
34
35 # Move t h e l e f t p o i n t e r ahead ,
36 i += 1
37
38 # Keep expanding t h e window
39 j += 1
40 # i f bPrint :
41 # p r i n t ( ’ move t o : ’ , s [ i : j +1] , i , j )
42 r e t u r n " " i f ans [ 0 ] == f l o a t ( " i n f " ) e l s e s [ ans [ 1 ] : ans [ 1 ] +
ans [ 0 ] ]

The process would be:


found : ADOBEC 0 5
move t o : DOBECO 1 6
found : DOBECODEBA 1 10
move t o : ODEBAN 6 11
found : ODEBANC 6 12
move t o : ANC 10 13

Three Pointers and Sliding Window Algorithm Sometimes, by ma-


nipulating two pointers are not enough for us to get the final solution.
0.1 930. Binary Subarrays With Sum. In an array A of 0s and 1s,
how many non-empty subarrays have sum S?
Example 1 :

I nput : A = [ 1 , 0 , 1 , 0 , 1 ] , S = 2
Output : 4
xiv LINEAR SEARCH

Explanation :
The 4 s u b a r r a y s a r e b o l d e d below :
[1 ,0 ,1 ,0 ,1]
[1 ,0 ,1 ,0 ,1]
[1 ,0 ,1 ,0 ,1]
[1 ,0 ,1 ,0 ,1]

Note: A.length <= 30000, 0 <= S <= A.length, A[i] is either 0 or 1.


For example in the following problem, if we want to use two pointers to
solve the problem, we would find we miss the case; like in the example
1, 0, 1, 0, 1, when j = 5, i = 1, the sum is 2, but the algorithm would
miss the case of i = 2, which has the same sum value.
To solve this problem, we keep another index ih i, in addition to the
moving rule of i, it also moves if the sum is satisfied and that value
is 0. This is actually a Three pointer algorithm, it is also a mutant
sliding window algorithm.
1 class Solution :
2 d e f numSubarraysWithSum ( s e l f , A, S ) :
3 i _ l o , i_hi , j = 0 , 0 , 0 #i _ l o <= j
4 sum_window = 0
5 ans = 0
6 w h i l e j < l e n (A) :
7
8 sum_window += A[ j ]
9
10 w h i l e i _ l o < j and sum_window > S :
11 sum_window −= A[ i _ l o ]
12 i _ l o += 1
13 # up t i l l here , i t i s s t a n d a r d s l i d i n g window
14
15 # now s e t t h e e x t r a p o i n t e r a t t h e same
l o c a t i o n of the i_lo
16 i_hi = i_lo
17 w h i l e i _ h i < j and sum_window == S and not A[
i_hi ] :
18 i _ h i += 1
19 i f sum_window == S :
20 ans += i _ h i − i _ l o + 1
21
22 j += 1 #i n c r e a s e t h e p o i n t e r a t l a s t s o t h a t we
do not need t o check i f j <l e n a g a i n
23
24 r e t u r n ans

Summary Sliding Window is a powerful tool for solving certain subar-


ray/substring related problems. The normal situations where we use sliding
window is summarized:
• Subarray: for an array with numerical value, it requires all posi-
tive/negative values so that the prefix sum/product has monotonicity.
0.1. TWO-POINTER SEARCH xv

• Substring: for an array with char as value, it requires the state of each
subarray does not related to the order of the characters (anagram-like
state) so that we can have the sliding window property.

The steps of using sliding windows:

1. Initialize the left and right pointer;

2. Handle the right pointer and record the state of the current window;

3. While the window is in the state of desirable: record the optimal


solution, move the left pointer and record the state (change or stay
unchanged).

4. Up till here, the state is not desirable. Move the right pointer in order
to find a desirable window;

0.1.4 LeettersCode Problems


Sliding Window

0.1 76. Minimum Window Substring

0.2 438. Find All Anagrams in a String

0.3 30. Substring with Concatenation of All Words

0.4 159. Longest Substring with At Most Two Distinct Characters

0.5 567. Permutation in String

0.6 340. Longest Substring with At Most K Distinct Characters

0.7 424. Longest Repeating Character Replacement

You might also like