Chapter 11. Dynamic Programming III

Chapter 11. Dynamic Programming III#

Game Theory#

You are given an integer array nums. Two players are playing a game with this array: player 1 and player 2.

Player 1 and player 2 take turns, with player 1 starting first. Both players start the game with a score of 0. At each turn, the player takes one of the numbers from either end of the array (i.e., nums[0] or nums[nums.length - 1]) which reduces the size of the array by 1. The player adds the chosen number to their score. The game ends when there are no more elements in the array.

Return true if Player 1 can win the game. If the scores of both players are equal, then player 1 is still the winner, and you should also return true. You may assume that both players are playing optimally.

Leetcode

0486
0877

Analysis#

How to formulate a DP problem?

dp table[i,j]: stores a tuple (first, second) as the maximum score for the first and second player using nums[i,j]
state:
- i, j, player
choices: at each state, the player can choose the left or right of nums[i,j]
- left = nums[i]
- right = nums[j]
state transition:
- player 1 chooses left or right, whichever makes his score highest from nums[i,j]
  - dp[i,j][0] = max(left + dp[i+1][j][1], right + dp[i][j-1][1])
  - max(choosing left, choosing right)
- if player 1 chooses left, then player 2 makes choice from dp[i+1][j] as the first player:
  - dp[i][j][1] = dp[i+1][j][0]
- if player 1 chooses right, then player 2 makes choice from dp[i][j-1] as the first player:
  - dp[i][j][1] = dp[i][j-1][0]
base case:
- apparently, j>=i is required for traversing the dp table
- we can then set the base as i=j, which are the states defined along the diag.

DP Function#

def player1Wins(nums):

    # dp function that returns the maximum score from nums[i,j] for the first and second player
    def dp(i, j):
        # base case:
        if j == i:
            return (nums[i],0)
        
        # state transition
        # player 1 goes first by choosing left or right
        left_max = nums[i] + dp(i+1, j)[1]
        right_max = nums[j] + dp(i, j-1)[1]
        
        if left_max >= right_max:
            left = True 
            first = left_max
        else:
            left = False
            first = right_max 
        # player 2
        if left:
            second = dp(i+1, j)[0]
        else:
            second = dp(i, j-1)[0]
        
        return (first, second)
    
    scores = dp(0,len(nums)-1)
    return scores[0] >= scores[1]

# test
nums = [1,5,2]
print(player1Wins(nums))

nums = [1,5,233,7]
print(player1Wins(nums))

nums = [1,567,1,1]
print(player1Wins(nums))

False
True
True

DP Function with Memo#

def player1Wins(nums):
    # initialize a memo
    memo={}

    # dp function
    def dp(i, j):
        # base case:
        if j == i:
            return (nums[i], 0)
        
        # if in memo
        if (i,j) in memo:
            return memo[(i,j)]
        
        # else; fill in memo
        # state transition
        # player 1 goes first by choosing left or right
        left_max = nums[i] + dp(i+1, j)[1]
        right_max = nums[j] + dp(i, j-1)[1]
        
        if left_max >= right_max:
            left = True 
            first = left_max
        else:
            left = False
            first = right_max 
        # player 2
        if left:
            second = dp(i+1, j)[0]
        else:
            second = dp(i, j-1)[0]
        memo[(i,j)] = (first, second)

        return memo[(i, j)]
    
    (first, second) = dp(0, len(nums)-1)
    return first >= second

# test
nums = [1,5,2]
print(player1Wins(nums))

nums = [1,5,233,7]
print(player1Wins(nums))

nums = [1,567,1,1]
print(player1Wins(nums))

False
True
True

DP table#

def player1Wins(nums):
    n = len(nums)
    dp = [[[0,0] for j in range(n)] for i in range(n)]
    
    # base case
    for i in range(n):
        dp[i][i] = (nums[i],0)
    
    # traverse order? for dp[i][j], we need dp[i+1][j] and dp[i][j-1]
    # therefore, traverse i in a reverse order, and j in normal order
    for i in range(n-2, -1, -1):
        for j in range(i+1, n):
            # player 1 choose first: max(left, right)
            left_max = nums[i] + dp[i+1][j][1]
            right_max = nums[j] + dp[i][j-1][1]
            
            if left_max >= right_max:
                # player 1 choose left
                dp[i][j][0] = left_max
                # player 2:
                dp[i][j][1] = dp[i+1][j][0]
            else:
                # player 1 choose right
                dp[i][j][0] = right_max
                # player 2:
                dp[i][j][1] = dp[i][j-1][0]
    
    return dp[0][n-1][0] >= dp[0][n-1][1]

# test
nums = [1,5,2]
print(player1Wins(nums))

nums = [1,5,233,7]
print(player1Wins(nums))

nums = [1,567,1,1]
print(player1Wins(nums))

False
True
True

def player1Wins(nums):
    n = len(nums)
    dp = [[0] * n for _ in range(n)]
    for i in range(n):
        dp[i][i] = nums[i]
    for i in range(n - 2, -1, -1):
        for j in range(i + 1, n):
            dp[i][j] = max(nums[i] - dp[i + 1][j], nums[j] - dp[i][j - 1])
    return dp[0][n - 1] >= 0

# test
nums = [1,5,2]
print(player1Wins(nums))

nums = [1,5,233,7]
print(player1Wins(nums))

nums = [1,567,1,1]
print(player1Wins(nums))

False
True
True

House Robber#

You are a professional robber planning to rob houses along a street. Each house has a certain amount of money stashed, the only constraint stopping you from robbing each of them is that adjacent houses have security systems connected and it will automatically contact the police if two adjacent houses were broken into on the same night.

Given an integer array nums representing the amount of money of each house, return the maximum amount of money you can rob tonight without alerting the police.

Leetcode

0198
0213
0337

Analysis#

How to define a DP problem?

dp[i] stores the maximum amount of money when robing nums[:i] with house i-1 being robbed.
state: maximum money robbed from nums[:i], with i-1 th house being robbed
choice: rob i-3 or not, the second house before it
state transition:
- dp[i] = max(dp[:i-1]) + nums[i-1]
output: return the maximum of dp
base case:
- i=0, no house for robbibg, which is 0

DP Function#

def rob(nums):
    # maximum so far
    res = 0
    n = len(nums)
    # dp function: from top to down
    def dp(i):
        nonlocal res

        # base case
        if i > n-1:
            return res

        # fill the dp
        res = max(res, dp(i+2) + nums[i])
        return res

    return dp(0)

# test
nums = [1,2,3,1]
print(rob(nums))

nums=[2,7,9,3,1]
print(rob(nums))

4
12

DP Table#

def rob(nums):
    n = len(nums)
    # initialize
    dp = [0]*(n+1) 
    dp[1] = nums[0]

    for i in range(2,n+1):
        dp[i] = max(dp[:i-1]) + nums[i-1]

    return max(dp)

# test
nums = [1,2,3,1]
print(rob(nums))

nums=[2,7,9,3,1]
print(rob(nums))

4
12

Chapter 11. Dynamic Programming III

Contents

Chapter 11. Dynamic Programming III#

Game Theory#

Analysis#

DP Function#

DP Function with Memo#

DP table#

House Robber#

Analysis#

DP Function#

DP Table#

Sell Stocks#