CodingTour
ARTS #45

Algorithm

本周选择的算法题是:Edit Distance

规则如下:

Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.

You have the following 3 operations permitted on a word:

  1. Insert a character
  2. Delete a character
  3. Replace a character

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

Solution

这是一道经典的题目。

Time Limit Exceeded

原始解法:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        if word1 == word2: return 0
        if not word1: return len(word2)
        if not word2: return len(word1)

        if word1[0] == word2[0]:
            return self.minDistance(word1[1:], word2[1:])
        else:
            return 1 + min(
                self.minDistance(word1[1:], word2[1:]), 
                self.minDistance(word1[1:], word2),
                self.minDistance(word1, word2[1:]),
            )

优化一

加上缓存:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        cache = {}

        def dp(word1: str, word2: str) -> int:
            if word1 == word2: return 0
            if not word1: return len(word2)
            if not word2: return len(word1)

            if (word1, word2) not in cache:
                if word1[0] == word2[0]:
                    cache[(word1, word2)] = dp(word1[1:], word2[1:])
                else:
                    cache[(word1, word2)] = 1 + min(
                        dp(word1[1:], word2[1:]), 
                        dp(word1[1:], word2),
                        dp(word1, word2[1:]),
                    )
            return cache[(word1, word2)]
        return dp(word1, word2)

Runtime:120 ms,快过 91.90%。

Memory:19.2 MB,低于 11.54%。

优化二

因为在 word 的比较过程中总是只关心第一个字符,所以可以用索引减少字符串创建的开销:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        cache = {}
        def dp(i: int, j: int) -> int:
            if (i, j) not in cache:
                if i >= len(word1): return len(word2) - j
                if j >= len(word2): return len(word1) - i

                if word1[i] == word2[j]:
                    cache[(i, j)] = dp(i + 1, j + 1)
                else:
                    cache[(i, j)] = 1 + min(
                        dp(i + 1, j + 1), 
                        dp(i + 1, j),
                        dp(i, j + 1),
                    )
            return cache[(i, j)]
        return dp(0, 0)

Runtime:108 ms,快过 93.84%。

Memory:16 MB,低于 84.62%。

使用循环代替递归的解法:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        dp = [[0] * (len(word2) + 1) for _ in range(len(word1) + 1)]
        for i in range(len(word1) + 1):
            dp[i][0] = i
        for j in range(len(word2) + 1):
            dp[0][j] = j

        for i in range(1, len(word1) + 1):
            for j in range(1, len(word2) + 1):
                if word1[i - 1] == word2[j - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                else:
                    dp[i][j] = 1 + min(dp[i - 1][j - 1], dp[i - 1][j], dp[i][j - 1])
        return dp[-1][-1]

基于循环的 dp 表优化

减少内存的使用:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        pre, cur = [j for j in range(len(word2) + 1)], [0] * (len(word2) + 1)

        for i in range(1, len(word1) + 1):
            cur[0] = i
            for j in range(1, len(word2) + 1):
                if word1[i - 1] == word2[j - 1]:
                    cur[j] = pre[j - 1]
                else:
                    cur[j] = 1 + min(pre[j - 1], pre[j], cur[j - 1])
            pre, cur = cur, pre
        return pre[-1]

进一步减少内存的使用:

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        pre, cur = 0, [j for j in range(len(word2) + 1)]

        for i in range(1, len(word1) + 1):
            pre = cur[0]
            cur[0] = i
            for j in range(1, len(word2) + 1):
                temp = cur[j]
                if word1[i - 1] == word2[j - 1]:
                    cur[j] = pre
                else:
                    cur[j] = 1 + min(pre, cur[j], cur[j - 1])
                pre = temp
        return cur[-1]

Review

Xcode Build Time Optimization - Part 1 这是一个系列文章,本篇着重介绍了如何利用工具测量编译时间的开销:

  • 测量对象

    • Clean & Build - 常见于 CI 环境

    • Incremental build - 常见于开发环境

      tips: Xcode 使用时间戳检测文件的变动,从而确定要重新编译哪些文件;可以使用 touch 指令模拟变动。

  • 测量方法

    • Xcode 自带的报告 - Report Navigator 和 Xcode activity viewer
    • Xcode build timing summary - 通过 Product->Perform Action->Build With Timing Summary 路径开起,可以看到 Xcode 将任务进行了分组统计;在命令行下通过 -showBuildTimingSummary 选项开起
    • Swift 编译器的诊断选项
      • -driver-time-compilation
      • -Xfrontend -debug-time-compilation
      • -Xfrontend -debug-time-function-bodies
      • -Xfrontend -debug-time-expression-type-checking
    • Target 的编译时间统计
    • 聚合报告
      • 通过第三方的 XCLogParser 工具生成丰富的报告

编译时间的改进值得持续不断地完善,减少时间就是提高开发效率,磨刀不误砍柴工,这篇文章提到的工具都很不错,且都支持自动化,期待作者的后续文章。

Tip

Your App Might Be Too Fast, Here’s Why(And What To Do About It).

很有用的产品设计指南,订阅不亏(一周一封)

Share

一篇关于实现 Pub/Sub AppDelegate 的方案概述:提供 Pub/Sub 服务的 AppDelegate