Loading... ## 有穷自动机 ### 定义 - 一类处理系统建立的数学模型 - 这类系统具有一系列离散的输入输出信息和有穷数目的内部状态(状态:概括了对过去输入信息处理的状况) - 系统只需要根据当前所处的状态和当前面临的输入信息就可以决定系统的后继行为。每当系统处理了当前的输入后,系统的内部状态也将发生改变 ### FiniteAutomata的表示 #### 转换图 - 结点:FA的状态 - 初始状态(开始状态):只有一个,由start箭头指向 - 终止状态(接收状态):可以有多个,用双圈表示 - 带标记的有向边:如果对于输入a,存在一个从状态p到状态q的转换,就在p、q之间画一条有向边,并标记上a ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; 3; node [shape = circle]; start -> 0 [label = "a"]; 0 -> 0 [label = "a"]; 0:s -> 0 [label = "b"]; 0 -> 1 [label = "a"]; 1 -> 2 [label = "b"]; 2 -> 3 [label = "b"]; } ``` ### Finite Automata定义(接收)的语言 - 给定输入串x,如果存在一个对应于串x的从初始状态到某个终止状态的转换序列,则称串x被该FA接收 - 由一个有穷自动机M接收的所有串构成的集合称为是该FA定义(或接收)的语言,记为L(M(machine)) 对于上面的转换图,L(M) = 所有以abb结尾的字母表{a, b}上的串的集合 ### 最长子串匹配原则(Longest String Matching Principle) - 当输入串的多个前缀与一个或多个模式匹配时,总是选择最长的前缀进行匹配 - 在到达某个终态之后,只要输入带上还有符号, DFA就继续前进,以便寻找尽可能长的匹配 ## 有穷自动机的分类 - 确定的FA (Deterministic finite automata, DFA) - 非确定的FA (Nondeterministic finite automata, NFA) ### 确定的有穷自动机DFA 从一个状态出发,沿着标记为a的边所能到达的状态只有一个 ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; 3; node [shape = circle]; start -> 0 [label = ""]; 0:e -> 1 [label = "a"]; 1 -> 2 [label = "b"]; 2 -> 3 [label = "b"]; 0 -> 0 [label = "b"]; 2 -> 1 [label = "a"]; 1 -> 1 [label = "a"]; 3 -> 0 [label = "b"]; 3 -> 1 [label = "a"]; } ``` ### 非确定的有穷自动机NFA 从一个状态出发,沿着标记为a的边所能到达的状态可能有多个 ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; 3; node [shape = circle]; start -> 0 [label = "a"]; 0 -> 0 [label = "a"]; 0:s -> 0 [label = "b"]; 0 -> 1 [label = "a"]; 1 -> 2 [label = "b"]; 2 -> 3 [label = "b"]; } ``` ### DFA和NFA的等价性 - 对任何非确定的有穷自动机N,存在定义同一语言的确定的有穷自动机D - 对任何确定的有穷自动机D,存在定义同一语言的非确定的有穷自动机N - DFA和NFA可以识别相同的语言 上述状态图代表的正则表达式都为$(a|b)^*abb$ ### 带有“ɛ-边”的NFA 带有“ɛ-边”代表正则中的*(匹配前面的子表达式零次或多次) ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; C; node [shape = circle]; start -> A [label = ""]; A -> A [label = "0"]; A -> B [label = "ɛ"]; B -> B [label = "1"]; B -> C [label = "ɛ"]; C -> C [label = "2"]; } ``` ### 带有和不带有“ε-边”的NFA 的等价性 ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; C; node [shape = circle]; start -> A [label = ""]; A -> A [label = "0"]; A -> B [label = "ɛ"]; B -> B [label = "1"]; B -> C [label = "ɛ"]; C -> C [label = "2"]; } ``` ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; A B C; node [shape = circle]; start -> A [label = ""]; A -> A [label = "0"]; A -> B [label = "0,1"]; A -> C [label = "0,1,2"] B -> B [label = "1"]; B -> C [label = "1,2"]; C -> C [label = "2"]; } ``` ## 从正则表达式到有穷自动机 ### ε对应的NFA ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; B; node [shape = circle]; start -> A [label = ""]; A -> B [label = "ɛ"]; } ``` ### 字母表Σ中符号a对应的NFA ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; B; node [shape = circle]; start -> A [label = ""]; A -> B [label = "a"]; } ``` ### r = ab对应的NFA ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; C; node [shape = circle]; start -> A [label = ""]; A -> B [label = "a"]; B -> C [label = "b"] } ``` ### r = a|b对应的NFA ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; B; node [shape = circle]; start -> A [label = ""]; A -> B [label = "a"]; A -> B [label = "b"]; } ``` ### $r=a^*$对应的NFA ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; start -> A [label = ""]; A -> A [label = "a"]; } ``` ## NFA到DFA的转化 ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; 9; node [shape = circle]; start -> 0 [label = ""]; 0 -> 1 [label = "ɛ"]; 0 -> 7 [label = "ɛ"]; 1 -> 2 [label = "ɛ"]; 1 -> 4 [label = "ɛ"]; 2 -> 3 [label = "a"]; 3 -> 6 [label = "ɛ"]; 4 -> 5 [label = "b"]; 5 -> 6 [label = "ɛ"]; 6 -> 1 [label = "ɛ"]; 6 -> 7 [label = "ɛ"]; 7 -> 8 [label = "a"]; 8 -> 9 [label = "b"]; } ``` | 操作 | 描述 | | ------------ | ----------------------------------------------------------- | | ɛ-closure | 能够从NFA的状态s开始只经由条件ε可以到达的所有状态的集合 | | move(T, a) | 能够从T中的某个状态s开始经由条件a可以到达的所有状态的集合 | 首先将初始态的转换闭包ε-closure(0)设为状态A,即A={0,1,2,4,7} 接着写出状态A经过上面转换图中所有转换条件得到的状态 即ε-closure(move(A,a))和ε-closure(move(A,b)) 每次得到新的集合命名为新的DFA状态,继续对新的集合进行上述操作 - ε-closure(0) = A - ε-closure(move(A,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = B - ε-closure(move(A,b)) = ε-closure(5) = {1,2,4,5,6,7} = C - ε-closure(move(B,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = B - ε-closure(move(B,b)) = ε-closure({5,9}) = {1,2,4,5,6,7,9} = D - ε-closure(move(C,a)) = ε-closure() - ε-closure(move(C,b)) - ε-closure(move(D,a)) = ε-closure({3,8}) = {1,2,3,4,6,7,8} = B - ε-closure(move(D,b)) = ε-closure(5) = {1,2,4,5,6,7} = C 得到DFA状态转换表如下 | | a | b | | --- | --- | --- | | A | B | C | | B | B | D | | C | B | C | | D | B | C | ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; D; node [shape = circle]; start -> A [label = ""]; A -> B [label = "a"]; A -> C [label = "b"]; B -> B [label = "a"]; B -> D [label = "b"]; C -> B [label = "a"]; C -> C [label = "b"]; D -> B [label = "a"]; D -> C [label = "b"]; } ``` ## DFA的最小化 ### 多余状态 - 从这个状态出发没有通路到达终态 - 从开始状态出发,任何输入串也不能到达的那个状态 ### 等价状态 - **一致性条件**:状态s和t必须同时为终态或非终态 - **蔓延性条件**:对于所有输入符号,状态s和状态t必须转化到等价的状态里 ### 方法 以上面DFA为例 根据等价状态将所有状态划分 1. 开始按一致性条件划分为{A,B,C},{D} 2. move({A,B,C},a) = {B} move({A,B,C},b) = {C,D} A,B,C接收b时,AC接收b到达C,B接收b到达D 根据蔓延性条件划分为{A,C},{B},{D} 3. move({A,C},a) = {B} move({A,C},b) = {C} 所以A,C可以合并,结束 ```graphviz digraph finite_state_machine { rankdir = LR; size = "8,5" node [shape = plaintext]; start; node [shape = doublecircle]; D; node [shape = circle]; start -> A [label = ""]; A -> B [label = "a"]; B -> B [label = "a"]; B -> D [label = "b"]; A -> A [label = "b"]; D -> B [label = "a"]; D -> A [label = "b"]; } ``` 最后修改:2023 年 06 月 05 日 © 允许规范转载 打赏 赞赏作者 支付宝微信 赞 1 如果觉得我的文章对你有用,请随意赞赏