NFA (nondeterministic finite automaton) made out of the Bible?

Question

Let B be the language over alphabet {a, ... z} consisting of those words occuring in the Bible. Thus, B = {in,the,beginning,god,created,...}.

Describe an NFA whose language is B. Describe what happens when you apply subset construction to this NFA to get a DFA, ignoring unreachable states. Roughly, how do the number of states in each compare? How does each number compare to the # of distinct words in the Bible?

Of course, this is homework so I'm not asking for answers so much as a bit of a shove in the right direction. I'm pretty thoroughly stumped here.

I'm picturing the NFA as a series of letters that go down an ordered path in the same order that the letters show up in the Bible. Which not only seems really impractical but a little too stupid of an answer for it to be correct. What would an NFA of this even look like? Would it really be huge?

Thanks for any help.

To make an automaton that recognize one word is easy. If you merge the initial state of two of this automaton you get one that recognize 2 words. sot to get your automaton you can just merge all the initial state of all the automaton for all word. It will give you a big star shaped automaton. Not that big if you have $n$ word of size $l$ you will have $n.l+1$ states — wece, Apr 22 '15 at 19:43

score 2 · Answer 1 · answered Apr 23 '15 at 09:14

2

According to this page, there are 788,258 words in the Bible but only 14,565 distinct words. Now, it is shown in [1] that a full Scrabble lexicon with 94,240 entries can be represented be a DFA with 19,853 states. So you can expect the DFA of the Bible to have roughly $19,853 \times \frac{14,565}{94,240} \approx 3068$ states.

[1] A. W. Appel et G. J. Jacobson, The world fastest scrabble program, Commun. ACM 31 (1988), 572–585.

answered Apr 23 '15 at 09:14

J.-E. Pin

40,163

Wow way smaller than what I was expecting ... – wece Apr 23 '15 at 12:18

NFA (nondeterministic finite automaton) made out of the Bible?

1 Answers1