Think about it in terms of finite state automata. Since $\sim_L$ has $4$ equivalence classes, a minimal FSA $M$ for $L$ has $4$ states, say $q_1,q_2,q_3$, and $q_4$, such that $A_k$ is the set of words in $\Sigma^*$ that cause $M$ to end in state $q_k$ for each $k=1,2,3,4$. $S$ is then the set of words that cause $M$ to end in state $q_1$ or state $q_2$.
- $S$ is clearly regular; why?
If you’ve answered that first question correctly and understand the relationship between Myhill-Nerode equivalence classes and states of DFAs, it should be clear why $\sim_S$ has at most $4$ classes. But the DFA that you have for $S$ may not be minimal, so you may be able to reduce the number of classes. Here’s a concrete example.
Example: Let $\Sigma=\{a\}$, and let $L=\{a^{4n}:n\ge 0\}$. The words $\lambda,a,aa$, and $aaa$ are representatives of the four $\sim_L$ equivalence classes. A minimal DFA for $L$ is a $4$-cycle of states $q_0,q_1,q_2,q_3$, where $q_0$ is the initial state and the only acceptor state. Let $S$ be the union of the classes containing $\lambda$ (the empty word: you may use $\epsilon$) and $aa$. Then $S=\{a^{2n}:n\ge 0\}$ has a $2$-state minimal DFA, a simple $2$-cycle with states $q_0',q_1'$ with $q_0'$ as the initial state and only acceptor state. The two $\sim_S$ equivalence classes contain the words of even length (which are in $S$) and the words of odd length (which are not in $S$), respectively.
One question still remains:
- Why must $\sim_S$ have at least two classes? Equivalently, why must any DFA for $S$ have at least two states? (HINT: What languages over $\Sigma$ have one-state DFAs?)