3

For example, when it comes to strings and languages, we say that an alphabet is any finite set and a string over alphabet is a finite sequence of symbols from the alphabet.

What does, "string over alphabet," mean?

Does it mean that just that, "a finite sequence of symbols from the alphabet"?

Maybe I am confused by subsets, which we know to be unordered. Is this concept similar, but a sequence of values from a set, therefore ordered?

Asaf Karagila
  • 393,674
Oliver
  • 183
  • 1
    Yes, though it does not need to be finite. If $A$ is the underlying alphabet then a string over $A$ is just a sequence $s_1s_2\cdots$ where $s_n\in A$ for all $n$. It is usually written in concatenated form, though you could just consider the sequence ${s_n}$ – lulu Aug 25 '21 at 23:00
  • @lulu Thank you. Then why is a language over alphabet a set of strings over the alphabet, and not a sequence of strings over the alphabet? Also, would a language also be finite? – Oliver Aug 25 '21 at 23:02
  • There are contexts in which we want to take strings over an alphabet in a particular sequence. For example, to formalize the notion of proofs, we would require a finite sequence of strings so that it can said each step of the proof is either justified as an axiom or by logical inference from previous steps. The language is just the set of (well-formed) formulas (strings), so no sequence on those formulas is needed. – hardmath Aug 25 '21 at 23:06
  • Your question is extremely unclear. The text you are quoting is defining the term "string over [an] alphabet" using the notions of sequence and set that you are presumed to understand. So a sequence like "notion" of letters of the alphabet is a string over the set of letters ${i, n, o, t}$. – Rob Arthan Aug 25 '21 at 23:15
  • A language is defined by which strings it contains. Defining it as a sequence would just add unnecessary noise to the definition: each element would have a position in the sequence which is never used, and we would need to specify what it means if the same string appears multiple times. I think you are getting confused with the order of the letters within a string being important. But not the order of the strings within a language. – Erick Wong Aug 25 '21 at 23:58

1 Answers1

4

A string is a sequence of symbols. Depending on the context, this might mean exclusively finite sequences or infinite sequences might be permitted.

An $X$ over a set is an $X$ whose "elements" are taken from that set ... what exactly the elements are depends on the context.

Let $\mathbb{N}$ be the numbers $1, 2, 3 \cdots$.

If $\Sigma$ is an alphabet and $\Sigma^*$ is the set of strings over $\Sigma$, then each element of $\Sigma^*$ is a total function from $\{x : x \in \mathbb{N} \;\text{and}\; 1 \le x \le n \}$ to $\Sigma$, where $n \in \mathbb{N} \cup \{\infty\}$ and the string in question has length $n$.

Your next question raises a subtle point.

A language $L$ is typically defined as a subset of $\Sigma^*$. The reason why $L$ is a subset in general is that it lets you rule out ill-formed strings. For example $((((x+$ is ill-formed in the language of arithmetic, but $(x+1)$ is not.

$L$ sometimes has a fairly natural ordering called the lexicographic ordering. If $\Sigma$ itself is ordered, then we can order any two strings relative to each other by comparing their elements one by one and arbitrarily decreeing that the absence of an element is less than any given element, so $ab$ would come before $aba$ in lexicographic order. However, the lexicographic ordering does not give you a sequence in general, for example these strings are in lexicographic order, but are not order-isomorphic to $\mathbb{N}$.

$$ a, aa, aaa, a^4, a^5, \cdots, b $$

Because of issues like this, it's convenient not to insist up front that all the strings of a given language be possible to arrange into a sequence.

Greg Nisbet
  • 11,657