1

If I have a word that consists of letters I, V, X, L, C, D, M, how can I tell whether it is a valid roman numeral? For example, how do I tell that IXXL is not valid?

spiderface
  • 113
  • 2
  • You can draw a syntax diagram. It should break down neatly into four successive phases, for thousands, hundreds, tens and units, the last three being similar in structure. – Joffan Jan 16 '17 at 22:31

1 Answers1

2

To ease the description you can consider the pairs "IV", "IX", "XL", "XC", "CD" and "CM" as single symbols.

That is, the roman numerals are sequences made of these symbols: I, IV, V, IX, X, XL, L, XC, C, CD, D, CM, M.

The sequences must hold these rules:

  • The symbols I, X, C, M can be repeated up to three consecutive times. Other symbols must not be repeated.
  • In each sequence, the symbols appear in decreasing order.
  • If a symbol with two letters occurs, none of these two letters occur after it, with the followinf exceptions: after XL and XC there can be IX; and after CD and CM there can be XC.

This way, your example is not legal because IX < XL, or because after IX there must not be any letter X.

ajotatxe
  • 65,084
  • 1
    "If a symbol with two letters occurs, none of these two letters occur after it." is not right. For example 99 - XCIX. After 'XC' there occurs 'X'. – Jaroslaw Matlak Jan 16 '17 at 22:47
  • 1
    These are “modern” rules: in Roman inscriptions one can find “IIXX” for 18 (Latin duodeviginti), possibly because “IIXX” takes a bit less space than “XVIII”. Repetition of more than three of the main symbols I C M was also allowed. Sticking to a consistent set of rules was not of a concern for the ancient Romans, who were not very proficient in arithmetic. – egreg Jan 16 '17 at 22:53
  • @JaroslawMatlak The pair “IX” should be considered as a single symbol. Probably the third rule should be better phrased. – egreg Jan 16 '17 at 22:54
  • What about pairs like XM or IC? – spiderface Jan 17 '17 at 05:45
  • @JaroslawMatlak Thank you for your pointing. I have changed the third rule. – ajotatxe Jan 17 '17 at 13:32
  • @spiderface These pairs are not allowed, at least in the modern standard. To write 990. the correct expression is CMXC. – ajotatxe Jan 17 '17 at 13:33