13

Is the following true? Where [a] is a 1x1 matrix containing the object a.

$$ \begin{bmatrix} 2 \end{bmatrix} =\\ \begin{bmatrix} \begin{bmatrix} 2 \end{bmatrix} \end{bmatrix} =\\ \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} 2 \end{bmatrix} \end{bmatrix} \end{bmatrix}\\ \vdots $$

I am curious because I am writing a function for adding matrices, but there is no rule forbidding elements of matrices to be matrices; so I want to know if

$$ \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} 2 \end{bmatrix} \end{bmatrix} \end{bmatrix} + \begin{bmatrix} \begin{bmatrix} 2 \end{bmatrix} \end{bmatrix} = 4 $$

after canonicalization or not. I also want to know if this holds for direct sum:

$$ \begin{bmatrix} \begin{bmatrix} \begin{bmatrix} 2\\ 4 \end{bmatrix} \end{bmatrix} \end{bmatrix} + \begin{bmatrix} \begin{bmatrix} 2\\ 4 \end{bmatrix} \end{bmatrix} = \begin{bmatrix} 4\\ 8 \end{bmatrix} $$

I am aware that this messes up the convenient "indexing" property, as A[0] will no longer make sense; but matrices don't require to support this property anyway, we just have it for free for implementing them as arrays, so it's a nonissue.

I am curious if there is any particular harm to the algebra if 1 = [1] = [[1]] = [[[1]]] that makes it unusable, or if it just causes minor nuisance?

Breakages so far/severity:

  1. Indexing of a matrix(minor)

[3] at 0 is 3, but 3 at 0 is undefined(since indexing is not defined for scalars), and it is not clear whether [[3; 4]] at 0 is [3; 4] which is what it would be in current algebra, whereas it would be 3 in proposed algebra.

This is not very severe since matrices are not really arrays, but bilinear maps, so indexing them is fairly naive to begin with. This property won't work unless the given matrix is canonicalized before being indexed.

  1. Multiplication of matrix by scalar(minor/real damage).

There are some concerns over whether or not multiplication by scalar is invalid, but my previous argument is invalid. so I removed it.

Dmytro
  • 1,155
  • 5
    No, they are not the same, and you can't perform the mixed additions in your example. – TonyK May 01 '17 at 00:27
  • Why are they not the same? – Dmytro May 01 '17 at 00:28
  • 6
    A matrix of integers is not the same as a matrix of matrices (is not the same as a matrix of matrices of matrices etc.). Why would you think it was? – TonyK May 01 '17 at 00:30
  • 2 + [2] is not 4? – Dmytro May 01 '17 at 00:32
  • I mean typewise, you're right, addition works in a monoid where a : A + b : A = c : A. but in a weaker type system, what are the consequences of automatic coersion of higher order objects to a lower order in this kind of cases, other than losing the indexing operation. If [2] was automatically coersed to 2,and [[2; 4]] to [2; 4], what would break in our algebra other than the indexing operator and notion of order-of matrices? – Dmytro May 01 '17 at 00:37
  • 3
    It is notable that block matrices are often treated as if they were "matrices of matrices". – Ben Grossmann May 01 '17 at 03:53
  • 3
    "1 in current algebra is equal to [1 0; 0 1], [1 0 0; 0 1 0; 0 0 1]": say what now? I don't get this at all. – Ben Grossmann May 01 '17 at 03:55
  • @Omnomnomnom. in multiplication by scalar, 5[1 2; 3 4] treats 5 as [5 0; 0 5]. but you're right I can't say that the two are equivalent, as we can't replace scalars with identifies in higher spaces with scalar mapped over them. I'll edit it out later. – Dmytro May 01 '17 at 10:36

5 Answers5

30

The question has evolved since my initial answer, so now I offer three:


Answer #1, to the question "Is there a difference between $3$ and $[3]$?":

The answer is yes.

$3$ is the number three.

$[3]$ is a nice wooden box, lined with velvet, holding the number three.

$[[3]]$ is a nice wooden box, lined with velvet, containing a smaller wooden box.

We have $2+3=5$, because we know how to add two numbers. We cannot calculate $[2]+3$, as we do not know how to add a number to a nice wooden box. It turns out that $[2]+[3]=[5]$, because we have special rules for adding two wooden boxes.


Answer #2, to the question "Can we treat $3$ and $[3]$ as the same?"

We can, except that this will not be standard mathematics anymore. Other answers (and comments) have pointed out things that break. Not everything breaks, true.

We could, similarly, treat $3$ and $7$ as the same. This doesn't break all of mathematics, e.g. $2+2=4$ remains true; however, now $7-3=0$, and all sorts of craziness flows out. Following this to its natural conclusion ends up breaking a lot. To avoid this, we could just throw out the parts that break, leaving a new type of mathematics (considerably smaller now) in which $3=7$. This brings us to the third question and answer.


Answer #3, to the question "Should we treat $3$ and $[3]$ as the same?"

The answer is no -- we should not do this -- unless we have a clear benefit from doing so. Even if there is such a benefit, it would have to be weighed against the cost of what we give up with this new mathematics.

The only benefits I personally see are (a) a certain clarity of understanding, and (b) a certain simplicity in writing software implementations. However, (a) is a very small benefit for the price of setting 3=7, and in my view the clarity is actually categorical confusion. As for (b), it is similarly only an illusion; all that is gained is the ability to generate crashed programs very easily.

vadim123
  • 82,796
  • That's not a very algebraic explanation. – Dmytro May 01 '17 at 00:33
  • 24
    @Dmitry: the algebra is clear on this. Vadim is just trying to help you believe it. – TonyK May 01 '17 at 00:34
  • 2
    That's about it though. $3\neq {3}$ because one is a number and the other a set. Philosophically, you can write a number with any symbol of course, but the standard in math is that $3\neq{3}$. – pancini May 01 '17 at 00:35
  • @ElliotG sets have nothing to do with this, each column may be represented as either an object in the set of nary products of any object(aka struct), which is ridgid, or as a nested the set of nested binary product of any objects, with the deepest binary product's second or first being undefined(aka list).

    So [[[4, 5]]] can be represented as either Scalar Scalar (Vector2 4 5) or Vector2 (Vector2(Vector2 4 5) undefined) undefined.

    In the first definition, if a=[4;5] and b=[4;5;6], a[0] would be a different operation because of different types. In second, we have uniform indexing.

    – Dmytro May 01 '17 at 00:53
  • that said; so far most courses I took use first definition; being each column vector of n items is its own type, and addition of differently sized vectors is undefined(whereas it would be defined in second definition). – Dmytro May 01 '17 at 00:55
  • 1
    @ElliotG: But for set theorists, $3={0,1,2}$. – celtschk May 01 '17 at 06:13
  • Vadim: Do you feel the same about polynomials? If $a=3 \in \mathbb{Z}$ and $b=x^2+1 \in \mathbb{Q}[x]$, we routinely write $a+b$ without worrying about it, identifying the number 3 and the polynomial 3, but that's more or less the same notation abuse in my view. – Federico Poloni May 01 '17 at 21:31
  • @FedericoPoloni, this is a different situation. $\mathbb{Z}$ is a subset of $\mathbb{Q}$, and hence of $\mathbb{Q}[x]$. $a+b$ here is not abusing notation at all. – vadim123 May 02 '17 at 03:19
  • 1
    @vadim123 Well, $\mathbb{Q}$ is a subset of $\mathbb{Q}[x]$ only because we are used to thinking about it identifying some things. If you were to define it formally (in a computer language or to a set theorist, for instance), $\mathbb{Q}[x]$ would be something like the set of tuples $d,c$ with $d\in\mathbb{N}$ and $c\in\mathbb{Q}^{d+1}$. – Federico Poloni May 02 '17 at 06:18
15

The ring of $1 \times 1$ real matrices is indeed isomorphic to the real numbers, but is not "the same thing". That said, it's often convenient and rarely confusing to identify them. For example, you can think of the dot product of two vectors (which is a real number) as the matrix product of a $1 \times n$ matrix with an $n \times 1$ matrix, which is a $1 \times 1$ matrix.

Ethan Bolker
  • 95,224
  • 7
  • 108
  • 199
  • Still; what would break in out algebra if we allow automatic coersion from the higher order to the lower order in such situations? Would any axiom break, would we lose any operations? Surely we can define such an algebraic structure("weaker" version of the [] constructor) easily, I'm curious what the consequences of such a structure would be. – Dmytro May 01 '17 at 00:42
  • 6
    Well you can multiply an $m \times n$ matrix by a scalar, but you can't (usually) multiply it by a $1 \times 1$ matrix. So in some situations you could coerce, in some not. It's always clear in written mathematics, but I suspect would be pretty tricky to implement properly in a computer program. – Ethan Bolker May 01 '17 at 00:50
  • Oh so you are saying the issue is that 2 is not equal to [2] because 2 would then be equal to [2 0; 0 2] and [2], breaking multiplication by scalar(which isn't actually a big issue, but it does break things that depend on scalar multiplication of matrices whereas a more honest representation is to spell out the scalar in the proper matrix). In my system [2] * [3 4; 5 6] would be undefined because both are in canonical form and 1x1 matrix cannot be multiplied by 2x2 matrix. In particular, multiplication by scalar would be broken because 1 would be [1 0; 0 1] and this rule never terminates – Dmytro May 01 '17 at 01:00
  • In addition to the "written mathematics exceptions" noted in this answer, it is common to write (and multiply) block matrices as if they were "matrices of matrices". – Ben Grossmann May 01 '17 at 03:51
  • 2
    @EthanBolker: Note however that the Kronecker product of an $1\times 1$ matrix with a general matrix acts exactly the same as multiplication with a scalar. So one might argue the problem is not the identification of $1\times 1$ matrices with scalars, but using the wrong multiplication. – celtschk May 01 '17 at 06:23
  • @EthanBolker No, it's not tricky to implement properly in a computer program. You just need three cases: the product of a $m\times n$ array and a $p\times q$ array is defined if $n=p$, $m=n=1$ or $p=q=1$. – Federico Poloni May 01 '17 at 21:26
  • @FedericoPoloni That takes care of my example. But there may well be other places where you have to decide whether or not to coerce. For an example, see cellsck's comment. Best to keep the typing strict and say what you mean each time. – Ethan Bolker May 01 '17 at 21:33
5

That's ultimately a matter of convention. Many people who work professionally in numerical linear algebra think in those terms, and the world still survives. It's just a slight abuse of notation that is typically harmless. As you correctly identify, you have to be careful with scalar * matrix, and it's better to "flatten" block matrices: $$ \begin{bmatrix} \begin{bmatrix} 1\\2 \end{bmatrix}\\ \begin{bmatrix} 3\\4 \end{bmatrix} \end{bmatrix} = \begin{bmatrix} 1\\2\\3\\4 \end{bmatrix}. $$

For instance, note that in Matlab (probably the most popular language for matrix computations) scalars are automatically promoted to matrices: the lines starting with >> are the ones I typed in, the rest is output

>> [2] == 2
ans =
  logical
   1
>> [[[2;4]]] + [[2;4]]
ans =
     4
     8
>> A=3
A =
     3
>> A(1)
ans =
     3
>> B = [[3;4]]
B =
     3
     4
>> B(1)
ans =
     3

>> C = [1 2; 3 4]
C =
     1     2
     3     4
>> 2*C
ans =
     2     4
     6     8
>> C+2
ans =
     3     4
     5     6
>> B + 2
ans =
     5
     6
>> A = ones(0,2)
A =
  0×2 empty double matrix
>> B = ones(2, 1)
B =
     1
     1
>> A*B
ans =
  0×1 empty double column vector
4

Lets look at this in a different way. Suppose $$A:= \begin{bmatrix} C&D\\ E&F \end{bmatrix}, \:B:= \begin{bmatrix} E&C\\ D&F \end{bmatrix}$$ Would it be possible for us to say have $A(B)=X$ or $A+B=Y$ for any arbitrary matricies $C$, $D$, $E$, and $F$? How about both $A(B)=X$ and $A+B=Y$ is this possible? If there are issues, when would they come up?

  • are you referring to the problem that multiplying [C D; E F] by [[E E] [C C];[D D] [F F]] would make no sense? This would distribute properly, but we would end up having expressions like C * [E E] which is undefined in the new algebra, whereas it would be [(CE) (C E)] in the old algebra, which dictates that axm * mxb = axb, except for 1x1, which can be coersed into an identity in the other space. – Dmytro May 01 '17 at 01:06
  • In my proposal, all the objects in matrices can be any object(any algebraic structure with + defined), including strings, functions; but they can only be multiplied if the additions performed after the distribution is done is legal. [1] * ["hi"] would be undefined, [1] * [2] would be [2], and [1 0; 0 1] * ["hello"; 2] would be [undefined undefined; 0 2]. So in a way each addition after distribution is independent of whether the other additions are defined or not.

    A consequence of this would be that + can be overloaded to be any associative operation, not just real/complex addition.

    – Dmytro May 01 '17 at 01:28
  • Yes I'm referring to the issue of the sizes of $C$, $D$, $E$,and $F$. there is no exceptions even with $1\times 1$ matrices. Also with your "New Algebra" construct vectors wouldn't be closed under either addition or scalar multiplication. While I'm not saying that you can't have nested matrices, By all means go for it. But it still has to follow the same axioms that formed matrices to begin with. Otherwise it just makes nonsense, just like how $\begin {bmatrix}1\ 2\end{bmatrix}+2$ is nonsensical. – Sentinel135 May 01 '17 at 01:48
  • they would be closed under addition, it's not closed under multiplication but current algebra has the same issue for nonsquare matrices; square matrices(including vector with 1 element) would be closed under multiplication. multiplication by scalar would be broken for vectors, but it could be fixed by multiplying the identity of (length of v)x(length of v) matrix and mapping constant over each element(0 stay 0, 1 become new value), then 2 <1, 2> would be undefined, but (id([1 2]x[1;2]).map(*2)) would be [2 0; 0 2] which can then be used to multiply by [1; 2] to produce [2 4]. – Dmytro May 01 '17 at 01:54
  • but yeah not being able to scale vectors inuitively is a nuisance. – Dmytro May 01 '17 at 01:54
  • yeah [1; 2] + 2 is not nonsense, it's an object : RxR being added to an object in R, since neither can be coersed into a common form, + is undefined since + is a Set morphism AxA -> A where A is the same type; the most general common type of [1; 2] and 2 is "Any" and + is not generally defined for any type(we generally don't define Apple + Orange, we can define it to be an IncompleteAddition(new algebraic structure to being used in our algebra) [Apple; Orange] but it won't be universal). – Dmytro May 01 '17 at 02:02
  • 2
    And now we get to the crux of the issue. Your algebra Looses the identity property when you include undefined objects. This can be shown as we include $\begin {bmatrix} 1\ 2\end {bmatrix}$ and $2$ into the set. Note that the identity $e$ must also be in the set. Note that $\begin {bmatrix} 1\ 2\end {bmatrix}+e=\begin {bmatrix} 1\ 2\end {bmatrix}$ implies $e$ is a vector. However $2+e\neq 2$, and thus $e$ doesn't exist for all elements. So your "new algebra" can't even be a group. – Sentinel135 May 01 '17 at 04:30
  • What you have created could be considered a semicatagory, or a semigroup. Considering you're including any object (including undefinable ones) I'd say the former rather than the later. – Sentinel135 May 01 '17 at 04:31
  • can you rephrase it? How come 2+e not 2, identity for this is still 0. – Dmytro May 01 '17 at 10:45
  • Certainly! What I am saying is that the identity $e_1+2=2$ is not the same as the identity $e_2+\begin{bmatrix} 1 \ 2 \end{bmatrix}=\begin{bmatrix} 1 \ 2 \end{bmatrix}$. While mechanically they are similar. it has the issue of adding apples to oranges. So if we were to try $e_2+2$, we'd get an undefined answer, and likewise with $e_1+ \begin{bmatrix} 1 \ 2 \end{bmatrix}$. – Sentinel135 May 01 '17 at 11:41
  • what is e_1 and e_2 is it identity of the object "1" and identity of the object "2"? identity being the function that given any object a in a set of A that has id(a : A) defined, gives an object b such that a*b = a? Because identity of all integers under addition is 0, 1 under multiplication. I'm not sure where anything breaks. As long as we are using reals, integers, complex, so on, we stay as a ring; identity, associativity, inverse, all remain valid. – Dmytro May 01 '17 at 11:46
  • Okay $e_1= 0$ and $e_2= \begin {bmatrix} 0\ 0\end{bmatrix}$. As such $e_2+2= \begin {bmatrix} 0\ 0\end{bmatrix} + 2=$ undefined for the same reason $\begin {bmatrix} 1\ 2\end{bmatrix}+2=$ undefined. Similarly $e_1+\begin {bmatrix} 1\ 2\end{bmatrix}= 0+\begin {bmatrix} 1\ 2\end{bmatrix}=$ undefined. – Sentinel135 May 01 '17 at 11:56
  • Yeah, but id([1; 2]) is [0, 0]. both 2 + [1; 2] and 0 + [1; 2] are undefined in the normal algebra. 0 is not [0; 0] is not [0; 0; 0] if that's what you're asking. Only [0] = [[0]] = [[[0]]]... – Dmytro May 01 '17 at 12:03
1

Is the following true? Where [a] is a 1x1 matrix containing the object a.

As has been implied by @Federico_Poloni and others, the answer somewhat depends upon if you are a mathematical platonist or a mathematical formalist. As I see it, mathematics is a toolchest from which you can pull out anything you need for the job. So, a typical mathematical activity would be to define a basic axiom, such as:

Axiom: 1 = [1]

and then look at the consequences of the assertion. In many ways, programming languages are subsets of mathematics, and from within those one can find that most languages state that the axiom is false, because the types are different, even if the values are the same. Some of the answers above make the same assertion.

So, truth value is merely a convention among others, used to explore aspects of formal problem solving languages - otherwise known as mathematics.

I think that what you are asking though, is whether or not it is safe to make the assumption within your particular environment, and then the answer would have to be 'it depends upon the context', and likewise, 'it depends upon how your rule is going to be used, defined, and understood'. If you are working within the domain of typed languages, then it's most probably bad idea to make the rule, whereas if you are in the domain of untyped languages, then it may still be a bad idea, depending upon how you expect users (biological and artifical) to interact with your function.

Konchog
  • 193
  • 1
    to be clear, I'm asking whether allowing the given to be equal breaks any particular property necessary for matrices. it is fundamentally an algebra question rather than interface question. You're right that 1 and [1] are distinct objects, but we can allow canonical(1) = canonical([1]) and assert that in our algebra canonicalization is implicit. This in typed languages requires union type and eval since we can't tell what the final type is, but it is representable. typed and untyped algebras are isomorphic. – Dmytro May 01 '17 at 11:32
  • 1
    @Dmitry, thanks, yes - but I believe my answer still stands: it depends upon the context. Just ensure that you define your function clearly. – Konchog May 01 '17 at 11:36