3

The author of this question was close to determining the derivative of the function of dual variable, when we consider matrices isomorphic (algebraically and topologically) to dual numbers: $$(a+\epsilon b) \sim \begin{bmatrix} a & 0 \\ b & a \\ \end{bmatrix}.$$

So, using the fact we can define the derivative (in the Fréchet sense) for functions $F$ for with an argument in the form of such a matrix and a value in the form of such a matrix: $$F\big(\begin{bmatrix} x+s & 0 \\ y+t & x+s \\ \end{bmatrix}\big)-F\big(\begin{bmatrix} x & 0 \\ y & x \\ \end{bmatrix}\big)=\begin{bmatrix} u' & 0 \\ v' & u' \\ \end{bmatrix}\begin{bmatrix} s & 0 \\ t & s \\ \end{bmatrix}+o\bigg(\bigg|\bigg|\begin{bmatrix} s & 0 \\ t & s \\ \end{bmatrix}\bigg|\bigg|\bigg),$$ where $\bigg|\bigg|\begin{bmatrix} s & 0 \\ t & s \\ \end{bmatrix}\bigg|\bigg|=\max\{|s|,|t|\}$ and all elements of all matrices are real.

Therefore, the existence of such a matrix $\begin{bmatrix} u' & 0 \\ v' & u' \\ \end{bmatrix}$ (which we will call derivative at $\begin{bmatrix} x & 0 \\ y & x \\ \end{bmatrix}$) means differentiability of $F$ at $\begin{bmatrix} x & 0 \\ y & x \\ \end{bmatrix}$.

I'm interested in to what extent can this approach be generalized in defining a matrix-valued function of a matrix argument? I mean the case, when the derivative is an object of the same nature as variables (in opposed to the definition of the derivative of a function $f:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}$ which is a (Jacobian) matrix).

Can anyone share links to material with respect to such kind of derivatives?

1 Answers1

1

That will quickly go wrong. One step more complex than your example is the case of complex valued functions of complex numbers. The matrix equivalent to those number is then: $$(a+i b) \sim \begin{bmatrix} a & b \\ -b & a \\ \end{bmatrix}.$$

And for the function we know that it needs not one, but two derivatives: $$F\big(\begin{bmatrix} x+s & y+t \\ -y-t & x+s \\ \end{bmatrix}\big)-F\big(\begin{bmatrix} x & y \\ -y & x \\ \end{bmatrix}\big)=\qquad\qquad\qquad \\[20pt] \begin{bmatrix} c & d \\ -d & c \\ \end{bmatrix}\begin{bmatrix} s & t \\ -t & s \\ \end{bmatrix} +\begin{bmatrix} u & v \\ -v & u \\ \end{bmatrix}\begin{bmatrix} s & -t \\ t & s \\ \end{bmatrix}+o\bigg(\bigg|\bigg|\begin{bmatrix} s & t \\ -t & s \\ \end{bmatrix}\bigg|\bigg|\bigg),$$ which we usually write as: $$ f(z+\Delta) - f(z) = c \ \Delta + u\ \Delta^* + o(|\Delta|)\\[15pt] \text{or:}\quad f(z+\Delta) - f(z) = \frac{df}{dz} \ \Delta + \frac{df}{dz^*}\ \Delta^* + o(|\Delta|)\\ $$.

As example take the following functions and their pair of derivatives: $$ \begin{matrix} f(z) & & df/dz & & df/dz^* \\[10pt] {\rm Re}(z) && \frac12 && \frac12 \\ {\rm Im}(z) && -\frac12\ i && \frac12\ i \\ z && 1 && 0 \\ z^2 && 2\,z && 0 \\ z^* && 0 && 1 \\ |z|^2 && z^* && z \\ |z| && \frac12 \frac{\Large z^*}{\Large |z|} && \frac12 \frac{\Large z}{\Large |z|} \\ |z|^3 && \frac32 |z| z^* && \frac32\, |z|\, z \\ && {\rm etc.} && \end{matrix}.$$

As can be seen, only functions that are analytical, like $z$, or $z^2$, have $df/dz^*=0$ so they need only $df/dz$ to describe their derivative (which we then call "the complex derivative"). Likewize, purely anti-analytical functions, like $z^*$, or $(z^*)^2$, need only $df/dz^*$. In general, however, two complex numbers are needed, or in matrix language: two matrices are needed to describe the first order variation for these matrix-valued functions of a matrix. (See also question 2126598.)

For more complex (larger) matrices the number further increase, there simply is much more information required than can be contained in one matrix to describe the derivative in those cases.

  • Jos Bergervoet good example. But if I define the derivative of such matrix functions as I described above for case of dual nubmer, it just means the function $F$ satisfies Cauchy–Riemann conditions (that is, the function is holomorphic, which is equivalent to analyticity for the case of complex numbers).

    Therefore, it is clear that such derivatives can only be described by limited classes of functions. And so my question is: is there a well-described theory regarding such classes of matrices and functions of them?

    – Иван Петров Mar 30 '24 at 09:15
  • Indeed @Иван (perhaps we should look at the quaternions as next example and follow Cayley–Dickson from there, hopefully it will all be clear once we reach the Sedenions!) But what I was thinking of is actually the opposite: allowing all functions, and then accepting that the derivative must be split into a sum with a minimum required number of terms, as the "rank" in an entangled state of two systems in QM. (And we know that the well-described theory of entanglement is very interesting, maybe this could be related.) – Jos Bergervoet Mar 30 '24 at 09:29
  • 1
    I look at it from a different perspective @Jos. For functions $f:\mathbb{R}^n \to \mathbb{R}^m$ the derivative at a point $a \in \mathbb{R}^n$ is the linear transformation $D_a \in L(\mathbb R^n,\mathbb R^m)$. In its turn, we have $L(\mathbb R^n,\mathbb R^m) \approx M_{m,n}(\mathbb R)$, generally speaking.

    So, I am interested in cases where for a certain class of matrices $M$ and functions $F:M\rightarrow M$ we would have an isomorphism $L(M,M)\approx M.$ Simply put, I'm interested in a general view of this question.

    – Иван Петров Mar 30 '24 at 10:23
  • So at least things like zero-divisors have to be excluded. If you look at nice examples like O(n) or SU(n) where the matrices form a group, it might work, but even then we can define non-differentiable functions (we can even do that for the reals). So your question is an existance question, for functions that have (at least in some region of their domain) a single first-derivative description... (and still are non-constant of course). – Jos Bergervoet Mar 30 '24 at 10:42
  • Dual numbers have zero divisors, but this does not prevent us from defining the derivative of functions of this numbers – Иван Петров Mar 30 '24 at 11:02
  • But I mean you would have to somehow exclude them from the class of functions, e.g. the function $$F\big(\begin{bmatrix} x & 0 \ y & x \ \end{bmatrix}\big)=\begin{bmatrix} 0 & 0 \ y & 0 \ \end{bmatrix} $$ woud not allow you to describe the linear variation with one dual number as proportionality constant (similar to $f(z)={\rm Im}z$ for complex numbers). But I agree this does not mean they have to be excluded from the class of matrices $M$. – Jos Bergervoet Mar 30 '24 at 11:47
  • You're right, the function is not differentiable, but this is a normal suituation, some functions are differentiable, some are not. I don't quite understand the point of this comment. – Иван Петров Mar 30 '24 at 12:17
  • It is about what the formulation of the existance question should be. There always exist non-constant functions that are differentiable (like $f(x)=x$), and functions that are not. So what more do we want to prove? Your definition "the existence of such a matrix ... means differentiability of $F$" seems to answer the question whether a function qualifies. Isn't that enough? In what sense would that approach have to be generalized? – Jos Bergervoet Mar 30 '24 at 13:55
  • I apologize, perhaps I did not describe my question well enough at the beginning of the topic.

    In these examples that you and I gave, everything is quite obvious, there are isomorphisms to complex or dual numbers.

    But what if we consider matrices of arbitrary size, even commutative ones, but NOT isomorphic to any generalizations of complex numbers (quaternions, double numbers, and so on)?

    – Иван Петров Mar 30 '24 at 14:37
  • Are such objects described in any articles or educational materials? That was my original question.

    Everything that I found on the network regarding matrix derivatives one way or another comes down to vectorization of matrices and again consideration of functions of the form $f:\mathbb{R}^n \to \mathbb{R}^m$ the derivative of which is again an object of a different nature

    – Иван Петров Mar 30 '24 at 14:38
  • If in the general case no one has considered this, when the derivative of a matrix function is again a matrix, and of the same size, then my question is closed. – Иван Петров Mar 30 '24 at 14:41