I'm trying to understand the method of separation of variables. I'm probability overlooking something simple, regarding the justification for the term-by-term differentiation that comes up when an initial conditions is given (in particular when the solution is generalised rather than classical).
So the setting is an initial boundary value problem for a Hyperbolic PDE of the form $\rho(x)m(t)u_{tt}-L_x[u]=0$ where $L_x[u]=div(p(x)\nabla u + q(x)u$) (well, since here the spatial dimension is 1, it's just a Sturm–Liouville operator).
As I understand it, when looking for a generalized solution, the strategy is finding a sequence $\{u_n(x,t)\}_{n=0}^{\infty}$ of classical solutions for the PDE, such that each satisfies the given boundary conditions, and the point-wise limit $u(x,t):=\lim_{n\to\infty }u_n(x,t)$ satisfies the initial condition ($u_t(x,0)=g(x)$ where $g(x)$ is well-behaved, i.e. continuous and piecewise differentiable function that satisfies the boundary conditions).
This is done by the eigenfunction expansions of $u(x,t)$ and $g(x)$ with respect to the orthonormal system $\{X_n(x)\}_{n=0}^{\infty}$ of all eigenfunctions of the operator $L_x$ and the boundary-conditions. Under those circumstances, we always have for every $t_o$ a sequence $\{A_n(t_o)\}_{n=0}^{\infty}$ and a sequence $\{B_n\}_{n=0}^{\infty}$ of real-numbers such that $u(x,t_o)=\sum A_n(t_0)X_n(x)$ and $g(x)=\sum B_nX_n(x)$, and the two series converge uniformly (with respect to $x$) for all $x$ within the boundaries.
The elements of $\{A_n(t_o)\}_{n=0}^{\infty}$ are determined for all $t_0$ (usually up to some constants) by solving the sequence of regular partial equations (one for each eigenvalue of the operator $L_x$) that are obtained by substituting a general separated solution $X(x)T(t)$ in the PDE. This means that the $A_n(t)$ functions are differentiable (even twice).
It remains to find those constants in order to obtain the unique solution for the entire original problem, and they are determined by the initial conditions (which were ignored until now). So sooner or later, it seems an argument along the following lines is used:
Since $u(x,t)=\sum X_n(x)A_n(t)$, and since $u_t(x,0)=g(x)=\sum X_n(x)B_n$, let's write $u_t(x,t)=\sum X_n(x)\frac{dA_n(t)}{dt}$ and from uniqueness it follows that $\frac{dA_n(t)}{dt}=B_n$ for $t=0$.
It has been a long time since I've last dealt with function series. What is the justification for this term-by-term differentiation? Indeed, both series converge uniformly - but is it not required to show explicitly that $\sum X_n(x)\frac{dA_n(t)}{dt}$ also converges uniformly in order to make the above deduction?
I'd appreciate any clarification,
Thanks!