Since in Motivation behind weak form and the test function?
you are given a really good "physicist" point of view, I will try to give a "mathematician" point of view.
Let's say you are given a problem
$$\tag{1} -\Delta f = g \, ,$$
with $g \in C^{\infty}(\Omega)$ and $\Omega \subset \mathbb{R}^n$ is a smooth bounded domain. We want to find if a (or maybe more than one) solution $f$ exists such that $f|_{\partial \Omega} = 0$. I will show now two attempts to "solve" this problem and you will see where weak formulation and test functions naturally arises.
Attempt 1: we search for $f \in C^2(\overline{\Omega})$ and work with the PDE (1) in the point wise sense.
This may seem the most natural approach, but it turns out that this really is not the case. Since we are asking $f$ to be $C^2$, even only proving that one solution exists basically reduces in finding one. You have few weapons of attacking the problem different from building an explicit $f$ that solves the problem point wise and is twice differentiable. This turns out to be really hard in practice, even for simple domains, for example for $\Omega = B_{R}(0) \setminus B_{r}(0)$, constructing an explicit solution relies on finding the so called Green's function of that domain, and it's far from being easy even for very simple domains. Here we have come to the clue of the problem: building an explicit solution is really domain-dependant. This is really bad, mathematically speaking, because it means that even if you find an explicit formula for an annulus let's say, just bending the annulus a little bit makes you restart from scratch on the whole problem, since different domain means different Green's function, and different Green's function means different solution construction. I don't want to go into the details of Green's functions but believe me that it's hard (sometimes close to impossible) to find for most of domains. But let's say you have a magic box that computes Green's functions for all smooth domains, then you are done right? Speaking about existence theory of PDEs, it turns out you are farm from being done! Just let me change the problem a little bit. We work in some smooth $\Omega \subset \mathbb{R}^2$ and try to solve the problem
$$ \tag{2} (1+\sin(x+y)^2)f_{xx}+ (1+y^2x^4)f_{yy}=g$$
for some $f$ satisfying $f|_{\partial \Omega} = 0$. Guess what? For this problem your magic box that computes Green's functions is completely useless! You haven't build a general and strong theory of existence, every problem is drastically different if you look at it point wise, and solving two similar problems could mean using completely different techniques. As we will see, this problem is completely equivalent to the first one for the second attempt, here we go:
Attempt 2: we search for $f$ in some space of functions much bigger than $C^2(\Omega)$ and at first we don't ask $f$ to be a solution point wise, but just that it is solution in some "weak sense". Then, in a second moment we will build a theory that discovers if and when weak solutions are strong solutions.
This approach may let you think that us (mathematicians) always try to find the most difficult solutions to simple problems, but the reality is exactly the opposite. We start stating the problem weakly, that is defining $f$ to be a weak solution of (1) if: for every $\phi \in C^{\infty}_c (\Omega)$ we have
$$ \int_{\Omega} \nabla f \cdot \nabla \phi \, dx = \int_{\Omega} g\phi \, dx \, .$$
This formulation has already endowed the equation we want to solve (the PDE) with it's boundary data, in some sense now the boundary data and the PDE are coupled in the same formulation. Already at this point, one could guess that solving this problem probably will not depend on the domain much, at least not drastically. From a mathematical point of view, this problem is super well posed, on the LHS we have a scalar product in some Hilbert space $(H, \langle \cdot, \cdot \rangle)$ and on the right hand side we have a linear functional $l : H \to \mathbb{R}$. So this problem is really equivalent in finding a vector $f$ (let me call it a vector, as we imagine it as living in the vector space $H$ that I won't specify much, but it's some Sobolev space) such that
$$ \langle f, \phi \rangle = l(\phi) \,,$$
for all $\phi \in H$ (there's some density result I have used since now I allow the test functions in the whole Hilbert space $H$ and not in the subspace $C^{\infty}_c(\Omega)$, but let me skip these details). The existence of such an $f$ is guaranteed by simple and really general theorems as Rietsz's Representation Theorem, even for rough data $g$, for example in $L^2(\Omega)$. This procedure is, in contrast with the previous one, highly domain independant. It turns out that, by exactly the same procedure, one can solve problem (2) and similar problems in arbitrary dimension. This is because shifting one derivative to the text function $\phi$ completely changed the symmetry of the problem, making the LHS somewhat symmetric in $(f, \phi)$, and in mathematics when you find symmetries you are on the right track.
Ok but if you arrived at this point maybe you see that there's something that I have swept under the carpet carefully: is this "weak solution" a solution in the point wise sense? If the answer is "yes" then we are all happy and we have solved a large class of problems by a very general theory, if the answer is "no" then I am an impostor and this post makes little to no sense. Fortunately, it turns out that for regular domains and some regularity on the data the answer is yes! This gives rise to the beautiful Schauder theory, but I don't want to go off topic.
To summarize, weak formulation of a PDE is just a "trick" to enlighten symmetries where they are hidden. If you try to solve a problem in the point wise sense you realize that you are really asking too much, and you have little room to change the problem in order to fit some general result in mathematics. Thus the track becomes: first you relax your definition of solution so that the problem may be "symmetrized" to fit some strong/abstract theorem, and when you have existence of a weak solution by this method, then you have something to work with to prove that it is indeed smooth. This track turns out to be very effective in a lot of areas in mathematics.
EDIT: let me make one last comment. The whole point of the story is that derivatives are not point operators, they are local operators, meaning that the derivative of $f$ at $x_0 \in \Omega$ does not depend only on the value $f(x_0)$, it does depends on the values of $f$ around $x_0$. So if you think about it, taking a PDE and treating it point wise does not make much sense, since $-\Delta f = g $ is not a point wise equation, is a local one. If I say $-\Delta f(x_0)=g(x_0)$ to someone, I am implicitly prescribing the behavior of $f$ also in a neighborhood of $x_0$, I'n not only saying what happens at $x_0$. Then it turns out to be really natural to make the formulation of the PDE not point wise but local, multiplying by some arbitrary weight and then integrating, because in this way you capture the entire information the PDE can give you. You are not pretending anymore to be dealing with point operators.