MY ATTEMPT AT A PROOF:
We prove the result for a constant sectional curvature $\Delta$.
Let $(x,\,y) \in \mathcal{M}^2$. Define the geodesic $\gamma (-\epsilon, \epsilon) \to \mathcal{M}^2$ with $\gamma(0) = x$ and $\gamma(1)=y $. The second order Taylor expansion of $\varphi$ gives:
\begin{multline*}
\varphi(y) = \varphi(x) + \langle \text{grad} \varphi(x), \text{Exp}^{-1}_x(y) \rangle + \frac{1}{2} \langle \text{Hess} \varphi(\gamma(t))[ \dot{\gamma}_t],\dot{ \gamma}(t) \rangle \nonumber \\+ \frac{1}{2} \langle \text{grad} (\varphi(\gamma(t))), \ddot{\gamma}(t)\rangle
\end{multline*}
The last term can be ignored since the acceleration of a geodesic is zero. We show that the third term is positive for any point $ \gamma(t) =(\tilde{x},\tilde{y}$) and $\dot{ \gamma}(t) = (v_1,v_2)$ with $t \in (0,1)$.
Accordingly, consider any point $(\tilde{x},\tilde{y}) \in \mathcal{M}^2$ with $d(\tilde{x},\tilde{y})\leq r <r^*$. Define the geodesics $\gamma_i :(-\epsilon,\epsilon) \to \mathcal{M}$, with $\gamma_1(0)=\tilde{x},\,\dot{\gamma}_1(0) = v_1$ and $\gamma_2(0)=\tilde{y},\,\dot{\gamma}_2(0) = v_2$. Let $\gamma_{12,t}(s)$ denote the geodesic joining the point $\gamma_1(t)$ to $\gamma_2(t)$. We note that since we restrict ourselves to radii less than the convexity radius, such a construction is possible without encountering any conjugate point pairs on $\gamma_{12}$. Also, let $\phi(t):=L(\gamma_{12,t}) $, where $L$ denotes the length of the geodesic segment, so that $ \phi(t) :=d (\gamma_1(t),\gamma_2(t))$ and $\phi(0)=r$. Thus, we need to calculate the second order derivative of $\phi^2(t)$ at $t_0=0$.
To proceed, we define the geodesic variation as $\alpha : [0,1]\times [a,b] \to \mathcal{M} $, so that the map $s \to \alpha(t_0,s)$ traces the (normalized) geodesic $\gamma_{1,2,t_0}(s)$ with $\| \dot{\gamma}_{1,2,t}\|=1$ . Define the vector field $X(t):= \frac{\partial \alpha}{\partial t}:=D\alpha \frac{\partial}{\partial t}|_{t=t_0}$. We note that $X(t)$ is a Jacobi field along $\gamma_{12,t}$. The double derivative of $\phi^2(\cdot)$ can be calculated by a straightforward application of the second variation of arc length formula and is given by :
\begin{align}\label{dder}
\frac{d}{dt^2} \phi^2(t)\Big|_{t = t_0} &= r \langle X^\perp (s), X^{\perp}(s) \rangle\big|_{a}^b
\nonumber \\&\qquad + \Big( \Big\langle X(s), \frac{\gamma_{12,t_0}(s)}{\| \gamma_{12,t_0}(s)\|} \Big\rangle \Big|_{a}^b \Big)^2,
\end{align}
where $X^{\perp}(t):= X(t) - \langle X(t), \dot{\gamma}_{12,t} \rangle \dot{\gamma}_{12,t}$ is the part of $X$ orthogonal to $\dot{\gamma}_{12,t}$. We consider the first term and show that it is non-negative to establish the result. Since the initial condition $X(0)$ is not zero, to use the standard Jacobi field theory we split $X(t)$ as $X(t) =X_1(t) + X_2(t)$, where
\begin{equation}\label{i.c.}
X_1(0)=0 \,; X_2(0) =X^{\perp}(0); \,X_1(r)=X^{\perp}(r); \,X_2(r) =0.
\end{equation}
Such a (unique) decomposition is possible. Then, the solution to the Jacobi equation with the above initial conditions, for a manifold with constant curvature, is given as :
\begin{equation}\label{h.def}
X_i(t) = S_{\Delta}(t) E_i(t) \text{ and } \nabla X_i(t) = C_{\Delta}(t) E_i(t) , \,\,i=1,\,2
\end{equation}
where $E_i(t)$ is some field parallel to the $\gamma_{12,t}$ with $\| E_i(t)\| =1 $ and
\begin{multline*}
C_{\Delta}(t):= \cos (\sqrt{\Delta}t),\,\, S_{\Delta}(t) := \frac{1}{\sqrt{\Delta}}\sin (\sqrt{\Delta}t),\text{ if } \Delta >0; \qquad \\
C_{\Delta}:= t,\,\, S_{\Delta}= 1,\text{ if } \Delta =0;\\
C_{\Delta}(t):= \cosh (\sqrt{|\Delta|}t),\,\,\, S_{\Delta}(t) := \frac{1}{\sqrt{|\Delta|}}\sinh (\sqrt{|\Delta|}t),\text{ if } \Delta < 0.
\end{multline*}
We also have the easily verifiable property ,
\begin{equation}\label{h.1}
\frac{\langle \nabla X_i(t),X_i(t)\rangle }{\langle X_i(t) ,X_i(t)\rangle } = \frac{C_{\Delta} (t)}{S_{\Delta}(t)} \text{ and } \| \nabla X_i (0)\| \leq \frac{\|X_i(t)\|}{S_\Delta(t)}, \,\,i=1,2.
\end{equation}
We have:
\begin{align*}
\frac{1}{r}\frac{d}{dt^2} \frac{\phi^2(t)}{2} &\geq \langle \nabla X_1(r), X_1(r) \rangle + \langle \nabla X_2(r), X_1(r) \rangle \\&\qquad - \langle \nabla X_1(0), X_2(0) \rangle -\langle \nabla X_2(0), X_2(0) \rangle
\end{align*}
We remark that since the initial conditions for $X_2(t)$ are reversed (i.e. $X_2(r)=0$ and $X_2(0)=X^{\perp}(0)$), when considering $\nabla X_2(0)$, we have to parametrize $\gamma_{12,t_0}(s)$ as $s' =r-s$. This gives $\nabla X_2(r)=-\nabla X_2(0)$. We first consider the case $\Delta \leq 0$. We have from using :
\begin{align}\label{p-1}
\frac{1}{r}\frac{d}{dt^2} \frac{\phi^2(t)}{2} & \geq \frac{C_\Delta(r)}{S_\Delta(r)} \|X_1(r) \|^2 - \frac{1}{S_\Delta(r)}\|X_2(0)\| \|X_1(r)\| \nonumber \\& \qquad- \frac{1}{S_\Delta(r)} \|X_1(r)\| \| X_2(0)\| + \frac{C_\Delta(r)}{S_\Delta(r)} \|X_2(0) \|^2 \nonumber \\
& \geq \frac{C_\Delta(r)}{S_\Delta(r)} \|X_1(r) \|^2 + \frac{C_\Delta(r)}{S_\Delta(r)} \|X_2(0) \|^2 \nonumber \\& \qquad - \frac{1}{S_\Delta(r)} \Big( \|X_1(r) \|^2 + \|X_2(0) \|^2 \Big) \nonumber \\
& = \frac{1}{S_\Delta(r)}\Bigg( C_\Delta(r)- 1 \Bigg) \Big( \|X^{\perp}(0)\|^2 + \|X^{\perp}(r)\|^2 \Big),
\end{align}
The right hand side can be seen to be positive for all $r>0$ using the definition of $C_\Delta$.
We next consider $\Delta > 0$. Set $a:= \|X^{\perp}(0)\|$ and $b :=\|X^{\perp}(r)\|$. We have:
\begin{align}\label{p-2}
\frac{1}{r}\frac{d}{dt^2} \frac{\phi^2(t)}{2} &\geq \frac{C_\Delta(r)}{S_\Delta(r)} \|X_1(r) \|^2 - C_\Delta(0)\|E_2(0)\| \|X_1(r)\| \nonumber \\ &-C_\Delta(0)\|E_1(0)\| \| X_2(0)\| + \frac{C_\Delta(r)}{S_\Delta(r)} \|X_2(0) \|^2 \nonumber \\
& \geq \frac{C_\Delta(r)}{S_\Delta(r)} \|X^{\perp}(r) \|^2 - \|X^{\perp}(r)\| -\| X^{\perp}(0)\| \nonumber \\ &\qquad + \frac{C_\Delta(r)}{S_\Delta(r)} \|X^{\perp}(0) \|^2 \nonumber \\
&= (a^2+b^2) \frac{C_\Delta(r)}{S_\Delta(r)} \Bigg( 1- \frac{ \tan (\sqrt{\Delta}r) (a+b)}{\sqrt{\Delta}(a^2+b^2) } \Bigg)
\end{align}
The second inequality uses the fact that $\|E_i(t)\|=1$. Consider $r^* = \min ( \frac{\pi}{2\sqrt{\Delta}} , \frac{1}{\sqrt{\Delta}} \tan^{-1} (\frac{\sqrt{\Delta}(a^2+b^2)}{a+b})) $. One can easily verify that for $r<r^*$, the RHS is positive in the above inequality.