An equivalent formulation for the sum rule is:
``find $s \in \partial f(x)$ such that $p - s \in \partial g(x)$''.
Now, we can reformulate the second relation to isolate $s$ on the left-hand side.
Define the concave function $h_p(y) = p^\top \, (y-x) - g(y) + g(x) + f(x)$. Then, the second relation becomes $s \in \hat\partial h_p(x)$, where $\hat\partial$ is the superderivative of concave functions.
Let us reconsider this sub/supergradient relation in terms of the epigraph of $f$ and the hypograph of $h_p$.
By $s \in \partial f(x)$ we obtain
\begin{equation*}
f(x) - s^\top x \le f(y) - s^\top y
\end{equation*}
for all $y$ and this implies
\begin{equation*}
f(x) - s^\top x \le (1, -s)^\top k
\end{equation*}
for all $k$ in the epigraph of $f$.
Similar, from $s \in \hat\partial h_p(x)$ we get
\begin{equation*}
h_p(x) - s^\top x
=
f(x) - s^\top x
\ge
h_p(y) - s^\top y
\end{equation*}
for all $y$ and this implies
\begin{equation*}
f(x) - s^\top x
\ge
(1,-s)^\top k
\end{equation*}
for all $k$ in the hypograph $\{(a,b) \in \mathbb{R} \times \mathbb{R}^n : a \le h_p(b)\}$ of $h_p$.
Hence, the validity of the sum rule implies that we can separate the epigraph of $f$
and the hypograph of $h_p$ (which is just the mirrored, shifted and tilted epigraph of $g$).
To the contrary,
let $p \in \partial (f+g)(x)$ be given.
This implies that the epigraph of $f$ and the hypograph of $h_p$ are disjunct.
If you can separate the epigraph of $f$ and the hypograph of $h_p$,
then you find $a \in \mathbb{R}$ and $(\lambda,-s) \in (\mathbb{R}\times\mathbb{R}^n) \setminus \{0\}$ such that
\begin{equation*}
(\lambda,- s)^\top k \ge a \ge (\lambda,-s)^\top l
\end{equation*}
for $k$ in the epigraph of $f$ and $l$ in the epigraph of $h_p$.
Since you can make the first component of $k$ arbitrarily large, you obtain $\lambda \ge 0$,
and $\lambda = 0$ cannot happen, since this would imply $s = 0$.
Hence, we can rescale $\lambda$ to be $1$.
This implies
\begin{equation*}
f(y) - s^\top y \ge a \ge (p^\top \, (y-x) - g(y) + g(x) + f(x)) - s^\top y
\end{equation*}
for all $y$.
Plugging in $y = x$ yields $a = f(x) - s^\top x$.
The considerations above yield $s \in \partial f(x)$ and $p - s \in \partial g(x)$.
I hope this answer clarifies that a sum rule is essentialy a separation theorem.