First, some notation: upper-case bold letters for homogeneous coordinate vectors of points in $\mathbb{RP}^3$ and lower-case bold for points in $\mathbb{RP}^2$; a tilde over the symbol will indicate the corresponding inhomogeneous Cartesian coordinate vector in $\mathbb R^3$ and $\mathbb R^2$, respectively. We have the projection $\mathbf x = \mathtt P\mathbf X$ from the world to the image. I’m assuming a finite camera, so that $\mathtt P$ is a full-rank $4\times3$ matrix. The columns of this matrix are designated $\mathbf p_1$ through $\mathbf p_4$.
The back-projection of an image point $\mathbf x$ is a world ray that emanates from the camera center $\mathbf C$. (If you don’t have the center handy, you can compute it from $\mathtt P$ using the fact that $\mathtt P\mathbf C=0$.) By decomposing $\mathtt P$ into $[\mathtt M\mid\mathbf p_4]$, we find that $[(\mathtt M^{-1}\mathbf x)^T; 0]^T$ is the point at infinity that projects to $\mathbf x$. The back-projected ray is then the join of this point and the camera center, $\tilde{\mathbf C}+\lambda\mathtt M^{-1}\mathbf x = \mathtt M^{-1}(\lambda \mathbf x-\mathbf p_4)$ in inhomogeneous Cartesian coordinates. This back-mapping can’t be represented by a $3\times3$ matrix, but if you assume that $\tilde{\mathbf C}$ is the origin, the inhomogeneous direction vector of the ray is enough to describe it, and that’s just $\mathtt M^{-1}\mathbf x$.
There is a different decomposition of $\mathtt P$ that connects it more transparently to the image plane, although it’s not nearly as convenient as the above decomposition. In case you didn’t know, a plane with implicit Cartesian equation $ax+by+cz+d=0$ can be represented by the homogeneous vector $\mathbf\Pi=[a,b,c,d]^T$ in $\mathbb{RP}^3$: just write the equation as $\mathbf\Pi^T\mathbf x=0$. Central projection onto $\mathbf\Pi$ relative to the viewpoint $\mathbf C$ is given by the matrix $$\mathtt M=\mathbf C\mathbf\Pi^T-(\mathbf C^T\mathbf\Pi)\mathtt I_4.$$ (When $\tilde{\mathbf C}=0$ this matrix has a particularly simple form.) The camera projection transformation can then be viewed as central projection onto the image plane $\mathbf\Pi$ followed by an affine transformation $\mathtt A$ that maps the image plane onto the $x$-$y$ plane, and finally deletion of the $z$-coordinate, i.e., $$\mathtt P = \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&0&1\end{bmatrix} \mathtt A \mathtt M.$$ To back-project an image point $\mathbf x$, we can reverse the last two steps, producing a point on the image plane and then, assuming again that the camera is at the world origin, delete the last coordinate of the result to get the ray’s direction vector in $\mathbb R^3$. (Technically, we should project the point on the image plane onto the plane at infinity first, but that projection is just a matter of setting the last coordinate to zero.) This transformation cascade is accomplished by the $3\times3$ matrix $$\begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\end{bmatrix} \mathtt A^{-1} \begin{bmatrix}1&0&0\\0&1&0\\0&0&0\\0&0&1\end{bmatrix},$$ which is just $\mathtt A^{-1}$ with its last row and third column deleted. $\mathtt A$ can be derived from the world-to-camera transformation and the camera’s intrinsic matrix, but I won’t go into the details here.