Note that in your case, the matrix $E$ is $L^{-1}$. So one might just go even further and store $U^{-1}$ as well or even $A^{-1}=U^{-1}L^{-1}$.
So why not just store $A^{-1}$? In a word, accuracy. A matrix $A$ has a so-called condition number $\kappa(A)$. The error $\| x_{\rm computed} - x\|$ in solving $Ax = b$ is roughly $\kappa(A)\times 10^{-16}$ in double precision arithmetic (more generally, $10^{-16}$ is roughly the unit roundoff for your floating point number system). However, even when $\kappa(A)$ is large, conventional $LU$ factorization with back- and forward-substitution tends to be backwards stable in the sense that the residual norm is small: $\|Ax_{\rm computed} - b\| \approx 10^{-16}$. In some applications, a small residual is more important than a small error, so this backwards stability is a desirable property.
Unfortunately, if you compute and store $A^{-1} = U^{-1}L^{-1}$, then the computed solution is (usually) no longer backwards stable and the residual is much larger $\|Ax_{\rm computed} - b\| \approx \kappa(A)\times 10^{-16}$. The error $\| x_{\rm computed} - x\|$ is usually of the same magnitude, but is usually a bit larger than if we just did back substitution.
So if we store $E=L^{-1}$, our residual norm will usually be higher. But not higher like $\|Ax_{\rm computed} - b\| \approx \kappa(A)\times 10^{-16}$, but higher like $\|Ax_{\rm computed} - b\| \approx \kappa(L)\times 10^{-16}$. We pay for the condition number of $L$ not $A$. In practice, $L$ is usually fairly well-conditioned even if $A$ is not. For instance, in some tests I did with a randsvd random matrix $A$ with condition number $10^{14}$, the condition number of $L$ computed with partial pivoting is only like $10^2$. So the residual norm is only like $100\times $ larger than with back- and forward- substitution.
There are other reasons to prefer storing $L$ over $E$. For example, if $L$ is a sparse matrix, $E$ may be less sparse and thus more computational costly to multiply by.
You're usually not losing a lot of accuracy or speed by the approach you advocate, but it's not clear exactly what you're gaining by your approach since you are losing something in accuracy by storing $E$ instead of $L$. Unless you're in a setting where you have a good reason to use $E$ instead of $L$ (which do exist), I would suggest sticking with $L$.