A matrix is an array of numbers, expressions or symbols arranged in rows and columns as you may have read on Wikipedia. Matrices can be added, subtracted and multiplied. Division is a special case where instead of "dividing matrices" we find the inverse and multiply by that matrix. $(A/B=A\times \frac{1}{B}=A\times B^{-1})$
$$A=
\begin{pmatrix}
a_{11} & a_{12} & \cdots &a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots&\vdots\\
a_{m1} & a_{m2} & \cdots&a_{mn} \\
\end{pmatrix}
$$
The matrix $A=(a_{ij})_{m\times n}$ where $a_{ij}\in R,\space (i=1,2,...m, \space j=1,2,...n)$, $m$ is the number of rows and $n$ is the number of columns. $m\times n$ is called the size of the matrix. It is possible to have a matrix with one row - this is called a row vector and its size is $1\times n$ as well as a matrix to have one column - this is called a column vector and its size is $m\times 1$. A matrix with size $n\times n$ is called a square matrix. Below is a row vector, column vector and square matrix respectively.
$$\left(3, \space 2,\space 5\right),\space \begin{pmatrix}
3\\
7\\
2\\
\end{pmatrix}, \space \begin{pmatrix}
3&5&5\\
3&3&9\\
1&2&2\\
\end{pmatrix}$$
Matrices have several uses and applications that can be applied to different applied areas. A major application of matrices is to represent linear transformations. Linear Transformations/Linear Mapping is a mapping between two modules $(L:V\to W)$ that preserve the operations of scalar multiplication and addition - we can consider this to be simply a generalization of linear functions. I will elaborate on this shortly.
To clear up your misconception, matrices can be expressed in a number of ways. And, furthermore, equations can be expressed as matrices. So for example if you have a system of equations, you can express it as a matrix. Let us look at the following system of equations:
$$2x + 3y – z = 6\\
–x – y – z = 9\\
x + y + 6z = 0\\$$
$$\left[
\begin{array}{ccc|c}
2&3&-1&6\\
-1&-1&-1&9\\
1&1&6&0
\end{array}
\right]$$
Writing down the coefficients of $x$, $y$ and $z$ and their solutions, we have represented this as a matrix. This matrix is called the augmented matrix (below). With the system of equations expressed as an augmented matrix, we can now solve and determine if the system is consistent (which means that there is one unique solution or infinitely many solutions to the system) or inconsistent (which means that there are no solutions to the system). Solving systems such like these have a variety of useful applications such as in business etc..and so many others - generally any problems with unknowns. Matrices thus makes it possible to solve a system of $3$ or more unknowns that would be time-consuming or difficult, imagine $6$ or even $10$ unknowns...
Note that the above set of matrices can also be represented as:
$$
\begin{pmatrix}
2 & 3 & -1 \\
-1 & -1 & -1 \\
1 & 1 & 6 \\
\end{pmatrix}
\begin{pmatrix}
x\\
y\\
z\\
\end{pmatrix}
=\begin{pmatrix}
6\\
9\\
0\\
\end{pmatrix}
$$
A Rotation Matrix is a matrix that is used perform a rotation made in the Euclidean Space. So taking the two equations you mentioned..As we did with the above system of equations we can represent these as matrices..
$$x' = x\cos\theta - y\sin\theta\\
y' = x\sin\theta + y\cos\theta\\$$
$$\begin{pmatrix}
x'\\
y'\\
\end{pmatrix}
=\begin{pmatrix}
\cos\theta &-\sin\theta\\
\sin\theta &\cos\theta\\
\end{pmatrix}
\begin{pmatrix}
x\\
y\\
\end{pmatrix}$$
When you consider the above representation you can clearly see that those two equations are indeed matrices and thus the name "Rotation Matrix". Plugging in any $(x,\space y)$ coordinate and using matrix multiplication, you can find the corresponding $(x',\space y')$ coordinates. This will yield the same answer if you had substituted $x$ and $y$ in the equations..
Now back to Linear Transformations...
A linear transformation, $L$, is a function where $V$ and $W$ are vector spaces, $L:V\to W $ satisfying $L(k_1\mathbf x_1+k_2\mathbf x_2)=k_1L(\mathbf x_1)+k_2L(\mathbf x_2)$ for $\mathbf x_1, \mathbf x_2 \in V \text{and}\space k_1,\space k_2\in R$. Matrices allow linear transformations to be represented in a consistent format suitable for computation. If $L$ is a linear transformation mapping $V \to W$ and $\mathbf x$ is a column vector with $n$ entries then $L(\mathbf x)=A\mathbf x$ for some $m\times n$ matrix $A$. $A$ is called the matrix of transformation of $L$. Using the rotation matrix from above:
$$\begin{align}
L(\mathbf x) & = A\mathbf x \\
& =\begin{pmatrix}
\cos\theta &-\sin\theta\\
\sin\theta &\cos\theta\\
\end{pmatrix}\begin{pmatrix}
x\\
y\\
\end{pmatrix}\\
& = \begin{pmatrix}
x\cos\theta &-y\sin\theta\\
x\sin\theta &y\cos\theta\\
\end{pmatrix}
\end{align}$$
$$\text{When}\space \theta = \frac{\pi}{2}$$
$$L\begin{pmatrix}
x\\
y\\
\end{pmatrix}=\begin{pmatrix}
x\cos\frac{\pi}{2} &-y\sin\frac{\pi}{2}\\
x\sin\frac{\pi}{2} &y\cos\frac{\pi}{2}\\
\end{pmatrix}=\begin{pmatrix}
x(0) &-y(1)\\
x(1) &y(0)\\
\end{pmatrix}=\begin{pmatrix}
-y\\
x\\
\end{pmatrix}$$
$$\text{When}\space \theta = \pi$$
$$L\begin{pmatrix}
x\\
y\\
\end{pmatrix}=\begin{pmatrix}
x\cos\pi &-y\sin\pi\\
x\sin\pi &y\cos\pi\\
\end{pmatrix}=\begin{pmatrix}
x(-1) &-y(0)\\
x(0) &y(-1)\\
\end{pmatrix}=\begin{pmatrix}
-x\\
-y\\
\end{pmatrix}$$