The method I used on Combo, which I recently put up on GitHub, is the following:
$$
λx.x = I,\quad λx.a = Ka,\quad λx.xx = D,\quad λx.xb = Tb,\quad λx.ax = a,\\
λx.xv = Ub,\quad λx.ux = Wa,\quad λx.av = Bab,\quad λx.ub = Cab,\quad λx.uv = Sab,
$$
where $a$ and $b$ are terms containing no free occurrences of $x$ in them, and where $u$ and $v$ are terms containing free occurrences of $x$ in them in which $a = λx.u$ and $b = λx.v$ (thus $u = ax$ and $v = bx$ under the β-rule). This is equivalent to taking the simplified abstraction rule
$$λx.x = SKK,\quad λx.y = Ky,\quad λx.uv = Sab,$$
where, this time, $u$ and $v$ can be any terms, but $y$ is restricted to being just a variable other than $x$ ... and applying the optimization rules:
$$
I = SKK,\quad K(ab) = S(Ka)(Kb),\quad SII = D,\quad SI(Kb) = Tb,\quad S(Ka)I = a,\\
SIb = Ub,\quad SaI = Wa,\quad S(Ka)b = Bab,\quad Sa(Kb) = Cab.
$$
If you run Combo on $S\ (K\ a)\ (K\ b)$ it will say it's in normal form already. However, if you run Combo on it, with the extensionality axiom (i.e. the η-rule: $λx.ax = a$ where $a$ has no free occurrences of $x$ in it), then it will reduce it to $K\ (a\ b)$.
The weak normal forms are obtained by applying reductions with only the β-rule, while strong normal forms also use the η-rule. To reduce a term $N$ in weak normal form to strong normal form, tack on an extra variable $x$ that does not occur freely in $N$, reduce $Nx$ to (strong) normal form - which I'll denote $[Nx]$ - and then abstract out the $x$: $λx.[Nx]$.
A full set of axioms to embody the η-rule could be obtained by taking the above-mentioned optimization rules, along with the results obtained by abstracting them with respect to $a$ and $b$. For instance, from $SaI = Wa$, one also obtains $λa.(SaI) = λa.Wa$, i.e. $CSI = W$. For $T$, one would get $λb.SI(Kb) = Tb$, i.e. $B(SI)K = T$. For $U$, one gets $SI = U$. Similarly, for $B$, one gets $S(Ka) = Ba$, and then $BSK = B$ ... or $S(KS)K = B$ if applying the simplified abstraction rule. Likewise, for $C$, one gets $B(Sa)K = Ca$ and $C(BBS)K = C$, or $S(S(KB)S)(KK) = C$ if applying the simplified abstraction rule, which yields the equation $C = S(BBS)(KK)$, if also applying the rule $S(Ka) = Ba$ to $S(KB)$ to get $BB$.
Abstracting out the optimization rule for $K$ - which is what your question was about - would get $BKa = B(S(Ka))K$ and then $BK = C(BB(BSK))K$.
I think these all suffice, together, to capture the η-rule.
You can't apply the abstraction rule for $K$ consistently without the qualifiers I listed, because of a conflict with the $S$ rule on the term $a b$ - unless you adopt the rule $K(ab) = S(Ka)(Kb)$ as an extra axiom - i.e. apply an instance of the η-rule. So, the η-rule is needed to get the different weak normal forms to match up and yield the same strong normal form.
λx.c = Kcprovided thatcis free ofx"? Or "λx.c = Kcprovided thatxdoes not occur free inc? The latter would allowcto rebindxas inc = λx.xs.t.λx.λx.xcould be written asK (λx.x)via rule 2.? – joseville Nov 12 '21 at 17:19