Many computational libraries contain a routine called sincos that allows one to compute sin and cos simultaneously. The existence of this routine suggests that calling this routine is more efficient than making a call to sin followed by a call to cos.
I am interested in knowing how sincos is (or can be) implemented to make it more efficient than making two calls.