0

Say I have a sentence consisting of n words.

That word can be

I like horse and I also like cats and kitten.

I want to compute the number of sequential words.

For example,

I, Like, Horse, are phrases with one word.

I Like, Like Horse, Horse and, are phrases with two words.

The total number of such phrases should be

n + (n-1) + (n-2) + ... +1

There are n phrases with 1 word, (n-1) phrases with 2 words, and 1 phrase with n words. So there are n sets and each set contain on average (n+1)/2 words.

So total there is n(n+1)/2 such phrases.

Say I want to limit the number of such sequences.

For example, do not compute every such sequences but only sequences less or equal to k.

For example, k =2 we got

I, Like, Horse, are phrases with one word.

I Like, Like Horse, Horse and, are phrases with one word.

I like Horse, like horse and, and everything else doesn't count.

So we got k sets and the number of element in each sets are n, n-1, n-2, ... n-k+1

When k=2, for example, we stop at n-1.

Hence the total number of such phrases are

k*(n+n-k+1)/2 = k*(2n-k+1)/2

I think my math so far is correct. When n is equal to k, it reduces to our previous formula of (n)(n+1)/2, when k is equal to 1, it reduces to simply n.

Now works on the formula we got

k*(2n-k+1)/2 = (2nk - k^2 +k)/2 = nk - .5*(k^2 +k)

Now this gets a little unintuitive.

Naturally we will think that the higher the k, the more phrases we have.

Yet the coefficient of k^2 is negative. It seems to me at some point, the higher the k, the less phrases we get.

How do we reconcile this problem?

user4951
  • 1,714
  • 1
    First, in your last formula, the $k$ should be $-k$. Second, in context, $k$ can't be any bigger than $n$, and it's only when $k$ exceeds $n$ that your (corrected) formula starts giving smaller answers. – Gerry Myerson Feb 11 '13 at 04:57

1 Answers1

1

Notice that $nk > k^2$ since $n>k$, this is the principal term and it is increasing in $k$.