Sum over all contiguous partitions of an array

Question

Consider an array $A$ of length $n$. We can split $A$ into contiguous segments called pieces and store them as another array $B$ . For example , if $A = [1,2,3]$, we have the following arrays of pieces :

$B = [(1),(2),(3)]$ contains three 1-element pieces .
$B = [(1,2),(3)]$ contains two pieces,one having $2$ elements and the other having $1$ element .
$B = [(1),(2,3)]$ contains two pieces, one having $1$ element and the other having $2$ elements.
$B = [(1,2,3)]$ contains one $3$-element pieces .

We consider the value of a piece in some array $B$ to be $(sum$ $of$ $all$ $numbers$ $in$ $the$ $piece)$ x ($length$ $of$ $piece)$ , and we consider the total value of some array $B$ to be the sum of the values for all pieces in that $B$. For example, the total value of $B = [(1,2,4),(5,1),(2)]$ is $(1+2+4)$ x $3$ + $(5+1)$ x $2$ + $(2)$ x $1$ = $35 $ .

Given $A$ , find the total values for all possible B's,sum them together and find this sum modulo $(10^9 + 7)$ .

How to find the closed form for the contribution of every element in the final answer ? Can't seem to find any pattern !

This was asked in my college contest today , these are conducted every week . I would've posted the link directly but the site is accessible to students only . — Sarvagya, Sep 25 '16 at 01:26
Yes , number of elements can be upto 10^6 , so a O(N) algorithm is needed i think . And that can be achieved if we find the contribution of every element in the final answer , like i said . — Sarvagya, Sep 25 '16 at 01:29
I now have a proof, but it's rather long. Given the simplicity of the coefficient ($2^n+2^{n-1}-2^{n-k}-2^{k-1}$), it would be great to have a simpler, maybe combinatorial, proof. — Jean-Claude Arbaut, Sep 26 '16 at 09:25
I answered twice to a similar question recently ; a natural and a tricky. But without the modulo ... — , Sep 27 '16 at 01:16

Jean-Claude Arbaut · Answer 1 · 2016-09-26T09:32:57.697

Updated with a proof of the formula $u(n,k)=2^n+2^{n-1}-2^{n-k}-2^{k-1}$.

Instead of computing the sum for each $B$, compute how many times each number $a_i$ in $A$ may appear in the final sum.

Such an $a_i$ will be in a subarray of length $L$, from $a_j$ to $a_{j+L-1}$, with $j\leq i\leq j+L-1$. There are not many possible subarrays for each $(i,L)$ pair. For each one, you know you have to add $a_i\times L$. And this subarray will appear in all choices of $B$ which split $a_1\dots a_{j-1}$ and $a_{j+L}\dots a_n$ into different sets of subarrays. These choices are easy to count (smells like combinations, where you choose where you put the "separations" between subarrays).

I hope this is clear enough.

Regarding the number of ways an array $A$ can be split in arrays $B$, consider the possible separations. There are $n-1$ possible separations (between $a_i$ and $a_{i+1}$ for all $i$). And any subset of these separations is valid. Hence there are $2^{n-1}$ ways to split an array $A$ into arrays $B$.

It's also the number of possibilities for each remaining part in the algorithm above. That is, $a_1\dots a_{j-1}$ of length $j-1$ (thus $2^{j-2}$ cases, but take care of the limit case $j=1$), and $a_{j+L}\dots a_n$ of length $n-j-L+1$ (hence $2^{n-j-L}$ cases). The total would be the product, $2^{j-2+n-j-L}=2^{N-L-2}$.

This does not even depend on $j$, but it could be guessed: when you remove a central array of length $L$ from an array of length $n$, there remains $n-L$ cells, and there is already a separation (the whole subarray that was removed necessarily cuts the remaining, as they have to be contiguous). So, instead of $2^{n-L-1}$, it's $2^{n-L-2}$.

There is however an adjustment to do for the limit cases where the subarray of length $L$ is at the beginning or the end of $A$ (or when $n=L$), but it's the idea.

Closed-form

Let $u(n,k)$ be the coefficient of the $k$-th element of an array $A$ of length $n$ (numbering starting at $1$, so $1\leq k\leq n$. This is the coefficient in the last sum, adding all sums for each $B$.

Given the discussion above, and taking the "adjustment" into account, one gets

$$u(n,k)=\sum_{i=1}^k\sum_{j=k}^n (j-i+1) 2^{\max(0,i-2)+\max(0,n-j-1)}$$

Notice that in this formula, $i$ is the index of the beginning of the subarray containing $a_k$, $j$ is the index of its end, and $j-i+1$ is its length. The remaining part (subarray of $A$) on the left accounts for the factor $2^{i-2}$, except when there is nothing left (hence the $\max(0,i-2)$. Same on the right.

It simplifies to the following formula. You will find below in this answer a direct proof of the last formula $2^n+2^{n-1}-2^{n-k}-2^{k-1}$

$$u(n,k)=2^n-1+\sum_{i=0}^{k-2} (2^{n-i-2}-2^i)$$

Notice that for $k=1$, the sum is zero and there is only the term $2^n-1$.

Then it's easy:

$$u(n,k)=2^n+2^{n-1}-2^{n-k}-2^{k-1}$$

You can check that $u(n,k)=u(n,n+1-k)$, as expected (the coefficient are "symmetrical").

The final sum requested by the task is then simply

$$\sum_{k=1}^n u(n,k) a_k$$

Implementation: you are computing everything modulo $M=10^9+7$, and $2^k \bmod M$ is easy to compute by the usual binary algorithm for modular exponentiation. But you can do even better by keeping track of some values in the loop, and only updating the coefficient (so it's $O(1)$ per loop). It may help to remark that $500000004\times 2 \mod M=1$. And of course, use the symmetry.

You can also compute the sum only on the last two terms of the formula for $u(n,k)$, since $2^{n}+2^{n-1}=3\cdot2^{n-1}$ is a constant. Then compute a sum of the $a_k$ and multiply by $3\cdot2^{n-1}$ only in the end. For $2^{n-k}+2^{k-1}$, you will still have some work though.

It's tempting to compute $u(n,k)a_k$ using arithmetic shifts, as it's very fast, but this won't work: don't forget you are computing modulo $10^9+7$, and not modulo $2^{32}$.

As a side note, the sum of coefficients is $\sum_{k=1}^n u(n,k)=2^{n-1}(3n-4)+2$. This is A027992 on OEIS.

Example for $n=4$ and $A=(1234)$, let's compute by hand all $u(n,k)$ (actually $k=1,2$ are enough).

$k=1$: then the piece containing $1$ may be $(1), (12), (123)$ or $(1234)$.
- If it's $(1)$, its length is $1$ and the other pieces may be $(234), (23)(4), (2)(34), (2)(3)(4)$. Four cases of length $1$: contribution $1\times4=4$ to the coefficient.
- If it's $(12)$, its length is $2$ and the other pieces may be $(34), (3)(4)$. Two cases of length $2$: contribution $2\times2=4$ to the coefficient.
- If it's $(123)$, its length is $3$, and the other piece is necessarily $(4)$. Contribution $3$, one time.
- If it's $(1234)$, there is nothing left, but still a contribution $4$ (the length of the piece containing $1$), one time.
- Total, for $k=1$: $u(n,k)=4+4+3+4=15$
$k=2$: then the piece containing $1$ may be $(12),(123),(1234),(2),(23),(234)$.
- If it's $(12)$, the other pieces may be $(34), (3)(4)$, contribution $2\times2=4$
- If it's $(123)$, the other piece is $(4)$, contribution $3$
- If it's $(1234)$, this is the only piece, the contribution is $4$
- If it's $(2)$, the other pieces may be $(1)(34), (1)(3)(4)$, contribution $1\times2=2$
- If it's $(23)$, then the other pieces are $(1)(4)$, contribution $2$
- If it's $(234)$, the other piece is $(1)$, contribution $3$
- Total, for $k=2$: $u(n,k)=4+3+4+2+2+3=18$

So, for $n=4$, the coefficients are $15,18,18,15$. These are also the coefficients given by the formula above.

Proof starting from the double sum. It's a bit tricky, and I would really be interested by a direct (combinatorial?) proof of the closed form for $u(n,k)$.

Define, for $1\leq k\leq n$:

$$u(n,k)=\sum_{i=1}^k\sum_{j=k}^n (j-i+1) 2^{\max(0,i-2)+\max(0,n-j-1)}$$ $$v(n,k)=2^n+2^{n-1}-2^{n-k}-2^{k-1}$$

We want to show that $u(n,k)=v(n,k)$.

We will need the sums, for $a\neq1$:

$$\sum_{k=0}^n a^k=\frac{1-a^{n+1}}{1-a}$$

$$\sum_{k=1}^n ka^k= a \frac{\mathrm d}{\mathrm da}\left(1+a+\dots+a^n\right)\\ =a \frac{\mathrm d}{\mathrm da}\left(\frac{1-a^{n+1}}{1-a}\right)\\ =a\frac{1-a^{n+1}}{(1-a)^2}-\frac{(n+1)a^{n+1}}{1-a}$$

First, the case $k=1$. Then the sum on $i$ reduces to only one case, $i=1$:

$$u(n,1)=\sum_{j=1}^n j2^{\max(0,n-j-1)}=n+\sum_{j=1}^{n-1} j2^{n-j-1}\\ =n+2^{n-1}\sum_{j=1}^{n-1} j2^{-j}\\ =n+2^{n-1}\left(2(1-2^{-n})-2n\cdot2^{-n}\right)=2^n-1$$

This agrees with $v(n,1)$.

By a symmetry argument, likewise $u(n,n)=v(n,n)=2^n-1$, but the direct proof would be as easy.

Now, let $1<k<n$. Let's get the corner cases out of the sum to get rid of those "$\max$".

$$u(n,k)=\sum_{j=k}^nj2^{\max(0,n-j-1)}+\sum_{i=2}^k\sum_{j=k}^n(j-i+1) 2^{i-2}2^{\max(0,n-j-1)}\\ =n+\sum_{j=k}^{n-1}j2^{n-j-1}+\sum_{i=2}^k2^{i-2}\left(n-i+1+\sum_{j=k}^{n-1}(j-i+1) 2^{n-j-1}\right)\\ =n+\sum_{j=k}^{n-1}j2^{n-j-1}+(n+1)\sum_{i=2}^k2^{i-2}-\sum_{i=2}^k i2^{i-2} +\sum_{i=2}^k2^{i-2}\sum_{j=k}^{n-1}(j-i+1) 2^{n-j-1}\\ =n+\sum_{j=k}^{n-1}j2^{n-j-1}+(n+1)\sum_{i=2}^k2^{i-2}-\sum_{i=2}^k i2^{i-2} - \left(\sum_{i=2}^k(i-1)2^{i-2}\right)\left(\sum_{j=k}^{n-1}2^{n-j-1}\right) +\left(\sum_{i=2}^k2^{i-2}\right)\left(\sum_{j=k}^{n-1}j 2^{n-j-1}\right) $$

Now simplify, using

$$\sum_{i=2}^k 2^{i-2}=\sum_{i=0}^{k-2} 2^i=2^{k-1}-1$$ $$\sum_{i=2}^k i2^{i-2}=\sum_{i=0}^{k-2} (i+2)2^i=2^{k}-2+\sum_{i=0}^{k-2} i2^i=2^k-2+2(1-2^{k-1})+(k-1)2^{k-1}=(k-1)2^{k-1}$$

Hence also $$\sum_{i=2}^k (i-1)2^{i-2}=(k-1)2^{k-1}-2^{k-1}+1=(k-2)2^{k-1}+1$$

And the sums over $j$: $$\sum_{j=k}^{n-1}2^{n-j-1}=\sum_{j=0}^{n-k-1}2^{n-j-1-k}=2^{n-k-1}\sum_{j=0}^{n-k-1}2^{-j}=2\cdot2^{n-k-1}(1-2^{-n+k})=2^{n-k}-1$$ $$\sum_{j=k}^{n-1}j2^{n-j-1}=\sum_{j=0}^{n-k-1}(j+k)2^{n-j-1-k} =k\sum_{j=0}^{n-k-1}2^{n-j-1-k}+\sum_{j=0}^{n-k-1}j2^{n-j-1-k}\\ =k2^{n-k}-k+2^{n-k-1}\sum_{j=0}^{n-k-1}j2^{-j}\\ =k2^{n-k}-k+2^{n-k-1}(2(1-2^{-n+k})-2(n-k)2^{-n+k})\\ =k2^{n-k}-k+2^{n-k}(1-2^{-n+k}-(n-k)2^{-n+k})\\ =(k+1)2^{n-k}-n-1$$

Finally

$$u(n,k)=n+[(k+1)2^{n-k}-n-1]+(n+1)[2^{k-1}-1]-[(k-1)2^{k-1}]\\-[(k-2)2^{k-1}+1]\cdot[2^{n-k}-1]+[2^{k-1}-1]\cdot[(k+1)2^{n-k}-n-1]$$

$$u(n,k)=(n-n-1-n-1+1+n+1)\\+(-k+2+k+1)2^{n-1}\\+(n+1-k+1+k-2-n-1)2^{k-1}\\+(k+1-1-k-1)2^{n-k}$$

$$u(n,k)=3\cdot2^{n-1}-2^{k-1}-2^{n-k}=2^n+2^{n-1}-2^{n-k}-2^{k-1}=v(n,k)$$

And we are done!

That's exactly what I did . I analyzed these values for some small cases and tried to find the relation/pattern/formula but failed . n = 3 -> (7,8,7) n = 4-> (15,18,18,15) n=5 -> (31,38,40,38,31) The ith value in the brackets denote the number of times Ai would be added to the answer. — Sarvagya, Sep 25 '16 at 01:46
Say N = 5 and L = 2 and the index is 3 . The formula says total would be 2^(5-2-2) = 2 but the total should be 4 . (This is not a corner case ) — Sarvagya, Sep 25 '16 at 03:13
This doesnt work unless there is something wrong with my implementation. — Sarvagya, Sep 25 '16 at 22:37
@Alex I added the detailed computation for $n=4$. I don't quite understand why you write this does not work since I find the same coefficients you wrote above yesterday, for $n=3,4,5$. For which value of $n$ do you differ? It may be something else, but notice my array is indexed from $1$ to $n$, and not from $0$ to $n-1$ as in $C$-like programming languages. — Jean-Claude Arbaut, Sep 25 '16 at 23:24
@prakharsingh95 Thanks for the notification. I can't check (apparently one must participate to see the questions?), but I have contacted the HackerRank team to report this. I'll see if they request the removal of the answer, but I'll take no action before they react, since somebody already got undeserved help in this contest. I don't like this. — Jean-Claude Arbaut, Sep 26 '16 at 10:18
@Jean-ClaudeArbaut : i did the work for a similar question on another MSE page. I don't like this too ... — , Sep 27 '16 at 01:26

Sum over all contiguous partitions of an array

1 Answers1

Linked