0

This is probably a pretty standard statistical/graph analysis requirement but I can't figure out the terminology I need to find what I'm looking for on Google.

I am making a very large number of function calls, and let's say each of those function calls is an item in a set called A. These function calls are expensive and ideally I would like to make as few of them as possible.

The function calls each return one or more results, and let's say each of those results is an item in a set called B. There are many more results than function calls, and each result could be returned by many of the function calls.

Once I have finished making all of the possible function calls for the first time, I would like to determine the least number of function calls in A I could have made and still got all of the results in B, so that next time I run the program I can do so more efficiently.

What algorithm, or type of algorithm, am I looking for here?

1 Answers1

0

Note: Funny, I can nearly recycle my last answer, to a seemingly unrelated problem.

You can model the decision which functions to call as vector $$ x = \{ 0, 1 \}^n $$ where $n = \lvert A \rvert$ is the number of functions in $A$ and $x_i$ models if we call the $i$-th function in $A$ or not.

Each function will provide a subset of results from $B$, which we can model as vector $$ f_i \in \{ 0, 1 \}^m $$ where $m = \lvert B \rvert$.

The number of called functions is $$ c^\top x $$ where $c = (1, \dotsc, 1)^\top \in \{0, 1 \}^n$.

For $A = (f_1, \dotsc, f_n) \in \{0, 1 \}^{m \times n}$ the expression $$ A x $$ will yield a vector from $\mathbb{Z}^m$, where the $k$-th component contains how many times the $k$-th result from $B$ was calculated. We need at least one result each, so our constraint is $$ A x \ge b $$ with $b = (1, \dotsc, 1)^\top \in \mathbb{Z}^m$. (Yes, $b = c$ for this problem, but I keep them separate to stick to the usual form)

This gives the integer linear program $$ \min_{x \in \{0,1\}^n} \left\{ c^\top x \mid A x \ge b \right\} $$ which can be solved with standard software like lpsolve.

mvw
  • 34,562
  • Thank you very much for this very interesting answer. I am writing a program in Python and I've seen that there is a Python module available for lpsolve, so I'm looking forward to experimenting with that, and hopefully finding some sort of pattern to my problem so that I might be able to solve it without brute-forcing it in future. Thanks again! – utterly confuzzled Oct 18 '16 at 07:39
  • I usually try out lpsolve from R (http://math.stackexchange.com/a/1727713/86776). Your Python bindings will work in a similar way. You should try some small problem instances. No idea when the dimension is too large to be solved in reasonable time. – mvw Oct 18 '16 at 08:45