4

ladies and gentlemen.

I am not a mathematician, so go easy on me. I'm a programmer.

I have a database that I got from the Internet (USDA National Nutrient Database for Standard Reference), detailing the amount of each nutrient in each of a few thousand foodstuffs. I wanted to write a program that would be able to create a maximally nutritious meal based on this data.

For each nutrient, I have a target and two penalties - one for going over and one for going under the target (since, for example, it's a lot worse to get too much saturated fat than not enough). The goal is to minimize the sum of the penalties.

The meal can select from all the thousands of foodstuffs, but can only contain five or six.

I wrote the program in Java, implemented a genetic algorithm, specified my requirements, and let it run. It produced recommendations that were pure poison, and didn't seem to improve with time.

Maybe I just don't get genetic algorithms? Let's see what I did...

1) Create a population of randomly generated meals.
2) Normalize each one so it has 2000 calories, by multiplying the amount of each foodstuff proportionally.
3) Select the best 10% of meals to be parents.
4) Create a new generation - a few random to avoid local minima, the rest created by combining the numbers and amounts from the parents.
5) GOTO 2.

What other algorithm can I try? Someone advised me to use simplex algorithm, but I can't seem to explain to it (the implementation in Apache Commons Math) what my fitness function is. But he claimed it would be a natural fit, and I have even heard of someone who used simplex for exactly this.

Tiiba
  • 49
  • As a fellow programmer, I laughed at the following: "I'm a programmer...(snip)...I wanted to write a program that would be able to create a maximally nutritious meal based on this data." Sounds about right ;) – Jesse Smith Jul 11 '13 at 21:26
  • Vote to close. I cannot see a clear question with a clear answer. Even if a question was explicitly stated, insufficient detail is given to enable a concrete answer. – Fly by Night Jul 11 '13 at 21:31
  • Sounds like you're optimizing an underconstrained problem on a trillion-dimensional space. I'm not exaggerating: there are over eight trillion ways to choose five foodstuffs out of a thousand. –  Jul 11 '13 at 21:32
  • @FlybyNight has nothing to do with pure programming - typical linear programming problem, exactly for this forum... – gt6989b Jul 11 '13 at 21:34
  • @gt6989b Point taken. Comment edited appropriately. The other major problems remain. – Fly by Night Jul 11 '13 at 21:36
  • I have thought of doing something similar, but I haven't found the data. Can you give a link? – Peter Taylor Jul 11 '13 at 21:39
  • @Tiiba Share the results? – Git Gud Jul 11 '13 at 21:40
  • Maybe try Simulated Annealing as an alternative optimization method, if GA does not work for you. – Lucozade Jul 11 '13 at 21:43
  • 1
    But will it taste? – Hagen von Eitzen Jul 11 '13 at 21:49
  • Historical note: optimizing meals based on nutrients was one of the first examples of a problem solved by linear programming. – Gerry Myerson Jul 12 '13 at 10:54

1 Answers1

3

Simplex is indeed a natural fit, so would be any other algorithm for solving linear programming problems. I will try to detail how you would formulate the problem.

You have foods $f_i$ and nutrients $n_j$ and a matrix $a_{ij}$ of how much of nutrient $n_j$ is in one serving of the $f_j$. Let your variables be $x_i \geq 0$ (how much for each foodstuff) and requirements be $r_i$ (also for each foodstuff).

Now you require $A\vec{x} = \vec{r}$ if you want the exact solution and to minimize the number of ingredients perhaps minimize $\sum x_i$.

Another approach is to define $\vec{y} = A\vec{x}$ and optimizing $\sum |y_i-r_i|$. To use the penalties you are suggesting is a little tricky (but possible), then optimizing $\sum_i P_i (y_i-r_i) + \sum_i p_i(r_i-y_i)$ where $\vec{P}, \vec{p}$ are penalties for overhitting/underhitting the target $r_i$ and the first sum is over those $i$ which overhit, and the second over those that underhit.

EDIT To make the last step slightly formal, introduce variables $o_i$ (1 if index $i$ overhit and $y_i>r_i$ and 0 otherwise) and $u_i$ (1 if $y_i < r_i$ and 0 otherwise). Then, your optimization function will be $$ \sum_{i=1}^N o_i P_i (y_i-r_i) + \sum_{i=1}^N u_i p_i(r_i-y_i) $$ and the problem reduces to how to enforce the variables $o_i$ and $u_i$...

gt6989b
  • 54,422
  • 1
    I was going to recommend the same thing... Just purchased a Linear Programming book, and diet planning was the first applied example they gave. – A.E Jul 11 '13 at 21:37
  • Yes, it's a classic. But the penalties make it a little tricky... – gt6989b Jul 11 '13 at 21:39
  • 1
    @gt6989b, the programmer was asking to go easy and this is what you have cooked for him :) – imranfat Jul 11 '13 at 22:47