Knapsack Problem – Dynamic Programming

Dynamic Programming

Dynamic Programming is one very useful technique for solving problems like the Knapsack problem.

The basic idea of Dynamic programming is to break down a problem in recursive subproblems. Each subproblem must be easy to solve, and its solution stored in some data structure to be used again.

In a schematic drawing, it’s something like this:

 

DynamicScheme.jpg

Dynamic programming is not a general procedure to be applied in any problem. The problem has to allow it. Richard Bellman, the creator of Dynamic Programming, called it  “Principle of optimallity”. It is like saying “this method works in the problems it works, and doesn’t works in the problems it doesn’t works”. But mathematicians love this kind of concept, because they can ask: which problems has the principle of optimality? Is the Knapsack one of them?

Since this method is not general, the programmer has to design a new method for every different problem and also figure out if the problem has the principle of optimality.

But, despite all these observations, it is a useful way of thinking.


 

The Algorithm

The detailed algorithm can be found in Wikipedia  (https://en.wikipedia.org/wiki/Dynamic_programming). I’ll sketch here the main ideias.

Suppose the problem is the same of previous post.
Max Weight = 14.
List of itens:
ValueWeight

1) It is easy to compute the solution with 1 item for a range of weights.

Array of values
Dyn01.JPG
Item 1, with weight 8 and value 10,000. If we vary the max weight of the knapsack from 0 to MaxW, the value array will be Zero before we have max weight = 8, and 10,000 after it.

 Array of picked itens:

Dyn02.JPG
“i1” refers to item 1

 


2) We increase the number of itens, and compare current solution versus solution with new item.
Step2_details
With items 1 and 2: if the max weight is 6, we pick item 2. When max weight is 8, we compare solution with weight 8 from item 1 (10,000) against solution with item 1 and weight 2 plus item 2 (0) and weight 6 (3,000). It is better to keep the 10,000 value solution.
Step2_details2.JPG
When maxWeight is 14, we compare current solution with item 1 (10,000) versus solution with item 1 and weight 8 (10,000) plus item 2: 13,000
Complete array with 2 items.
Step2.JPG
Evaluation arrays with itens 1 and 2
Proceeding this way, we get the full evaluation matrix.
Step5.JPG
Evaluation array with 5 items.
The optimum is to pick itens 1, 3 and 4, and the value of those items is 14,000.
Note that, in order to get the optimum, I had to calculate every solution for 1 to number of itens and 1 to number weigths. Thus, the computational effort to calculate it is proportional to N*maxW.
The problem happens when N and W are large. It needs a lot of computation processing.
Knapsack is a pseudo polynomial time algorithm. Polynomial because it is proportional to N. But pseudo since it is not quite true it is proportional to N – it is also proportional to W, and W can be very very large. One day I’ll post a more detailed way of thinking of this.

Divide and conquer
Other question I had when studying dynamic programming for the first time was: what is the difference of this method compared to divide and conquer methods?
I think this discussion deserves a post apart, but just to give a simple explanation, divide and conquer has a “horizontal” structure, with subproblems independent of each other, while dynamic programming has a “nested” structure, with subproblems dependent of other subproblems.

Computation implementation
An Excel- VBA implementation of dynamic programming applied to the knapsack problem can be found at https://github.com/asgunzi/KnapsackDynamicProgrammingVBA.
It uses the Random case generator of previous post and generates a solution for this case.
ScreenDynProg.JPG

Other Applications
Dynamic programming can be applied in several other cases. Cutting stock problems are also a classic application.
One less known application is in the optimization of forest multiproducts. We have a forest of eucalyptus. Each tree has to be cut in several pieces. Each piece has a different diameter (since the tree is larger in the base). A product is a combination of diameter and lenght. Given a demand, how can I minimize the waste and maximize the use of the forest?

 

In the next posts, I’ll show alternative methods for solving the Knapsack Problem.

 

Arnaldo Gunzi.

 

 

Human centered analytics

Engineers, computer scientists and technicians alike have affinity with mathematics, numbers, formulas. They are the kind of people who prefer to know How it works than What it delivers. They are the ones who disassemble an old watch and try to fix it.
5920883-disassembled-wrist-watch-lies-on-table-Stock-Photo.jpg
We try to understand why it works. Normal people just wear it.
One of my common mistakes in my first years in the corporate world was that I faced it like it was a college examination: I tried to explain to the executives the hypothesis, the developments, the formulas, and expected then to ask about some technical stuff, like why I used a Chi-Squared distribution instead of a Normal one, as if they were academic teachers.

Surely it was not the case. In the corporate world, and in the real world in general, people want answers. Just it, answers. And they want you to explain the solution in simple, easy-to-follow and reasonable steps. They are not stupid and of course they don’t want to be treated like that. They will check the big numbers and look for contradictions. And they want to know, in human words, why is your solution the correct one. You have to tell a history. It is your task to explain what you mean in simple ways, and to defend your position using the same domain of words (not formulas).

shakespeare-words.jpg
We need to tell histories with words, like Shakespeare. Not formulas.

 


The optimal solution for the wrong problem
In the same line of thinking, I spent a lot of years in my quest to always find the optimal solution with the best up-to-date method for analytical problems.
Wrong. Completely wrong.
An optimal solution for the wrong problem solves nothing.
FindX.png
People in the real world don’t need the optimal solution. They need a solution that works effectively. And the only way to find an effective solution is in the Questions, not in the method of resolution of the model. The problem is in the formulation of the problem.
If you want to solve a real world problem, it will not come with a formulation “Use integer programming to solve this set of equations”. It will come in diffuse information pieces, with dozens of people telling different points of will, with spurious sparse databases. And you will have to formulate the questions.
The model can be as simple as a single linear equation, as long as it provides the effective solution.

Challenges of Analytics
I see a lot of excitement and buzz words: Big Data, Artificial intelligence, Watson, R-Phyton or whatever.
bigdata.jpg

These are only tools. They don’t solve problems alone. They will not solve all the problems of the world.

In a couple of years, these buzz words will fade, some will be forgotten, and new buzz names will appear, promising the same things.

 

For me, better than trying to provide the magical solution, is to make the basics. Simple and effective solutions. And these solutions must:

  • Provide information, guidance, clarity
  • Support decision
  • Empower analysis
That is, provide ways to make questions.
Arnaldo Gunzi
Feb 2016
“The release of atom power has changed everything except our way of thinking. The solution to this problem lies in the heart of mankind. If only I had known, I would have become a watchmaker”
Albert Einstein