This is a very easy example, inspired on Forest Planning.

Suppose we have two paper mills, ABC01 and ABC02. Each has a demand of wood in a given period.

There are stands (areas of forest) that will be harvested. Each has a volume of wood. Let the index of stands be i.

The distance between each stand (i) and paper mill (j) is given.

How to supply the paper mills (supply >= demand), with the objective function of minimization of the Distance?

General rules

Because it is a linear problem, we can use only linear functions like “sumproduct” and “sum”. “Sumif” or “vlookup” are non linear.

There are a lot of advantages in modelling as a linear problem. We can use all the power of linear solvers like CBC. We have the guarantee of an optimal solution. Besides, the solution time will be faster than in the non linear case.

The model is very similar to solver.

Solution

We define a binary variable, aloc(i,j).

The volume for each destination is aloc(i,j)*volume(i)

It must be greater than demand(j)
each j, sum(i, aloc(i,j)*volume(i)) >= demand(j)

There is a need to impose a structural constraint: each stand must be destined to only one paper mill.
each i, sum(j, aloc(i,j)) <=1 (note isn’t an obligation to harvest the stand)

The objective function is minimize aloc(i,j)*distance(i,j)

Once the Excel file is ready, just set the Model in opensolver.

Set the objective function, each constraint and save the model.

Press solve, and it will give the solution.

OpenSolver shows the formulation, with the colored boxes. The “Show/Hide Model” makes it easy to hide the formulation.

It’s very easy, and similar to Solver. And also powerful, as we saw in Introduction.

Perhaps the easiest and most famous tool for linear programming is Microsoft Excel Solver. It is the tool people know from college, or some friend talked about, along with the infamous and magical “goal seek”.

Excel solver is quite intuitive, easy and ubiquitous (since Excel is ubiquitous). Solver is developed by a company named Frontline (http://www.solver.com/), not by Microsoft.

I am a great fan of Excel, but I hate Solver. Mainly because Frontline solver has a limitation of 200 variables.

Try to solve more than 200 variables, and the following message will appear:

It is very very frustrating to discover this limitation after you spent some hours modelling the problem.

Frontline provides more powerful versions of solver, but it is not free. If you are a company, that’s ok, just buy it. But if you are a student, it is not a trivial value. Luckly, there are alternatives.

OpenSolver

The OpenSolver tool can help a lot in these cases. It is a Great (upper case Great) initiative by Andrew Mason, University of Auckland, New Zealand. It has a clean, easy interface, and calls a free source solver like CBC (another Great initiative by Coin-OR – http://www.coin-or.org/).

Because the solver is CBC, we don’t have limitation on number of variables, constraints or whatever.

And because OpenSolver is open source, it is easy to learn a lot from the VBA code and also help the development of it. The limit is our imagination.

Dynamic Programming is one very useful technique for solving problems like the Knapsack problem.

The basic idea of Dynamic programming is to break down a problem in recursive subproblems. Each subproblem must be easy to solve, and its solution stored in some data structure to be used again.

In a schematic drawing, it’s something like this:

Dynamic programming is not a general procedure to be applied in any problem. The problem has to allow it. Richard Bellman, the creator of Dynamic Programming, called it “Principle of optimallity”. It is like saying “this method works in the problems it works, and doesn’t works in the problems it doesn’t works”. But mathematicians love this kind of concept, because they can ask: which problems has the principle of optimality? Is the Knapsack one of them?

Since this method is not general, the programmer has to design a new method for every different problem and also figure out if the problem has the principle of optimality.

But, despite all these observations, it is a useful way of thinking.

1) It is easy to compute the solution with 1 item for a range of weights.

Array of values

Array of picked itens:

2) We increase the number of itens, and compare current solution versus solution with new item.

Complete array with 2 items.

Proceeding this way, we get the full evaluation matrix.

The optimum is to pick itens 1, 3 and 4, and the value of those items is 14,000.

Note that, in order to get the optimum, I had to calculate every solution for 1 to number of itens and 1 to number weigths. Thus, the computational effort to calculate it is proportional to N*maxW.

The problem happens when N and W are large. It needs a lot of computation processing.

Knapsack is a pseudo polynomial time algorithm. Polynomial because it is proportional to N. But pseudo since it is not quite true it is proportional to N – it is also proportional to W, and W can be very very large. One day I’ll post a more detailed way of thinking of this.

Divide and conquer

Other question I had when studying dynamic programming for the first time was: what is the difference of this method compared to divide and conquer methods?

I think this discussion deserves a post apart, but just to give a simple explanation, divide and conquer has a “horizontal” structure, with subproblems independent of each other, while dynamic programming has a “nested” structure, with subproblems dependent of other subproblems.

It uses the Random case generator of previous post and generates a solution for this case.

Other Applications

Dynamic programming can be applied in several other cases. Cutting stock problems are also a classic application.

One less known application is in the optimization of forest multiproducts. We have a forest of eucalyptus. Each tree has to be cut in several pieces. Each piece has a different diameter (since the tree is larger in the base). A product is a combination of diameter and lenght. Given a demand, how can I minimize the waste and maximize the use of the forest?

In the next posts, I’ll show alternative methods for solving the Knapsack Problem.

Imagine that you are going to a trip, and everything you have just a knapsack to carry on everything.

The maximum weight you can carry is limited, say 14 kg.

There are a list of itens you’re considering to choose from. Each item has a weight and a value.

Which of the itens do I choose, to maximize the value I can carry on my backpack?

The “Human method”

The first method described is the “human method”. How you and me would choose the itens.

I would choose the itens with greatest value for me, one by one, till the maximum weight.

In this example, I would choose itens 1 and 2, whose sum of values is 13.000 and sum of weights is 14. It is the “greedy” method, heuristically simple but can be suboptimal.

In the end, I would take a look on my selection, and try to find if I can replace one of the chosen itens for two itens not choosed with greater value.

In this example, a simple inspection shows that the item 2 is too heavy for its benefits. It is better to replace it for itens 3 and 4.

The final selection contains itens {1, 3, 4}, with total value 14.000 (10.000 + 2000 + 2000) and weight 13 (8 +2+3).

The problem is when the list is very large, with tens or hundreds of itens. It is several timer harder to figure out a good solution.

The human method is very important. It is suboptimal, but it is fast. Humans are good problem solvers. We use intuition and millions of years of understanding of nature, to solve this kind of problem. Human reasoning can be a source for fast approximation algorithms.

Even when the problem is large, humans can solve in a heuristically interesting way. The world is also Pareto, what means that 20% of itens will have 80% of the value. In a intuitive way, I would focus my attention in those 20% and apply the above described method.

In real world application problems, I always respect a lot the people who make the analysis and try to someway incorporate some of theirs thinking in my models – a human centered analytics.

Mathematical formulation

The formulation of the knapsack problem via integer programming is straightforward.

If xi is the decision variable, binary (1 is to carry the item, 0 is to not carry) then

In the next posts I’ll show how to solve it with computational methods, as dynamic programming, integer programming using solver and evolutive metaheuristics.