With this change the solver during install now considers only the part
of the tree which is required to calculate the solution, it doesn't
consider anymore World() as the search space.
The search space now is narrowed down to the packages that related to
the one which we are considering.
In this subset of changes we are also optimizing the Parallel solver
avoiding an useless loop.
This change boost overall performance on large datasets which don't
necessarly have relations touching the whole tree.
Keep a record of the observed delta and maximize reward for it.
Also add Noop actions which is turned off by default.
Let finish the execution also when no solution is found, as we will take the
minimum observed delta as result.
This is done on purpose to avoid guessing "when" is a good time to stop the agent,
as it could be in the middle of picking up a new action which is not the final
(but we need limits, we can't let it run forever).