It creates cycle and we don't want to output anything from the computation process.
We should handle output in different stages
Also create constructor for solver to be able to consume resolvers.
Keep a record of the observed delta and maximize reward for it.
Also add Noop actions which is turned off by default.
Let finish the execution also when no solution is found, as we will take the
minimum observed delta as result.
This is done on purpose to avoid guessing "when" is a good time to stop the agent,
as it could be in the middle of picking up a new action which is not the final
(but we need limits, we can't let it run forever).
Having the same var in the and block seems to make gophersat crash. Even if might be unoptimal,
we need this to tighten the conditions between packages.
Switch to gophersat fork until this fix is merged upstream:
https://github.com/crillab/gophersat/pull/17