Reward by observedDelta

Keep a record of the observed delta and maximize reward for it. Also add Noop actions which is turned off by default. Let finish the execution also when no solution is found, as we will take the minimum observed delta as result. This is done on purpose to avoid guessing "when" is a good time to stop the agent, as it could be in the middle of picking up a new action which is not the final (but we need limits, we can't let it run forever).
2025-09-02 07:45:02 +00:00 · 2020-02-11 14:52:24 +01:00
parent 7ce522110e
commit 711c039296
2 changed files with 121 additions and 67 deletions
--- a/pkg/solver/resolver_test.go
+++ b/pkg/solver/resolver_test.go
@@ -112,12 +112,12 @@ var _ = Describe("Resolver", func() {
 				solution, err := s.Install([]pkg.Package{A, D})
 				Expect(err).ToNot(HaveOccurred())

-				Expect(len(solution)).To(Equal(4))
-
 				Expect(solution).To(ContainElement(PackageAssert{Package: A, Value: false}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: B, Value: false}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: C, Value: true}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: D, Value: true}))
+
+				Expect(len(solution)).To(Equal(4))
 			})

 			It("will find out that we can install D and F by ignoring E and A", func() {
@@ -142,14 +142,14 @@ var _ = Describe("Resolver", func() {
 				solution, err := s.Install([]pkg.Package{A, D, E, F}) // D and F should go as they have no deps. A/E should be filtered by QLearn
 				Expect(err).ToNot(HaveOccurred())

-				Expect(len(solution)).To(Equal(6))
-
 				Expect(solution).To(ContainElement(PackageAssert{Package: A, Value: false}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: B, Value: false}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: C, Value: true})) // Was already installed
 				Expect(solution).To(ContainElement(PackageAssert{Package: D, Value: true}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: E, Value: false}))
 				Expect(solution).To(ContainElement(PackageAssert{Package: F, Value: true}))
+				Expect(len(solution)).To(Equal(6))
+
 			})
 		})