Abstract: Policy iteration (PI), an iterative method in reinforcement learning, has the merit of interactions with a little-known environment to learn a decision law through policy evaluation and ...
Abstract: Iterative learning control (ILC) applies to systems that repeat the same finite-duration task repeatedly, where each repetition is termed a trial and the finite duration is termed the trial ...
We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization. Finding an optimal portfolio of retrieved items is a combinatorial ...