We prove that the combination of a target network and over-parameterized linear function
approximation establishes a weaker convergence condition for bootstrapped value …
approximation establishes a weaker convergence condition for bootstrapped value …