Value smoothing using similarity-based on latent embeddings

Published: February 01, 2023

Created an experiment to modify reward structure of reinforcement learning algorithms to enhance the learning capabilities in environments with sparse rewards. And experimentally showed that the algorithm performs better that standard in environments with sparce rewards.

This used similarity in embedding space to teach a model how to understand when the output is a negative reward, but “almost correct”.

Project_Report.pdf

Share on

Facebook LinkedIn

Aniket Wagde

Share on