Unsupervised state embedding and aggregation towards scalable reinforcement learning