Let's move CartPole immediately.
CartPole.py
import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()
python ./CartPole.py
If it works, it's OK. CartPole seems to be basic and there are many explanations, so ignore the explanations.
Recommended Posts