Training a Tetris TAMER agent

Before training (random behavior):

 

During training:

A green flash indicates positive reward from the human trainer. Red indicates negative reinforcement.

 

After 2 games of training: