baselines算法库run.py模块分析
baselines算法库地址:
https://gitee.com/devilmaycry812839668/baselines
========================================
对baselines算法库中 run.py模块代码分析:
记录gym所有的游戏环境:
_game_envs = defaultdict(set)
for env in gym.envs.registry.all():
# TODO: solve this with regexes
env_type = env.entry_point.split(':')[0].split('.')[-1]
_game_envs[env_type].add(env.id)
测试:
打印结果:
algorithmic {'ReversedAddition-v0', 'RepeatCopy-v0', 'DuplicatedInput-v0', 'ReversedAddition3-v0', 'Copy-v0', 'Reverse-v0'}
classic_control {'CartPole-v0', 'CartPole-v1', 'Acrobot-v1', 'Pendulum-v0', 'MountainCarContinuous-v0', 'MountainCar-v0'}
box2d {'BipedalWalker-v3', 'LunarLander-v2', 'LunarLanderContinuous-v2', 'CarRacing-v0', 'BipedalWalkerHardcore-v3'}
toy_text {'KellyCoinflipGeneralized-v0', 'KellyCoinflip-v0', 'GuessingGame-v0', 'Taxi-v3', 'CliffWalking-v0', 'NChain-v0', 'FrozenLake-v0', 'FrozenLake8x8-v0', 'HotterColder-v0', 'Roulette-v0', 'Blackjack-v0'}
mujoco {'Ant-v2', 'Hopper-v2', 'Reacher-v2', 'InvertedDoublePendulum-v2', 'Humanoid-v2', 'HumanoidStandup-v2', 'Walker2d-v2', 'Thrower-v2', 'InvertedPendulum-v2', 'HalfCheetah-v2', 'Swimmer-v2', 'Striker-v2', 'Pusher-v2'}
half_cheetah_v3 {'HalfCheetah-v3'}
hopper_v3 {'Hopper-v3'}
swimmer_v3 {'Swimmer-v3'}
walker2d_v3 {'Walker2d-v3'}
ant_v3 {'Ant-v3'}
humanoid_v3 {'Humanoid-v3'}
robotics {'HandManipulateBlockRotateZTouchSensorsDense-v1', 'FetchReach-v1', 'FetchPush-v1', 'HandManipulateBlockTouchSensors-v0', 'HandManipulatePenFull-v0', 'FetchPickAndPlace-v1', 'HandManipulateBlockTouchSensors-v1', 'HandManipulateBlockRotateParallelTouchSensorsDense-v1', 'HandManipulateBlockRotateParallel-v0', 'HandManipulateBlockRotateXYZ-v0', 'FetchSlideDense-v1', 'HandManipulatePenRotateTouchSensors-v1', 'HandManipulateBlockFullDense-v0', 'HandManipulateEggFullDense-v0', 'HandManipulateBlockRotateZTouchSensors-v0', 'HandManipulatePenDense-v0', 'HandManipulateEggDense-v0', 'HandManipulatePenRotateTouchSensors-v0', 'HandReachDense-v0', 'HandManipulateBlockRotateZTouchSensorsDense-v0', 'HandManipulateEggTouchSensors-v1', 'FetchReachDense-v1', 'HandManipulateEggRotateTouchSensorsDense-v1', 'HandManipulatePenRotateTouchSensorsDense-v1', 'HandManipulateEggRotate-v0', 'HandManipulateEggRotateDense-v0', 'HandManipulateBlockRotateXYZTouchSensors-v0', 'HandManipulateBlockTouchSensorsDense-v1', 'HandManipulateEggTouchSensorsDense-v0', 'HandManipulateBlockRotateXYZDense-v0', 'HandManipulateBlock-v0', 'HandReach-v0', 'HandManipulateBlockRotateParallelDense-v0', 'HandManipulateBlockRotateParallelTouchSensors-v0', 'HandManipulateBlockRotateZTouchSensors-v1', 'HandManipulateEggFull-v0', 'HandManipulatePen-v0', 'HandManipulateEggRotateTouchSensors-v1', 'HandManipulateBlockRotateZDense-v0', 'HandManipulateBlockRotateXYZTouchSensorsDense-v1', 'HandManipulatePenRotateDense-v0', 'HandManipulatePenRotate-v0', 'HandManipulateBlockRotateParallelTouchSensorsDense-v0', 'HandManipulatePenTouchSensorsDense-v0', 'HandManipulatePenTouchSensors-v1', 'HandManipulateEggTouchSensorsDense-v1', 'HandManipulatePenTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensors-v1', 'HandManipulateBlockRotateParallelTouchSensors-v1', 'HandManipulatePenFullDense-v0', 'HandManipulateEggTouchSensors-v0', 'HandManipulateEgg-v0', 'HandManipulateBlockRotateZ-v0', 'HandManipulateBlockTouchSensorsDense-v0', 'FetchSlide-v1', 'HandManipulatePenRotateTouchSensorsDense-v0', 'FetchPushDense-v1', 'HandManipulateBlockFull-v0', 'HandManipulateBlockDense-v0', 'HandManipulateEggRotateTouchSensors-v0', 'HandManipulateEggRotateTouchSensorsDense-v0', 'FetchPickAndPlaceDense-v1', 'HandManipulatePenTouchSensors-v0', 'HandManipulateBlockRotateXYZTouchSensorsDense-v0'}
atari {'ElevatorAction-ramDeterministic-v0', 'RoadRunner-ramDeterministic-v0', 'HeroDeterministic-v4', 'Bowling-ramNoFrameskip-v4', 'Bowling-ramDeterministic-v0', 'Assault-ramDeterministic-v0', 'MsPacman-v4', 'MsPacmanDeterministic-v0', 'BowlingNoFrameskip-v4', 'Atlantis-ram-v4', 'Boxing-v4', 'ChopperCommand-ramDeterministic-v4', 'NameThisGame-ramDeterministic-v4', 'JourneyEscapeNoFrameskip-v4', 'Solaris-ramDeterministic-v4', 'CrazyClimber-ramNoFrameskip-v4', 'NameThisGame-ramNoFrameskip-v4', 'QbertNoFrameskip-v4', 'Gravitar-v4', 'ZaxxonNoFrameskip-v4', 'DoubleDunkDeterministic-v0', 'KangarooDeterministic-v4', 'Pitfall-ram-v0', 'BattleZoneDeterministic-v4', 'Amidar-v4', 'Enduro-ramDeterministic-v0', 'StarGunner-ram-v0', 'BankHeistDeterministic-v4', 'BattleZone-ramDeterministic-v0', 'ChopperCommand-ram-v0', 'Jamesbond-ramNoFrameskip-v0', 'DoubleDunk-v4', 'Pong-v0', 'Qbert-v0', 'Tutankham-ramNoFrameskip-v0', 'BattleZone-ramNoFrameskip-v0', 'Defender-ramDeterministic-v4', 'Kangaroo-ram-v0', 'MontezumaRevengeNoFrameskip-v4', 'Frostbite-ramNoFrameskip-v4', 'Hero-ramDeterministic-v0', 'QbertDeterministic-v4', 'EnduroDeterministic-v0', 'RobotankDeterministic-v4', 'SolarisDeterministic-v4', 'Enduro-ramDeterministic-v4', 'Asterix-ram-v4', 'Asterix-ramDeterministic-v4', 'CrazyClimberDeterministic-v4', 'KrullNoFrameskip-v4', 'Freeway-ram-v4', 'JourneyEscape-v4', 'Asterix-ramNoFrameskip-v0', 'DemonAttack-v0', 'Boxing-ram-v4', 'SpaceInvaders-ram-v4', 'Kangaroo-v0', 'Solaris-v4', 'Breakout-ram-v4', 'KungFuMasterDeterministic-v4', 'CrazyClimber-ramNoFrameskip-v0', 'IceHockey-ramDeterministic-v0', 'Adventure-ramNoFrameskip-v4', 'HeroNoFrameskip-v0', 'PrivateEyeNoFrameskip-v0', 'SpaceInvaders-ramDeterministic-v4', 'Freeway-ramNoFrameskip-v0', 'AirRaid-ram-v4', 'Zaxxon-v0', 'Riverraid-v0', 'Robotank-ramNoFrameskip-v4', 'AmidarDeterministic-v0', 'WizardOfWor-ram-v4', 'VideoPinball-v4', 'BeamRider-ramNoFrameskip-v0', 'DefenderDeterministic-v0', 'Centipede-ram-v0', 'AirRaidNoFrameskip-v4', 'BankHeist-ramDeterministic-v4', 'JourneyEscape-ramDeterministic-v4', 'TimePilot-ramNoFrameskip-v4', 'MsPacman-ramDeterministic-v4', 'Skiing-v0', 'UpNDownDeterministic-v4', 'Pong-ramNoFrameskip-v0', 'Centipede-ramNoFrameskip-v0', 'AirRaidDeterministic-v0', 'KungFuMaster-ramNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v4', 'AssaultNoFrameskip-v4', 'WizardOfWor-ramNoFrameskip-v0', 'BeamRiderDeterministic-v4', 'PhoenixDeterministic-v0', 'PongDeterministic-v4', 'Jamesbond-v0', 'VideoPinball-ram-v0', 'RoadRunner-ram-v0', 'GopherNoFrameskip-v0', 'Assault-v4', 'Defender-ram-v4', 'UpNDown-ram-v0', 'MontezumaRevenge-ram-v0', 'Defender-ram-v0', 'BankHeist-ramNoFrameskip-v4', 'DoubleDunk-ramNoFrameskip-v0', 'VentureDeterministic-v0', 'MontezumaRevenge-ramDeterministic-v4', 'Adventure-ramDeterministic-v0', 'MsPacman-ramNoFrameskip-v0', 'Krull-ram-v4', 'EnduroNoFrameskip-v4', 'JamesbondNoFrameskip-v0', 'Atlantis-ramNoFrameskip-v4', 'ElevatorAction-ramNoFrameskip-v4', 'DemonAttackNoFrameskip-v4', 'AdventureNoFrameskip-v4', 'Carnival-ramNoFrameskip-v4', 'PitfallDeterministic-v4', 'TimePilot-v4', 'Solaris-ramNoFrameskip-v0', 'PitfallDeterministic-v0', 'DoubleDunkNoFrameskip-v0', 'UpNDown-ramDeterministic-v0', 'JourneyEscapeNoFrameskip-v0', 'Asteroids-v0', 'NameThisGame-v0', 'BattleZone-v0', 'Tennis-v0', 'ChopperCommandNoFrameskip-v4', 'YarsRevenge-ram-v4', 'Hero-v4', 'PongNoFrameskip-v4', 'NameThisGame-ram-v0', 'Assault-ramNoFrameskip-v4', 'RiverraidDeterministic-v4', 'ZaxxonDeterministic-v4', 'VentureNoFrameskip-v0', 'SpaceInvadersNoFrameskip-v4', 'Tutankham-ram-v0', 'BreakoutNoFrameskip-v0', 'AirRaid-ramDeterministic-v4', 'AsteroidsDeterministic-v0', 'BankHeistNoFrameskip-v4', 'Bowling-ramNoFrameskip-v0', 'Alien-ramDeterministic-v4', 'Alien-ramNoFrameskip-v0', 'IceHockey-v0', 'AirRaid-v0', 'SeaquestNoFrameskip-v4', 'Atlantis-ramDeterministic-v0', 'SpaceInvaders-ram-v0', 'Jamesbond-v4', 'Kangaroo-ramNoFrameskip-v0', 'AlienNoFrameskip-v4', 'Phoenix-ram-v0', 'GravitarNoFrameskip-v0', 'PitfallNoFrameskip-v4', 'DoubleDunk-ram-v4', 'StarGunner-v0', 'CrazyClimber-ramDeterministic-v0', 'Robotank-ram-v0', 'Breakout-ramNoFrameskip-v0', 'AsterixNoFrameskip-v0', 'AtlantisNoFrameskip-v4', 'Boxing-ramNoFrameskip-v0', 'GopherDeterministic-v4', 'Gravitar-ram-v0', 'YarsRevenge-ramDeterministic-v0', 'KungFuMaster-ramDeterministic-v0', 'VideoPinballDeterministic-v4', 'Asteroids-ramNoFrameskip-v0', 'PhoenixNoFrameskip-v0', 'Jamesbond-ramNoFrameskip-v4', 'RiverraidNoFrameskip-v0', 'Pong-ramDeterministic-v4', 'TimePilot-ramDeterministic-v0', 'IceHockeyDeterministic-v4', 'Freeway-ram-v0', 'Adventure-ram-v4', 'FishingDerby-v4', 'Hero-ramNoFrameskip-v0', 'Seaquest-ramNoFrameskip-v4', 'SpaceInvadersDeterministic-v0', 'WizardOfWor-ram-v0', 'SpaceInvadersNoFrameskip-v0', 'Zaxxon-ram-v4', 'SpaceInvaders-v4', 'FreewayNoFrameskip-v0', 'BoxingNoFrameskip-v4', 'TimePilot-ram-v0', 'MontezumaRevengeNoFrameskip-v0', 'BreakoutDeterministic-v4', 'Defender-ramNoFrameskip-v4', 'TutankhamDeterministic-v0', 'Venture-ram-v0', 'Pong-v4', 'Robotank-v4', 'Atlantis-v4', 'Skiing-ram-v4', 'MsPacman-ramNoFrameskip-v4', 'AssaultNoFrameskip-v0', 'NameThisGame-ramDeterministic-v0', 'BattleZone-ramNoFrameskip-v4', 'Krull-ramDeterministic-v4', 'Breakout-ram-v0', 'MontezumaRevenge-ramNoFrameskip-v0', 'AssaultDeterministic-v0', 'Qbert-ram-v0', 'Seaquest-ram-v4', 'Seaquest-v4', 'Solaris-ram-v4', 'Solaris-ram-v0', 'RoadRunnerDeterministic-v4', 'CarnivalDeterministic-v4', 'Hero-ramDeterministic-v4', 'Amidar-ramNoFrameskip-v0', 'YarsRevenge-ramNoFrameskip-v4', 'GopherNoFrameskip-v4', 'FishingDerbyNoFrameskip-v0', 'AirRaidDeterministic-v4', 'Asterix-v4', 'MsPacmanNoFrameskip-v4', 'Bowling-v0', 'KungFuMaster-ram-v4', 'GopherDeterministic-v0', 'AirRaidNoFrameskip-v0', 'BankHeist-ram-v0', 'SkiingDeterministic-v0', 'StarGunnerDeterministic-v0', 'FishingDerby-v0', 'StarGunner-ramDeterministic-v0', 'RobotankNoFrameskip-v4', 'KangarooNoFrameskip-v0', 'TutankhamNoFrameskip-v0', 'Robotank-ramNoFrameskip-v0', 'AirRaid-ram-v0', 'Assault-ramDeterministic-v4', 'FishingDerby-ramNoFrameskip-v4', 'RoadRunner-v0', 'Krull-ram-v0', 'Pooyan-ramDeterministic-v0', 'Pitfall-ramNoFrameskip-v0', 'UpNDownDeterministic-v0', 'AsterixDeterministic-v4', 'Kangaroo-v4', 'SkiingNoFrameskip-v0', 'Gopher-v4', 'CarnivalDeterministic-v0', 'Solaris-ramNoFrameskip-v4', 'VideoPinball-ramNoFrameskip-v0', 'FrostbiteNoFrameskip-v4', 'RoadRunnerNoFrameskip-v4', 'ElevatorAction-ram-v0', 'Defender-ramNoFrameskip-v0', 'Frostbite-ramDeterministic-v4', 'EnduroNoFrameskip-v0', 'Bowling-ram-v0', 'IceHockey-ramNoFrameskip-v0', 'Defender-v0', 'CarnivalNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v0', 'Frostbite-ramDeterministic-v0', 'ZaxxonDeterministic-v0', 'IceHockey-ramDeterministic-v4', 'AsteroidsNoFrameskip-v0', 'Phoenix-v0', 'DoubleDunk-v0', 'Gravitar-ramNoFrameskip-v4', 'NameThisGameDeterministic-v0', 'BeamRiderNoFrameskip-v0', 'JamesbondDeterministic-v0', 'BattleZoneNoFrameskip-v4', 'YarsRevenge-ramDeterministic-v4', 'Breakout-v4', 'Adventure-ramDeterministic-v4', 'FrostbiteNoFrameskip-v0', 'Freeway-v4', 'ElevatorActionDeterministic-v0', 'BankHeist-v4', 'Frostbite-ram-v0', 'PongDeterministic-v0', 'AirRaid-v4', 'GravitarDeterministic-v4', 'UpNDown-ramNoFrameskip-v4', 'DefenderNoFrameskip-v4', 'Pitfall-ramDeterministic-v4', 'IceHockeyNoFrameskip-v0', 'JourneyEscape-ramNoFrameskip-v0', 'Pitfall-v4', 'Qbert-ram-v4', 'YarsRevenge-ram-v0', 'BeamRider-ramNoFrameskip-v4', 'BeamRider-ram-v0', 'PhoenixNoFrameskip-v4', 'KrullNoFrameskip-v0', 'Seaquest-v0', 'AmidarDeterministic-v4', 'SpaceInvaders-v0', 'SolarisNoFrameskip-v0', 'Venture-v4', 'Pong-ramNoFrameskip-v4', 'Enduro-v4', 'Pooyan-ram-v4', 'AlienDeterministic-v4', 'MsPacman-ram-v0', 'Phoenix-ram-v4', 'Seaquest-ramDeterministic-v0', 'Amidar-ramDeterministic-v0', 'Bowling-v4', 'SpaceInvadersDeterministic-v4', 'Gopher-ram-v0', 'JourneyEscape-ramNoFrameskip-v4', 'Phoenix-ramNoFrameskip-v4', 'RoadRunner-ramNoFrameskip-v0', 'Boxing-ram-v0', 'DoubleDunk-ramDeterministic-v4', 'JamesbondNoFrameskip-v4', 'Breakout-ramDeterministic-v4', 'Phoenix-v4', 'SeaquestDeterministic-v0', 'QbertDeterministic-v0', 'Krull-v0', 'Jamesbond-ramDeterministic-v0', 'Riverraid-ramNoFrameskip-v0', 'IceHockeyDeterministic-v0', 'Phoenix-ramNoFrameskip-v0', 'PrivateEye-v4', 'Centipede-v0', 'CrazyClimberNoFrameskip-v0', 'Jamesbond-ram-v4', 'SkiingDeterministic-v4', 'Freeway-ramNoFrameskip-v4', 'BattleZone-v4', 'Carnival-ramDeterministic-v4', 'IceHockeyNoFrameskip-v4', 'KrullDeterministic-v0', 'Venture-ramDeterministic-v4', 'Kangaroo-ramDeterministic-v4', 'YarsRevengeNoFrameskip-v4', 'IceHockey-ram-v4', 'DemonAttack-ramNoFrameskip-v0', 'PrivateEye-ram-v4', 'Alien-v0', 'AirRaid-ramNoFrameskip-v4', 'UpNDown-v0', 'SpaceInvaders-ramNoFrameskip-v4', 'VideoPinball-ramDeterministic-v0', 'Frostbite-v4', 'Tutankham-v0', 'SeaquestDeterministic-v4', 'Atlantis-ramDeterministic-v4', 'Gopher-ramDeterministic-v0', 'Hero-ram-v4', 'UpNDown-ramDeterministic-v4', 'NameThisGameNoFrameskip-v4', 'TimePilot-v0', 'FrostbiteDeterministic-v0', 'Amidar-ram-v4', 'ChopperCommandNoFrameskip-v0', 'PitfallNoFrameskip-v0', 'CrazyClimber-v4', 'RoadRunner-ramNoFrameskip-v4', 'BankHeistNoFrameskip-v0', 'Tutankham-v4', 'KungFuMasterNoFrameskip-v4', 'YarsRevengeDeterministic-v0', 'HeroDeterministic-v0', 'AdventureDeterministic-v0', 'Berzerk-ram-v4', 'Centipede-ram-v4', 'Skiing-ramDeterministic-v4', 'VideoPinball-ramNoFrameskip-v4', 'AdventureNoFrameskip-v0', 'DefenderDeterministic-v4', 'Alien-ramNoFrameskip-v4', 'ElevatorAction-v0', 'PrivateEyeDeterministic-v4', 'MontezumaRevenge-ram-v4', 'Asterix-ramNoFrameskip-v4', 'Alien-v4', 'TennisNoFrameskip-v0', 'Atlantis-v0', 'Krull-ramNoFrameskip-v4', 'BerzerkDeterministic-v4', 'Tennis-v4', 'VideoPinballNoFrameskip-v0', 'Zaxxon-ramDeterministic-v0', 'NameThisGameNoFrameskip-v0', 'JourneyEscape-ramDeterministic-v0', 'Pooyan-ramNoFrameskip-v4', 'Pitfall-v0', 'Riverraid-ramDeterministic-v0', 'VideoPinballDeterministic-v0', 'Tutankham-ramDeterministic-v4', 'Phoenix-ramDeterministic-v0', 'Tennis-ramNoFrameskip-v4', 'Hero-ram-v0', 'Frostbite-ram-v4', 'TennisNoFrameskip-v4', 'Carnival-ramDeterministic-v0', 'DemonAttack-ram-v4', 'JourneyEscape-ram-v4', 'UpNDown-v4', 'PooyanNoFrameskip-v4', 'Centipede-ramDeterministic-v4', 'Seaquest-ramDeterministic-v4', 'Pooyan-v4', 'AlienNoFrameskip-v0', 'Pong-ram-v4', 'UpNDownNoFrameskip-v0', 'AsterixNoFrameskip-v4', 'Breakout-ramDeterministic-v0', 'Venture-ramNoFrameskip-v4', 'Pong-ram-v0', 'Adventure-ramNoFrameskip-v0', 'Asteroids-ram-v4', 'Carnival-v0', 'PooyanNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v0', 'Qbert-v4', 'TutankhamNoFrameskip-v4', 'Skiing-ramNoFrameskip-v0', 'Riverraid-v4', 'VentureNoFrameskip-v4', 'CentipedeNoFrameskip-v4', 'Adventure-v4', 'MsPacmanNoFrameskip-v0', 'TimePilotDeterministic-v0', 'BattleZoneDeterministic-v0', 'BankHeist-ramNoFrameskip-v0', 'FishingDerbyDeterministic-v4', 'Boxing-ramDeterministic-v4', 'Asteroids-ramDeterministic-v0', 'PooyanDeterministic-v4', 'CrazyClimber-ram-v4', 'RoadRunner-v4', 'Amidar-v0', 'DemonAttackDeterministic-v0', 'Riverraid-ram-v0', 'PrivateEye-ramDeterministic-v0', 'RiverraidDeterministic-v0', 'RobotankNoFrameskip-v0', 'Seaquest-ram-v0', 'CarnivalNoFrameskip-v0', 'FishingDerby-ramDeterministic-v0', 'KangarooNoFrameskip-v4', 'FishingDerbyNoFrameskip-v4', 'ElevatorActionDeterministic-v4', 'Enduro-v0', 'AtlantisDeterministic-v0', 'KungFuMasterDeterministic-v0', 'Breakout-ramNoFrameskip-v4', 'PrivateEyeDeterministic-v0', 'BowlingNoFrameskip-v0', 'DoubleDunkNoFrameskip-v4', 'AsteroidsNoFrameskip-v4', 'CrazyClimberDeterministic-v0', 'Assault-ram-v0', 'Venture-ramNoFrameskip-v0', 'Gravitar-v0', 'GravitarNoFrameskip-v4', 'StarGunner-v4', 'KungFuMaster-ramDeterministic-v4', 'Tennis-ram-v4', 'Asterix-ramDeterministic-v0', 'Venture-ramDeterministic-v0', 'Tennis-ramDeterministic-v0', 'SolarisDeterministic-v0', 'VentureDeterministic-v4', 'VideoPinball-ramDeterministic-v4', 'Gravitar-ramDeterministic-v0', 'IceHockey-ramNoFrameskip-v4', 'ChopperCommand-ramDeterministic-v0', 'Asteroids-v4', 'ElevatorActionNoFrameskip-v0', 'BeamRider-ram-v4', 'Qbert-ramDeterministic-v0', 'Robotank-ramDeterministic-v4', 'Hero-ramNoFrameskip-v4', 'Skiing-ramNoFrameskip-v4', 'SkiingNoFrameskip-v4', 'Berzerk-v4', 'ChopperCommand-ramNoFrameskip-v0', 'BoxingDeterministic-v4', 'Venture-ram-v4', 'KungFuMaster-v0', 'PrivateEyeNoFrameskip-v4', 'HeroNoFrameskip-v4', 'Asteroids-ramDeterministic-v4', 'Krull-ramDeterministic-v0', 'GravitarDeterministic-v0', 'Kangaroo-ramDeterministic-v0', 'Gopher-ramNoFrameskip-v0', 'Breakout-v0', 'TimePilotDeterministic-v4', 'Jamesbond-ram-v0', 'Freeway-ramDeterministic-v4', 'JamesbondDeterministic-v4', 'JourneyEscapeDeterministic-v4', 'Gopher-v0', 'Tutankham-ramNoFrameskip-v4', 'KungFuMasterNoFrameskip-v0', 'SpaceInvaders-ramNoFrameskip-v0', 'Enduro-ram-v0', 'YarsRevengeNoFrameskip-v0', 'AirRaid-ramNoFrameskip-v0', 'Tutankham-ram-v4', 'Kangaroo-ram-v4', 'FreewayDeterministic-v4', 'RoadRunnerNoFrameskip-v0', 'RoadRunnerDeterministic-v0', 'CentipedeNoFrameskip-v0', 'AssaultDeterministic-v4', 'WizardOfWorDeterministic-v0', 'AsterixDeterministic-v0', 'SolarisNoFrameskip-v4', 'FishingDerbyDeterministic-v0', 'ChopperCommand-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v4', 'Bowling-ramDeterministic-v4', 'YarsRevenge-ramNoFrameskip-v0', 'UpNDownNoFrameskip-v4', 'BankHeistDeterministic-v0', 'Tennis-ramNoFrameskip-v0', 'CrazyClimber-ram-v0', 'EnduroDeterministic-v4', 'TennisDeterministic-v0', 'Riverraid-ram-v4', 'Amidar-ram-v0', 'MsPacman-ram-v4', 'AirRaid-ramDeterministic-v0', 'Adventure-v0', 'Enduro-ramNoFrameskip-v0', 'KangarooDeterministic-v0', 'BreakoutNoFrameskip-v4', 'AtlantisNoFrameskip-v0', 'TimePilot-ramDeterministic-v4', 'StarGunner-ramDeterministic-v4', 'UpNDown-ram-v4', 'Carnival-v4', 'Frostbite-ramNoFrameskip-v0', 'ZaxxonNoFrameskip-v0', 'MontezumaRevengeDeterministic-v4', 'Gopher-ramDeterministic-v4', 'Asterix-v0', 'Riverraid-ramNoFrameskip-v4', 'Hero-v0', 'QbertNoFrameskip-v0', 'Jamesbond-ramDeterministic-v4', 'KungFuMaster-v4', 'Seaquest-ramNoFrameskip-v0', 'ChopperCommandDeterministic-v4', 'StarGunnerNoFrameskip-v4', 'BeamRider-v0', 'BattleZone-ram-v0', 'AsteroidsDeterministic-v4', 'FishingDerby-ram-v4', 'WizardOfWorNoFrameskip-v0', 'Alien-ram-v0', 'AtlantisDeterministic-v4', 'Qbert-ramNoFrameskip-v0', 'AlienDeterministic-v0', 'Kangaroo-ramNoFrameskip-v4', 'DoubleDunkDeterministic-v4', 'WizardOfWor-v4', 'MontezumaRevenge-ramNoFrameskip-v4', 'BerzerkNoFrameskip-v4', 'YarsRevenge-v4', 'DemonAttack-v4', 'NameThisGameDeterministic-v4', 'DoubleDunk-ram-v0', 'Zaxxon-ramDeterministic-v4', 'Robotank-v0', 'Pooyan-ramDeterministic-v4', 'DoubleDunk-ramDeterministic-v0', 'Krull-ramNoFrameskip-v0', 'TimePilot-ram-v4', 'DemonAttackDeterministic-v4', 'Pitfall-ramDeterministic-v0', 'PooyanDeterministic-v0', 'Carnival-ram-v0', 'Pooyan-v0', 'ChopperCommand-ram-v4', 'AmidarNoFrameskip-v0', 'BerzerkNoFrameskip-v0', 'ElevatorAction-ramDeterministic-v4', 'KrullDeterministic-v4', 'TimePilotNoFrameskip-v4', 'Enduro-ram-v4', 'Tennis-ram-v0', 'Carnival-ram-v4', 'ChopperCommand-v4', 'MontezumaRevenge-v4', 'TimePilotNoFrameskip-v0', 'Skiing-ram-v0', 'BankHeist-ram-v4', 'BankHeist-ramDeterministic-v0', 'JourneyEscape-v0', 'Krull-v4', 'Tutankham-ramDeterministic-v0', 'MsPacmanDeterministic-v4', 'Alien-ramDeterministic-v0', 'ChopperCommandDeterministic-v0', 'Pooyan-ramNoFrameskip-v0', 'Bowling-ram-v4', 'Riverraid-ramDeterministic-v4', 'MontezumaRevenge-ramDeterministic-v0', 'PrivateEye-ram-v0', 'ElevatorAction-ram-v4', 'SpaceInvaders-ramDeterministic-v0', 'Pooyan-ram-v0', 'Berzerk-ramNoFrameskip-v4', 'CentipedeDeterministic-v0', 'AdventureDeterministic-v4', 'PhoenixDeterministic-v4', 'YarsRevengeDeterministic-v4', 'ElevatorActionNoFrameskip-v4', 'PrivateEye-v0', 'FreewayNoFrameskip-v4', 'Asterix-ram-v0', 'NameThisGame-ram-v4', 'ElevatorAction-ramNoFrameskip-v0', 'BeamRiderDeterministic-v0', 'ChopperCommand-v0', 'DemonAttack-ram-v0', 'BowlingDeterministic-v4', 'StarGunnerDeterministic-v4', 'JourneyEscapeDeterministic-v0', 'StarGunnerNoFrameskip-v0', 'Zaxxon-ram-v0', 'Freeway-v0', 'Defender-v4', 'Centipede-ramNoFrameskip-v4', 'Robotank-ramDeterministic-v0', 'IceHockey-ram-v0', 'NameThisGame-ramNoFrameskip-v0', 'StarGunner-ramNoFrameskip-v0', 'Pitfall-ramNoFrameskip-v4', 'WizardOfWor-ramDeterministic-v0', 'BeamRider-ramDeterministic-v0', 'MontezumaRevengeDeterministic-v0', 'RoadRunner-ramDeterministic-v4', 'Solaris-v0', 'Venture-v0', 'Amidar-ramNoFrameskip-v4', 'Assault-ram-v4', 'Alien-ram-v4', 'Atlantis-ramNoFrameskip-v0', 'RoadRunner-ram-v4', 'Assault-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v4', 'BattleZone-ramDeterministic-v4', 'Pong-ramDeterministic-v0', 'MsPacman-v0', 'FishingDerby-ramNoFrameskip-v0', 'BattleZone-ram-v4', 'YarsRevenge-v0', 'Gopher-ramNoFrameskip-v4', 'FishingDerby-ram-v0', 'DoubleDunk-ramNoFrameskip-v4', 'BeamRider-v4', 'DefenderNoFrameskip-v0', 'Gopher-ram-v4', 'Qbert-ramDeterministic-v4', 'Gravitar-ram-v4', 'VideoPinballNoFrameskip-v4', 'TennisDeterministic-v4', 'BeamRiderNoFrameskip-v4', 'WizardOfWorDeterministic-v4', 'Frostbite-v0', 'TutankhamDeterministic-v4', 'ElevatorAction-v4', 'MontezumaRevenge-v0', 'Amidar-ramDeterministic-v4', 'Skiing-ramDeterministic-v0', 'Zaxxon-v4', 'Skiing-v4', 'Assault-v0', 'Berzerk-v0', 'Asteroids-ramNoFrameskip-v4', 'CrazyClimber-v0', 'DemonAttack-ramDeterministic-v4', 'WizardOfWor-ramNoFrameskip-v4', 'FishingDerby-ramDeterministic-v4', 'Boxing-v0', 'KungFuMaster-ram-v0', 'VideoPinball-ram-v4', 'BattleZoneNoFrameskip-v0', 'BankHeist-v0', 'Berzerk-ram-v0', 'Carnival-ramNoFrameskip-v0', 'MsPacman-ramDeterministic-v0', 'Asteroids-ram-v0', 'PongNoFrameskip-v0', 'DemonAttack-ramNoFrameskip-v4', 'WizardOfWorNoFrameskip-v4', 'Atlantis-ram-v0', 'Freeway-ramDeterministic-v0', 'BoxingNoFrameskip-v0', 'WizardOfWor-ramDeterministic-v4', 'KungFuMaster-ramNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v4', 'WizardOfWor-v0', 'Berzerk-ramNoFrameskip-v0', 'Phoenix-ramDeterministic-v4', 'Gravitar-ramNoFrameskip-v0', 'CentipedeDeterministic-v4', 'DemonAttackNoFrameskip-v0', 'BreakoutDeterministic-v0', 'NameThisGame-v4', 'AmidarNoFrameskip-v4', 'RiverraidNoFrameskip-v4', 'Tennis-ramDeterministic-v4', 'VideoPinball-v0', 'BerzerkDeterministic-v0', 'Boxing-ramDeterministic-v0', 'Berzerk-ramDeterministic-v0', 'UpNDown-ramNoFrameskip-v0', 'DemonAttack-ramDeterministic-v0', 'Gravitar-ramDeterministic-v4', 'TimePilot-ramNoFrameskip-v0', 'FrostbiteDeterministic-v4', 'BeamRider-ramDeterministic-v4', 'Robotank-ram-v4', 'Defender-ramDeterministic-v0', 'Centipede-ramDeterministic-v0', 'FreewayDeterministic-v0', 'IceHockey-v4', 'StarGunner-ram-v4', 'Berzerk-ramDeterministic-v4', 'RobotankDeterministic-v0', 'Centipede-v4', 'PrivateEye-ramDeterministic-v4', 'Adventure-ram-v0', 'CrazyClimber-ramDeterministic-v4', 'Solaris-ramDeterministic-v0', 'BowlingDeterministic-v0', 'SeaquestNoFrameskip-v0', 'Pitfall-ram-v4', 'JourneyEscape-ram-v0', 'Boxing-ramNoFrameskip-v4', 'BoxingDeterministic-v0', 'StarGunner-ramNoFrameskip-v4', 'CrazyClimberNoFrameskip-v4'}
unittest {'CubeCrashSparse-v0', 'CubeCrash-v0', 'CubeCrashScreenBecomesBlack-v0', 'MemorizeDigits-v0'}
================================
函数:
def get_default_network(env_type):
if env_type in {'atari', 'retro'}:
return 'cnn'
else:
return 'mlp'
根据输出参数返回指定哪种神经网络结构,CNN还是MLP。
===========================================
函数:
def get_alg_module(alg, submodule=None):
submodule = submodule or alg
try:
# first try to import the alg module from baselines
alg_module = import_module('.'.join(['baselines', alg, submodule]))
except ImportError:
# then from rl_algs
alg_module = import_module('.'.join(['rl_' + 'algs', alg, submodule])) return alg_module
get_alg_module函数的作用就是 输入的alg变量是个字符串,比如 alg为 'os' , 这个函数内部的作用就是import os; return os;
该函数就是在程序启动时识别要启动的具体算法,然后在算法库中根据具体的算法名称调用该算法。
函数:
def get_learn_function(alg):
return get_alg_module(alg).learn
调用具体强化学习算法模块的learn函数
函数:
def get_learn_function_defaults(alg, env_type):
try:
alg_defaults = get_alg_module(alg, 'defaults')
kwargs = getattr(alg_defaults, env_type)()
except (ImportError, AttributeError):
kwargs = {}
return kwargs
alg为要调用的算法,如:deepq、a2c、acer等,这里alg为字符串类型,如下:
baselines中的所有算法都以名字单独有一个文件夹。
env_type为游戏环境类型,也是最前面的字典_game_envs中的key值:
我们比较常用的游戏环境为atari 和 mujoco 。
函数get_learn_function_defaults的作用就是返回调用的强化学习算法模块下的default子模块中定义的名称为对应的env_type的函数来返回算法运行参数字典,这里我们以deepq算法模块下的default子模块为例:
函数:
def get_env_type(args):
env_id = args.env if args.env_type is not None:
return args.env_type, env_id # Re-parse the gym registry, since we could have new envs since last time.
for env in gym.envs.registry.all():
env_type = env.entry_point.split(':')[0].split('.')[-1]
_game_envs[env_type].add(env.id) # This is a set so add is idempotent if env_id in _game_envs.keys():
env_type = env_id
env_id = [g for g in _game_envs[env_type]][0]
else:
env_type = None
for g, e in _game_envs.items():
if env_id in e:
env_type = g
break
if ':' in env_id:
env_type = re.sub(r':.*', '', env_id)
assert env_type is not None, 'env_id {} is not recognized in env types'.format(env_id, _game_envs.keys()) return env_type, env_id
根据输入的环境名env_id识别出所属的环境类型env_type 。
对输入参数的解析:
def arg_parser():
"""
Create an empty argparse.ArgumentParser.
"""
import argparse
return argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) def common_arg_parser():
"""
Create an argparse.ArgumentParser for run_mujoco.py.
"""
parser = arg_parser()
parser.add_argument('--env', help='environment ID', type=str, default='Reacher-v2')
parser.add_argument('--env_type', help='type of environment, used when the environment type cannot be automatically determined', type=str)
parser.add_argument('--seed', help='RNG seed', type=int, default=None)
parser.add_argument('--alg', help='Algorithm', type=str, default='ppo2')
parser.add_argument('--num_timesteps', type=float, default=1e6),
parser.add_argument('--network', help='network type (mlp, cnn, lstm, cnn_lstm, conv_only)', default=None)
parser.add_argument('--gamestate', help='game state to load (so far only used in retro games)', default=None)
parser.add_argument('--num_env', help='Number of environment copies being run in parallel. When not specified, set to number of cpus for Atari, and to 1 for Mujoco', default=None, type=int)
parser.add_argument('--reward_scale', help='Reward scale factor. Default: 1.0', default=1.0, type=float)
parser.add_argument('--save_path', help='Path to save trained model to', default=None, type=str)
parser.add_argument('--save_video_interval', help='Save video every x steps (0 = disabled)', default=0, type=int)
parser.add_argument('--save_video_length', help='Length of recorded video. Default: 200', default=200, type=int)
parser.add_argument('--log_path', help='Directory to save learning curve data.', default=None, type=str)
parser.add_argument('--play', default=False, action='store_true')
return parser def parse_unknown_args(args):
"""
Parse arguments not consumed by arg parser into a dictionary
"""
retval = {}
preceded_by_key = False
for arg in args:
if arg.startswith('--'):
if '=' in arg:
key = arg.split('=')[0][2:]
value = arg.split('=')[1]
retval[key] = value
else:
key = arg[2:]
preceded_by_key = True
elif preceded_by_key:
retval[key] = arg
preceded_by_key = False return retval def parse_cmdline_kwargs(args):
'''
convert a list of '='-spaced command-line arguments to a dictionary, evaluating python objects when possible
'''
def parse(v): assert isinstance(v, str)
try:
return eval(v)
except (NameError, SyntaxError):
return v return {k: parse(v) for k,v in parse_unknown_args(args).items()} arg_parser = common_arg_parser()
args, unknown_args = arg_parser.parse_known_args()
extra_args = parse_cmdline_kwargs(unknown_args) print(args)
print(unknown_args)
print(extra_args)
运行:
python test.py --aaa=me --xxx=11.11 --abc=True --cde=1+99
结果:
Namespace(env='Reacher-v2', env_type=None, seed=None, alg='ppo2', num_timesteps=1000000.0, network=None, gamestate=None, num_env=None, reward_scale=1.0, save_path=None, save_video_interval=0, save_video_length=200, log_path=None, play=False)
['--aaa=me', '--xxx=11.11', '--abc=True', '--cde=1+99']
{'aaa': 'me', 'xxx': 11.11, 'abc': True, 'cde': 100}
其中,函数parse_unknown_args 是对没有成功解析的字符串构成的list转为dict类型,这里就是将list数据:
['--aaa=me', '--xxx=11.11', '--abc=True', '--cde=1+99']
转成字典类型:
{'aaa:'me', 'xxx':'11.11', 'abc':'True', 'cde':'1+99'}
而操作:{k: parse(v) for k,v in parse_unknown_args(args).items()}
再将字符类型的value转为对应的python类型,也就有了最后解析出来的字典:
{'aaa': 'me', 'xxx': 11.11, 'abc': True, 'cde': 100} 。
不过这个字符解析的操作搞得这么复杂在这里其实意义不太大,毕竟对于未定义的变量进行解析本就是一个极为小众的操作,这里不得不是baselines库的编写有些太不精简和优化了。
baselines所支持的游戏环境,由于可以兼容retro模块,也就是说可以安装一些任天堂的游戏,因此需要为支持的环境字典中添加retro下的游戏名:
_game_envs['retro'] = {
'BubbleBobble-Nes',
'SuperMarioBros-Nes',
'TwinBee3PokoPokoDaimaou-Nes',
'SpaceHarrier-Nes',
'SonicTheHedgehog-Genesis',
'Vectorman-Genesis',
'FinalFight-Snes',
'SpaceInvaders-Snes',
}
打印完整的环境类型和环境名:
import gym
from collections import defaultdict _game_envs = defaultdict(set)
for env in gym.envs.registry.all():
# TODO: solve this with regexes
env_type = env.entry_point.split(':')[0].split('.')[-1]
_game_envs[env_type].add(env.id) _game_envs['retro'] = {
'BubbleBobble-Nes',
'SuperMarioBros-Nes',
'TwinBee3PokoPokoDaimaou-Nes',
'SpaceHarrier-Nes',
'SonicTheHedgehog-Genesis',
'Vectorman-Genesis',
'FinalFight-Snes',
'SpaceInvaders-Snes',
} for env_type in _game_envs:
print(env_type, '\n', list(_game_envs[env_type]))
print('......')
打印结果:
algorithmic
['DuplicatedInput-v0', 'ReversedAddition-v0', 'Copy-v0', 'RepeatCopy-v0', 'Reverse-v0', 'ReversedAddition3-v0']
......
classic_control
['CartPole-v1', 'CartPole-v0', 'MountainCar-v0', 'Acrobot-v1', 'MountainCarContinuous-v0', 'Pendulum-v0']
......
box2d
['LunarLander-v2', 'LunarLanderContinuous-v2', 'CarRacing-v0', 'BipedalWalkerHardcore-v3', 'BipedalWalker-v3']
......
toy_text
['FrozenLake-v0', 'KellyCoinflip-v0', 'Blackjack-v0', 'NChain-v0', 'GuessingGame-v0', 'FrozenLake8x8-v0', 'Taxi-v3', 'HotterColder-v0', 'CliffWalking-v0', 'Roulette-v0', 'KellyCoinflipGeneralized-v0']
......
mujoco
['Walker2d-v2', 'Swimmer-v2', 'Humanoid-v2', 'InvertedDoublePendulum-v2', 'Reacher-v2', 'Hopper-v2', 'HalfCheetah-v2', 'Striker-v2', 'InvertedPendulum-v2', 'Thrower-v2', 'HumanoidStandup-v2', 'Ant-v2', 'Pusher-v2']
......
half_cheetah_v3
['HalfCheetah-v3']
......
hopper_v3
['Hopper-v3']
......
swimmer_v3
['Swimmer-v3']
......
walker2d_v3
['Walker2d-v3']
......
ant_v3
['Ant-v3']
......
humanoid_v3
['Humanoid-v3']
......
robotics
['FetchPush-v1', 'HandManipulateBlockRotateXYZTouchSensorsDense-v1', 'HandManipulateBlockRotateParallelTouchSensorsDense-v0', 'HandManipulateEggRotateTouchSensors-v1', 'HandManipulateBlockTouchSensors-v0', 'HandManipulateEggTouchSensorsDense-v0', 'HandManipulateEggFullDense-v0', 'HandManipulatePenDense-v0', 'HandManipulateBlockTouchSensorsDense-v1', 'HandManipulateBlockRotateParallelTouchSensors-v1', 'HandManipulateEggFull-v0', 'FetchPickAndPlaceDense-v1', 'HandManipulateEggRotateTouchSensors-v0', 'HandManipulateBlockRotateXYZTouchSensors-v0', 'HandManipulateBlockRotateParallelTouchSensors-v0', 'HandManipulateEggTouchSensors-v1', 'FetchPushDense-v1', 'HandManipulateBlockRotateZTouchSensorsDense-v1', 'HandReach-v0', 'HandManipulatePenTouchSensors-v1', 'HandManipulateBlockRotateXYZDense-v0', 'HandManipulateEggRotateTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensorsDense-v0', 'HandManipulatePenFullDense-v0', 'FetchPickAndPlace-v1', 'HandManipulatePenRotateTouchSensors-v0', 'HandManipulateBlockRotateZDense-v0', 'HandManipulateEggRotateTouchSensorsDense-v0', 'HandManipulateBlockRotateParallelDense-v0', 'HandManipulateBlock-v0', 'HandManipulateEggTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensors-v1', 'FetchReachDense-v1', 'HandManipulatePenTouchSensorsDense-v0', 'HandManipulatePenTouchSensorsDense-v1', 'HandManipulateEgg-v0', 'HandManipulateEggDense-v0', 'HandManipulateBlockRotateZTouchSensors-v1', 'FetchReach-v1', 'HandManipulatePen-v0', 'HandManipulateBlockFullDense-v0', 'HandManipulateBlockFull-v0', 'HandManipulateBlockRotateZ-v0', 'HandManipulateBlockRotateZTouchSensors-v0', 'HandManipulatePenTouchSensors-v0', 'HandManipulateEggRotate-v0', 'HandReachDense-v0', 'HandManipulateBlockRotateXYZ-v0', 'HandManipulateBlockDense-v0', 'FetchSlide-v1', 'HandManipulateBlockRotateParallel-v0', 'FetchSlideDense-v1', 'HandManipulatePenRotate-v0', 'HandManipulatePenFull-v0', 'HandManipulateBlockRotateParallelTouchSensorsDense-v1', 'HandManipulatePenRotateDense-v0', 'HandManipulateEggRotateDense-v0', 'HandManipulateBlockTouchSensors-v1', 'HandManipulatePenRotateTouchSensorsDense-v0', 'HandManipulateBlockTouchSensorsDense-v0', 'HandManipulatePenRotateTouchSensors-v1', 'HandManipulateEggTouchSensors-v0', 'HandManipulatePenRotateTouchSensorsDense-v1', 'HandManipulateBlockRotateZTouchSensorsDense-v0']
......
atari
['RoadRunner-ramDeterministic-v4', 'YarsRevenge-v4', 'TimePilot-ram-v0', 'Zaxxon-ramDeterministic-v0', 'AsteroidsNoFrameskip-v0', 'MsPacmanNoFrameskip-v0', 'SpaceInvaders-ramDeterministic-v4', 'Gopher-ramNoFrameskip-v0', 'Venture-v4', 'TutankhamNoFrameskip-v0', 'Skiing-ramDeterministic-v0', 'Phoenix-ram-v4', 'Asterix-v4', 'Pong-ram-v4', 'TimePilotNoFrameskip-v0', 'Assault-ramDeterministic-v4', 'Riverraid-ramDeterministic-v0', 'BerzerkDeterministic-v0', 'DefenderDeterministic-v4', 'Gopher-ramDeterministic-v0', 'Gravitar-ramNoFrameskip-v0', 'Skiing-ramDeterministic-v4', 'ChopperCommand-ramDeterministic-v0', 'Atlantis-ramDeterministic-v0', 'DefenderDeterministic-v0', 'BankHeistNoFrameskip-v0', 'ZaxxonDeterministic-v4', 'Solaris-ramNoFrameskip-v4', 'AmidarNoFrameskip-v0', 'HeroDeterministic-v0', 'AsterixDeterministic-v4', 'TimePilot-ramDeterministic-v4', 'IceHockeyDeterministic-v0', 'Centipede-ram-v0', 'BeamRiderNoFrameskip-v0', 'BeamRider-ramDeterministic-v4', 'AtlantisDeterministic-v0', 'MsPacmanDeterministic-v0', 'DefenderNoFrameskip-v4', 'Jamesbond-ramDeterministic-v4', 'Venture-ram-v4', 'QbertDeterministic-v0', 'Solaris-ramDeterministic-v0', 'Solaris-ramNoFrameskip-v0', 'Jamesbond-ramNoFrameskip-v0', 'Krull-v4', 'Tutankham-v4', 'WizardOfWor-ramDeterministic-v0', 'BattleZone-ramNoFrameskip-v4', 'FishingDerbyDeterministic-v4', 'RoadRunner-v4', 'TimePilot-ramDeterministic-v0', 'YarsRevengeDeterministic-v0', 'AsterixNoFrameskip-v4', 'BeamRiderDeterministic-v0', 'Bowling-ramNoFrameskip-v0', 'JourneyEscape-v0', 'TimePilot-v0', 'GopherDeterministic-v4', 'PhoenixDeterministic-v0', 'StarGunner-v4', 'Alien-v0', 'Pong-ramNoFrameskip-v4', 'Frostbite-ram-v4', 'Pooyan-ram-v4', 'Bowling-ramDeterministic-v0', 'RoadRunner-ram-v4', 'TimePilotNoFrameskip-v4', 'Phoenix-ram-v0', 'Assault-ramDeterministic-v0', 'Seaquest-ramNoFrameskip-v4', 'AlienNoFrameskip-v0', 'Defender-ram-v0', 'SolarisDeterministic-v4', 'Tutankham-v0', 'Kangaroo-ramDeterministic-v0', 'PhoenixDeterministic-v4', 'VideoPinballDeterministic-v0', 'Freeway-ramDeterministic-v4', 'Asteroids-v0', 'JourneyEscape-ram-v0', 'StarGunner-ramNoFrameskip-v4', 'VideoPinball-v4', 'Carnival-ramNoFrameskip-v0', 'Assault-ram-v4', 'PrivateEye-ramDeterministic-v0', 'Venture-v0', 'PrivateEyeDeterministic-v0', 'Alien-ram-v4', 'ElevatorActionNoFrameskip-v0', 'MsPacman-ram-v0', 'Asterix-ram-v0', 'Bowling-ram-v4', 'MontezumaRevenge-ramNoFrameskip-v0', 'VentureDeterministic-v0', 'AmidarDeterministic-v0', 'ElevatorActionNoFrameskip-v4', 'SpaceInvaders-ram-v0', 'ChopperCommandNoFrameskip-v0', 'SeaquestDeterministic-v4', 'StarGunner-ramDeterministic-v0', 'SpaceInvadersDeterministic-v4', 'VideoPinballNoFrameskip-v0', 'MontezumaRevenge-ram-v4', 'SpaceInvaders-ram-v4', 'Frostbite-v4', 'Pong-ramDeterministic-v4', 'Adventure-ramNoFrameskip-v4', 'AdventureDeterministic-v0', 'KrullNoFrameskip-v4', 'NameThisGame-ramNoFrameskip-v0', 'Centipede-v0', 'WizardOfWor-ram-v0', 'AtlantisNoFrameskip-v4', 'AsteroidsDeterministic-v0', 'BeamRiderDeterministic-v4', 'Centipede-ramDeterministic-v4', 'YarsRevenge-ramDeterministic-v0', 'PooyanDeterministic-v0', 'DemonAttack-ramDeterministic-v4', 'RoadRunnerDeterministic-v4', 'PongNoFrameskip-v4', 'CrazyClimber-ram-v4', 'IceHockey-v4', 'Adventure-v0', 'Krull-v0', 'BerzerkNoFrameskip-v0', 'Freeway-v0', 'Amidar-ramNoFrameskip-v0', 'MsPacmanDeterministic-v4', 'GopherDeterministic-v0', 'YarsRevenge-v0', 'Pong-ramDeterministic-v0', 'Robotank-ramDeterministic-v0', 'AtlantisNoFrameskip-v0', 'TennisNoFrameskip-v4', 'VentureNoFrameskip-v4', 'BankHeist-ramDeterministic-v4', 'Gravitar-ramDeterministic-v4', 'PrivateEye-v4', 'StarGunner-ramNoFrameskip-v0', 'VideoPinball-v0', 'BreakoutDeterministic-v4', 'StarGunnerDeterministic-v4', 'KungFuMaster-ramNoFrameskip-v0', 'RoadRunner-v0', 'Krull-ramNoFrameskip-v0', 'AirRaidNoFrameskip-v4', 'Gravitar-ram-v0', 'Tutankham-ramNoFrameskip-v0', 'Adventure-ram-v4', 'FishingDerby-ramDeterministic-v0', 'PrivateEye-ram-v0', 'Boxing-ramNoFrameskip-v4', 'KungFuMaster-ram-v4', 'ZaxxonNoFrameskip-v4', 'YarsRevenge-ram-v4', 'JourneyEscapeDeterministic-v4', 'BoxingNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v0', 'Zaxxon-ram-v4', 'RobotankNoFrameskip-v4', 'Amidar-v0', 'SolarisDeterministic-v0', 'Gopher-ram-v0', 'WizardOfWorDeterministic-v0', 'DoubleDunkNoFrameskip-v0', 'CentipedeNoFrameskip-v4', 'Berzerk-ramDeterministic-v0', 'Skiing-ram-v4', 'Gopher-ramDeterministic-v4', 'Amidar-ram-v4', 'Breakout-ramDeterministic-v4', 'Freeway-ramNoFrameskip-v0', 'NameThisGame-ramDeterministic-v4', 'Hero-v4', 'PrivateEye-ram-v4', 'Riverraid-v4', 'MontezumaRevenge-ramNoFrameskip-v4', 'Seaquest-ram-v0', 'CentipedeDeterministic-v0', 'Phoenix-ramNoFrameskip-v4', 'JamesbondDeterministic-v0', 'IceHockey-ramNoFrameskip-v0', 'Atlantis-ram-v4', 'Seaquest-ramDeterministic-v4', 'Adventure-ramDeterministic-v0', 'AssaultNoFrameskip-v0', 'BeamRider-ramNoFrameskip-v4', 'BerzerkNoFrameskip-v4', 'NameThisGameDeterministic-v4', 'SkiingNoFrameskip-v4', 'CrazyClimberDeterministic-v0', 'ZaxxonDeterministic-v0', 'Phoenix-ramDeterministic-v0', 'Boxing-ramDeterministic-v4', 'Pong-v4', 'SpaceInvadersNoFrameskip-v4', 'DoubleDunk-v4', 'PooyanNoFrameskip-v4', 'MsPacman-ramDeterministic-v4', 'RoadRunner-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v0', 'Freeway-ram-v4', 'Berzerk-v0', 'Zaxxon-ramNoFrameskip-v4', 'Enduro-v0', 'Freeway-ram-v0', 'FishingDerby-ramDeterministic-v4', 'MsPacman-ram-v4', 'TutankhamDeterministic-v0', 'UpNDown-ram-v4', 'Venture-ram-v0', 'Assault-v0', 'DemonAttack-v0', 'TimePilotDeterministic-v0', 'RoadRunner-ramNoFrameskip-v0', 'BattleZoneDeterministic-v0', 'DemonAttack-ramNoFrameskip-v4', 'Boxing-ramDeterministic-v0', 'MontezumaRevengeNoFrameskip-v4', 'DemonAttack-v4', 'PrivateEyeDeterministic-v4', 'MontezumaRevenge-v0', 'AssaultDeterministic-v4', 'SpaceInvadersNoFrameskip-v0', 'Robotank-ram-v0', 'WizardOfWorNoFrameskip-v0', 'Asteroids-ramDeterministic-v0', 'CrazyClimberNoFrameskip-v0', 'DoubleDunk-ramNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v0', 'PooyanDeterministic-v4', 'RobotankDeterministic-v4', 'Krull-ram-v0', 'Carnival-ramDeterministic-v0', 'BankHeistDeterministic-v4', 'IceHockey-ramDeterministic-v0', 'Jamesbond-ram-v0', 'StarGunnerNoFrameskip-v0', 'BoxingNoFrameskip-v4', 'Gopher-v4', 'AlienDeterministic-v0', 'Zaxxon-ramDeterministic-v4', 'VideoPinball-ramNoFrameskip-v4', 'TimePilot-v4', 'NameThisGame-ram-v0', 'PitfallDeterministic-v0', 'GravitarDeterministic-v4', 'DoubleDunk-ramNoFrameskip-v0', 'YarsRevenge-ram-v0', 'PhoenixNoFrameskip-v4', 'Gravitar-v4', 'Pooyan-ramDeterministic-v4', 'Berzerk-ramDeterministic-v4', 'RiverraidDeterministic-v0', 'CrazyClimber-ramNoFrameskip-v0', 'KungFuMasterDeterministic-v0', 'Phoenix-ramNoFrameskip-v0', 'StarGunner-ram-v0', 'BeamRider-ram-v4', 'Defender-ramDeterministic-v0', 'MsPacman-ramNoFrameskip-v0', 'Pitfall-ramDeterministic-v0', 'BoxingDeterministic-v0', 'Enduro-ram-v4', 'Adventure-ramDeterministic-v4', 'Riverraid-ramNoFrameskip-v0', 'Enduro-ramDeterministic-v0', 'Boxing-v4', 'Skiing-v0', 'BattleZoneNoFrameskip-v0', 'Pitfall-ramNoFrameskip-v0', 'BowlingNoFrameskip-v0', 'DoubleDunkDeterministic-v0', 'ElevatorAction-ramDeterministic-v0', 'Boxing-ram-v0', 'WizardOfWor-ramNoFrameskip-v4', 'MsPacman-ramNoFrameskip-v4', 'Defender-v4', 'Bowling-v4', 'Berzerk-ramNoFrameskip-v4', 'MontezumaRevenge-ram-v0', 'AtlantisDeterministic-v4', 'GravitarDeterministic-v0', 'FreewayDeterministic-v0', 'Breakout-ram-v4', 'CrazyClimberDeterministic-v4', 'Zaxxon-v4', 'EnduroDeterministic-v0', 'KungFuMaster-ramNoFrameskip-v4', 'NameThisGame-ram-v4', 'Breakout-ramNoFrameskip-v0', 'ElevatorAction-ramNoFrameskip-v4', 'Carnival-v0', 'Pooyan-ram-v0', 'BankHeist-v0', 'NameThisGameDeterministic-v0', 'MontezumaRevenge-ramDeterministic-v4', 'DoubleDunk-ram-v4', 'Asteroids-v4', 'Alien-ramNoFrameskip-v0', 'Breakout-ramNoFrameskip-v4', 'AmidarDeterministic-v4', 'SpaceInvaders-v4', 'Boxing-v0', 'Riverraid-ram-v0', 'UpNDownDeterministic-v0', 'Defender-ramNoFrameskip-v0', 'StarGunner-ram-v4', 'MsPacmanNoFrameskip-v4', 'Skiing-ramNoFrameskip-v4', 'Tennis-v4', 'SolarisNoFrameskip-v4', 'TutankhamNoFrameskip-v4', 'ChopperCommand-ramNoFrameskip-v0', 'Gravitar-ramNoFrameskip-v4', 'VideoPinball-ram-v4', 'Qbert-ramDeterministic-v4', 'Pitfall-ramNoFrameskip-v4', 'Pooyan-ramDeterministic-v0', 'MsPacman-ramDeterministic-v0', 'Enduro-v4', 'Freeway-ramNoFrameskip-v4', 'JourneyEscape-ramDeterministic-v0', 'Alien-ram-v0', 'FreewayDeterministic-v4', 'Qbert-v0', 'Kangaroo-ramNoFrameskip-v4', 'Atlantis-ramNoFrameskip-v0', 'Frostbite-ramNoFrameskip-v0', 'Hero-ramNoFrameskip-v4', 'YarsRevengeNoFrameskip-v4', 'SkiingDeterministic-v0', 'Gopher-v0', 'PitfallDeterministic-v4', 'Carnival-v4', 'Riverraid-ram-v4', 'FishingDerby-ramNoFrameskip-v0', 'AirRaidDeterministic-v4', 'Pitfall-ram-v4', 'BattleZone-v4', 'Tennis-ram-v0', 'YarsRevengeNoFrameskip-v0', 'BattleZone-ram-v0', 'AirRaid-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v0', 'TennisDeterministic-v4', 'Asteroids-ram-v4', 'AsteroidsDeterministic-v4', 'BowlingDeterministic-v4', 'Asterix-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v4', 'ElevatorAction-v4', 'MontezumaRevenge-ramDeterministic-v0', 'BeamRider-ramNoFrameskip-v0', 'BankHeist-ramNoFrameskip-v4', 'ElevatorActionDeterministic-v0', 'FishingDerby-v0', 'FishingDerbyNoFrameskip-v0', 'Frostbite-ram-v0', 'JamesbondDeterministic-v4', 'Phoenix-v4', 'Pooyan-v0', 'NameThisGame-ramDeterministic-v0', 'Riverraid-ramDeterministic-v4', 'JamesbondNoFrameskip-v4', 'Kangaroo-ram-v0', 'Gopher-ramNoFrameskip-v4', 'Centipede-ramDeterministic-v0', 'Carnival-ram-v0', 'AlienNoFrameskip-v4', 'BowlingNoFrameskip-v4', 'Breakout-v4', 'KungFuMasterNoFrameskip-v4', 'Amidar-ramDeterministic-v4', 'Asterix-v0', 'PongDeterministic-v0', 'UpNDown-ramDeterministic-v0', 'FreewayNoFrameskip-v4', 'Gravitar-v0', 'BattleZone-ramDeterministic-v4', 'Bowling-ramDeterministic-v4', 'Frostbite-ramNoFrameskip-v4', 'Gopher-ram-v4', 'Asteroids-ramNoFrameskip-v4', 'BankHeist-ram-v4', 'Centipede-ram-v4', 'Kangaroo-ramDeterministic-v4', 'Pong-v0', 'SeaquestNoFrameskip-v0', 'Jamesbond-v4', 'FishingDerby-v4', 'Krull-ram-v4', 'BeamRider-v0', 'TennisDeterministic-v0', 'ChopperCommandDeterministic-v4', 'QbertNoFrameskip-v4', 'BattleZone-ramDeterministic-v0', 'RoadRunnerNoFrameskip-v4', 'Atlantis-v0', 'BankHeist-v4', 'TennisNoFrameskip-v0', 'SpaceInvaders-v0', 'DemonAttackNoFrameskip-v4', 'BattleZoneDeterministic-v4', 'Tutankham-ramDeterministic-v4', 'KungFuMaster-v4', 'Boxing-ram-v4', 'SkiingNoFrameskip-v0', 'Pong-ram-v0', 'Asterix-ramDeterministic-v0', 'JamesbondNoFrameskip-v0', 'StarGunner-v0', 'WizardOfWor-ramDeterministic-v4', 'FrostbiteNoFrameskip-v0', 'FishingDerbyDeterministic-v0', 'QbertNoFrameskip-v0', 'KungFuMaster-ram-v0', 'Atlantis-v4', 'CrazyClimberNoFrameskip-v4', 'Defender-ram-v4', 'JourneyEscape-v4', 'Zaxxon-v0', 'DoubleDunk-ramDeterministic-v0', 'YarsRevenge-ramNoFrameskip-v0', 'BattleZone-ram-v4', 'Hero-ram-v4', 'KungFuMaster-ramDeterministic-v4', 'CentipedeDeterministic-v4', 'Seaquest-ram-v4', 'ChopperCommand-ram-v0', 'ElevatorAction-v0', 'Atlantis-ram-v0', 'Skiing-v4', 'Robotank-ramDeterministic-v4', 'Pitfall-v0', 'Solaris-ram-v4', 'Tennis-ramNoFrameskip-v4', 'Venture-ramDeterministic-v0', 'KrullNoFrameskip-v0', 'DemonAttackNoFrameskip-v0', 'MontezumaRevengeDeterministic-v0', 'EnduroNoFrameskip-v0', 'Freeway-v4', 'CrazyClimber-ram-v0', 'FishingDerby-ramNoFrameskip-v4', 'RiverraidNoFrameskip-v4', 'Berzerk-ramNoFrameskip-v0', 'Venture-ramNoFrameskip-v0', 'Adventure-v4', 'PrivateEye-ramNoFrameskip-v4', 'UpNDown-ramNoFrameskip-v0', 'BreakoutNoFrameskip-v4', 'ElevatorActionDeterministic-v4', 'BeamRiderNoFrameskip-v4', 'DemonAttackDeterministic-v4', 'Kangaroo-ram-v4', 'Phoenix-ramDeterministic-v4', 'AirRaidDeterministic-v0', 'Hero-v0', 'UpNDown-ramDeterministic-v4', 'Defender-ramNoFrameskip-v4', 'DoubleDunk-ramDeterministic-v4', 'Centipede-v4', 'ChopperCommand-v4', 'Frostbite-ramDeterministic-v0', 'SeaquestNoFrameskip-v4', 'SolarisNoFrameskip-v0', 'AirRaid-v4', 'HeroNoFrameskip-v4', 'FishingDerby-ram-v0', 'Gravitar-ramDeterministic-v0', 'KungFuMasterDeterministic-v4', 'Pong-ramNoFrameskip-v0', 'IceHockey-ramDeterministic-v4', 'Kangaroo-ramNoFrameskip-v0', 'UpNDown-ram-v0', 'BeamRider-ram-v0', 'Hero-ramDeterministic-v0', 'Jamesbond-v0', 'Tennis-ramNoFrameskip-v0', 'Freeway-ramDeterministic-v0', 'WizardOfWor-v0', 'Tennis-ram-v4', 'FrostbiteNoFrameskip-v4', 'Skiing-ram-v0', 'NameThisGameNoFrameskip-v4', 'Alien-ramDeterministic-v0', 'Pitfall-ramDeterministic-v4', 'IceHockey-ramNoFrameskip-v4', 'Solaris-v0', 'Asterix-ram-v4', 'Adventure-ram-v0', 'HeroNoFrameskip-v0', 'Frostbite-ramDeterministic-v4', 'Bowling-ramNoFrameskip-v4', 'VideoPinballDeterministic-v4', 'VideoPinball-ramDeterministic-v0', 'DefenderNoFrameskip-v0', 'TimePilotDeterministic-v4', 'UpNDown-v0', 'Seaquest-v0', 'QbertDeterministic-v4', 'FreewayNoFrameskip-v0', 'VideoPinball-ramDeterministic-v4', 'Kangaroo-v4', 'Alien-ramNoFrameskip-v4', 'KrullDeterministic-v4', 'PongDeterministic-v4', 'RoadRunnerDeterministic-v0', 'GravitarNoFrameskip-v4', 'AlienDeterministic-v4', 'KungFuMasterNoFrameskip-v0', 'Hero-ramDeterministic-v4', 'Asterix-ramNoFrameskip-v4', 'KrullDeterministic-v0', 'AirRaid-ram-v4', 'MontezumaRevengeDeterministic-v4', 'Qbert-ramDeterministic-v0', 'Asteroids-ramNoFrameskip-v0', 'StarGunner-ramDeterministic-v4', 'GravitarNoFrameskip-v0', 'WizardOfWor-ram-v4', 'YarsRevenge-ramDeterministic-v4', 'DemonAttack-ramDeterministic-v0', 'NameThisGame-ramNoFrameskip-v4', 'PooyanNoFrameskip-v0', 'Enduro-ram-v0', 'Robotank-ramNoFrameskip-v0', 'Asteroids-ramDeterministic-v4', 'Carnival-ramDeterministic-v4', 'Berzerk-v4', 'JourneyEscape-ramDeterministic-v4', 'Kangaroo-v0', 'BattleZone-ramNoFrameskip-v0', 'NameThisGame-v0', 'Assault-ramNoFrameskip-v0', 'StarGunnerNoFrameskip-v4', 'CentipedeNoFrameskip-v0', 'Venture-ramDeterministic-v4', 'Frostbite-v0', 'JourneyEscape-ramNoFrameskip-v0', 'Krull-ramDeterministic-v0', 'VideoPinball-ramNoFrameskip-v0', 'IceHockeyDeterministic-v4', 'AdventureDeterministic-v4', 'Carnival-ram-v4', 'YarsRevenge-ramNoFrameskip-v4', 'CarnivalDeterministic-v4', 'UpNDownNoFrameskip-v4', 'FishingDerbyNoFrameskip-v4', 'Riverraid-v0', 'Robotank-ramNoFrameskip-v4', 'CrazyClimber-v0', 'RiverraidNoFrameskip-v0', 'AsterixDeterministic-v0', 'WizardOfWor-v4', 'AdventureNoFrameskip-v0', 'Solaris-ram-v0', 'Seaquest-ramNoFrameskip-v0', 'Tennis-ramDeterministic-v0', 'DoubleDunkNoFrameskip-v4', 'VentureDeterministic-v4', 'AsterixNoFrameskip-v0', 'Qbert-ram-v4', 'MontezumaRevengeNoFrameskip-v0', 'RoadRunner-ramDeterministic-v0', 'Phoenix-v0', 'BreakoutDeterministic-v0', 'WizardOfWorDeterministic-v4', 'ZaxxonNoFrameskip-v0', 'KangarooDeterministic-v0', 'Alien-v4', 'MsPacman-v4', 'Assault-ramNoFrameskip-v4', 'RobotankNoFrameskip-v0', 'MontezumaRevenge-v4', 'Solaris-v4', 'Tennis-v0', 'Amidar-ram-v0', 'SpaceInvadersDeterministic-v0', 'TutankhamDeterministic-v4', 'DoubleDunk-ram-v0', 'NameThisGameNoFrameskip-v0', 'JourneyEscapeNoFrameskip-v0', 'PitfallNoFrameskip-v4', 'TimePilot-ram-v4', 'Tutankham-ram-v4', 'CrazyClimber-v4', 'CrazyClimber-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v4', 'Breakout-ramDeterministic-v0', 'BankHeistDeterministic-v0', 'ChopperCommandNoFrameskip-v4', 'Defender-v0', 'Pitfall-v4', 'BeamRider-v4', 'AmidarNoFrameskip-v4', 'Solaris-ramDeterministic-v4', 'ElevatorAction-ram-v0', 'SpaceInvaders-ramDeterministic-v0', 'BankHeistNoFrameskip-v4', 'JourneyEscape-ramNoFrameskip-v4', 'EnduroNoFrameskip-v4', 'Tutankham-ramNoFrameskip-v4', 'KungFuMaster-ramDeterministic-v0', 'Berzerk-ram-v0', 'UpNDownNoFrameskip-v0', 'IceHockey-v0', 'SpaceInvaders-ramNoFrameskip-v4', 'Jamesbond-ramNoFrameskip-v4', 'Bowling-ram-v0', 'PrivateEyeNoFrameskip-v4', 'ElevatorAction-ram-v4', 'WizardOfWor-ramNoFrameskip-v0', 'Seaquest-ramDeterministic-v0', 'Pooyan-ramNoFrameskip-v4', 'Zaxxon-ram-v0', 'Atlantis-ramNoFrameskip-v4', 'UpNDownDeterministic-v4', 'CarnivalNoFrameskip-v4', 'ChopperCommand-v0', 'JourneyEscapeDeterministic-v0', 'FrostbiteDeterministic-v4', 'TimePilot-ramNoFrameskip-v4', 'Assault-v4', 'Enduro-ramDeterministic-v4', 'ChopperCommand-ramNoFrameskip-v4', 'KungFuMaster-v0', 'DemonAttack-ram-v0', 'BattleZone-v0', 'Centipede-ramNoFrameskip-v4', 'Venture-ramNoFrameskip-v4', 'UpNDown-v4', 'VideoPinballNoFrameskip-v4', 'BreakoutNoFrameskip-v0', 'AirRaid-ramNoFrameskip-v4', 'Centipede-ramNoFrameskip-v0', 'ChopperCommand-ram-v4', 'Alien-ramDeterministic-v4', 'IceHockeyNoFrameskip-v4', 'AirRaidNoFrameskip-v0', 'Qbert-v4', 'SpaceInvaders-ramNoFrameskip-v0', 'Asteroids-ram-v0', 'Pooyan-ramNoFrameskip-v0', 'Qbert-ram-v0', 'CrazyClimber-ramDeterministic-v4', 'DemonAttackDeterministic-v0', 'AirRaid-v0', 'BerzerkDeterministic-v4', 'Tennis-ramDeterministic-v4', 'KangarooDeterministic-v4', 'FishingDerby-ram-v4', 'CarnivalDeterministic-v0', 'VideoPinball-ram-v0', 'HeroDeterministic-v4', 'ElevatorAction-ramNoFrameskip-v0', 'Krull-ramNoFrameskip-v4', 'UpNDown-ramNoFrameskip-v4', 'Pooyan-v4', 'ChopperCommandDeterministic-v0', 'IceHockey-ram-v0', 'StarGunnerDeterministic-v0', 'TimePilot-ramNoFrameskip-v0', 'ElevatorAction-ramDeterministic-v4', 'Jamesbond-ram-v4', 'Hero-ramNoFrameskip-v0', 'Bowling-v0', 'BowlingDeterministic-v0', 'EnduroDeterministic-v4', 'Riverraid-ramNoFrameskip-v4', 'RiverraidDeterministic-v4', 'Breakout-ram-v0', 'Robotank-v0', 'Atlantis-ramDeterministic-v4', 'Assault-ram-v0', 'Seaquest-v4', 'PhoenixNoFrameskip-v0', 'Defender-ramDeterministic-v4', 'Krull-ramDeterministic-v4', 'BankHeist-ramNoFrameskip-v0', 'NameThisGame-v4', 'AirRaid-ram-v0', 'Berzerk-ram-v4', 'Boxing-ramNoFrameskip-v0', 'AssaultDeterministic-v0', 'Asterix-ramDeterministic-v4', 'Jamesbond-ramDeterministic-v0', 'Breakout-v0', 'DemonAttack-ramNoFrameskip-v0', 'DoubleDunkDeterministic-v4', 'CrazyClimber-ramDeterministic-v0', 'CarnivalNoFrameskip-v0', 'PrivateEyeNoFrameskip-v0', 'BeamRider-ramDeterministic-v0', 'Robotank-v4', 'SkiingDeterministic-v4', 'AirRaid-ramDeterministic-v4', 'PitfallNoFrameskip-v0', 'Skiing-ramNoFrameskip-v0', 'AsteroidsNoFrameskip-v4', 'Carnival-ramNoFrameskip-v4', 'RoadRunnerNoFrameskip-v0', 'PrivateEye-ramDeterministic-v4', 'KangarooNoFrameskip-v0', 'AirRaid-ramDeterministic-v0', 'JourneyEscapeNoFrameskip-v4', 'BattleZoneNoFrameskip-v4', 'DemonAttack-ram-v4', 'JourneyEscape-ram-v4', 'Tutankham-ram-v0', 'ChopperCommand-ramDeterministic-v4', 'GopherNoFrameskip-v4', 'IceHockey-ram-v4', 'RoadRunner-ram-v0', 'Robotank-ram-v4', 'WizardOfWorNoFrameskip-v4', 'Hero-ram-v0', 'SeaquestDeterministic-v0', 'PrivateEye-v0', 'Gravitar-ram-v4', 'Amidar-v4', 'AssaultNoFrameskip-v4', 'GopherNoFrameskip-v0', 'MsPacman-v0', 'VentureNoFrameskip-v0', 'KangarooNoFrameskip-v4', 'YarsRevengeDeterministic-v4', 'Tutankham-ramDeterministic-v0', 'Adventure-ramNoFrameskip-v0', 'BankHeist-ram-v0', 'DoubleDunk-v0', 'Pitfall-ram-v0', 'PongNoFrameskip-v0', 'AdventureNoFrameskip-v4', 'IceHockeyNoFrameskip-v0', 'Amidar-ramNoFrameskip-v4', 'FrostbiteDeterministic-v0', 'BoxingDeterministic-v4', 'RobotankDeterministic-v0', 'Amidar-ramDeterministic-v0', 'BankHeist-ramDeterministic-v0']
......
unittest
['MemorizeDigits-v0', 'CubeCrash-v0', 'CubeCrashScreenBecomesBlack-v0', 'CubeCrashSparse-v0']
......
retro
['SuperMarioBros-Nes', 'Vectorman-Genesis', 'FinalFight-Snes', 'SonicTheHedgehog-Genesis', 'TwinBee3PokoPokoDaimaou-Nes', 'SpaceHarrier-Nes', 'SpaceInvaders-Snes', 'BubbleBobble-Nes']
......
在 main 函数中主要调用train函数。
在train函数中真正执行算法训练的函数操作为:
model = learn(
env=env,
seed=seed,
total_timesteps=total_timesteps,
**alg_kwargs
)
如果是执行deepq的算法,那么这里的这个learn函数就是deepq算法模块中的learn函数,也就是说根据指定的不同算法模块最后传入的learn函数都是其对应算法模块下的learn函数。
learn函数可以根据指定的timesteps来训练算法直到算法最后训练完成,最终的算法的网络模型给传输回来,也就是这里的model变量,model变量本身也是一个类对象,model的step函数输入变量为observation,输出变量为action的np.array类型对象。
在train函数中对游戏画面录像的操作:
if args.save_video_interval != 0:
env = VecVideoRecorder(env, osp.join(logger.get_dir(), "videos"), record_video_trigger=lambda x: x % args.save_video_interval == 0, video_length=args.save_video_length)
由于gym的视频录像功能并不是很好用,因此推荐使用opencv的图片录像的代码。
========================================
在这个模块还有一个最为重要的操作就是对游戏环境进行创建,这里的函数为build_env 。
def build_env(args):
ncpu = multiprocessing.cpu_count()
if sys.platform == 'darwin': ncpu //= 2
nenv = args.num_env or ncpu
alg = args.alg
seed = args.seed env_type, env_id = get_env_type(args) if env_type in {'atari', 'retro'}:
if alg == 'deepq':
env = make_env(env_id, env_type, seed=seed, wrapper_kwargs={'frame_stack': True})
elif alg == 'trpo_mpi':
env = make_env(env_id, env_type, seed=seed)
else:
frame_stack_size = 4
env = make_vec_env(env_id, env_type, nenv, seed, gamestate=args.gamestate, reward_scale=args.reward_scale)
env = VecFrameStack(env, frame_stack_size) else:
config = tf.ConfigProto(allow_soft_placement=True,
intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
config.gpu_options.allow_growth = True
get_session(config=config) flatten_dict_observations = alg not in {'her'}
env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations) if env_type == 'mujoco':
env = VecNormalize(env, use_tf=True) return env
如果游戏环境不属于{'atari', 'retro'},并且调用的强化学习算法不为'her',(游戏环境不包括'mujoco'游戏),创建环境的操作为:
flatten_dict_observations = True
env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)
如果游戏环境不属于{'atari', 'retro'},并且调用的强化学习算法为'her',(游戏环境不包括'mujoco'游戏),创建环境的操作为:
flatten_dict_observations = False
env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)
在这里如果gym游戏不属于'atari、retro、mujoco'那么也就是纯gym的原生游戏环境了,对于这样的游戏的创建统一调用方法:
env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)
唯一不同的是调用的强化学习算法是否为‘her’,如果是则设置flatten_dict_observations为False,否则设置为True。
至于这个flatten_dict_observations变量的设置不是很理解,不过这个设置的意思就是说如果要用的环境为gym的原生环境,并且不调用‘her’算法那么需要对observation进行flatten操作。这里的flatten操作就是调用gym下的FlattenObservation函数。
而这个FlattenObservation操作:
import numpy as np
import gym.spaces as spaces
from gym import ObservationWrapper class FlattenObservation(ObservationWrapper):
r"""Observation wrapper that flattens the observation."""
def __init__(self, env):
super(FlattenObservation, self).__init__(env) flatdim = spaces.flatdim(env.observation_space)
self.observation_space = spaces.Box(low=-float('inf'), high=float('inf'), shape=(flatdim,), dtype=np.float32) def observation(self, observation):
return spaces.flatten(self.env.observation_space, observation)
def flatdim(space):
if isinstance(space, Box):
return int(np.prod(space.shape))
elif isinstance(space, Discrete):
return int(space.n)
elif isinstance(space, Tuple):
return int(sum([flatdim(s) for s in space.spaces]))
elif isinstance(space, Dict):
return int(sum([flatdim(s) for s in space.spaces.values()]))
elif isinstance(space, MultiBinary):
return int(space.n)
elif isinstance(space, MultiDiscrete):
return int(np.prod(space.shape))
else:
raise NotImplementedError def flatten(space, x):
if isinstance(space, Box):
return np.asarray(x, dtype=np.float32).flatten()
elif isinstance(space, Discrete):
onehot = np.zeros(space.n, dtype=np.float32)
onehot[x] = 1.0
return onehot
elif isinstance(space, Tuple):
return np.concatenate([flatten(s, x_part) for x_part, s in zip(x, space.spaces)])
elif isinstance(space, Dict):
return np.concatenate([flatten(s, x[key]) for key, s in space.spaces.items()])
elif isinstance(space, MultiBinary):
return np.asarray(x).flatten()
elif isinstance(space, MultiDiscrete):
return np.asarray(x).flatten()
else:
raise NotImplementedError
该步操作主要是将observation进行flatten操作,尤其是将observation的spaces空间为Tuple和Dict的进行flatten。
该步操作可以将observation的spaces空间属于Box, Discrete, Tuple, Dict, MultiBinary, MultiDiscrete类型的observation全部进行flatten操作。
但是由于flatten的操作具体代码为:
if flatten_dict_observations and isinstance(env.observation_space, gym.spaces.Dict):
env = FlattenObservation(env)
也就是说实际上只能对observation的空间为Dict的observation进行flatten操作。
但是需要注意的是不论是在make_env函数中还是在make_vec_env函数中默认的参数设置都是flatten_dict_observations=True,也就是说在gym的原生环境中使用‘her‘’算法是不进行flatten操作的,由于对这个‘her’算法并不是很了解这里就不进行分析了。
=================================================
如果游戏环境不属于{'atari', 'retro'},并且为'mujoco'游戏,环境创建操作为:
flatten_dict_observations = alg not in {'her'}
env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations) if env_type == 'mujoco':
env = VecNormalize(env, use_tf=True)
===========================================
baselines算法库run.py模块分析的更多相关文章
- mahout算法库(四)
mahout算法库 分为三大块 1.聚类算法 2.协同过滤算法(一般用于推荐) 协同过滤算法也可以称为推荐算法!!! 3.分类算法 算法类 算法名 中文名 分类算法 Log ...
- 【Python】【Web.py】详细解读Python的web.py框架下的application.py模块
详细解读Python的web.py框架下的application.py模块 这篇文章主要介绍了Python的web.py框架下的application.py模块,作者深入分析了web.py的源码, ...
- OpenRisc-43-or1200的IF模块分析
引言 “喂饱饥饿的CPU”,是计算机体系结构设计者时刻要考虑的问题.要解决这个问题,方法大体可分为两部分,第一就是利用principle of locality而引进的cache技术,缩短取指时间,第 ...
- OpenCV学习笔记(27)KAZE 算法原理与源码分析(一)非线性扩散滤波
http://blog.csdn.net/chenyusiyuan/article/details/8710462 OpenCV学习笔记(27)KAZE 算法原理与源码分析(一)非线性扩散滤波 201 ...
- 【转】python模块分析之unittest测试(五)
[转]python模块分析之unittest测试(五) 系列文章 python模块分析之random(一) python模块分析之hashlib加密(二) python模块分析之typing(三) p ...
- 使用织梦开源的分词算法库编写的YII获取分词扩展
在编辑文章中,很多时候都需要自动根据文章内容获取关键字的功能,因此,本文主要是说明如何在yii中使用织梦开源的分词算法编写一个独立的扩展,可以在不同的模块中使用,步骤如下: 1 到这里下载其他朋友整理 ...
- scikit-learn 支持向量机算法库使用小结
之前通过一个系列对支持向量机(以下简称SVM)算法的原理做了一个总结,本文从实践的角度对scikit-learn SVM算法库的使用做一个小结.scikit-learn SVM算法库封装了libsvm ...
- web.py模块使用
web.py模块 import time import web urls=("/",'hello') class hello(): def GET(self): return (t ...
- 【转】python模块分析之hashlib加密(二)
[转]python模块分析之hashlib加密(二) hashlib模块是用来对字符串进行hash加密的模块,明文与密文是一一对应不变的关系:用于注册.登录时用户名.密码等加密使用.一.函数分析:1. ...
- 【转】python之random模块分析(一)
[转]python之random模块分析(一) random是python产生伪随机数的模块,随机种子默认为系统时钟.下面分析模块中的方法: 1.random.randint(start,stop): ...
随机推荐
- postman Could not get any response 无法请求
外网访问接口地址,刚开始考虑到是阿里云服务器上面的ECS网络安全策略拦截,添加了白名单, 首先在浏览器中回车访问,页面有反应. 但是在postman中请求,仍然返回 Could not get any ...
- Python遥感影像叠加分析:基于一景数据提取另一数据
本文介绍基于Python中GDAL模块,实现基于一景栅格影像,对另一景栅格影像的像元数值加以叠加提取的方法. 本文期望实现的需求为:现有一景表示6种不同植被类型的.tif格式栅格数据,以及另一 ...
- Woothosting 6$/年 vps测评
当你看到这个提示的时候,说明当前的文章是由原emlog博客系统搬迁至此的,文章发布时间已过于久远,编排和内容不一定完整,还请谅解 Woothosting 6$/年 vps测评** 日期:2018-7- ...
- Spring Boot 整合 Fisco Bcos(部署、调用区块链合约)
简介 在上一节,介绍了Spring Boot 整合 Fisco BCOS的最最基础的案例(SpringBoot成功连接Fisco BCOS,并访问其节点网络 --> 文章链接). 本节,咱们继续 ...
- 16-Docker核心概念
Docker镜像 Docker 镜像类似于操作系统镜像,可以将它理解为一个只读的模板. 例如,一个镜像可以包含一个基本的操作系统环境,里面仅安装了Apache应用程序,可以把它称为一个Apache镜像 ...
- 05-CentOS防火墙
概述 CentOS中的防火墙有很多,如SELinux.Firewall.TCP Wrappers.iptables/netfilter. 每种防火墙都有各自擅长的地方. 这里主要讲两种:SELinux ...
- BST-splay板子 - 维护一个分裂和合并的序列
splay 均摊复杂度 \(O(\log n)\) 证明: https://www.cnblogs.com/Mr-Spade/p/9715203.html 我这个 splay 有两个哨兵节点,分别是1 ...
- 嵌入式Linux如何设置获取uboot参数
--- title: 嵌入式Linux如何设置获取uboot参数 EntryName: embeded-linux-debug-get-and-set-u-boot-envarg date: 2020 ...
- 开源日志组件Sejil--附带日志管理界面
1.开源日志组件源码: https://github.com/alaatm/Sejil 2.下载下来发现里面对于不同的.net core 版本的配置提供了对应的示例 .Net Core 3.1 Pr ...
- “科来杯”第九届山东省大学生网络安全技能大赛决赛部分wp
1.损坏的流量包 wireshark打不开,丢进winhex里,找关键字flag 哎,没找到. 那就仔细看看,在最后发现一串类似base64的密文 base64解密 得到flag 2.签到题 一个 ...