本文整理汇总了Python中Game.Game.get_illegal_actions方法的典型用法代码示例。如果您正苦于以下问题:Python Game.get_illegal_actions方法的具体用法?Python Game.get_illegal_actions怎么用?Python Game.get_illegal_actions使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类Game.Game
的用法示例。
在下文中一共展示了Game.get_illegal_actions方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: _get_e_greedy_action
# 需要导入模块: from Game import Game [as 别名]
# 或者: from Game.Game import get_illegal_actions [as 别名]
def _get_e_greedy_action(self, state, exploration=None):
actions = self.get_action_values(state)
if exploration is None or (exploration is not None and random.uniform(0, 1) > exploration):
max_val = max(actions)
action = np.where(actions == max_val)[0]
game = Game(game_board=translate_state_to_game_board(state), spawning=False)
if set(action) == set(game.get_illegal_actions()):
return [random.choice(game.get_legal_actions())]
return [random.choice(action)]
else:
return [random.choice(self.actions)]
示例2: reward
# 需要导入模块: from Game import Game [as 别名]
# 或者: from Game.Game import get_illegal_actions [as 别名]
def reward(self, individual):
game = Game(game_board=self.env, spawning=False)
intuitive_reward = 0
cnt = 0
memory_reward = 0
for action in individual:
if action in game.get_illegal_actions():
return -50000, memory_reward
action_values = self.value_function(map_state_to_inputs(game.get_state()[0]))
predicted_reward = game.do_action(action)
if game.game_over():
return -50000, -50000
intuitive_reward += predicted_reward * (self.discounted ** cnt)
memory_reward += action_values[action] * (self.discounted ** cnt)
cnt += 1
return intuitive_reward, memory_reward