Python Agent.step方法代码示例

本文整理汇总了Python中agent.Agent.step方法的典型用法代码示例。如果您正苦于以下问题：Python Agent.step方法的具体用法？Python Agent.step怎么用？Python Agent.step使用的例子？那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类agent.Agent的用法示例。

在下文中一共展示了Agent.step方法的3个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: main

# 需要导入模块: from agent import Agent [as 别名]
# 或者: from agent.Agent import step [as 别名]
def main():
  """ Main function, runs the experiment. """
  agent = Agent()
  env = init_env()
  for i in range(15):
    agent.start()
    state, reward = env.reset()
    while not env.terminal:
      action = agent.step(state, reward)
      state, reward = env.update(action)
    
    agent.end(reward)

开发者ID:KyriacosShiarli，项目名称:helicopter，代码行数:14，代码来源:experiment.py

示例2: test

# 需要导入模块: from agent import Agent [as 别名]
# 或者: from agent.Agent import step [as 别名]
def test(world, restore=False, show=False, agent_name=None):
    """ 
    Run BECCA with world.  
    
    If restore=True, this method loads a saved agent if it can find one.
    Otherwise it creates a new one. It connects the agent and
    the world together and runs them for as long as the 
    world dictates.
    
    To profile BECCA's performance with world, manually set
    profile_flag in the top level script environment to True.
    """
    if agent_name is None:
        agent_name = '_'.join((world.name, 'agent'))
    agent = Agent(world.num_sensors, world.num_actions, 
                  agent_name=agent_name, show=show)
    if restore:
        agent = agent.restore()

    # If configured to do so, the world sets some BECCA parameters to 
    # modify its behavior. This is a development hack, and 
    # should eventually be removed as BECCA matures and settles on 
    # a good, general purpose set of parameters.
    world.set_agent_parameters(agent)
    actions = np.zeros((world.num_actions,1))
    
    # Repeat the loop through the duration of the existence of the world 
    
    totalTime = 0
    loops = 0
        
    while(world.is_alive()):
        a = time.time()
        sensors, reward = world.step(actions)
        if (show):
            world.visualize(agent)
        actions = agent.step(sensors, reward)
        totalTime = totalTime + time.time() - a
        loops = loops + 1
        
    print totalTime / loops, ' time per loop (', loops, ')'
    
    
    return agent.report_performance()

开发者ID:automenta，项目名称:becca，代码行数:46，代码来源:tester.py

示例3: main

# 需要导入模块: from agent import Agent [as 别名]
# 或者: from agent.Agent import step [as 别名]
def main():
    Tw = 100  # data width
    Y = []
    A = []
    R = []
    seq_gen = SeqGenerator()
    rew_gen = RewardGenerator()
    agent = Agent(3, 2)
    total_r = 0
    for t in range(Tw):
        (y, l) = seq_gen.step()
        a = agent.step(y)
        r = rew_gen.evaluate(l, a)

        total_r += r
        Y.append(y)
        A.append(a)
        R.append(r)

    print("Simulation finished.")
    print("Average Reward:", total_r / float(Tw), " ", total_r)

    f = plt.figure()
    spy = f.add_subplot(311)
    spy.stem(Y, label="observations", linefmt='b-', markerfmt='bo')
    plt.ylabel('Observations')
    plt.xlabel('step')
    spa = f.add_subplot(312)
    spa.stem(A, linefmt='r-', markerfmt='ro')
    plt.ylabel('Actions')
    plt.xlabel('step')
    spr = f.add_subplot(313)
    spr.stem(R, linefmt='g-', markerfmt='go')
    plt.ylabel('Rewards')
    plt.xlabel('step')
    plt.show()                  ### --> So assim consigo manter a imagem aberta
    #f.show()                   ### --> Codigo original para mostrar a imagem
    g = plt.figure()

开发者ID:ptresende，项目名称:work，代码行数:40，代码来源:learning_rl.py

注：本文中的agent.Agent.step方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。