当前位置: 首页>>代码示例>>Python>>正文


Python Model.transition方法代码示例

本文整理汇总了Python中Model.Model.transition方法的典型用法代码示例。如果您正苦于以下问题:Python Model.transition方法的具体用法?Python Model.transition怎么用?Python Model.transition使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在Model.Model的用法示例。


在下文中一共展示了Model.transition方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: extract_info

# 需要导入模块: from Model import Model [as 别名]
# 或者: from Model.Model import transition [as 别名]
# Initial reward function

# model.reward_f = np.random.uniform(-2,-0.1,model.reward_f.shape)
r_initial = model.reward_f
examples, distribution = extract_info(disc_model, steps, dist=True)
policy_ref, lg = caus_ent_backward(model.transition, model.reward_f, examples[1]["end_state"], steps)
start_states = [example["start_state"] for example in examples]
state_freq_ref, state_action_frequencies_ref = forward_sa(policy_ref, model.transition, start_states, steps)
iterations = 300


# initialise reward model
feat = {"function": continouous, "inputs": None}
disc_model = DiscModel(feature=feat)
model = Model_non_linear(disc_model)
model.transition = trans
model.reward_f = np.zeros(model.reward_f.shape)
model.reward_f += r_initial
model.reward_f[1, :] -= 0.5
# model.reward_f = r_initial
actions, states, features = model.feature_f.shape
for itera in xrange(iterations):
    policy_test, lg = caus_ent_backward(model.transition, model.reward_f, examples[1]["end_state"], steps)
    state_freq_test, state_action_frequencies_test = forward_sa(policy_test, model.transition, start_states, steps)
    reward_diff = np.sum(np.sum(np.absolute(model.reward_f - r_initial)))
    policy_diff = np.sum(np.sum(np.absolute(policy_test - policy_ref)))
    print "Difference in Reward --->", reward_diff
    print "Difference in Policy --->", policy_diff
    X = np.reshape(model.feature_f, (disc_model.tot_states * disc_model.tot_actions, 4))
    Y = (state_action_frequencies_ref - state_action_frequencies_test).reshape(
        (disc_model.tot_states * disc_model.tot_actions)
开发者ID:KyriacosShiarli,项目名称:MDP,代码行数:33,代码来源:fg_learn_test.py


注:本文中的Model.Model.transition方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。