当前位置: 首页>>代码示例>>Python>>正文


Python World.newPosition方法代码示例

本文整理汇总了Python中World.World.newPosition方法的典型用法代码示例。如果您正苦于以下问题:Python World.newPosition方法的具体用法?Python World.newPosition怎么用?Python World.newPosition使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在World.World的用法示例。


在下文中一共展示了World.newPosition方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: valueIteration

# 需要导入模块: from World import World [as 别名]
# 或者: from World.World import newPosition [as 别名]
def valueIteration(defaultReward):
    discountedValue = 0.9
    from World import World

    instance = World()
    instance.default_Reward = defaultReward
    # print instance.isWalls(3,2)

    # old actions ={'right':[0.8,0.2],'left':[1.0],'up':[0.8,0.2],'down':[1.0]}

    actions = {
        "right": {"right": 0.8, "down": 0.2},
        "left": {"left": 1.0},
        "up": {"up": 0.8, "left": 0.2},
        "down": {"down": 1.0},
    }
    # initialize the value
    valueGrid = [[0 for x in range(instance.world_Column)] for x in range(instance.world_Row)]

    previousValueGrid = [[0 for x in range(instance.world_Column)] for x in range(instance.world_Row)]

    iterations = 0
    stop = False

    while not stop:
        iterations += 1
        previousValueGrid = copyMatrix(valueGrid, instance.world_Row, instance.world_Column)
        for row in range(instance.world_Row):
            for col in range(instance.world_Column):
                # for all states

                # for all actions

                valueActions = [0, 0, 0, 0]
                count = 0
                if not instance.isWalls(row, col):
                    for key, pairs in actions.iteritems():

                        total = 0.0
                        for action, value in pairs.iteritems():

                            if instance.isWithinWorld(action, row, col):
                                newCoordinates = instance.newPosition(action, row, col)
                                total += value * valueGrid[newCoordinates[0]][newCoordinates[1]]

                        valueActions[count] = instance.getRewards(row, col) + (discountedValue * total)
                        count += 1

                valueGrid[row][col] = max(valueActions)

        # print valueGrid
        stop = convergence(valueGrid, previousValueGrid, instance.world_Row, instance.world_Column)

    print valueGrid
    print "The number of iterations is " + str(iterations)
开发者ID:srikirank,项目名称:Machine-Learning,代码行数:57,代码来源:run.py


注:本文中的World.World.newPosition方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。