當前位置: 首頁>>代碼示例>>Python>>正文


Python Reward_observation_terminal.terminal方法代碼示例

本文整理匯總了Python中rlglue.types.Reward_observation_terminal.terminal方法的典型用法代碼示例。如果您正苦於以下問題:Python Reward_observation_terminal.terminal方法的具體用法?Python Reward_observation_terminal.terminal怎麽用?Python Reward_observation_terminal.terminal使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在rlglue.types.Reward_observation_terminal的用法示例。


在下文中一共展示了Reward_observation_terminal.terminal方法的7個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。

示例1: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
        def env_step(self,action):
                self.stepCount=self.stepCount+1
                
                if self.whichEpisode % 2 == 0:
                        self.o.intArray=list(range(0,50000))
                        #cheating, might break something
                        self.o.doubleArray=list(range(0,50000))
                        terminal=0
                        if self.stepCount==200:
                                terminal=1
                        ro=Reward_observation_terminal()
                        ro.r=1.0
                        ro.o=self.o
                        ro.terminal=terminal
                        return ro

                self.o.intArray=list(range(0,5))
                #cheating, might break something
                self.o.doubleArray=list(range(0,5))
                terminal=0
                if self.stepCount==5000:
                        terminal=1
                ro=Reward_observation_terminal()
                ro.r=1.0
                ro.o=self.o
                ro.terminal=terminal
                return ro
開發者ID:steckdenis,項目名稱:rlglue-py3,代碼行數:29,代碼來源:test_speed_environment.py

示例2: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
    def env_step(self,thisAction):
        episodeOver=0
        theReward=0

        if    thisAction.intArray[0]==0:
            self.currentState=self.currentState-1
        if    thisAction.intArray[0]==1:
            self.currentState=self.currentState+1

        if self.currentState <= 0:
            self.currentState=0
            theReward=-1
            episodeOver=1

        if self.currentState >= 20:
            self.currentState=20
            theReward=1
            episodeOver=1

        theObs=Observation()
        theObs.intArray=[self.currentState]

        returnRO=Reward_observation_terminal()
        returnRO.r=theReward
        returnRO.o=theObs
        returnRO.terminal=episodeOver

        return returnRO
開發者ID:AAHays,項目名稱:python-rl,代碼行數:30,代碼來源:skeleton_environment.py

示例3: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
	def env_step(self,action):
		ro=Reward_observation_terminal()
		terminal=False

		if self.stepCount < 5:
			self.o.doubleArray=[]
			self.o.charArray=[]
			self.o.intArray=[self.stepCount]
	
			self.stepCount=self.stepCount+1
				
			if self.stepCount==5:
				terminal=True

			ro.r=1.0

		else:
			self.o.doubleArray=[0.0078125,-0.0078125,0.0,0.0078125e150,-0.0078125e150]
			self.o.charArray=['g','F','?',' ','&']
			self.o.intArray=[173,-173,2147483647,0,-2147483648]

			ro.r=-2.0

		ro.o=self.o
		ro.terminal=terminal
		return ro	
開發者ID:junzhez,項目名稱:rl_glue_python3_codec,代碼行數:28,代碼來源:test_1_environment.py

示例4: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
    def env_step(self,thisAction):

        # プレーヤーの移動
        self.player.update(thisAction)
      
        # 移動後のスコア計算
        theReward = self.field.decision(int(self.player.x+0.5), int(self.player.y+0.5), thisAction.intArray[0])
        #print("Reward:%d" %theReward)
        episodeOver = self.field.get_gameover()
        #print("EdgeTracer:episodeOver %03d" %episodeOver)
      
        # フィールドの描畫
        self.draw_field()

        returnObs=Observation()
        returnObs.intArray=np.append(np.zeros(128), [ item for innerlist in self.img_state for item in innerlist ])
        #scipy.misc.imsave('l_screen.png', img_src)
        #scipy.misc.imsave('r_screen.png', img_afn)

        returnRO=Reward_observation_terminal()
        returnRO.r=theReward
        returnRO.o=returnObs
        returnRO.terminal=episodeOver
 
        return returnRO
開發者ID:hashima,項目名稱:DQN_Framework,代碼行數:27,代碼來源:env.py

示例5: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
    def env_step(self, action):
        state, reward, terminal = self.environment.step(self.get_action(action))

        rot = Reward_observation_terminal()
        rot.r = reward
        rot.o = self.create_observation(state)
        rot.terminal = terminal
        return rot
開發者ID:ProjectGameTheory,項目名稱:PyALE,代碼行數:10,代碼來源:RLGlueEnvironment.py

示例6: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
    def env_step(self,thisAction):
        intAction = thisAction.intArray[0]
        theReward, episodeOver = self.takeAction(intAction)

        theObs = Observation()
        theObs.doubleArray = self.state.tolist()
        returnRO = Reward_observation_terminal()
        returnRO.r = theReward
        returnRO.o = theObs
        returnRO.terminal = int(episodeOver)

        return returnRO
開發者ID:AAHays,項目名稱:python-rl,代碼行數:14,代碼來源:bicycle.py

示例7: env_step

# 需要導入模塊: from rlglue.types import Reward_observation_terminal [as 別名]
# 或者: from rlglue.types.Reward_observation_terminal import terminal [as 別名]
    def env_step(self,thisAction):
        intAction = int(thisAction.intArray[0])
        theReward = self.takeAction(intAction)
        theObs = Observation()
        theObs.intArray = self.getState()

        returnRO = Reward_observation_terminal()
        returnRO.r = theReward
        returnRO.o = theObs
        returnRO.terminal = 0

        return returnRO
開發者ID:AAHays,項目名稱:python-rl,代碼行數:14,代碼來源:chain.py


注:本文中的rlglue.types.Reward_observation_terminal.terminal方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。