当前位置: 首页>>代码示例>>Java>>正文


Java MAValueIteration类代码示例

本文整理汇总了Java中burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration的典型用法代码示例。如果您正苦于以下问题:Java MAValueIteration类的具体用法?Java MAValueIteration怎么用?Java MAValueIteration使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。


MAValueIteration类属于burlap.behavior.stochasticgames.madynamicprogramming.dpplanners包,在下文中一共展示了MAValueIteration类的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: getPlannerInstance

import burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration; //导入依赖的package包/类
@Override
public MADynamicProgramming getPlannerInstance() {
	return new MAValueIteration(domain, agentDefinitions, jointReward, terminalFunction, discount, hashingFactory, qInit, backupOperator, maxDelta, maxIterations);
}
 
开发者ID:f-leno,项目名称:DOO-Q_BRACIS2016,代码行数:5,代码来源:MADPPlannerFactory.java

示例2: VICoCoTest

import burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration; //导入依赖的package包/类
public static void VICoCoTest(){

		//grid game domain
		GridGame gridGame = new GridGame();
		final OOSGDomain domain = gridGame.generateDomain();

		final HashableStateFactory hashingFactory = new SimpleHashableStateFactory();

		//run the grid game version of prisoner's dilemma
		final State s = GridGame.getPrisonersDilemmaInitialState();

		//define joint reward function and termination conditions for this game
		JointRewardFunction rf = new GridGame.GGJointRewardFunction(domain, -1, 100, false);
		TerminalFunction tf = new GridGame.GGTerminalFunction(domain);

		//both agents are standard: access to all actions
		SGAgentType at = GridGame.getStandardGridGameAgentType(domain);

		//create our multi-agent planner
		MAValueIteration vi = new MAValueIteration(domain, rf, tf, 0.99, hashingFactory, 0., new CoCoQ(), 0.00015, 50);

		//instantiate a world in which our agents will play
		World w = new World(domain, rf, tf, s);


		//create a greedy joint policy from our planner's Q-values
		EGreedyMaxWellfare jp0 = new EGreedyMaxWellfare(0.);
		jp0.setBreakTiesRandomly(false); //don't break ties randomly

		//create agents that follows their end of the computed the joint policy
		MultiAgentDPPlanningAgent a0 = new MultiAgentDPPlanningAgent(domain, vi, new PolicyFromJointPolicy(0, jp0), "agent0", at);
		MultiAgentDPPlanningAgent a1 = new MultiAgentDPPlanningAgent(domain, vi, new PolicyFromJointPolicy(1, jp0), "agent1", at);

		w.join(a0);
		w.join(a1);

		//run some games of the agents playing that policy
		GameEpisode ga = null;
		for(int i = 0; i < 3; i++){
			ga = w.runGame();
		}

		//visualize results
		Visualizer v = GGVisualizer.getVisualizer(9, 9);
		new GameSequenceVisualizer(v, domain, Arrays.asList(ga));


	}
 
开发者ID:jmacglashan,项目名称:burlap_examples,代码行数:49,代码来源:GridGameExample.java

示例3: getPlannerInstance

import burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration; //导入依赖的package包/类
@Override
public MADynamicProgramming getPlannerInstance() {
	return new MAValueIteration(domain, agentDefinitions, jointRewardFunction, terminalFunction, discount, hashingFactory, qInit, backupOperator, maxDelta, maxIterations);
}
 
开发者ID:jmacglashan,项目名称:burlap,代码行数:5,代码来源:MADPPlannerFactory.java

示例4: VICorrelatedTest

import burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration; //导入依赖的package包/类
public static void VICorrelatedTest(){

		GridGame gridGame = new GridGame();
		final OOSGDomain domain = gridGame.generateDomain();

		final HashableStateFactory hashingFactory = new SimpleHashableStateFactory();

		final State s = GridGame.getPrisonersDilemmaInitialState();

		JointRewardFunction rf = new GridGame.GGJointRewardFunction(domain, -1, 100, false);
		TerminalFunction tf = new GridGame.GGTerminalFunction(domain);

		SGAgentType at = GridGame.getStandardGridGameAgentType(domain);
		MAValueIteration vi = new MAValueIteration(domain, rf, tf, 0.99, hashingFactory, 0., new CorrelatedQ(CorrelatedEquilibriumSolver.CorrelatedEquilibriumObjective.UTILITARIAN), 0.00015, 50);

		World w = new World(domain, rf, tf, s);


		//for correlated Q, use a correlated equilibrium policy joint policy
		ECorrelatedQJointPolicy jp0 = new ECorrelatedQJointPolicy(CorrelatedEquilibriumSolver.CorrelatedEquilibriumObjective.UTILITARIAN, 0.);


		MultiAgentDPPlanningAgent a0 = new MultiAgentDPPlanningAgent(domain, vi, new PolicyFromJointPolicy(0, jp0, true), "agent0", at);
		MultiAgentDPPlanningAgent a1 = new MultiAgentDPPlanningAgent(domain, vi, new PolicyFromJointPolicy(1, jp0, true), "agent1", at);

		w.join(a0);
		w.join(a1);

		GameEpisode ga = null;
		List<GameEpisode> games = new ArrayList<GameEpisode>();
		for(int i = 0; i < 10; i++){
			ga = w.runGame();
			games.add(ga);
		}

		Visualizer v = GGVisualizer.getVisualizer(9, 9);
		new GameSequenceVisualizer(v, domain, games);


	}
 
开发者ID:jmacglashan,项目名称:burlap_examples,代码行数:41,代码来源:GridGameExample.java


注:本文中的burlap.behavior.stochasticgames.madynamicprogramming.dpplanners.MAValueIteration类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。