Java SolverDerivedPolicy类代码示例

本文整理汇总了Java中burlap.behavior.policy.SolverDerivedPolicy类的典型用法代码示例。如果您正苦于以下问题：Java SolverDerivedPolicy类的具体用法？Java SolverDerivedPolicy怎么用？Java SolverDerivedPolicy使用的例子？那么, 这里精选的类代码示例或许可以为您提供帮助。

SolverDerivedPolicy类属于burlap.behavior.policy包，在下文中一共展示了SolverDerivedPolicy类的4个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: DeterministicTerminationOption

import burlap.behavior.policy.SolverDerivedPolicy; //导入依赖的package包/类
/**
 * Initializes the option by creating the policy uses some provided option. The valueFunction is called repeatedly on each state in the
 * the list <code>seedStatesForPlanning</code> and then
 * sets this options policy to the valueFunction derived policy that is provided.
 * @param name the name of the option
 * @param init the initiation conditions of the option
 * @param terminationStates the termination states of the option
 * @param seedStatesForPlanning the states that should be used as initial states for the valueFunction
 * @param planner the valueFunction that is used to create the policy for this option
 * @param p the valueFunction derived policy to use after planning from each initial state is performed.
 */
public DeterministicTerminationOption(String name, StateConditionTest init, StateConditionTest terminationStates, List<State> seedStatesForPlanning,
									  Planner planner, SolverDerivedPolicy p){
	
	if(!(p instanceof Policy)){
		throw new RuntimeErrorException(new Error("PlannerDerivedPolicy p is not an instnace of Policy"));
	}
	
	
	this.name = name;
	
	this.initiationTest = init;
	this.terminationStates = terminationStates;
	
	//now construct the policy using the valueFunction from each possible initiation state
	for(State si : seedStatesForPlanning){
		planner.planFromState(si);
	}
	
	p.setSolver(planner);
	this.policy = (Policy)p;
	
}

开发者ID:f-leno，项目名称:DOO-Q_BRACIS2016，代码行数:34，代码来源:DeterministicTerminationOption.java

示例2: main

import burlap.behavior.policy.SolverDerivedPolicy; //导入依赖的package包/类
public static void main(String args[]) {

        // Learning constants
        double gamma = 0.99;
        int replayStartSize = 50000;
        int memorySize = 1000000;
        double epsilonStart = 1;
        double epsilonEnd = 0.1;
        double testEpsilon = 0.05;
        int epsilonAnnealDuration = 1000000;
        int staleUpdateFreq = 10000;

        // Caffe solver file
        String solverFile = "example_models/grid_world_dqn_solver.prototxt";

        // Load Caffe
        Loader.load(caffe.Caffe.class);

        // Setup the network
        GridWorldDQN gridWorldDQN = new GridWorldDQN(solverFile, gamma);

        // Create the policies
        SolverDerivedPolicy learningPolicy =
                new AnnealedEpsilonGreedy(epsilonStart, epsilonEnd, epsilonAnnealDuration);
        SolverDerivedPolicy testPolicy = new EpsilonGreedy(testEpsilon);

        // Setup the learner
        DeepQLearner deepQLearner =
                new DeepQLearner(gridWorldDQN.domain, gamma, replayStartSize, learningPolicy, gridWorldDQN.dqn);
        deepQLearner.setExperienceReplay(new FixedSizeMemory(memorySize), gridWorldDQN.dqn.batchSize);
        deepQLearner.useStaleTarget(staleUpdateFreq);

        // Setup the tester
        Tester tester = new SimpleTester(testPolicy);

        // Set the QProvider for the policies
        learningPolicy.setSolver(deepQLearner);
        testPolicy.setSolver(deepQLearner);

        // Setup the visualizer
        VisualExplorer exp = new VisualExplorer(
                gridWorldDQN.domain, gridWorldDQN.env, GridWorldVisualizer.getVisualizer(gridWorldDQN.gwdg.getMap()));
        exp.initGUI();
        exp.startLiveStatePolling(33);

        // Setup helper
        TrainingHelper helper = new TrainingHelper(
                deepQLearner, tester, gridWorldDQN.dqn, actionSet, gridWorldDQN.env);
        helper.setTotalTrainingSteps(50000000);
        helper.setTestInterval(500000);
        helper.setTotalTestSteps(125000);
        helper.setMaxEpisodeSteps(10000);

        // Run helper
        helper.run();
    }

开发者ID:h2r，项目名称:burlap_caffe，代码行数:57，代码来源:GridWorldDQN.java

示例3: setPolicy

import burlap.behavior.policy.SolverDerivedPolicy; //导入依赖的package包/类
/**
 * Sets the policy to the provided one. Should be a policy that operates on a {@link burlap.behavior.valuefunction.QFunction}. Will automatically set its
 * Q-source to this object.
 * @param policy the policy to use.
 */
public void setPolicy(SolverDerivedPolicy policy){
	this.policy = (Policy)policy;
	policy.setSolver(this);
	
}

开发者ID:f-leno，项目名称:DOO-Q_BRACIS2016，代码行数:11，代码来源:ARTDP.java

示例4: setPolicy

import burlap.behavior.policy.SolverDerivedPolicy; //导入依赖的package包/类
/**
 * Sets the policy to the provided one. Should be a policy that operates on a {@link QProvider}. Will automatically set its
 * Q-source to this object.
 * @param policy the policy to use.
 */
public void setPolicy(SolverDerivedPolicy policy){
	this.policy = (Policy)policy;
	policy.setSolver(this);
	
}

开发者ID:jmacglashan，项目名称:burlap，代码行数:11，代码来源:ARTDP.java

注：本文中的burlap.behavior.policy.SolverDerivedPolicy类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。