C++ Environment::act方法代码示例

本文整理汇总了C++中Environment::act方法的典型用法代码示例。如果您正苦于以下问题：C++ Environment::act方法的具体用法？C++ Environment::act怎么用？C++ Environment::act使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类Environment的用法示例。

在下文中一共展示了Environment::act方法的2个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的C++代码示例。

示例1: evaluatePolicy

double HumanAgent::evaluatePolicy(Environment<bool>& env){
#ifdef __USE_SDL
	Action action;
	int reward = 0;
    int totalReward = 0;
	int cumulativeReward = 0;
	
	//Repeat (for each episode):
	for(int episode = 0; episode < numEpisodesToEval; episode++){
		int step = 0;
		while(!env.game_over() && step < maxStepsInEpisode) {
			action = receiveAction();
            //If one wants to save trajectories, this is where the trajectory is saved:
            if(toSaveTrajectory){
                saveTrajectory(action);
            }
			reward = env.act(action);
			cumulativeReward += reward;
			step++;
		}
		printf("Episode %d, Cumulative Reward: %d\n", episode + 1, cumulativeReward);
        totalReward += cumulativeReward;
		cumulativeReward = 0;
		env.reset_game(); //Start the game again when the episode is over
	}
    return double(totalReward)/numEpisodesToEval;
}

开发者ID:mcmachado，项目名称:ALEResearch，代码行数:27，代码来源:HumanAgent.cpp

示例2: evaluatePolicy

double RandomAgent::evaluatePolicy(Environment<bool>& env){
	int reward = 0;
	int totalReward = 0;
	int cumulativeReward = 0;
	int numActions;
	ActionVect actions;
	//Check if one wants to sample from all possible actions or only the valid ones:
	if(useMinActions){
		actions = env.getMinimalActionSet();
	}
	else{
		actions = env.getLegalActionSet();
	}
	numActions = actions.size();
	printf("Number of Actions: %d\n\n", numActions);
	//Repeat (for each episode):
	for(int episode = 0; episode < numEpisodesToEval; episode++){
		int step = 0;
		while(!env.game_over() && step < maxStepsInEpisode) {
			reward = env.act(actions[rand()%numActions]);
			cumulativeReward += reward;
			step++;
		}
		printf("Episode %d, Cumulative Reward: %d\n", episode + 1, cumulativeReward);
		totalReward += cumulativeReward;
		cumulativeReward = 0;
		env.reset_game(); //Start the game again when the episode is over
	}
	return double(totalReward)/numEpisodesToEval;
}

开发者ID:mcmachado，项目名称:ALEResearch，代码行数:30，代码来源:RandomAgent.cpp

注：本文中的Environment::act方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。