当前位置: 首页>>代码示例>>Java>>正文


Java TrainingParameters类代码示例

本文整理汇总了Java中opennlp.tools.util.TrainingParameters的典型用法代码示例。如果您正苦于以下问题:Java TrainingParameters类的具体用法?Java TrainingParameters怎么用?Java TrainingParameters使用的例子?那么, 这里精选的类代码示例或许可以为您提供帮助。


TrainingParameters类属于opennlp.tools.util包,在下文中一共展示了TrainingParameters类的15个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: getNLPModel

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static DoccatModel getNLPModel(File openNLPTraining) throws IOException {
	DoccatModel model = null;

	FeatureGenerator[] def = { new BagOfWordsFeatureGenerator() };
	WhitespaceTokenizer tokenizer = WhitespaceTokenizer.INSTANCE;

	DoccatFactory factory = new DoccatFactory(tokenizer, def);
	InputStreamFactory isf = new MarkableFileInputStreamFactory(openNLPTraining);
	ObjectStream<String> lineStream = new PlainTextByLineStream(isf, "UTF-8");
	ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream);

	TrainingParameters params = TrainingParameters.defaultParams();
	System.out.println(params.algorithm());
	params.put(TrainingParameters.CUTOFF_PARAM, Integer.toString(0));
	params.put(TrainingParameters.ITERATIONS_PARAM, Integer.toString(4000));

	model = DocumentCategorizerME.train("en", sampleStream, params, factory);
	
	evaluateDoccatModel(model, openNLPTraining);

	return model;

}
 
开发者ID:SOBotics,项目名称:SOCVFinder,代码行数:24,代码来源:ModelCreator.java

示例2: main

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static void main(String[] args) {
if (args.length < 2) {
    System.out.println("usage: <input> <output>\n");
    System.exit(0);
}

String input = args[0];
String output = args[1];

TrainingParameters params = new TrainingParameters();
params.put(TrainingParameters.CUTOFF_PARAM, Integer.toString(0));
params.put(TrainingParameters.ITERATIONS_PARAM, Integer.toString(100));
//params.put(TrainingParameters.ALGORITHM_PARAM, NaiveBayesTrainer.NAIVE_BAYES_VALUE);

AgeClassifyModel model;
try {
    model = AgeClassifySparkTrainer.createModel("en", input, 
        "opennlp.tools.tokenize.SentenceTokenizer", "opennlp.tools.tokenize.BagOfWordsTokenizer", params);
} catch (IOException e) {
    throw new TerminateToolException(-1,
        "IO error while reading training data or indexing data: " + e.getMessage(), e);
}
CmdLineUtil.writeModel("age classifier", new File(output), model);
   }
 
开发者ID:USCDataScience,项目名称:AgePredictor,代码行数:25,代码来源:AgeClassifySparkTrainer.java

示例3: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static AgeClassifyModel train(String languageCode,
       ObjectStream<AuthorAgeSample> samples, TrainingParameters trainParams,
      	AgeClassifyFactory factory) throws IOException {

Map<String, String> entries = new HashMap<String, String>();

MaxentModel ageModel = null;

TrainerType trainerType = AgeClassifyTrainerFactory
    .getTrainerType(trainParams.getSettings());

ObjectStream<Event> eventStream = new AgeClassifyEventStream(samples,
    factory.createContextGenerator());

EventTrainer trainer = AgeClassifyTrainerFactory
    .getEventTrainer(trainParams.getSettings(), entries);
ageModel = trainer.train(eventStream);

Map<String, String> manifestInfoEntries = new HashMap<String, String>();

return new AgeClassifyModel(languageCode, ageModel, manifestInfoEntries,
			    factory);
   }
 
开发者ID:USCDataScience,项目名称:AgePredictor,代码行数:24,代码来源:AgeClassifyME.java

示例4: trainSentences

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static void trainSentences(final String inResource, String outFile) throws IOException {
    InputStreamFactory inputStreamFactory = new InputStreamFactory() {
        @Override
        public InputStream createInputStream() throws IOException {
            return Trainer.class.getResourceAsStream(inResource);
        }
    };
    SentenceSampleStream samples = new SentenceSampleStream(new PlainTextByLineStream(inputStreamFactory, StandardCharsets.UTF_8));
    TrainingParameters trainingParameters = new TrainingParameters();
    trainingParameters.put(TrainingParameters.ALGORITHM_PARAM, ModelType.MAXENT.name());
    trainingParameters.put(TrainingParameters.ITERATIONS_PARAM, "100");
    trainingParameters.put(TrainingParameters.CUTOFF_PARAM, "0");
    SentenceDetectorFactory sentenceDetectorFactory = SentenceDetectorFactory.create(null, "en", true, null, ".?!".toCharArray());
    SentenceModel sentdetectModel = SentenceDetectorME.train("en", samples, sentenceDetectorFactory, trainingParameters);
    //.train("en", samples, true, null, 100, 0);
    samples.close();
    FileOutputStream out = new FileOutputStream(outFile);
    sentdetectModel.serialize(out);
    out.close();
}
 
开发者ID:jprante,项目名称:elasticsearch-analysis-opennlp,代码行数:21,代码来源:Trainer.java

示例5: trainChunker

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static void trainChunker(final String inResource, String outFile) throws IOException {
    InputStreamFactory inputStreamFactory = new InputStreamFactory() {
        @Override
        public InputStream createInputStream() throws IOException {
            return Trainer.class.getResourceAsStream(inResource);
        }
    };
    ChunkSampleStream samples = new ChunkSampleStream(new PlainTextByLineStream(inputStreamFactory, StandardCharsets.UTF_8));
    TrainingParameters trainingParameters = new TrainingParameters();
    trainingParameters.put(TrainingParameters.ITERATIONS_PARAM, "70");
    trainingParameters.put(TrainingParameters.CUTOFF_PARAM, "1");

    ChunkerFactory chunkerFactory = ChunkerFactory.create(null);
    ChunkerModel model = ChunkerME.train("en", samples, trainingParameters, chunkerFactory);
    //ChunkerME.train("en", samples, 1, 70);
    samples.close();
    FileOutputStream out = new FileOutputStream(outFile);
    model.serialize(out);
    out.close();
}
 
开发者ID:jprante,项目名称:elasticsearch-analysis-opennlp,代码行数:21,代码来源:Trainer.java

示例6: trainNameFinder

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static void trainNameFinder(final String inResource, String outFile) throws IOException {
    InputStreamFactory inputStreamFactory = new InputStreamFactory() {
        @Override
        public InputStream createInputStream() throws IOException {
            return Trainer.class.getResourceAsStream(inResource);
        }
    };
    InputStream in = Trainer.class.getResourceAsStream(inResource);
    NameSampleDataStream samples = new NameSampleDataStream(new PlainTextByLineStream(inputStreamFactory, StandardCharsets.UTF_8));
    TrainingParameters trainingParameters = new TrainingParameters();
    trainingParameters.put(TrainingParameters.ITERATIONS_PARAM, "5");
    trainingParameters.put(TrainingParameters.CUTOFF_PARAM, "200");
    byte[] featureGeneratorBytes = null;
    Map<String, Object> resources = Collections.<String, Object>emptyMap();
    SequenceCodec<String> seqCodec = new BioCodec();
    TokenNameFinderFactory tokenNameFinderFactory = TokenNameFinderFactory.create(null, featureGeneratorBytes, resources, seqCodec);
    TokenNameFinderModel model = NameFinderME.train("en", "person", samples, trainingParameters, tokenNameFinderFactory);
    //NameFinderME.train("en", "person", samples, Collections.<String, Object>emptyMap(), 200, 5);
    samples.close();
    FileOutputStream out = new FileOutputStream(outFile);
    model.serialize(out);
    out.close();
}
 
开发者ID:jprante,项目名称:elasticsearch-analysis-opennlp,代码行数:24,代码来源:Trainer.java

示例7: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public final LemmatizerModel train(final TrainingParameters params) {
  // features
  if (getLemmatizerFactory() == null) {
    throw new IllegalStateException(
        "Classes derived from AbstractLemmatizerTrainer must "
            + " create a LemmatizerFactory features!");
  }
  // training model
  LemmatizerModel trainedModel = null;
  LemmatizerEvaluator lemmatizerEvaluator = null;
  try {
    trainedModel = LemmatizerME.train(this.lang, this.trainSamples, params,
        getLemmatizerFactory());
    final LemmatizerME lemmatizer = new LemmatizerME(trainedModel);
    lemmatizerEvaluator = new LemmatizerEvaluator(lemmatizer);
    lemmatizerEvaluator.evaluate(this.testSamples);
  } catch (final IOException e) {
    System.err.println("IO error while loading training and test sets!");
    e.printStackTrace();
    System.exit(1);
  }
  System.out.println("Final result: " + lemmatizerEvaluator.getWordAccuracy());
  return trainedModel;
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:25,代码来源:AbstractLemmatizerTrainer.java

示例8: AbstractTaggerTrainer

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
/**
 * Construct an AbstractTrainer. In the params parameter there is information
 * about the language, the featureset, and whether to use pos tag dictionaries
 * or automatically created dictionaries from the training set.
 * 
 * @param params
 *          the training parameters
 * @throws IOException
 *           the io exceptions
 */
public AbstractTaggerTrainer(final TrainingParameters params) throws IOException {
  this.lang = Flags.getLanguage(params);
  final String trainData = Flags.getDataSet("TrainSet", params);
  final String testData = Flags.getDataSet("TestSet", params);
  final ObjectStream<String> trainStream = InputOutputUtils
      .readFileIntoMarkableStreamFactory(trainData);
  this.trainSamples = new MorphoSampleStream(trainStream);
  final ObjectStream<String> testStream = InputOutputUtils
      .readFileIntoMarkableStreamFactory(testData);
  this.testSamples = new MorphoSampleStream(testStream);
  final ObjectStream<String> dictStream = InputOutputUtils
      .readFileIntoMarkableStreamFactory(trainData);
  setDictSamples(new MorphoSampleStream(dictStream));
  this.dictCutOff = Flags.getAutoDictFeatures(params);
  this.ngramCutOff = Flags.getNgramDictFeatures(params);

}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:28,代码来源:AbstractTaggerTrainer.java

示例9: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public final POSModel train(final TrainingParameters params) {
  // features
  if (getPosTaggerFactory() == null) {
    throw new IllegalStateException(
        "Classes derived from AbstractTrainer must "
            + " create a POSTaggerFactory features!");
  }
  // training model
  POSModel trainedModel = null;
  POSEvaluator posEvaluator = null;
  try {
    trainedModel = POSTaggerME.train(this.lang, this.trainSamples, params,
        getPosTaggerFactory());
    final POSTaggerME posTagger = new POSTaggerME(trainedModel);
    posEvaluator = new POSEvaluator(posTagger);
    posEvaluator.evaluate(this.testSamples);
  } catch (final IOException e) {
    System.err.println("IO error while loading training and test sets!");
    e.printStackTrace();
    System.exit(1);
  }
  System.out.println("Final result: " + posEvaluator.getWordAccuracy());
  return trainedModel;
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:25,代码来源:AbstractTaggerTrainer.java

示例10: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
/**
 * Main entry point for training.
 * 
 * @throws IOException
 *           throws an exception if errors in the various file inputs.
 */
public final void train() throws IOException {
  // load training parameters file
  final String paramFile = this.parsedArguments.getString("params");
  final TrainingParameters params = InputOutputUtils
      .loadTrainingParameters(paramFile);
  String outModel = null;
  if (params.getSettings().get("OutputModel") == null
      || params.getSettings().get("OutputModel").length() == 0) {
    outModel = Files.getNameWithoutExtension(paramFile) + ".bin";
    params.put("OutputModel", outModel);
  } else {
    outModel = Flags.getModel(params);
  }
  final Trainer chunkerTrainer = new DefaultTrainer(params);
  final ChunkerModel trainedModel = chunkerTrainer.train(params);
  CmdLineUtil.writeModel("ixa-pipe-chunk", new File(outModel), trainedModel);
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-chunk,代码行数:24,代码来源:CLI.java

示例11: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public final ChunkerModel train(final TrainingParameters params) {
  // features
  if (getChunkerFactory() == null) {
    throw new IllegalStateException(
        "Classes derived from AbstractTrainer must "
            + " create a ChunkerFactory features!");
  }
  // training model
  ChunkerModel trainedModel = null;
  ChunkerEvaluator chunkerEvaluator = null;
  try {
    trainedModel = ChunkerME.train(lang, trainSamples, params,
        getChunkerFactory());
    final Chunker chunker = new ChunkerME(trainedModel);
    chunkerEvaluator = new ChunkerEvaluator(chunker);
    chunkerEvaluator.evaluate(this.testSamples);
  } catch (IOException e) {
    System.err.println("IO error while loading traing and test sets!");
    e.printStackTrace();
    System.exit(1);
  }
  System.out.println("Final result: " + chunkerEvaluator.getFMeasure());
  return trainedModel;
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-chunk,代码行数:25,代码来源:AbstractTrainer.java

示例12: train

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static void  train(String file_train, String file_model) throws IOException {
	DoccatModel model = null;
	ObjectStream<String> lineStream =
			new PlainTextByLineStream(new MarkableFileInputStreamFactory(
					new File(file_train)), "UTF-8");
	ObjectStream<DocumentSample> sampleStream =
			new DocumentSampleStream(lineStream);

	TrainingParameters param = TrainingParameters.defaultParams();
	DoccatFactory factory = new DoccatFactory();
	model = DocumentCategorizerME.train("en", sampleStream,param,factory);

	model.serialize(new FileOutputStream(file_model));
}
 
开发者ID:jackeylu,项目名称:NLP_with_Java_zh,代码行数:15,代码来源:SentenceTest.java

示例13: crossValidate

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
/**
 * Main access to the cross validation.
 * @throws IOException
 *           input output exception if problems with corpora
 */
public final void crossValidate() throws IOException {

  final String paramFile = this.parsedArguments.getString("params");
  final TrainingParameters params = InputOutputUtils
      .loadTrainingParameters(paramFile);
  final POSCrossValidator crossValidator = new POSCrossValidator(params);
  crossValidator.crossValidate(params);
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:14,代码来源:CLI.java

示例14: getComponent

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static String getComponent(final TrainingParameters params) {
  String component = null;
  if (params.getSettings().get("Component") == null) {
    componentException();
  } else {
    component = params.getSettings().get("Component");
  }
  return component;
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:10,代码来源:Flags.java

示例15: getLanguage

import opennlp.tools.util.TrainingParameters; //导入依赖的package包/类
public static String getLanguage(final TrainingParameters params) {
  String lang = null;
  if (params.getSettings().get("Language") == null) {
    langException();
  } else {
    lang = params.getSettings().get("Language");
  }
  return lang;
}
 
开发者ID:ixa-ehu,项目名称:ixa-pipe-pos,代码行数:10,代码来源:Flags.java


注:本文中的opennlp.tools.util.TrainingParameters类示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。