当前位置: 首页>>代码示例>>Java>>正文


Java Tesseract.doOCR方法代码示例

本文整理汇总了Java中net.sourceforge.tess4j.Tesseract.doOCR方法的典型用法代码示例。如果您正苦于以下问题:Java Tesseract.doOCR方法的具体用法?Java Tesseract.doOCR怎么用?Java Tesseract.doOCR使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在net.sourceforge.tess4j.Tesseract的用法示例。


在下文中一共展示了Tesseract.doOCR方法的12个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: map

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public void map(LongWritable key, Text url, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {

            File videoDownloadDir = Files.createTempDir();
            VGet v = new VGet(new URL(url.toString()), videoDownloadDir);
            v.download();
            System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
            File[] videoFiles = videoDownloadDir.listFiles();
            Arrays.sort(videoFiles);
            File[] videoFramesFiles = VideoProcessing.parseVideo(videoFiles[0], 70);
            File[] processedVideoFrames = VideoProcessing.cutImages(videoFramesFiles);

            Tesseract instance = Tesseract.getInstance();
            instance.setDatapath("/usr/share/tesseract-ocr");
            instance.setTessVariable("LC_NUMERIC", "C");

            for (File image: processedVideoFrames) {
                String result = null;
                try {
                    result = instance.doOCR(image);
                } catch (TesseractException e) {
                    e.printStackTrace();
                }
                if (!result.isEmpty()) {
                    word.set(result);
                    output.collect(url, word);
                }
            }
        }
 
开发者ID:yurinnick,项目名称:hadoop-video-ocr,代码行数:29,代码来源:HadoopOCR.java

示例2: main

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static void main(final String[] args) {
  try {
    String _property = System.getProperty("java.io.tmpdir");
    System.out.println(_property);
    String _env = System.getenv("TESSDATA_PREFIX");
    System.out.println(_env);
    Image _image = new Image("d:\\test\\pdf\\test10.png");
    BufferedImage orgin = _image.getAsBufferedImage();
    BufferedImage textImage = ImageHelper.convertImageToGrayscale(orgin);
    int _width = textImage.getWidth();
    int _multiply = (_width * 5);
    int _height = textImage.getHeight();
    int _multiply_1 = (_height * 5);
    BufferedImage _scaledInstance = ImageHelper.getScaledInstance(textImage, _multiply, _multiply_1);
    textImage = _scaledInstance;
    Tesseract instance = Tesseract.getInstance();
    instance.setLanguage("chi_sim");
    System.out.println("instance done");
    String result = instance.doOCR(textImage);
    System.out.println(result);
  } catch (Throwable _e) {
    throw Exceptions.sneakyThrow(_e);
  }
}
 
开发者ID:East196,项目名称:maker,代码行数:25,代码来源:Tess4Java.java

示例3: recognizeText

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
/**
 * Zooms the text image to make it easier to read
 * */
public static String recognizeText(Image image) {
	LibraryLoaderSingleton.getInstance();
	Image scaledImage = image.scale(8);
	Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
	instance.setLanguage("eng");
	System.setProperty("jna.encoding", "UTF8");
	instance.setOcrEngineMode(TessAPI.TessOcrEngineMode.OEM_DEFAULT);
	try {
		String result = instance.doOCR(scaledImage.getInnerImage());
		return result;
	} catch (TesseractException e) {
	throw new IllegalStateException(e);
	}
	catch(Exception ex){
		throw new IllegalStateException("An error during text recognition was encountered.");
	}
	
}
 
开发者ID:gpeshterski,项目名称:chart-recognition-library,代码行数:22,代码来源:OCRReader.java

示例4: recognizeYText

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static String recognizeYText(Image image) {
	LibraryLoaderSingleton.getInstance();
	Image scaledImage = image.scale(8);
	Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
	instance.setLanguage("eng");
	System.setProperty("jna.encoding", "UTF8");
	instance.setOcrEngineMode(TessAPI.TessOcrEngineMode.OEM_DEFAULT);
	try {
		String result = instance.doOCR(scaledImage.getInnerImage());
		return result;
	} catch (TesseractException e) {
		throw new IllegalStateException(e);
	}
	catch(Exception ex){
		throw new IllegalStateException("An error during text recognition was encountered.");
	}
	
}
 
开发者ID:gpeshterski,项目名称:chart-recognition-library,代码行数:19,代码来源:OCRReader.java

示例5: main

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static void main(String[] args){
		try {
			boolean load = true;
			load = false;
//			BufferedImage image = ImageIO.read(new URL("http://www.miitbeian.gov.cn/captcha.jpg")) ;
//			if(load){
//				ImageIO.write(image, "jpg", new File("E:/captcha.jpg") );
//			}else{
//				image = ImageIO.read(new File("D:\\爬虫测试\\yzm\\111.png")) ;
//			}
			BufferedImage image = ImageIO.read(new File("D:\\爬虫测试\\yzm\\11.jpg")) ;
//			image = ImageUtil.grayFilter(image);
			image = ImageUtil.binaryFilter(image);
			image = ImageUtil.lineFilter(image);
//			image = ImageUtil.lineFilter(image);
//			image = ImageUtil.line2Filter(image);
//			image = ImageUtil.point2Filter(image);
//			image = ImageUtil.lineFilter(image);
			image = ImageUtil.meanFilter(image);
//			image = ImageUtil.lineFilter(image);
//			image = ImageUtil.binaryFilter(image);
			
			
			File imageFile = new File("E:/captcha5.jpg");
//			imageFile = new File("E:/test/test.jpg");
			
			ImageIO.write(image, "jpg", imageFile);
			
			Tesseract tesseract = Tesseract.getInstance();
			tesseract.setLanguage("eng");
			String code = tesseract.doOCR(imageFile);

			System.out.println(code);
			
		} catch (Exception e) {
			e.printStackTrace();
		}

	}
 
开发者ID:DMinerJackie,项目名称:JewelCrawler,代码行数:40,代码来源:ImageUtil.java

示例6: detect

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
@Override
public String detect(String filePath) {
    File imageFile = new File(filePath);
    Tesseract tess = new Tesseract();

    tess.setLanguage("hun");

    try {
        String result = tess.doOCR(imageFile);
        return result;
    } catch (TesseractException e) {
        return "ERROR";
    }
}
 
开发者ID:gaborvecsei,项目名称:OCR-libraries,代码行数:15,代码来源:TesseractDetection.java

示例7: getConvertedBoard

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static char[][] getConvertedBoard(BufferedImage[] tiles) throws TesseractException {
	Tesseract reader = new Tesseract();
	char[][] ret = new char[Player.BOARD_HEIGHT][Player.BOARD_WIDTH];
	BufferedImage processedImage;
	String convertedTile;
	for(int j = 0; j < Player.BOARD_HEIGHT; j++) {
		for(int k = 0; k < Player.BOARD_WIDTH; k++) {
			processedImage = ImageHelper.convertImageToGrayscale(tiles[j*Player.BOARD_HEIGHT+k]);
			convertedTile = reader.doOCR(processedImage);
			ret[j][k] = BoardConstructor.getLastAlpha(convertedTile);
		}
	}
	return ret;
}
 
开发者ID:akshaths,项目名称:WordamentPlayer,代码行数:15,代码来源:BoardConstructor.java

示例8: ocr

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static String ocr(File file) {
    Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
    instance.setDatapath(tessdataPath);
    instance.setLanguage("eng");
    //instance.setLanguage("number");
    String result = "";
    try {
        result = instance.doOCR(file);
    } catch (TesseractException e) {
        System.err.println(e.getMessage());
    } finally {

    }
    return result;
}
 
开发者ID:fivesmallq,项目名称:tesseract-ocr-demo,代码行数:16,代码来源:App.java

示例9: recognizeXText

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
public static String recognizeXText(Image image) {
	LibraryLoaderSingleton.getInstance();
	Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
	instance.setOcrEngineMode(TessAPI.TessOcrEngineMode.OEM_TESSERACT_ONLY);
	BufferedImage img = getScaledImage(image.getInnerImage(), image.getInnerImage().getWidth()*2, image.getInnerImage().getHeight()*2);  
	img = thresholdImage(img, 165);
	
	try {
		String result = instance.doOCR(img);
		return result;
	} catch (TesseractException e) {
		throw new IllegalStateException(e);
	}
	catch(Exception ex){
		throw new IllegalStateException("An error during text recognition was encountered.");
	}
	
}
 
开发者ID:gpeshterski,项目名称:chart-recognition-library,代码行数:19,代码来源:OCRReader.java

示例10: performOcr

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
/**
 * Perform the actual OCR using Tesseract.
 *
 * @param image An image to be processed by OCR. Should be cropped and filtered to ensure the contrast is sufficient.
 * @return The text that was recognised in the image
 */
protected String performOcr(BufferedImage image, int iteration) throws OcrException {
    try {
        Tesseract instance = Tesseract.getInstance();
        instance.setPageSegMode(getTesseractPageSegMode(iteration));
        String output = instance.doOCR(image);
        return output.trim();
    } catch (Exception e) {
        throw new OcrException("Error performing OCR", e);
    }
}
 
开发者ID:HearthStats,项目名称:HearthStats.net-Uploader,代码行数:17,代码来源:OcrBase.java

示例11: doOcrFile

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
@RequestMapping(value = "ocr/v0.9/upload", method = RequestMethod.POST, consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
public Status doOcrFile(@RequestBody final Image image) throws Exception {
    File tmpFile = File.createTempFile("ocr_image", image.getExtension());
    try {
        FileUtils.writeByteArrayToFile(tmpFile, Base64.decodeBase64(image.getImage()));
        Tesseract tesseract = Tesseract.getInstance(); // JNA Interface Mapping
        String imageText = tesseract.doOCR(tmpFile);
        LOGGER.debug("OCR Image Text = " + imageText);
    } catch (Exception e) {
        LOGGER.error("Exception while converting/uploading image: ", e);
        throw new TesseractException();
    } finally {
        tmpFile.delete();
    }
    return new Status("success");
}
 
开发者ID:arun0009,项目名称:ocr-tess4j-rest,代码行数:17,代码来源:Tess4jV1.java

示例12: doOcr

import net.sourceforge.tess4j.Tesseract; //导入方法依赖的package包/类
@RequestMapping(value = "ocr/v1/upload", method = RequestMethod.POST, consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
public Status doOcr(@RequestBody Image image) throws Exception {
    try {
        //FileUtils.writeByteArrayToFile(tmpFile, Base64.decodeBase64(image.getImage()));
        ByteArrayInputStream bis = new ByteArrayInputStream(Base64.decodeBase64(image.getImage()));
        Tesseract tesseract = Tesseract.getInstance(); // JNA Interface Mapping
        String imageText = tesseract.doOCR(ImageIO.read(bis));
        image.setText(imageText);
        repository.save(image);
        LOGGER.debug("OCR Result = " + imageText);
    } catch (Exception e) {
        LOGGER.error("TessearctException while converting/uploading image: ", e);
        throw new TesseractException();
    }

    return new Status("success");
}
 
开发者ID:arun0009,项目名称:ocr-tess4j-rest,代码行数:18,代码来源:Tess4jV1.java


注:本文中的net.sourceforge.tess4j.Tesseract.doOCR方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。