c++ - How to OCR multiple column in a document using tesseract -
i working on project of ocr sinhala language using tesseract. goal ocr, multiple column including text in document. , out put file in correct format. there method identify column in document using tesseract?
you can try below solution identify columns when scanning picture.
tessbaseapi baseapi = new tessbaseapi(); baseapi.setdebug(true); baseapi.init(data_path, lang); //data_path - image stored , lang - en(english) baseapi.setpagesegmode(tessbaseapi.pagesegmode.psm_single_column);//this line segment captured image - hope looking line baseapi.setimage(bitmap); //recognized text after capturing image process it. string recognizedtext = baseapi.getutf8text();
if not expecting solution please try pagesegmode, hope may resolve issue.
Comments
Post a Comment