c++ - How to OCR multiple column in a document using tesseract -


i working on project of ocr sinhala language using tesseract. goal ocr, multiple column including text in document. , out put file in correct format. there method identify column in document using tesseract?

you can try below solution identify columns when scanning picture.

tessbaseapi baseapi = new tessbaseapi();  baseapi.setdebug(true);  baseapi.init(data_path, lang); //data_path - image stored , lang - en(english)  baseapi.setpagesegmode(tessbaseapi.pagesegmode.psm_single_column);//this line segment captured image - hope looking line  baseapi.setimage(bitmap);   //recognized text after capturing image process it.  string recognizedtext = baseapi.getutf8text(); 

if not expecting solution please try pagesegmode, hope may resolve issue.


Comments

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

mercurial graft feature, can it copy? -