c++ - How to OCR multiple column in a document using tesseract -

- February 15, 2011

i working on project of ocr sinhala language using tesseract. goal ocr, multiple column including text in document. , out put file in correct format. there method identify column in document using tesseract?

you can try below solution identify columns when scanning picture.

tessbaseapi baseapi = new tessbaseapi();  baseapi.setdebug(true);  baseapi.init(data_path, lang); //data_path - image stored , lang - en(english)  baseapi.setpagesegmode(tessbaseapi.pagesegmode.psm_single_column);//this line segment captured image - hope looking line  baseapi.setimage(bitmap);   //recognized text after capturing image process it.  string recognizedtext = baseapi.getutf8text();

if not expecting solution please try pagesegmode, hope may resolve issue.

Search This Blog

Panthy J

c++ - How to OCR multiple column in a document using tesseract -

Comments

Post a Comment

Popular posts from this blog

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

yii2 - Yii 2 Running a Cron in the basic template -

javascript - jQuery DataTable responsive doesnt work with Boostrap 3 -