Poker-AI.org • View topic

View unanswered posts | View active topics

Board index » Public Forums » Online Botting

All times are UTC

Rule Based OCR

Page 1 of 1

[ 9 posts ]

Print view

Previous topic | Next topic

Author

Message

notSoEasy

Post subject: Rule Based OCR

Posted: Tue Apr 07, 2015 8:26 am

Junior Member

Joined: Wed Aug 27, 2014 12:15 pm
Posts: 12

hello, right now i am facing the problem of developing a robust OCR system. I have read and tried many of the widely known approaches, but in the end they all are not very robust. My idea is now that i limit my OCR to numbers, and try to write custom rules for them. As far as I can see this will become at least a pain in the ass, but eventually even impossible, especially because i did not find anything about this approach, which i hope is for the reason that most OCR aims to recognize the entire alphabet, which would make my approach unfeasible.
What do you think about this?

Top

shalako

Post subject: Re: Rule Based OCR

Posted: Tue Apr 07, 2015 3:57 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

The tessaract ocr engine has a built in rule for only recognizing numbers if that is your problem. You do have to create a few rules unless you train it specifically (I avoided that as its too complicated). If your still having problems my advice is to scale the image 300% before running it thru the OCR engine. This will help quite a bit. The other thing I have noticed is that sometimes if you distort the image say 80% horizontally it reads it better. No idea why.

OCR can be a pain. Mine just all of a sudden has problems with the number 8. It thinks its a 3 but not all the time which makes it a major headache...

Top

notSoEasy

Post subject: Re: Rule Based OCR

Posted: Tue Apr 07, 2015 5:49 pm

Junior Member

Joined: Wed Aug 27, 2014 12:15 pm
Posts: 12

hey, thx for your reply, i tried tesseract, also with scaling, the only numbers rule and manually making the image binary, however it was still regularly confusing 3 and 8, and some other stuff. Today i was playing around with the trial of Abbyy and got better results, however its impossible to incorporate in my bot. Probably got to go with some less robust methods for now.

Top

jukofyork

Post subject: Re: Rule Based OCR

Posted: Wed Apr 08, 2015 8:26 am

Junior Member

Joined: Thu Nov 14, 2013 2:56 pm
Posts: 12

This PhD thesis might be of interest: Recognition of ultra low resolution, anti-aliased text with small font sizes

Juk

Top

spears

Post subject: Re: Rule Based OCR

Posted: Wed Apr 08, 2015 1:02 pm

Site Admin

Joined: Sun Feb 24, 2013 9:39 pm
Posts: 642

Interesting. Have you come over to the dark side Juk?

Top

jukofyork

Post subject: Re: Rule Based OCR

Posted: Wed Apr 08, 2015 3:35 pm

Junior Member

Joined: Thu Nov 14, 2013 2:56 pm
Posts: 12

spears wrote:

Interesting. Have you come over to the dark side Juk?

LOL no - I don't even play poker any more, but do still maintain an interest in the AI side of things.

As for that thesis: I just remembered seeing it posted somewhere (possibly even here?) and thought it might be of interest.

Juk

Top

shalako

Post subject: Re: Rule Based OCR

Posted: Wed Jul 22, 2015 4:59 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

I revamped my OCR after a long overdo rewrite and I am finally getting 100% accuracy with the Tessaract system on numbers. The first thing I had to do was init tessaract so it only recognizes numbers, commas and periods via a whitelist. Tessaract was designed for black letters on white so in order to do that you have to do a few things before running it thru the engine:

1. Convert to Greyscale
2 Invert the image (ie make black areas white and vice versa)
3. Apply a Threshold filter (very high like over 200)
4. Scale the image 100%

That solved all my issues with 5 and 8 for good. Tessaract does work if you give it a good image to work with.

Top

cantina

Post subject: Re: Rule Based OCR

Posted: Thu Jul 23, 2015 6:38 am

Veteran Member

Joined: Thu Feb 28, 2013 2:39 am
Posts: 437

If you can find the character edges why not use hashes? If they're anti-aliased, you can train your own NNs with Encog or just do a difference map if you want to keep it simple (combined with a hash cache). Tesseract is ok, but bulky and slow.

Top

shalako

Post subject: Re: Rule Based OCR

Posted: Sun Jul 26, 2015 2:56 pm

Veteran Member

Joined: Mon Mar 04, 2013 9:40 pm
Posts: 269

Nasher wrote:

Hmm..Hashing is an idea I never though about before. That would definitely but as you know that NN stuff is way above my head..

Top

Page 1 of 1

[ 9 posts ]

Board index » Public Forums » Online Botting

All times are UTC

Who is online

Users browsing this forum: Bing [Bot] and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum