Image and Video Processing Assignment: Image Preparation for Optical Character Recognition

Problem -

Optical character recognition (OCR) is the process is the process of translating an image (or video) of text into machine encoded text. OCR algorithms typically work well when applied to clear images in which contain only text, but can struggle where the image contains elements other than text or when the image is noisy. As such, image processing techniques can be applied to images in order to better prepare them for OCR. In this assignment, you will design and investigate the performance of an algroithm for preparing images for OCR. You will apply this algorithm in an application context of your choice. You will be required to:

  • Conduct personal research to choose a potentially suitable algorithm for document image preparation.
  • Implement your chosen algorithm in Matlab and apply it to a set of test images.
  • Analyse the results (either subjectively or objectively) of applying your chosen algorithm and draw conclusions from them.

While it is expected that your algorithm will improve the performance of OCR in comparison to applying OCR to unprepared images, it is not expected that your algorithm will allow OCR to perfectly recognise all characters in all test images.

You will report on this work in the style of an academic paper, making clear your experimental methodology and your results and conclusions.

Test Images -

For this coursework, you should select an application of OCR to work on. In order to demonstrate your algorithm's performance of your algorithm, you will need a representative test set of images on which to employ it. You can choose your application freely but, to give you a start, you might want to look at the following applications and images:

  • License plate recognition: The "Cars 1999 (Rear) 2" dataset from Caltech (www.vision.caltech.edu/html-files/archive.html) would be a good source of test images.
  • Document recognition: The DIBCO 2011 benchmark data set (http://utopia.duth.gr/~ipratika/DIBCO2011/benchmark/dataset/) is a good source of document images.

Report Instructions -

You are required to work individually and produce a technical report of between 1000 and 2000 words (excluding references and appendices), accompanied by figures and tables where appropriate, on the topic outlined above.

Attachment:- Assignment.rar

    If you make a claim in your report, you should either give evidence for the claim (i.e. cite the resource in which you found it) or explain how this claim follows from earlier information in the report. One of the key features that should be present in the report is well reasoned argument. For instance, don’t just state what an algorithm outputs – also, explain why this output is useful compared to what we had before applying the algorithm.

    You are required to work individually and produce a technical report of between 1000 and 2000 words (excluding references and appendices), accompanied by figures and tables where appropriate, on the topic outlined above. As this report is largely research based, you are expected to peruse significant texts or web-based material to prepare the report, but it is essential that the report itself is your own work. You should cite a minimum of FIVE relevant sources excluding the lecture notes and Journal Class 3 presentation slides. These should be clearly citied at the points at which they are relevant using the Harvard referencing system. Any direct quotes should be clearly delineated from the text with quotation marks, and should be clearly attributed, although it is expected that the report will be written in your own words.

