I’ve been away from my desk a lot over the last three weeks. I’ve spent many hours working with my supervisor to design a palaeography experiment involving a group of very willing and wonderful calligraphers from the York Scribes and Wyke Scribes groups, a WACOM digitising tablet and an adapted digital pen. I’m still grappling with the results, many of which were unexpected but exciting. I’ll publish more when I have more!
Now that’s done, I’ve donned my programming hat again (which looks very much like my historian hat, but with added complications). I’ve found that blogging my experience has created a handy record of what I’ve done up until now. I can see that previously I was playing with ways of processing my images of medieval handwriting, to better represent the shape of the letters. I was attempting to standardise the thick and thin strokes and see only a skeleton of the letter. I’d come up with this:
First, I’d performed a ‘closing’ operation to fill in any gaps in the ink and smooth the shape’s contours. Then I ‘thinned’ the image, to show only the basic shape of the letter. However, the thinned image resulted in some strange details – the kind of ‘spurs’ that you see above. I wondered: how can we see an even better representation of the shape of letters, without these added extras? So, I decided to ‘close’ the image as I did before but, instead of thinning it, apply a Canny edge-detection program.
For the historians among us, the Canny edge detector was developed by the Australian computer scientist John F. Canny, in 1986.
In researching this blog post, I spent way too much time seeking an explanation of Canny’s method that did not send me on a Wikipedia loop. Here’s the best – very simplified – explanation that I could manage, based on this very helpful guide…
The Canny edge detector works by:
1) removing noise from the image by creating a blurred image, using what is called Gaussian convolution.
2) finding the edges where the grayscale intensity of the image changes the most (eg. where the white of the page turns to the black of ink). The direction of these edges are then determined and stored as an equation.
3) converting the ‘blurred’ edges in the image to ‘sharp’ edges
4) applying a threshold, so that only edges stronger that a certain value are preserved.
5) interpreting strong edges as ‘certain edges’, which are immediately included in the final edge image. Weak edges are included if and only if they are connected to strong edges
This was the image that resulted from the process, a very faithful rendering of the letter shapes, showing quite clearly the shape of its constituent strokes.
I still have a problem, as the edges of the letters are connected where the scribe joined his strokes in the process of writing. However, it seems that for the purposes of my current project, I’ll be drawing bounding boxes around the letters to segment them manually. This is ‘expensive’ in terms of the time and effort required, but seems to be the best option since I’m new to programming.
In order to regain some of this time, I’ve learned how to import and process multiple images simultaneously. My supervisor helped me to do this by devising a simple program which reads in every single file in a folder, from the first file to the end, as long as it ends in ‘.jpg’. I could then apply my edge detection program to the entire set, resulting in several processed images, prompto!
In the upcoming weeks, I’ll be trying out some basic computer vision techniques, attempting to segment letters and compare their shapes over the course of a scribe’s career. More on that anon…