We've got a rather strange png file. Very strange png. Something isn't right about it...
Stego challenges are not my favorite but still I gave this one a try because I felt the point value meant it would be a reasonably quick solve.
The png file when viewed just appeared to be a single 256x256 image of the letter "i". Sort of like this (this is not the actual PNG from the challenge):
When we investigated further though it's rather large in size to be a single image:
So pngcheck says there's additional data after the IEND chunk. Let's try carving the file with foremost:
Oh cool, 1892 PNG files embedded inside. Wow, what are all those files of I wonder:
Each PNG file contained a single letter. At first I thought this was the flag, that there would be a message in this and I could just read it back and win the points. Unfortunately this was just step one. After scrolling through the image previews in my Linux file explorer it quickly made sense that these images described a base64 encoded string. The biggest give away of this was the final image in the list was an "=".
So it seemed like the next step was to decode these images, get a base64 string, decode the string into a binary and get the flag. How to do that?
I've read other writeups of this challenge and saw that other people approached this in a smarter way than I did. I did this the long way, with OCR. I think if you want to know the best way to do this challenge, read those writeups. If you want to know how to OCR large groups of single letter images, read on!
Having never done any OCR before, this was going to be fun. First I found that on Linux one of the accepted OCR solutions is called Tesseract OCR and a Python interface to Tesseract OCR is called PyTesser.
So I grab those things quickly and read up on using it...
It seems to use PyTesser we just import it and use image_to_string from a PIL image in memory. Note here that we used PyTesser 0.0.1 which uses PIL. It seems on Github there's a new version of PyTesser that uses OpenCV. I'm certain OpenCV is better but im more familiar with PIL so i'm happy to use the old version.
Our first attempt was a flop:
This gave no output at all. When I looked into it I found that Tesseract is tunable and in the default mode, PyTesser has it tuned to all default settings.
So I wanted to use "pagesegmode" 10 to treat each image as a single character. Let's modify the args value in pytesser.py to suit our needs:
Let's try our script again:
Cool! Output! Oh but wait. It's totally wrong. Doh! Firstly base64 strings don't have "." in them. It must be interpreting the lowercase i characters as a ".". Half the other characters are similarly mangled. Ouch. So not good enough, it seems our OCR library needs more context about the letters so it can do a better job at OCR?
Firstly let's tweak tesseract a bit more. I learnt about character whitelists so I set one up containing only the base64 alphabet:
Next I tweak pytesser.py to use PSM 7(Treat image as a single text line) and to include our tesseract configuration file by changing the args array:
Next I modify our script so that it imports some number of images into a single image and then OCRs them all at once. This greatly increases our OCR efficiency but still not perfect results:
And the output:
And when we assemble a PNG file from the output:
Now i'm really unhappy because I've spent a bit of time on these 150 points and I wan't to just solve it now. Not pretty code any more, just solution. So instead of fussing more with this I resign myself to a manual process of weeding out these final byte errors using the following script.
It assembles a string of characters, OCR's the string, displays it and asks the user to proof read. If any issues are found it can correct them. This requires about 10-20 minutes of the user's time but as I said. I just wanted a solution and 10-20 minutes of focused reading seemed like a good trade off at this point:
And the output looks like this, it'll throw up an image on your screen:
And so on... You get the idea. You'd think we'd be finished by now right? Noooo. There was a time bomb in this challenge that I only just now discovered. And that is the upercase letter "I" looks identical to the lowercase letter "l". The OCR and even human recognition steps have no way to tell these apart. The best I could do after the best OCR and human recognition i could do resulted in a dud file still:
So I was stuck here and ready to throw in the towel, but I had one more idea. I didn't need to make the file perfect, I just needed enough image data to read a flag. So I counted the number of letter I's in the base64. There were 29 of them. So I had 229 (536,870,912 different possible file combinations here. I don't NEED a perfect image though. So here was my final idea:
- First, only permutate the data in the obvious part of the base64 data where the actual useful image data is found. This obvious when you see all the base64 data.
- Only try some reasonable number of permutations
So the base64 data we needed to focus on was quite obvious to the naked eye. I've highlighted it below. It's obvious because the surrounding data is repeated QEEBAYhEBAY etc which looks like it might be a repeating data, like a blank background of the image maybe?
The "reasonable" number of permutated I/l combinations I settled on was 211 which is just 2,048 combinations. If I could get enough image data in the first 11 I/l substitutions, I would be super happy.
Here's the final code:
And when we run it, we find 2,048 PNG files. Copying them to Windows 7 system allows simple browsing with Extra Large thumbnail mode right in explorer:
And finally we spot the "best" attempt:
Which is by no means perfect, but was enough for me to get the flag:
Incredible indeed. That's how you do a challenge the wrong way!