Capturing the Flag: at CodeMash & Beyond!
January 23, 2020 Jamey Alea 2 Comments
A few weeks ago, I attended CodeMash 2020 and it was as big and exciting as ever! I got to see all my friends, I built my own IoT badge, I taught a bunch of kids (and a few adults!) how to solder—and I participated in the annual CodeMash CTF competition, which is what I want to talk about.
If you haven’t heard of them, CTF—Capture the Flag—contests are hacking games that are popular in the programming and particularly infosec communities. It’s common to see challenges relating to encryption, file forensics, scripting, web security and reverse engineering. I always thought CTFs sounded interesting, but too intimidating to break into, but I participated in my first one at CodeMash 2017, had a great time, and have done it every year since then. CTFs are challenging, but they’re also fun and I’ve learned a lot from them, so I want to encourage newbies like me to at least give them a try. I’ll go over some basic tips for common types of CTF challenges, and show the solutions for a few of this year’s CodeMash flags.
Encoding & Encryption
Finding encoded messages in different forms is often a big part of CTF challenges, so being able to look at a long string of gibberish and figure out what type of encoding/cipher has been done to it is an important CTF skill. Binary and hexadecimal are easy to spot at a glance. (0s and 1s? Binary! Numerals and letters a-f? Hex!) Base64 is another common one, and it has way more characters so it looks more like gibberish, but it’s still easy to spot because it very often ends with an equals sign. (Why? It’s a form of padding used to make sure the encoded message conforms to Base64’s syntactical requirements.)
If you know something is meaningful, but you’re not sure how it’s encoded, that’s okay too. CyberChef is a great tool for any sort of string you think has a hidden message in it. It can support a ton of different types of encoding, encryption and other file analysis, and you can easily drag and drop different decoders, either to try a bunch if you’re not sure which is right or to do several in a row if its been encoded multiple times.
Encryption and ciphers are a little bit different, but they are similar in that the most important piece is figuring out what cipher you’re working with. A very simple one is the Caesar cipher, where all the letters are just shifted by a certain amount. (ROT13 is a popular example and always the first one I try, where the shift is equal to 13.) The Vigenere cipher is a more complex version, which requires a keyword instead of just a number, so it’s much harder to brute force—but if you’ve got some sort of password you don’t know what to do with, it’s a good try. There are way too many ciphers to list here, but often the description of the challenge will have a clue. One of the CodeMash challenges was called “Where’s the Bacon?” and a quick google search revealed a cipher that I wasn’t previously familiar with called the Bacon cipher that only uses 2 characters. Eureka! (A good tool for researching and decrypting ciphers is dCode.)
In another challenge, I got a long string that looked like this: “c4e5Nc3Nc6Nf3Nf6e4Bb4d3Bxc3+bxc3d6Be” – which isn’t any encoding format I’m familiar with. But the theme of the challenge was the trials at the end of the first Harry Potter book, and a notable one was chess. Which ended up leading me to discover and learn how to read “algebraic chess notation”!
It’s sometimes also possible to brute force encryption challenges, particularly because all the flags in a CTF will follow the same format. In the CodeMash competition, they began with the prefix “CM20-“, which often gives you insight about the beginning of your translation. For instance, I knew “CCCMCCMCMM20” above would decrypt to CM20, which can help decrypt the rest of it, so it’s a good thing to keep in mind.
File Forensics, Analysis & Steganography
Steganography refers to the practice of concealing something inside of another file, and it’s my personal favorite type of CTF challenge. There are lots of ways to go about doing this, but I’ll talk about a few that I’ve seen and some of the first things I check when given a file.
First off, it’s always good to check what kind of file you’re actually working with, especially if you can’t open it with the program you expect, because it won’t always match the file extension! Most files will include a header that contains information about what kind of file it is. (CyberChef actually has a feature that will detect this for you, if the file isn’t corrupted.) One of the CodeMash challenges gave me a .txt file filled with a whole bunch of Base64. Decoding it gave me mostly gibberish, but it started with a PNG header, so I knew there was an image file hidden inside. I was having trouble opening it as a PNG on my computer, probably because sometimes copy/pasting from a browser will change the encoding subtly, so I ended up finding a tool online that could convert straight from Base64 to PNG for me. (There are lots of specific tools like this that you can just google for specific conversions you’re trying to make.)
Data can also be hidden within the raw bytes of an image, so it’s helpful to be able to look directly at the file contents in hexadecimal. This is easy to do from the command line or Terminal. On Linux or Mac, the command hexdump
will do that conversion for you. hexdump -C [filename]
will even show you the translation directly beside it, like this:
A shortcut is to use the strings
command, which will pull out only the ascii text to display to you. You can also filter by length of the string: strings -n 8 [filename]
will show you only ascii strings 8 characters or longer, for instance. If there’s a lot of other noise, like in the example above, that’s where knowing the flag format can be helpful, because you can use search or grep. strings [filename] | grep picoCTF
would bring the flag up immediately and ignore the noise.
Even if the flag isn’t printed right there, this could still give you other useful information or hints. (This can be true even for non-steganography challenges. There was a file in the CodeMash challenge that had some Java themed strings inside it, and the first few hex characters were ca fe ba be
– that’s the header for a Java class file!) Or perhaps there’s a whole other file hidden inside the file you’re looking at? Look at this hexdump of a jpg file, for instance.
When I ran strings
on it, I saw those “secret” references at the end, but a hexdump gives more info. On line 001e17a0, you see ff d9
– the “terminating byte” which denotes the end of a jpg file. On that same line, we see a PK
– which indicates the start of a .zip file. Now we know there’s an extra zip file hidden in this image! Figuring out what you’re looking for is the hard part; once you realize what you’re trying to extract, you can always find tools online to extract it. (Binwalk would be a good one in this case.)
There are other ways of hiding something directly in an image or sound file, without messing around with the bytes. In image files, there could be text hidden right in the image that can’t be seen until it’s messed around with an image editor, because it’s too dark or because it’s only visible in certain color ranges. Here’s an image I was given as part of the CodeMash CTF, before and after I brought the levels all the way up.
Another challenge gave me an .mp3 file of the song American Pie. If you listened to it all the way through, you’d hear a few seconds of beeps in the middle, around the 6 minute mark. I opened the file up in Sonic Visualizer and started messing around with it—and something interesting came up when I took a look at the Spectrogram! (I’ve also seen audio challenges where you have to listen backwards, or where the beeps denote something like binary or morse code.)
Again, this is only a sampling of some fairly basic steganography challenges. But there are lots of resources online if you’d like to dig deeper, I’d start here, I found this field guide really helpful.
Web Security & Pen-Testing
As I mentioned before, CTFs are a particularly common hobby among the infosec community, and it’s because another common type of challenge deals with web security, pen-testing and exploiting security vulnerabilities. As a beginner without a background in pen-testing, these types of challenges are more difficult for me, but they’re way too big of a category to ignore, so I’ll talk about them to the best of my ability.
In my experience, pen-testing challenges are often presented as web pages that are specifically built such that they can be exploited in a specific way. A basic, building block technique for these kind of attacks is SQL injection. If an application intakes user input and doesn’t validate it to make sure no additional SQL was added, that causes a vulnerability that can be exploited to get or manipulate data that shouldn’t be accessible. Here’s an extremely basic example. Let’s imagine that a website takes user input to get a $email
variable, and then runs the following SQL on it:
mysql_query("SELECT * FROM users WHERE email='$email'")
But let’s say, instead of entering an email, we entered ' OR 1=1
— because 1-1
resolves as true
, when this gets appended to the end of the SQL query, it’s now essentially saying, “if the email matches OR true” which of course, always resolves as true, for every row in the database. So this would get us a list of all the emails in the database. More advanced SQL injection can be used to manipulate or delete database rows, or even entire tables, like in the famous XKCD comic.
Here’s an example of a different kind of web security challenge from the CodeMash CTF. We were given a webpage with a simple game: every time you clicked a gold coin, it would increment your gold by 1. After clicking it, the page would count down for 20 seconds and then respawn to be clicked again. We were given the instruction to “become the leprechaun with the most gold.” Problem? The first place user already had 87,983 coins. Hard to compete with!
Turns out the instruction to “become” the user with the most gold was pretty literal—and it was achieved by manipulating cookies. When logging in, there was a standard “stay logged in” checkbox, and checking it created a cookie called “stayloggedin” with a value in… base64. Decoding it gave me a value of “jameybash__ThisIsBadSalt” so I replaced my username with the first place user, re-encoded it into base64, and replaced the value of the cookie. Suddenly I was logged in as the other user instead of myself, which revealed the flag.
Like I said, these challenges aren’t necessarily my forte and there’s a lot of resources out there that are better that anything I could possibly write. Kali Linux is often recommended as a useful tool for pen-testing and CTFs: it’s a linux distribution that comes with hundreds of pen-testing tools already included, a list of which they include in their documentation.
As always with CTFs, google is your friend. It’s common for people to write walkthroughs for solutions that they figured out in previous competitions, which is a great way to learn some of the methods and try them out or modify them slightly for whatever challenge you’re working on.
So you’re ready to try a CTF?
Cool, I think you should give it a try! I had a really good experience starting with the CodeMash CTFs—which I do definitely recommend, you don’t have to attend the conference to play! But if you don’t want to wait until next January, let’s talk a little about CTF formats and how to find them.
The points I’ve been making are mainly useful for jeopardy style CTFs. This means that there’s different categories—like cryptography, steganography, web security—and you solve challenges and score points for them. These can come in both “individual” and “team” varieties, and it can be useful to put together a team of people who have different specialties. (I could contribute more for more puzzle-y challenges, and playing together with someone who was better at pen-testing would cover my weaknesses.)
There are also attack/defense CTFs, which were actually the original kind. I haven’t actually participated in one myself, but here’s the basic gist: teams are given a network with vulnerabilities already in it. They try to patch the vulnerabilities to protect their network from other teams, while also trying to exploit the other teams’ networks for points.
Even if you’re a beginner and not playing super competitively, CTFs are still competitions and, as such, they generally run for a certain amount of time and then end, so there’s an element of “finding” games to play in. CTFTime is a good resource for that, as they maintain a list of upcoming CTFs, with info about them. If you don’t wait to wait to get some practice in, check out PicoCTF: they run a yearly game that’s aimed at students, so it’s good for beginners, and even after the competition ends, they keep it open year round for people to play and practice. Hack The Box describes itself as a “pen-testing lab” and is more geared at learning and leveling up your skills in pen-testing. But they also have challenges year round, as well as a community of security professionals and hobbyists that can also be tapped into as a resource.
I hope this was a good introduction into the world of CTFs! Happy hacking!
Comments