Methods for learning a project’s code base

Picking up an existing code base can be intimidating. If you’re lucky the code is documented and the previous developer(s) is around and willing to explain the architecture. But if you are like everyone else in this world you find yourself staring at thousands of lines of undocumented code with no place to turn for help. There are a number of ways you can attack the problem and this post will go over some of the ways I’ve had success with.

Where to start.

When you have a project with more than just a few files the problem arises of where to start your investigation. When it comes to executable applications don’t get creative. Find the entry point of the application and start there. For projects where the output is a library you most likely won’t have a single entry point from which to start. In these cases treat each entry point of the library as it’s own mini application. Pick one and start there.

Depending on the size of the code base you may not be able to go through all of the files and functions. Don’t worry about this. If you can get an idea on the main areas of the application that are used then focus on them. Or, take a look at the open issues and feature requests. Use those to help guide you on where to spend your time.

Do while reading unfamiliar code.

Comment. If you can’t look at a file and immediately know what the point of the code is or how it achieves it’s end result then add comments. Breakdown complex logic into an easily understood explanation in the comments. The write-up, 13 Tips to Comment Your Code, gives a great overview of where and how to comment code. If you are in a situation where you can’t comment the code then use your favorite note taking application/book and put your comments there.

If the code is in GitHub or some other repository that allows for easy linking into the code then take advantage of it. As you make notes in your application include a link to the file, function, or line number. When that isn’t possible, be sure to include the filename, and function or line of code in the notes to easily associate the comments back to the code.

When I am able to comment in the code my preference is to, at a minimum, comment at the function-level. Explain the purpose of the code and provide the IDE with details it can show in IntelliSense. This requires using specific commenting styles based on the language or IDE but it is worth it. When you use the function elsewhere in the code your comments will automatically appear in a well formatted structure. In cases where the function is large and can’t be broken up then I’ll add comments to point out where things could be split out should the time come to refactor it.

Diving deeper into the code.

With the entry point(s) found it is time to start tracing through the code. If you can’t execute the code in a test environment then manually walk through the logic and and take notes. Write down a test case of possible input values and the relevant generated output. In some cases you may need to copy the code out to another new project and run it there to understand what the logic is doing. Don’t feel bad doing this; if it helps you understand the logic then do it. If you remember better by drawing out the logic flow then draw it and be sure to include your drawing in your notes. In scenarios where I had an expert in the code at questions available I’ve even used video conferencing applications and screen recording to document both the user-interface and underlying logic as explained by them for future reference.

In some cases a function may be defined in an independent way from the rest of the application that lends it to be testable by unit tests. When this happens then by all means create a unit test. It will not only help you understand the purpose of the function, it will also give you the reassurance later on that the function is operating properly when changes are made.

When you do find yourself struggling to understand what it is you are looking at don’t be afraid to ask for another set of eyes and ears. It doesn’t matter if the other person can code, having someone there for you to talk to about the code, explain the logic, and bounce ideas off of will help. If the person can code then let them take a look before you start explaining what you think is happening. This will let them formulate their own thoughts without the influence of yours. Hopefully, even if you both don’t understand the logic, you can work through it from different perspectives and figure it out. I’ve also had a lot of success getting a user of the application explain what is going on from their perspective. This viewpoint helps get you out of the weeds of the code and see what will be the visual results.

When you can execute the code in a debug session then by all means do it. Step through the code with example inputs and see how the logic handles different use cases. This should simplify the learning process by allowing you to see exactly how the application executes. An even better situation is if you can also comment the code while debugging it.

Remember.

One, of many, things to remember is that if you can’t believe the person wrote the code the way that they did, determine that the original author was a complete idiot, or desire to curse them for life, someone else out there looking at your code will probably feel the same way. Building software systems is not an easy task, nor a one perfect solution task. The person could be new, thrown into the project at the last minute, had a bad day, look at problems differently than you, or be years beyond your expertise. So don’t judge the author, focus on the logic.

TLDR;

  • Find a good entry point to start your investigation.
  • Add comments in the code anywhere you don’t understand the purpose of the code.
  • Don’t be afraid to copy code to a test project, create unit tests, or get a second person to look at it with you.
  • Don’t judge the author, focus on understanding the logic.