(Image generated using ChatGPT)
In my 20+ years in IT industry, what has been constant is seeing software engineers and teams struggle with legacy or inherited codebase. I completely understand it is not easy when you have to handle legacy or inherited codebase.
This is what I hear from these teams all the time:
- We inherited this code from some aliens who did not understand writing clean code
- These aliens did not believe in documentation
- These aliens did not even believe in writing tests
- These aliens believed in big ball of mud architecture
- These aliens have now left to their distant planets
This is what I hear from clients:
- This team does not have enough skills
- This team does not care for the product / application
- They can't seem to be able to stabilize the product / application
- Every release has to be followed with hot fixes
Inheriting codebase is like moving into an old house
Working on a legacy / inherited software project is like moving into an old, unfamiliar house. The first thing to do is take a tour of the house to understand its layout—how many rooms there are, what they’re used for, and its overall structure. Similarly, you start by understanding what the software does and who uses it. Next, you make sure the utilities—like water, electricity, and gas—are connected and working, just like setting up the software on your machine and ensuring all its dependencies are in place. You might find old manuals, blueprints, or notes left by the previous owners, which can give you insights into why certain rooms were designed the way they are; in software, this corresponds to reading any available documentation or commit histories. Then, you test the appliances, like turning on lights and faucets, to see what works and what doesn’t—this is akin to running the application to observe its behavior.
As you explore the house further, you open cupboards, closets, and drawers, uncovering hidden or rarely used spaces, much like navigating the software’s code structure and identifying its key components. You focus on high-traffic areas, such as the kitchen, bathroom, or living room, because they’re used the most; in software, you prioritize critical features that are frequently used or prone to issues. Over time, you start repairing and renovating—fixing a leaky faucet here or replacing an old light fixture there—rather than trying to rebuild the entire house at once, which mirrors the approach of improving the codebase incrementally. Along the way, you talk to neighbors who know the house’s history, much like consulting senior developers or team members familiar with the software.
What you should do?
- Understand the Big Picture
- Set Up the Environment
- Explore Documentation
- Explore Tests
- Run the Application
- Understand the Code Structure
- Focus on Key Areas
- Refactor in Small Steps
- Learn From the Team
- Document What You Learn
- Tools and Resources
Understand the Big Picture
Zoom out and take a look at the system from 10,000 feet high.
- Purpose and Functionality: Gain clarity on what the application does, who the end users are, and its primary features.
- Domain Knowledge: Learn about the business domain the application serves. This will help you connect the code to its purpose.
- Stakeholders: Identify key stakeholders like the product manager, QA team, or senior developers who can provide insights.
Set Up the Environment
Take the plunge and download the code. You will become comfortable with it as you play with it locally on your machine.
- Local Setup: Ensure the application runs on your local machine. Follow the README or setup guide (if available).
- Dependencies: Familiarize yourself with the tech stack, libraries, and frameworks the project uses.
- Access Tools: Get access to version control systems (e.g., Git), CI/CD pipelines, and monitoring tools used for the project.
Explore Documentation
Look for whatever documentation is available, even if it is outdated. If documentation is missing or outdated, note this for later improvement.
- Architecture Diagrams: High-level design, data flow, or system interaction diagrams.
- Code Comments: Read comments to understand why certain decisions were made.
- Commit History: Use Git logs to understand how the code evolved and what areas are frequently updated.
- Swagger: Lookup and play around with Swagger API documentation if available.
Explore Tests
Look for whatever tests as available. If none found then, note this for later improvement.
- Unit Tests: Go through the unit tests. Execute these test cases.
- QA Test Scenarios: Go through the test scenarios prepared by QA Team members.
- Integration Tests: Go through the integration tests to understand the overall flows.
- Regression Tests: Go through regression tests to understand system functions.
Run the Application
Best way to learn is to run the application and go through it. Do this for different personas of the system.
- Experience It: Use the application to experience its functionality firsthand by running it.
- Debugging Tools: Attach a debugger to understand the flow of code execution.
- Error Scenarios: Test edge cases and scenarios to identify potential pitfalls.
- Network Tab: Look at the network tab in the browser to understand what APIs are being called, what data is being passed (header, cookies, body, and url) and what responses are being received.
Understand the Code Structure
This is one of the most important aspect of handling legacy or inherited codebase. You should spend most of your time here.
- Folder Layout: Identify the entry points (e.g., main()), key modules, and file organization.
- Read Code: Start with the main application flow, then explore specific components. This is very important, that you read code whenever you get a chance. It will not only help understand code base but will also teach you the value of clean code.
- Dependencies: Look for external APIs, libraries, or services the application depends on.
- Configurations: Understand configuration files and environment settings.
- Schema: Understand the database schema, look at the relationships, and read queries, procedures, functions and triggers.
- Batch Jobs: Identify background jobs, their purpose, and schedule.
- Reporting: Spend time going over reports in the system.
- Logs: Find out where are the logs, what is being logged and how to access/query them.
- Queues: Find out message exchange patterns, topics, subscriptions, publishers, and subscribers of the messages.
Focus on Key Areas
- Critical Paths: Identify frequently used paths like login, data retrieval, or payment processing.
- Buggy Areas: Focus on parts of the code with a high density of past bug fixes (look at Git history).
- Complex Areas: Pay attention to code that is hard to understand or has minimal comments.
Refactor in Small Steps
- Code Quality: Look for opportunities to simplify complex logic, reduce duplication, and follow coding standards.
- Unit Tests: Add or update tests to ensure changes don't introduce new bugs.
- Incremental Changes: Avoid large rewrites; instead, improve the code gradually.
Learn From the Team
- Knowledge Transfer: Pair with senior developers or team members who have worked on the project.
- Code Reviews: Actively participate in code reviews to learn coding standards and team practices.
- Ask Questions: Don’t hesitate to clarify doubts about functionality or design. This is also another important aspect. Avoid assuming and instead ask questions and or confirm your assumptions about the code.
Document What You Learn
- Update Docs: As you understand the system, update or create documentation to help future developers.
- Create Notes: Maintain personal notes about quirks, tips, or useful commands.
Tools and Resources
Make use of all the tools that are available and can help aid in your learning.
- Static Code Analysis: Tools like SonarQube to identify code smells.
- Dependency Mapping: Use tools like Dependency Walker or similar for your language.
- Version Control Analysis: Blame tools in Git to understand when and why a line was changed.
- AI tools: Use AI Tools like Claude or ChatGPT to understand code functions or classes that are hard to understand.
I hope these above steps will help you transition from a beginner in legacy or inherited codebase to an effective contributor of the codebase.