In growing companies, as software systems become complex and extensively engineered, maintenance can be a challenging problem. Moreover, when high profile bugs arise and/or a lack of system availability arises, it can have disruptive consequences on a business. Hence there is little room for mistakes in these crucial systems.
To ensure reliability and quality, working on software robustness is key to success, and many different strategies have been developed over the years. Things like test coverage, documentation, branching, deployment strategies etc. are typically strictly regulated and engineered to minimize risks and maintain high development efficiency as well as quality.
Nevertheless, while many development techniques keep getting more and more engineered and standardized, I couldn’t help noticing that others didn’t keep up on the same pace in their evolution. In particular to me, a yet unexplored area is code knowledge, which is in short how well engineers know the system they are working on.
Code knowledge can be very important in many real cases. In an ideal world having two engineers with the same experience, one having written the code he is working on and the other one being new to it, working on a code that is very well tested, well written and well documented, would probably not lead to a much different outcome. However, in real cases much larger gaps can be noticed in their efficiency.
Not only it would be much slower for a developer trying to learn a code base by reverse engineering and guessing with respect to being able to ask questions to other developers that already know the code, but more importantly it would be error prone.
The reason for this is that many subtle technical constraints that arise while developing the system can’t always kept in documents or be self-explanatory, so they only reside in the heads of those people who developed the system. Accordingly, the risk in which many companies incur is that when those people that hold knowledge leave, then the knowledge just gets lost and newcomers who are unaware of those constraints introduce regressions while developing, sometimes creating more damage than benefits.
So how can companies minimize the risk? What is the best arrangement of developers that provides a good level of collaboration, reducing the ramp up period for developers that are learning specific part of the code and guarantees a barrier against mistakes and regressions?
Although an exact answer would be complicated to determine, a rudimentary suggestion that I would give to companies to help prevent burying technical knowledge would be for each logical component of the code to have at least two knowledgeable people available at all times, in such a way that as soon as someone rolls off there is someone else able to fill in.
And an even better plan is to invest time in thinking through a policy that not only aims to allocate engineers to different projects in a way that allows them to move ahead faster today, but also aims to be resilient to unexpected future changes. Bad staffing plans do bite back.