Kill It with Fire

Kill It with Fire

Manage Aging Computer Systems (and Future Proof Modern Ones)

Marianne Bellotti

Once you have identified the parts of the system where there is tight coupling and where there is complexity, study the role those areas have played in past problems. Will changing the ratio of complexity to coupling make those problems better or worse? A helpful way to think about this is to classify the types of failures you’ve seen so far. Problems that are caused by human beings failing to read something, understand something, or check something are usually improved by minimizing complexity. Problems that are caused by failures in monitoring or testing are usually improved by loosening the coupling (and thereby creating places for automated testing). Remember also that an incident can include both elements, so be thoughtful in your analysis. A human operator may have made a mistake to trigger an incident, but if that mistake was impossible to discover because the logs weren’t granular enough, minimizing complexity will not pay off as much as changing the coupling.
1246