Preview Mode Links will not work in preview mode

With the limited time and energy that software leaders have, by tuning into Programming with Palermo, you ensure that you accomplish the most for the effort invested.

You deliver software. That's what you do.  And it can be frustrating when things take too long, when bugs pop up, or when things break in production.  But you have what it takes.  Programming with Palermo can help improve your confidence by delivering timeless knowledge, moving unnecessary obstacles, and regaining excellence within your development team; opening up to their full potential.  Simply - simplify!

Feb 9, 2023

In this episode, Jeffrey shared how an executive oversees a software team

Situation

Our industry struggles mightily with failed software projects. On average half of the projects still fail. Failure is defined as the executive who authorized the budget wishing he hadn't. The project is so over budget and so over schedule, that the company would be better off having never started it. Even in the middle of these projects, executives can feel powerless to abort it for fear of sunk costs. And without knowing the right questions to ask or the right reports to demand, the executive in charge doesn't feel in charge at all. He's left choosing between trusting the team still longer or the nuclear option to scan the entire thing.

Mission

Right now, if you are an executive overseeing a software group, I want to equip you with the tools to do that well. If you work in a software team, use this video to give your software executive the information he needs to know the project is on track or the insight to know what the team needs to do a good job.

From here out, though, I'll call you the software executive. Even if you've never programmed any code, you are running a software team. Their success will steer your future career, so this is important. Don't keep going on faith. Don't proceed merely trusting that someone else reporting to you knows how to do your oversight job for you. Lean in. And I'll give you the questions to ask, the tools to use, and the practices to deploy so that you can safely guide your software project to success. And most importantly, if your current software project is veering toward failure, I'm going to empower you to stop the bleeding and get it back on track.

Execution

Before diving into the guidance, I want to paint a mental model for you. Think of every other department in the company. Think of every group. Think of every team branch on the org chart. Each one of them is responsible for delivering some output valuable to the business. And each of these teams also needs some inputs in order to deliver those outputs. And if the outputs are not delivered, the team's leader is typically replaced. And the leaders who excel are the ones that can set up the team members for success.

Mental Model

The factory is arranged well and operates efficiently every day in a safe manner. The assembly line flows at a good speed with incoming materials being delivered at the right cadence to keep going. Quality issues are prevented and detected very early. Hourly and daily throughput measures are tallied and reported up the management chain. Quality and throughput measures are paired with acceptable thresholds and established as a standard with better numbers as stretch targets. Then, the executive in charge ensures that the factory or assembly line is organized in a way where each team member understands the job and what activities it will take to meet the targets.

What we don't do is declare a building to be a manufacturing plant, ask a team to come to work inside it, and then come back to check in a month later. The people we staff on the team are typically not the same people needed in order to design the process for how the team should work. And Scrum has done the industry a disservice by spreading the notion of self-organizing teams. Even the certified ScrumMasters are trained to ask the team what they want to do and then work to remove blocking issues to what they want to do. This isn't leadership. Only when a team is working in an efficient manner can the lower-level details be turned over for self-organization. An appropriate leader (you) is always necessary to put the overall structure in place for the team so that real measurable throughput can build momentum.

I started out with a factory and assembly line analogy. And many knowledge workers will rightfully object that the nature of the work is different. And it is. Earlier in my career, I was one of the self-organization promoters, and I was banging the drum about knowledge work being inestimable or unmeasurable. But speaking for myself, what I liked most about that message was that it gave me space to dive into the work without having to report up as much. It gave me more space as a programmer. But what it didn't produce was less risk for the executive who authorized the project budget in the first place.

This challenge exists in all the fields of knowledge work as well. Managerial accountants and CPAs also have tough problems that don't have rote solutions. The rote solutions have been automated away by good accounting software. But if your CPA takes forever to figure something out and then bills you twice as much as what you budgeted, you still have a problem. Sales is another area that has some similarities with the "magic" of software development. You want a certain pace of sales. And the staff just wants to get to work. But seasoned sales executives know that without a good sales team process, closed sales won't happen. And even enterprise sales that can take 3-6 months or longer don't just ride on the "trust me" message of the sales rep. Good sales executives put in place processes with measures. Number of leads contacted, number of meetings. The number of emails, phone calls, and networking events.

My goal if this introduction is to suggest that we dispense with any notion that software is too complex to be managed like other departments in the business. I've been managing programmers for 17 years. All we have to do is raise up the conversation from the technical jargon, and we can get to a place of business language where all the executive tools apply. Whether you like to use OKR's or EOS L10 meetings with a scorecard, or just regular weekly metrics, you can apply the oversight methods of your other teams to your software team. Let's get into it.

Team Alignment

Before we discuss software-specific issues, let's apply what we already know about team formation and team alignment. If any team is going to be high-performing, it has to be aligned and going in the right direction. The old model of forming-storming-norming-performing applies just as well to the software team. And the Clear Measure Way's Team Alignment Template (TAT) provides a form to document the team's alignment. Just like other parts of the company, without consistently reinforcing the vision for the project and the business strategy that caused a software project to be commissioned, a team will stray. It's human nature. It has nothing to do with software. And regardless of what information is chosen, the team must send a periodic report to you, the software executive. After all, you are giving a report to your executive team or board of directors. And if you have no report from the team, then it's hard to do your briefing. So you need some form of a scorecard. The Clear Measure Way curriculum also includes a Software Team Scorecard Template you can use. We suggest the minimum set of measures to report. As time goes on, you'll want to add more.

Team Skills

Just like any other team in the business, if your software team doesn't have the skills needed to execute a particular project, you won't succeed. But if you haven't cataloged the required skills or taken an inventory of current skills, you don't know. And one of the peculiar traits of many software developers is their inventor personality. If you ask them "can you do _", they will answer "yes. I can do that". Even when they have never done it before. They will tell you they can. After all, Orville and Wilbur Wright said that they could make a flying machine. Turns out they did, but that process was invention, not implementation. To inventory your skills, you need to know what your team members _have done before, not what they believe they can learn to do. If you have a smartphone app project in front of you but no one who has ever put a smartphone app into production before, then you are missing a skill. This is just one example. But you can see again that any department in your business goes through the same skills planning. If your accounting department doesn't have anyone who has ever done inventory accounting for large warehouses, and you intend to build a warehouse, you would need to recognize this and augment the accounting team. There is no such thing as a "full-stack developer". Oh, you'll find it on the job boards, but "full stack" means very different things to different people. It depends on the technology stack. So if someone places "full stack developer" on their resume, you have to look at the projects they have done, which constrains their definition of "stack". In addition, some skills represent answers to strategic risks. Take security. Security breaches can tank entire lines of business. This is not just another technical skill. It's a department competency. So I encourage you to get specific in needed skills and current skills so that you understand the actual skills you have and the ones that are lacking. Then you can build a training and staffing plan for your project. Chances are some of your existing people can do some training and add some skills. Then there will be other skills that need to be sourced from the outside, either temporarily or with a permanent hire.

Establishing quality

We all want our team to be able to move fast and deliver at a rapid pace. But from an oversight perspective, demand a quality output from them at whatever pace they can deliver first. Measure the pace they are delivering first with a certain quality. Think of when you were learning to type. The measure of typing speed is the number of words per minute with some number of errors. You know that 100 WPM with 100 errors doesn't do you any good because that doesn't represent 100 typed words. It represents 100 misspelled words that have to be fixed. You want 100 WPM with 0 errors.

Capers Jones, in his writing about software metrics, notes that teams who prioritize productivity think of quality as something to balance end up suffering from poor quality. Then, with bugs mounting up, more and more capacity of the software team is used to fix bugs. With less of the team's full capacity going to delivering features, productivity slows, creating more pressure to re-establish productivity. The team members then, under more pressure to perform, take more shortcuts in order to "get things done", but this just yields more bugs, which takes more team capacity to tackle. With a small fraction of the overall team's capacity dedicated to new features, the overall productivity tanks. Over time, this is the cause some teams to pitch a new plan to management: "We need to modernize this system", which is code for "fixing this is more effort than starting over from scratch." This is the equivalence to waving the white flag and surrendering. No army that surrenders can later claim victory.

As the software executive, this is where your leadership comes in. Sequence the establishment of a quality standard first, before challenging the team to increase the pace of delivery. Measure the number of bugs caught before production, and the number of bugs caught by users in production. Measure how long it takes to fix each bug. All of the modern work-tracking tools will do this well for you. This is the easy part. Your leadership is important here because you are establishing an important principle for your team to abide by. That principle is that quality should be prioritized over the speed of delivery. Because you know that adopting a speed-over-quality strategy yields neither speed nor quality. In weekly team meetings, which every team should have, ask the same question over and over. "Tell me about the bug that escaped into production. What are we changing so that kind of bug can never get to our users again?" Their answer will be different every time, but your question will be the same. Ask for a tour of the code that caused the bug. If you can't understand the explanation or the code, then you've found a quality hot spot that you'll want to ask more questions about. Don't believe the lie that "the code is too complex for you to understand." After all, you wouldn't accept that excuse from an electrician or any other trade. After all, the purpose of the software is to simplify a domain that has higher complexity without the software.

In any of the teams you oversee, you'll want to understand the engineering practices that are in place. Here are a few that every software team should be using. - Test-driven development - Continuous integration - Static analysis - Pull request checklists (a modern implementation of a formal change inspection)

I expect the team to have other practices in place as well in order to ensure that quality is kept to a high bar. Without these, your team will struggle unnecessarily to keep quality high on a multi-developer team.

Achieving stability

Once a team has the practices in place that enable code to be delivered that is free from defects (bugs), the next priority is to get it onto a production environment in a stable fashion. Chances are that you don't just have a new piece of software, but you have existing pieces of software in production that have some stability issues from time to time. Stability issues can have one or more of the following symptoms. - Sluggishness - Outages/goes offline - Error messages or frozen screens - Abnormal behavior/bugs that can't be reproduced by engineers

When users report any of these symptoms, you have a production issue. Having good language around these symptoms gives you clarity in your oversight duties. You'll want to make sure the appropriate stability measures are in place to track the stability of your software as they run in production.

Sometimes, teams can be gun-shy about production deployments. They might advocate for monthly deployments or after-hours deployment events with many hands on deck. This is technically unnecessary but commonly born from a previous unpleasant experience making changes to a production environment. After a deployment goes bad, developers can become hesitant, wary, and distrustful of the process because they consider it dangerous. But a large inventory of undeployed software is not only a large investment that isn't generating a return, but it is also a growing risk of unproven system changes. All departments that manage throughput understand the power of limiting work-in-process (WIP). Infrequent deployments queue up far too many changes waiting for a stressful, error-prone deployment event.

Ultimately, your two goals to achieve stability are: - Prevent production issues - Minimize undeployed software

You can measure these on the team's scorecard by tracking weekly metrics: - Number of deployments for the week - Number of production issues for the week (separated by severity) - MTTR (mean/average time to issue recovery/issue resolution)

As with overseeing bugs, as mentioned above, you can ask your team the same questions to drive the right behaviors. - "What features/changes are tested and ready for production?" - "What was the root cause of that production issue, and what are we changing so that type of issue can never happen again?" - "What should we strengthen about our environment so that we are able to resolve issues faster next time?"

As with quality, there is a minimum set of practices that every team should employ if you have the expectation of running a stable software system in a production environment. - Automated DevOps from day 1 of a new project (eliminate manual, monthly deployments) - Small releases - Runtime automated health checks (built-in self-diagnostics) - Explicit secrets management

When production issues crop up - and they will from time to time, the following practices enable your team to diagnose them quicker and come to a resolution. - Centralized OpenTelemetry logging, metrics, and traces - An APM (Application Performance Management) tool with a shared operations dashboard - A formal support desk tool with ticket tracking, anomaly alerts, and emergency alarms

If some of this sounded familiar, it's because many of them are the software parallel of practices to operate any other factory or assembly line. In a factory, if a part of the production line experiences an issue, it's an obvious alert with staff springing into action to resolve it locally before it becomes a factory outage. For more serious problems, emergency alarms stop the line and call everyone's attention to rally around the problem to get the production line back up and functioning. While the tools are different, the way of thinking is the same. Here are some questions to ask your team in order to gain insight into how these may or may not be implemented. - "Would you please give me a tour of our logs and telemetry that allow me to see how users are using our software?" - "How do we currently train a new team member to be on-call for production support, and what dashboards should they be looking at to ensure the software is functioning in a stable fashion?" - "What events currently trigger an alert, and what events currently trigger alarms? Who receives alerts and how? How do we all receive alarms?"

Increasing speed (productivity)

Let's finally turn our attention to increasing speed. This is quite a bit of information to digest before we discuss productivity. But for good reason. With quality problems, our team is diverted to diagnosing and fixing those bugs rather than working on new changes and features. With stability problems, our team is yet again distracted from actually working on new changes because the production environment rightfully takes priority. Even if we staff dedicated systems engineers to be responsible for supporting the production environment, they typically can only operate a stable system. For high-scale systems, it's normal to constantly be changing the number of servers or cloud CPUs or Kubernetes pods based on the load. And it's normal to be watching the queue length as data flows to be sure it's being processed within established SLAs. But when errors are happening, and the system is not behaving as the systems engineers have to be trained to expect it to behave, those issues are escalated to the software development team. And that is where development capacity goes.

The power of prioritizing quality and stability first is that the result is 100% of your team's capacity actually going to the new work set before it. With this achieved, we can look at what then causes a team to be able to move fast when it is actually able to work on new software changes.

From an oversight perspective, I'd like to paint a picture of how to think about your team's productivity, throughput, or pace. Let's take an analogy of the Baja 1000 desert race. To do well in this race, you need to finish. That means you need to pick a pace that will not cause your driver or machine to expire. Then, you need to navigate well. If your drivers get lost or go off course, they drive many more miles than necessary. Picking a good course and staying on that course shortens the miles necessary to finish the race. Even so, an obstacle may emerge that needs a change of course because of new information learned. Finally, the drivers must drive FAST along the chosen route.

Let's apply this analogy to a software team. The Team Alignment Template has given us a tool to ensure everyone is clear on where we are going, that is, what business outcome we want to achieve. This is the finish line. Feature A or Feature B is not a finish line. Any individual feature is akin to a particular route on the race course. We are choosing Feature A because we reasonably believe that changing the software in this way will progress us toward our objective. But as we move along, we need to watch out for new information that would help us learn that Feature A might not be the progress toward our objective that we hoped it would be.

Let's pause now and tackle a fallacy that's been promoted heavily in our industry. That fallacy is that of the "Product Owner". The Scrum curriculum heavily touted the Product Owner as the role that knew the customer so well that he was to prioritize the backlog with items. Because the Product Owner had prioritized them, they were deemed to be the right software changes to make. In practice, so few teams have been able to find a person with that good of customer knowledge, that the role of Product Owner hasn't worked. The 2018 State of DevOps Report by Puppet Labs shared a study that teams using Product Owners had a batting average of about 333. In other words, the Product Owner was right 1/3 of the time. When the changes were put into production, they yielded the desired outcome. What's interesting is that another 1/3 of the changes put into production yielded no progress toward the business objective. And the final 1/3 of the changes actually hurt the performance of the software and moved the business away from its objective. These changes had to be hurriedly backed out.

In your oversight role, don't rely on anyone to be so prescient that you trust them implicitly to decide what changes to prioritize. Instead, think about it like any other department in the company. Measure the result and adjust based on the actual data you collect. This is another reason for prioritizing stability ahead of moving faster. The same practices that achieve stability yield a capability for collecting data used in business analysis for what features yield a desired result.

Now that we have a good mental model for how to increase speed towards a goal, we need to measure the current actual speed. You'll want to add more measures to the team's scorecard. Add weekly numbers that represent progress toward the business objective of the software. If the software is related to e-commerce, you may add daily revenue. If it's an inventory system, you may add numbers that are reported on executive team scorecards. This gives your software team ownership of targets that other executives see. And they can participate more fully in improving those business measures. When it comes to software-specific measures for the scorecard, I suggest these as a minimum. - Desired # of issues delivered per week - Current # of issues delivered this week - Average time an issue spends in each status

If you are just starting this type of measurement, you might not know what target to set for the Desired # of issues delivered per week. Go ahead and defer that until you've measured actuals for at least a month. An important principle on which these measures are dependent is commonality. The shipping industry is able to deliver any size or shape or object to a destination. That is because of packaging standards. There are envelopes, small boxes, long tubes, pallets, and even shipping containers. In software, no two features are the same. In previous decades, and still today, teams have attempted to use methods of estimation to get to numbers that could be relied on. No method of estimation today has reached that goal. If our work tracking system has some features in it that are 10x or 5x or 2x the size of other features, it's hard to get the team into a flow of consistent delivery. Again, other departments that measure throughput know that the work needs to be made common-sized in order to empower the team to shine and deliver at an increased rate.

In software, the practice to embrace is Feature Decomposition. In project management, there is a practice of Work Breakdown Structure. Breaking down units of work into smaller tasks is used widely elsewhere to make the work more approachable as well as manageable. Feature Decomposition is the Work Breakdown Structure in software. Guide the team to break down software changes into tasks that can each be reasonably completed in one day of effort. For some features, you will challenge the intended design in order to accomplish this. The result will be development tasks that are all roughly one day of work in the size of labor. And with a common-sized unit of work, you can measure the throughput. But measuring throughput isn't the only reason for doing this. Large software changes that are not broken down are typically where other problems exist. Faulty analysis, undefined architecture, incorrect assumptions, and undiscovered unknowns. Breaking down development work ends up exposing these hidden problems further increasing the quality of what is delivered. Breaking down the work forces more design work upstream of coding because design decisions will have to be made before the initial code. Starting the code on a feature that is too large mixes misses analysis and design conversations right into the middle of unfinished code when the developer gets to a point where he finds an unanswered question. Then, coding has to stop, and an impromptu meeting has to happen because coding on that feature is now blocked. You can safely assume that a feature or change that is expected to take several days to complete will not take several days. It will take 2x or 5x or 10x longer than that. The several days estimate is an estimate of no confidence. Only when you have an estimate of one day can you be certain that all needed work has been identified and understood well. In this process, you'll also see more detailed design diagrams since more knowledge will be flowing throughout the team. As you increase your team's delivery speed, here are some minimum practices to expect. - Kanban-style work tracking (a work board where items move from left to right) - Feature decomposition - Design diagrams - Refactoring

The last item in this list, Refactoring, is a mature practice. You can find very good books on this practice. It recognizes the reality that since we are going to learn from how our users use the software after the software has been built, we need to expect to make changes based on that learning. Refactoring is our method for making those changes. We are going to learn that a feature should behave and be designed differently. Refactoring is a means by which we change the software so that the feature becomes designed in a new way as if we had been designing it in that manner from the start. Here is a suggested question to ask when you learn something new from users in production. - "Since we need to change Feature A, what parts need to change so that the outcome is as if we intended to design it this way from the start?"

The lack of refactoring will compound over time into a code base that is hard to understand and hard to follow. Refactoring ensures that the code is always easy to understand at a glance.

As you measure your team each week, look for the current # of issues deliver each week to increase. You'll also notice bottlenecks to increased speed because you are measuring the average time an item takes in each of the statuses on your work tracking board. When bottlenecks are discovered, you resolve them. The lack of tracking time per status is what allowed bottlenecks to remain hidden.

With all this, you have a very strong oversight position for your software team. It will be important to keep quality, stability, and speed in the proper order. Schedule pressures can tempt teams to forget about this order to allow quality or stability to slip by tolerating shortcuts. But it doesn't take long for shortcuts to accumulate, resulting again in poor delivery speed. Whenever a bug makes it out to users, or whenever a production issue happens, reinforce to the team the importance of taking action so that this kind of bug or issue can never happen again. It's a logic journey. For a bug or issue to not happen again, it's not just about the team trying harder or gritting their teeth tighter. It's about real root cause analysis and changing the problem at the root so that it is impossible for the bug or production issue to happen in that way again.

Leading and strengthening the team

While you are enjoying your high-performing team, stay vigilant. Recognize when you need to go back and redo some of the parts of the process. When you add a person to the team. When a person leaves the team. Back up to stills assessment and inventory. Review the Team Alignment Template. Allow the now-changed team to form again, storm again, and norm again, so they can perform again. Keep them equipped. Make sure every member of your team has an avenue for ongoing professional development. Keep measuring the team. After all, "A" players look forward to the report card. Create an environment where "A" players can celebrate. Your less than "A" players will self-select themselves out. When you identify a "B" player, craft a professional development plan with them so that they become an "A" player. Your "A" players want to work with other "A" players, and they want to work in an area where it is normal to succeed. The working environment that you have crafted for them will empower them to succeed, and they won't want to work anywhere else. You will have created a team with longevity that has established quality, achieved stability, and is increasing its speed of delivery each day.

Conclusion

You have what it takes to oversee your team as a software executive. You can do it. By implementing these principles, leading your team with a scorecard of relevant measures, and putting into place these team practices, you will have a team that is an asset to your business. I know you can do it. And we are here to guide. May God grant you wisdom as you lead your team.

Thanks to Clear Measure for sponsoring this sample and episode of Programming with Palermo.

This program is syndicated on many channels. To send a question or comment to the show, email programming@palermo.network. We’d love to hear from you.

To use the private and confidential Chaplain service, use the following
Gentleman: 512-619-6950
Lady: 512-923-8178