Software Development: Sprint vs. Marathon Mindsets

Two mindsets.

Sprint Mindset. Ship it! Go for speed! SPRINT!

Marathon Mindset. Clean Architecture and Code results in lasting quality and speed of development, but may take longer in the short term.

Sprint Mindset. Focused on Time To Market (TTM)

The Agile proponents take it as a given that we should be optimizing for Time To Market (TTM). “The Fast Way is the Right Way” are a few words I have heard to describe this mindset. Now don’t get me wrong, there are times when shipping software fast is the only thing that matters.

Short runway, company is going to die if the product isn’t launched within 2 months? Build it quick and dirty!

Need to hack together a script to do a customer-specific one-off migration? Go for it. As long as it gets the job done right.

Prototyping something to see if focus groups respond well?

I get it. There are plenty of times it makes sense to go the fast route.

But what happens when organizations continually make the short-term expediency choice time and time again?

Well in fitting together each feature in a “build the thing as quick as possible” kind of way, we are left continuing to build a taller tower on an unsound foundation and rickety support beams.

I have witnessed larger organizations fail to remain relevant as smaller organizations out-maneuvered them. Where had their speed gone?

When developers are afraid of breaking things, they move slowly and cautiously.

“If I change this, it will impact these 4 other things.”

These are thoughts that slow a developer down immensely. Now the choice is: change the 4 other things (start by seeing what impact that might have) or put in another hacky, band-aid solution. So let’s play Choose Your Own Adventure

Hack it = Technical Debt

Another band-aid solution it is.

This added technical debt is frequently brought on by rushing an implementation. Developers code towards features rather than capabilities. Feature-based development often results in tightly-coupled code. It bakes in assumptions. You know, the things that are difficult and costly to change.

Now we are living in a codebase that is a nightmare to work in. Making changes is like solving a sudoku puzzle of logic balancing.

For some reason, things have started to slow down. So you crack the executive whip harder… Maybe that will work?

Refactor = Artificial Risk and Cost

Duct-tape solutions won’t work. Let’s refactor.

Perhaps you’d prefer to go the change the four other places in code (and the cascading changes those may require). Or perhaps you’re breaking the habit of making the short term choice again and again, and now is the time to do some refactoring.

Well refactoring artificially induces risk. I’m not saying don’t refactor. By all means, it may be time to pay the piper.

But perhaps this path was more avoidable than we had previously thought. Which brings us to… The Marathon Mindset.

Marathon Mindset. Focused on Total Cost of Ownership (TCO)

What we might instead consider is to focus on Total Cost of Ownership (TCO).

When we focus on TCO, we will build systems with capabilities and not features. Features will only come out of our capabilities. There will be a clean separation of concerns which will allow development teams to grow and still remain fast.

In many options we may rely on distributed or even cloud native systems. If we are working with distributed systems, we will prioritize testability. We need to focus on proper granularity of services so that we can have a simple system of where new functionality should belong. If it needs to be in a new service: where should that service belong and is it the Stripe service or the Payment Service?

Whenever we are working on a product of any respectable size, we will can reduce the project duration and cost by thinking ahead and building a cohesive system. Measure twice, cut once.

Remember: capabilities, not features.

Balance

However, there is always a balance to be had. At the extreme end of optimizing for Time to Market, we wind up with spaghetti code and worse- a spaghetti architecture.

At the extreme of optimizing for Total Cost of Ownership, we have overengineering and diminishing returns for time invested.

So, balance.

Which way to lean?

When do we want to have more importance on TCO? Given that the bulk of work is spent maintaining software and not just building it, optimizing for TCO is ideal for situations where we want to keep moving fast.

And when do we want to put more importance on TTM? Prototypes, true one-offs, and for startups without runway. These cases may demand a priority put on TTM.

Wrapup

So which problem is your organization facing?

All too often I see companies that cannot move quickly anymore and it can be costly. They are encumbered by system design quality and code quality from choosing the short term path each time.

Or there are organizations that use heavyweight processes. A barrage of meetings, stakeholder approvals, specs, heavyweight analyses. I will talk about this in a future article.

There are even organizations that are simultaneously encumbered by both!

Is your organization encumbered? I help organizations move fast again. Get in contact with me.

Nimble Manifesto

Nimble Manifesto: Agility Without Agile

So what is the Nimble Manifesto?

The Nimble Manifesto is a call for sanity in software engineering. Agile paints a picture of an artificial fight between itself and Waterfall. This is a false dichotomy. The choice is not between just-in-time decision making and rigid up-front detailed plan.

What we want is a method that keeps up with changing plans and gives some degree of predictability too. As an added bonus, we can move faster without meetings eating up our teams’ calendars, we can keep our teams productive and effective without such frequent rewrites, and we can keep our developers from burning out sprinting all the time.

 

Nimble Values

Able to evolve OVER grind-to-a-halt technical debt

Absorbent to business evolution OVER constant reactive scrambling

Predictable cost and duration OVER the guess-and-check loop

Sustainable pace OVER perpetual sprinting

 

Bad for:

  • Precision Industries. Think NASA, autopilot systems, or medical devices. In these cases, we prefer exactness over flexibility. Space shuttles were not designed to be evolvable, they are designed to perform one role a particular way and to perfection. Time and Cost are less important but Quality is extremely important. Waterfall would work better.
  • Widget Factories. Predictably and similarly sized software tasks that don’t involve a high degree of creativity to build. Use “Capital-A” Agile.
  • Initial Prototypes. Particularly if they are prototypes of something small. Best to just do it- don’t be encumbered by process. When the prototype is proven or adopted, it’s often best to rebuild it from the ground up – you can significantly reduce Total Cost of Ownership.

 

Good for:

  • Most others. The VAST middle ground in between Waterfall and Agile.

 

But nothing is free. What is the caveat?

This is our tradeoff. In order to gain project efficiency, predictability, and maintainability, we choose to give up the fastest possible initial date of a feature. Working nimbly, we can complete the project quicker, know our costs up front, and mindfully choose the right balance of Time, Cost, and Risk for our needs.

 

My recommendation on how to use the Nimble Manifesto (on a 2-12 month project):

We have two goals to accomplish before starting the project: we need to put together system level requirements and get executive approval for the project, knowing the Time, Cost, and Risk.

System Level Requirements

  • Discover project requirements. These are the features.
  • Tease apart the system requirements- we’re working with capabilities here and not features.
  • Estimate these requirements. 1 person for each service or component (for backend) or each area (for frontend). We will use weeks, not hours to estimate. Why by week? We value accuracy over precision. When averaged together for the project, precision will be lost anyway. So course-grained estimation gives us accuracy.

 

Project Sign-Off

  • Build multiple potential plans that show various balances of Time, Cost, Risk.
  • Include staffing needs of each role- Architect, Product Manager, Project Manager, Developer, QA, Configuration Manager. The first 3 of these will be needed throughout the course of the project. The other roles will be introduced after the initial plan. Choose a realistic plan for staffing- not everyone can start on day 1, people can’t reasonably join and leave and join and leave the project. Mirror your plan to realities.
  • Present plans to the executive team for approval. Work together to explain the various balances of Time, Cost, Risk. Choose what makes sense for the business. Sometimes, it means building in-house, other times working with a consulting firm can expedite the project.

 

During the project:

  • Build software with engineering in mind. Use an architect to supervise the project.
  • When business changes come in, have a technical architect meet with a business architect to work together to measure impact and make decisions. The impact can be applied to the project plan and reassessed- get executive approval for anything major.
  • Don’t meet so much. It can encumber the team. Rather than be prescriptive, I will leave this one open ended. Are meetings adding the amount of value they are costing? Getting 5-10 people together in a room is expensive and distracting. The lightest possible process to keep things moving tends to make a big impact on added efficiency.
  • Features are integration points of our activities. We can demonstrate our capabilities coming together as progress. Progress will come more efficiently, but the features will not be here as quickly. At the end of the project, we will see the features come together.

 

Life’s a marathon, not a sprint. Let’s move adeptly, intelligently, and with a purpose.

Moving fast in a legacy monolith codebase- safely!

A friend of mine told me about a problem he’s facing. He’s working with a client who has a globally distributed team, much of which is offshore, working in a monolith Java codebase. I hear the groans already! It gets worse… As you might imagine, there are quality issues with the software. They release only once or twice per week and when issues are found in production, they scramble to fix it and quickly ship again (fix forward). The leaders of the company are unhappy with the development team’s inability to produce quality software,

The scenario he described is all too common!

What strategies can be used to deal with this?

First let’s break the problem down the issues:

  1. Multiple teams sharing one codebase
  2. Testing is a challenge and quality issues slip through to production
  3. There is no automatic way to find production issues so they rely on QA staff and customers reporting issues
  4. Deployments are a manual process and therefore they are not easy to do frequently

So how can we address these issues?

Migrate parts of the monolith to a microservices architecture

If we can split the codebase into smaller, independently deployable services, each service can be assigned a team to own it. With ownership comes pride in one’s work and code quality naturally improves. If there’s an issue with a service, guess who is on the hook? The team who wrote it! Each service endpoint is a contract- you give me these inputs and I will give you these outputs. Now teams within the organization can independently build their services.

How can we get to microservices safely though? Using the Strangler pattern and the Anti-Corruption Layer pattern.

Invest in the right kinds of testing

Where this client was concerned, there was only surface level testing occurring- they had little automation tests to simulate clicks and confirm behavior of the web application itself. They had dismal code coverage with their minimal unit tests. Bugs slipped through the cracks. And what’s worse? When they found a bug there were no tests written to confirm the fix.

My advice: every time there is a bug, write unit tests to reproduce the bug and fix it. These tests will confirm that the fix is in place. This is Test Driven Development (TDD) in practice. Start by writing the test the way it should work (it will fail because there is a bug). Then fix the code. Run the test again to demonstrate that this code path produces the desired result. Check it in!

Additionally, for a web application of this size, more automation tests should be considered. There are easy ways to get started with these tests. Often times people work directly with Selenium but there are more reliable ways to get started- I would recommend using Cucumber or if you’re in a .NET shop, try Canopy.

Ship code faster

When there are more people in a codebase, there is more reason to ship code faster! Why is that? Well suppose we have 20 changes from various contributors all ready to be delivered together. And suppose we find a bug in the software while it is being QA tested (or worse: once it is in production). Now we have to track down which change was the offending change. I don’t know about you but my life is much easier when it’s not spent searching for a needle in the haystack.

What’s the solution? Ship code faster! Better to push code every day (except not on Friday!) or even multiple times every day! This couples nicely with microservices architectures because each team is responsible for getting their codebase into production. That’s the DevOps way!

Have an issue? It becomes much easier to diagnose and determine a course of action when you can tell exactly when it started and which codebase it was in! Only one problem… shipping code is painful! What can we do…?

Build a CI/CD Pipeline

To reduce the pain of deployments, we need to make packaging up code easy! To steal a Wikipedia definition,

Continuous integration is the practice of merging all developer working copies to a shared mainline several times a day.

So now for each codebase, we are constantly merging back into the develop branch as changes are happening. This means the develop branch is the source of truth for the current state of the team’s work.

And now for Continuous Delivery (from Wikipedia),

Continuous delivery is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software faster and more frequently. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production.

So we need to make it easy to automatically build our software when a commit happens to develop. The build process should include running all unit tests to ensure things are still healthy.

When a release branch is create, we can have a system run through the same build/test process, package it up for various environments, store packages in an artifact repository. The process then deploys it to a testing environment and runs our automation suite of tests against it to validate that level.

Manual QA is still likely a requirement for most organizations, but now driven off of nothing but a developer commit, we have releasable packages that just need to be signed off on by our QA staff and they can make it to production. And you guessed it– production deployments should be automated too!

Production Monitoring

So what can we do to find our production errors before our customers? Well we could have  a wired QA staff frantically clicking through to manually verify each behavior after each deployment to production. But there are better options!

One options I’ll put forward is to ensure we have centralized logging. Each codebase should report its errors and audits (and to different log stores). There are a variety of ways to do this but one that is gaining a lot of traction (and rightfully so) is the ELK stack. ELK stands for ElasticSearch, LogStash, and Kibana. The idea is that each codebase will store its logs locally on the server on a temporary basis. LogStash is a service installed on each server that monitors for these log files and ships the log entries to a central log store. This log store is running ElasticSearch and the log entries are combined within that so they are searchable. Finally, developers and QA staff use Kibana as a frontend to search through the errors that are occuring.

Best practices here include:

  • using a Correlation ID to identify a call working its way through a variety of microservices
  • logging which Codebase an error or audit happened in

Learn more about getting the ELK stack set up here.

Feature Toggles

So what do we do with all of these code deployments happening so frequently our heads are spinning? We can wrap our changes in something called a feature toggle.

A feature toggle is branching logic. If the toggle is on, go to the new path. If the toggle is off, go to the old path.

One use of feature toggles is for safely releasing. We can deploy code with the toggle off and there is no change to system behavior. Once we are ready- and maybe the product team has some involvement here- we toggle the feature on. Now code goes down the new path. Don’t forget to monitor the error logs! If all is well, we still need to clean up after the feature toggle on a subsequent deployment to remove the old path. This is very important as feature toggles should be short-lived so they don’t clutter your codebase.

You can read more about feature toggles from Martin Fowler here.

Wrapup

So that was a lot! With some work, we can safely move fast and deploy even several times per day. We can build confidence in the quality of the software we produce and can separate out the code into microservices to enable larger teams to safely work together.

I hope that helps your organization!

Honing Your Craft 2017

A few of the developers I work with asked me for recommendations on where to go to continue learning things in software development to hone your craft or just stay up with new things in the industry. So I put together some of my favorite resources here. Most of these are geared towards backend and distributed / cloud systems. I’d be happy to hear from you if you have some recommendations as well.

Conferences and Slides

Tons of value here!

Podcasts

These are some of my favorites to listen to in the mornings- subscribe in your favorite podcast app.

Development Blogs

Personal Tech Blogs

You can find and converse with top developers on twitter!

Service Patterns

Interesting Manifestos

I hope you can find a valuable educational resource or two from this list that helps your software development growth.

© 2021 John Bindel

Theme by Anders NorénUp ↑