Reliable by Design: Building Guardrails for AI and Other Unpredictable Systems

Presenter: Naresh Jain & Elisabeth Hendrickson
Event: Curious Duck Dev Podcast
Location: Online

Presentation summary

Microservices were supposed to make software delivery faster and easier โ€” so why are releases slower, bugs more frequent, and teams more frustrated than ever? In this insightful conversation, Naresh Jain and Elisabeth Hendrickson explore the hidden costs of modern software architecture, the myth of shift-left, and why API design-first thinking may be the way out of integration purgatory. From contract testing to AI-driven development and the inspection mindset trap, this talk is packed with practical lessons for teams struggling with scale, complexity, and quality.

Transcript

In this in-depth interview, Elisabeth Hendrickson sits down with her longtime friend and accomplished technologist, Naresh Jain, to explore the intricate challenges enterprises face with modern software architectures. From the early days of Agile testing tools to the ongoing struggles with microservices and the dawn of AI-driven development, this conversation offers a wealth of insights into how organizations can rethink their approach to design, testing, and deployment to build truly reliable systems. Naresh shares his extensive experience with large-scale enterprises, his passion for community building, and his pioneering work on contract testing and API design-first methodologies that promise to transform the way we handle complexity and unpredictability in software systems.

Table of Contents

Getting to Know Naresh Jain: From Early Agile Days to Community Building

 

Elisabeth Hendrickson: Naresh, we’ve known each other for almost two decades now, going back to the first AAFTT meeting in Portland in 2007. What was that experience like for you, and how has your journey evolved since then?

 

Naresh Jain: Yes, it’s been almost twenty years, which is incredible. I remember flying in jet-lagged but energized, eager to connect with others passionate about agile testing. Since then, my journey has been quite diverse. Beyond working in technology, I founded the Agile India community and organized numerous conferences like Agile India, Simple Design and Test, API Days, and even niche ones around Selenium, Appium, functional programming, and data science.

One of my core motivations has been to bridge the exposure gap for brilliant technologists in India. Too often, folks only see marketing-driven conferences rather than authentic technical content. So, bringing real movers and shakers to India has been both a mission and a way for me to engage with the people I admire.

Elisabeth Hendrickson: You also created your own conference organizing platform, ConfEngine. Can you tell us about how that came about?

 

Naresh Jain: Absolutely. Initially, conference organizing was purely volunteer-driven, with communications flying around via emails and spreadsheets. We wanted a single source of truth for schedules, proposals, and feedback. So, I started automating parts of the process as a hobby. To my surprise, other conference organizers wanted to use it too. That led to expanding ConfEngine into a platform supporting multiple conferences, even internationally in places like Japan. Itโ€™s been rewarding to see a shared value system around community events emerge through this tool.

Enterprise Systems: Complexity, Scale, and Legacy Challenges

 

Elisabeth Hendrickson: Shifting gears to your work with enterprises, youโ€™ve seen firsthand the complexity and scale at which they operate. What stands out the most about these mission-critical systems?

 

Naresh Jain: Itโ€™s truly mind-blowing. Some systems date back to the 1960s โ€” decades-old codebases written in obscure or even proprietary languages. The scale is immense, and many systems have been incredibly stable over long periods. Thatโ€™s a treasure trove of learning. Of course, these systems face challenges, which is why external experts like us are brought in. But itโ€™s a two-way street: we learn from their successes and share new ideas to improve things.

Elisabeth Hendrickson: And that led to our conversation about modernizing these systems, particularly with microservices. There was this belief that microservices would solve all the problems. Whatโ€™s your take on that?

 

Naresh Jain: Itโ€™s a story as old as time. I remember when COM objects and Enterprise JavaBeans (EJBs) were hyped as the solution. The belief was if you tested a component thoroughly, the whole system would just work. That illusion carried into microservices. Everyone wanted the magic recipe: microservices, cloud, domain-driven design โ€” and suddenly youโ€™d be the next Amazon or Netflix. But the reality is much messier.

The Microservices Mirage: When Promises Meet Reality

 

Elisabeth Hendrickson: Can you describe some of the pitfalls enterprises encounter after adopting microservices?

 

Naresh Jain: Many start with independent teams deploying small microservices rapidly โ€” sometimes multiple times a day โ€” which works initially. But then something breaks, and they add integration testing environments to catch issues. Soon, the scope of testing balloons. Instead of testing a handful of services, they find themselves needing to deploy thousands of microservices simultaneously to ensure compatibility. It becomes a nightmare to track versions, dependencies, and failures. What was meant to increase velocity ends up slowing everything down.

This is when observability becomes critical โ€” distributed tracing, centralized logging โ€” and the ecosystem around Kubernetes and microservices exploded. But the fundamental problem remains: microservices often become distributed monoliths where you must deploy and test the entire system before releasing anything.

Elisabeth Hendrickson: That distributed monolith kills productivity. What advice do you give organizations wrestling with this complexity?

 

Naresh Jain: If possible, avoid microservices โ€” the cost and complexity are high. Getting microservices right is not trivial. It’s about more than just domain-driven design. You must carefully design service boundaries and interfaces to prevent integration hell. Many organizations miss that the hidden logic inside systems gets pushed into service interfaces, which must be well thought out and maintained.

When interfaces are poorly designed or undocumented, you get contract mismatches, version conflicts, and cascading failures. The key is to ensure that your microservices truly operate independently and that the contracts between them are explicit, executable, and testable.

Testing Realities: The Integration Trap and Contract Testing

 

Elisabeth Hendrickson: You mentioned integration hell. How do enterprises typically try to solve that, and why does it often fail?

 

Naresh Jain: Many enterprises fall into the inspection trap: they rely heavily on end-of-cycle, heavyweight integration testing in shared environments. The problem is these environments are unstable โ€” thereโ€™s downtime, waiting for microservices to be deployed or fixed, and test data synchronization issues. This leads to long feedback loops where a developer might only find out months later that their change broke something.

Meanwhile, developers keep pushing new changes to avoid missing deadlines, piling up untested changes and causing the proverbial train never to leave the station. The complexity of orchestrating multiple CI/CD pipelines, versions, and dependencies also compounds the problem. The end result is infrequent releases, bug leakage, frustrated teams, and unhappy customers.

Elisabeth Hendrickson: That sounds like a vicious cycle. How can organizations break out of it?

 

Naresh Jain: One effective approach is embracing API design-first thinking combined with contract testing. Instead of letting microservices evolve haphazardly, organizations bring all stakeholders โ€” providers, consumers, architects โ€” together upfront to collaboratively design and agree on API specifications.

These API specs become a single source of truth, used as executable contracts to validate backward compatibility and integration correctness continuously. Consumers can generate mocks from the contract, enabling parallel development and early testing. This shift-left approach moves integration validation from the end of the cycle into design and development phases, drastically reducing surprises and delays.

Another key practice is ephemeral integration environments, where all services except the one under test remain fixed and stable. This allows for rapid, isolated testing of changes before full integration, improving confidence and speed.

Elisabeth Hendrickson: Can you share examples of how this has helped organizations?

 

Naresh Jain: We analyzed a large client with a six-month cycle time and found that 70% of that was spent waiting after development was “done” before deployment to production. Most bugs originated from contract mismatches. By instituting API design-first and contract testing practices, they caught these issues early, reducing integration surprises significantly and shortening cycle times.

Tools and Mindset: The Path to Reliable Systems

 

Elisabeth Hendrickson: This sounds like a big investment upfront. How can organizations make it tractable?

 

Naresh Jain: Thankfully, there are open-source tools and platforms to help. For example, I developed Specmatic, which can take API specifications like OpenAPI or AsyncAPI and spin up wire-compatible mocks without writing code. This ensures your mocks always align with your API contract, reducing drift and maintenance overhead.

Of course, you can’t do everything at once. Start where changes are fastest and most painful, and scale gradually. The tooling helps, but equally important is the mindset shift: moving away from the inspection mindset where you trust only end-to-end testing towards trusting design-time validation and continuous collaboration.

I often use the analogy of buying a car: would you buy a car without inspecting individual components like the engine? No. Yet, many teams only test the integrated system late, which is risky and inefficient. Shifting testing and validation to design time is both safer and faster.

Elisabeth Hendrickson: So, shifting left is more than a buzzword?

 

Naresh Jain: Exactly. It’s a clichรฉ now, unfortunately. But in reality, organizations struggle to operationalize it. Platforms and shift-down approaches are promising, but they must be tailored to the organization’s context. The real value lies in cultivating a continuous improvement culture, using data to identify and close feedback loops rapidly.

Facing the AI Era: New Challenges and Opportunities

 

Elisabeth Hendrickson: With AI accelerating code generation and change velocity, how do you see quality assurance evolving?

 

Naresh Jain: AI fundamentally changes the game. Developers can pump out code faster than ever, but the old inspection-based quality models won’t scale. AI-driven systems are inherently nondeterministic โ€” running the same input multiple times can yield different outputs. This makes traditional testing approaches ineffective.

Instead, we need deterministic guardrails around these nondeterministic systems. Think of it as sandwiching the AI components between layers that enforce expected properties and behaviors. For example, property-based testing defines invariants the system must satisfy, while shadow mode testing runs new models in parallel to compare outputs before full deployment.

In practice, this means embedding feedback mechanisms during generation, not after. It’s a shift from inspection to design-time and runtime monitoring with self-correcting feedback loops, much like Toyota’s lean manufacturing feedback on their looms.

Elisabeth Hendrickson: Can you share how this applies to AI-driven products, say, like conference proposal evaluation?

 

Naresh Jain: Certainly. For instance, when someone submits a conference talk proposal, traditionally a program committee reviews it, provides feedback, and refines the selection. In an AI-driven system, you might have an automated committee member giving real-time feedback as the proposal is entered.

This AI could assess content relevance, factual accuracy, tone, style, and even detect toxicity or AI-generated text. But testing such a system is complex โ€” the feedback it provides could vary each time, and you need to ensure itโ€™s meaningful and fair.

Training models on thousands of past proposals helps, but validating the AI’s behavior requires new testing paradigms, including continuous monitoring, human-in-the-loop checks, and adaptive prompts. This is a glimpse into the emerging challenges of testing AI-first systems.

Final Thoughts: One Action to Improve Today

 

Elisabeth Hendrickson: For enterprises currently in a world of hurt with complexity, delays, and bugs, if they could do just one thing starting tomorrow to improve, what would you advise?

 

Naresh Jain: Thatโ€™s a tough question, as it depends on the organizationโ€™s context. But generally, Iโ€™d say focus on building the muscle for continuous improvement through data-driven feedback loops. Donโ€™t get overwhelmed trying to solve the biggest problem first. Pick any problem, collect data, reflect, and close the loop with an improvement.

This approach builds momentum and cultivates a culture of learning and adaptation. It applies across testing, quality, architecture, and process. Avoid the trap of feature factories that ship more and more without clear direction or impact. Instead, aim to understand where youโ€™re headed and make incremental, measurable progress.

With this mindset and a willingness to collaborate on API design, embrace contract testing, and rethink integration, organizations can gradually reclaim velocity, quality, and confidence even in the face of sprawling microservices or AI-driven complexity.

About the Interviewer

 

This conversation was led by Elisabeth Hendrickson, a renowned agile testing expert and founder of Curious Duck. You can learn more about Elisabeth and her work on LinkedIn and at Curious Duck.

 

More to explore