Together with co-sponsor AWS, nClouds recently added our second entry to the 5C’s Series of webinars discussing customer journeys with CI, CD, Containers, Cloud and Culture. The new webinar explores the story of nDimensional, a pioneer in machine learning, AI, big data, and IoT, and how they scaled their infrastructure to support rapid growth. I was joined by Dr. Rakesh Chalasani, Vice President, Technology at nDimensional; Doug Cliché, Solutions Architect at AWS; and Alan Shimel, Editor-in-Chief at DevOps.com.
For context, our client nDimensional developed an end-to-end machine learning/AI platform, called nD, to operationalize industrial-scale big data and IoT applications in real-time production settings. Their customers span banking, insurance, healthcare, energy and more, and use self-service data visualization and analytics to detect credit card fraud, authorize radiology services, optimize electric power generators, and myriad other applications.
To create the powerful insights delivered by their platform, nDimensional was receiving data from a few thousand customer devices, But in late 2016, that number suddenly grew to hundreds of thousands and was trending to millions. As the incoming data swelled, the volume outmatched nDimensional’s infrastructure. The company was thriving, but the infrastructure wasn’t keeping pace.
In 2017, nDimensional partnered with nClouds, and they began re-architecting the infrastructure with AWS. Now, hundreds of thousands of devices are no trouble. Adding nodes to a cluster went from hours to minutes, downtime is virtually gone, and customer responsiveness is up. Plus, the infrastructure can be scaled to handle data from millions of devices. Now able to scale their infrastructure, nDimensional can scale their business.
To learn the details of how nDimensional changed their infrastructure and the corresponding benefits, you’ll want to watch the webinar replay. There’s also a case study. But I’ve outlined three of the biggest takeaways below.
Just because your infrastructure meets today’s needs doesn’t mean it will in the long run. As your business grows, it changes, and its infrastructure needs to change with it. Planning ahead can be tough when there’s a barrage of urgent issues popping up daily. But the longer you wait, the more technical debt you accumulate and the more difficult it is to transform your infrastructure.
For starters, moving to the cloud offers scalability, reliability, performance efficiency, and cost optimization that’s tough to get from an on-premises infrastructure. But to realize these benefits, you need to craft the right architecture and bring your resources together with a DevOps approach. The sooner these components are in place, the less risk there is for infrastructure to tether down a nascent company.
When we first engaged with nDimensional they were on the cloud already, but their architecture wasn’t built to scale with the eruption of data. Turning data into insight—the real value of nDimensional—became more taxing, as the infrastructure was hindered by high latency, downtimes of five to six hours, and more. “We needed compute resources to scale up and down depending on how often we updated our machine learning models,” said Rakesh.
As part of nClouds’ solution, we used AWS OpsWorks to create different layers based on server roles and enable auto-scaling across the infrastructure. Now, nDimensional can scale clusters up and down with zero downtime.
nDimensional’s new, scalable infrastructure is faster, smarter and less demanding on resources than its predecessor. In addition to auto-scaling with OpsWorks, the infrastructure uses VPC Peering to connect a Utility VPC, containing Jenkins and OpenVPN, with a separate production environment VPC for improved security and performance.
While this is all good news, nDimensional still isn’t in the business of maintaining IT architecture — it’s table stakes and a mandatory platform for their business. But their core competency and focus are delivering smart, innovative, pioneering applications of machine learning and AI based on customer and market requirements.
Nonetheless, the infrastructure requires upkeep for performance in latency, downtime, governance, automation, security and 24/7 support. That could distract nDimensional from what they do best, and that’s a dilemma faced by many businesses.
Rather than investing in the resources to create and maintain their infrastructure by themselves, they engaged with us at nClouds, and took a DevOps approach centered on collaboration and speed. Now, nClouds provides 24/7/365 support to leverage the nDimensional team so they can focus on their core business. “At this point, we maintain an SLA of 15 minutes for any major issues that may come up in our infrastructure,” said Rakesh. “It’s great and really responsive.”
With the freedom to focus on their machine learning solutions while nClouds focuses on the infrastructure, nDimensional has more resources to apply to innovation. That means greater velocity for new product features. “Our engineering team now has great insight on the stability and functionality of new features we roll out,” says Rakesh. “That gives us a lot of focus and helps us determine what to work on next, and what new features to add in upcoming releases.”
Innovating faster than your competition makes you the market innovator. Not all organizations can take this strategic path. But going fast is a matter of survival in hyper-growth segments like machine learning/AI. That’s part of what makes an integrated cloud and DevOps solution critical for companies like nDimensional.
A DevOps foundation, combined with powerful infrastructure automation in an AWS environment, is needed for organizations and industries where technology is the differentiator. These are at the core of nClouds’ passion. It’s what we have seen work repeatedly for clients. It’s about turning infrastructure from ball-and-chain into launch pad, a platform for digital innovation and business differentiation.
To learn more about nDimensional’s journey, watch the 5C’s Series webinar, “Scaling Cloud Infrastructure for Millions of Devices,” or read the case study .