For the last few years, several members of my team and I have been working with a Fortune 500 global company to help build a Node.js ecommerce platform. We learned many valuable lessons along the way. Lessons we feel are too valuable not to share.
In this post, learn four major lessons we’ve learned during this incredible Node.js project.
1. No One Architecture Works for Every Application
Monolith
When Bitovi first got involved in this project, the codebase we were working with was a Node.js monolith that was created by a startup the company had acquired. This was not the ideal architecture for the application, so we began looking into other options.
Microservices
One of the first things we did was to break this monolith into individual microservices. This code is now dozens of microservices, which allows the scalability needed by such a large brand. Without this, the platform we’ve been building wouldn’t be able to meet the demand of the number of orders coming in during peak periods.
But microservices can be expensive to implement and operate. If the startup that originally wrote the codebase started with microservices instead of their monolith, they likely would have run out of money long before their acquisition.
Serverless Functions
Aside from microservices and monoliths, serverless functions are often a performant and cost-effective choice for backend cloud architectures. Many of the systems that feed information into the ecommerce platform I’ve mentioned utilize serverless functions.
These systems generate product lists, send orders from client applications, and send orders to the downstream systems. Many of these things happen in bursts—businesses don’t continuously change their products, for example. If these were written as microservices, they would often be sitting unutilized while still incurring charges from the cloud provider. Using serverless functions for this use case greatly reduces cost and complexity.
In fact, the first version of the order workflow we helped develop was written using AWS Lambda serverless functions along with Amazon Simple Notification Service (SNS) and Simple Queue Service (SQS). Using serverless functions allowed the platform to scale to any order volume, but as the workflow expanded to handle more use cases, it became overly complex to manage. The web of topics, queues, functions, dead-letter-queues, and database records become very difficult to work with.
Workflow Frameworks
To simplify the growing complexity of serverless functions, we moved to Temporal’s workflow engine, which handles much of the error handling and retry logic we were doing manually before. Using a dedicated workflow engine made it much easier to handle all of the possible failures in such a large distributed workflow.
Each of these architectures—monoliths, microservices, serverless, workflows—has use cases that they are best suited for. There is no single backend architecture that will work well in all situations.
2. GraphQL’s Best Feature: Its Schema
The first version of the ecommerce platform we’ve been building was RESTful services. We later transitioned to GraphQL using Apollo Federation.
With the REST Services, we used OpenAPI to document our services. We also built lots of tooling on top of OpenAPI for things like validation and generating documentation. There are many great open-source tools for this, but none of them provide what you get with GraphQL out of the box unless you put in a sizable amount of effort.
When using GraphQL, your server will automatically validate that your requests and responses are valid based on your schema. Your schema will also provide your users with most of the documentation they will need to understand what operations your API supports and their inputs and outputs.
Not to be an ad for Apollo and their Studio product, but if you’re using Apollo Studio, you also get tools like Operation checks, which can track what calls your users are making and automatically detect when a proposed change to your schema will be a breaking change to your users.
Getting all of this tooling out-of-the-box when using GraphQL on top of all of the available open-source tools for things like code generation makes the schema GraphQL’s best feature.
3. GraphQL’s Downside: Clients Can Decide What to Query
The fact that clients can query only the fields they need is often given as one of the biggest selling points of using GraphQL. This is absolutely true; you can support multiple clients, and each client can optimize their own query for exactly what they need. But this benefit is an optimization for the client.
For the server, this means that your clients can make extremely expensive queries. Clients can make queries with infinite nesting and request as much data as possible.
It can be very hard to give accurate SLAs or guidance on how long different requests might take. With GraphQL services, it’s very important to pay close attention to what queries your clients are making. You can use tools like DataLoader to ensure that each entity is only ever fetched once from your database for a single request. It’s also possible to add tooling to prevent these issues by limiting things like the maximum depth of queries you will allow, but this is all extra development effort.
Want more tips on optimizing your GraphQL services? Sign up for a free one hour consultation and we’ll walk you through it!
4. “Dogfooding” Is Crucial for Microservices
“Dogfooding” is generally understood to mean having your employees use the software you’re building in order to find issues before your customers use it. This is a great way to find bugs earlier, clear up documentation, and make your code production-ready.
Microservices often need to communicate with one another—a Cart Service needs to request the menu from a Menu Service in order to make sure added products are available in the menu, for example. With microservices, you can get all of the benefits of dogfooding by consciously deciding that API calls between your microservices will use the same flow as calls made by external clients.
With the Federated GraphQL platform we helped build, all of these service-to-service requests are made through the GraphQL gateway in exactly the same way that clients make these calls. There are solutions like gRPC that could have some performance benefits in these service-to-service flows, but these are outweighed by the benefits of having these internal calls go through the exact same code path as client-to-service calls. This means that by the time clients start to use an API, it often has already been used by one or more internal service calls, the documentation has been vetted, and bugs have been found and fixed.
Conclusion
Building a massive ecommerce platform is no small task, even for a team of highly-skilled Node.js consultants. We learned invaluable lessons along the way—the importance of finding the right architecture, the best way to leverage GraphQL, and the magic of dogfooding.
Overall, the lessons we learned highlight the importance of careful consideration and planning when building scalable and efficient platforms.
Working on something big?
We’d love to hear about it! Join the Bitovi Community Discord to chat with Bitovians and our friendly community of tech lovers.
Working on something really big?
Bitovi can help. Schedule a free consultation with our expert Node.js consultants, and we’ll show you how, together, we can make your project incredible.