Scaling The Smurfs' Society with Gas Optimized Transactions

Alchemy Team
February 24, 2023
Case Study

Partnering with Alchemy to scale transactions on-chain

By The Smurfs’ Society Team

The Smurfs’ Society is the official web3 project of the Smurfs. The project is an invite-only, fully on-chain gamification platform that launched in December 2022 and generated 30,000+ wallet connections and 1.5 million on-chain transfers in less than a month, making it one of the TOP 100 most used Dapps according to DappRadar.

In this article, its CTO and co-founder, tackles the technical challenges of scaling high volumes of on-chain transactions, and how their team is collaborating with Alchemy to deliver a smooth and consistent user experience to tens of thousands of players using the Transact API suite.

The relativity of ‘scalability’ in the context of blockchains

Most of us wouldn’t consider 100 transactions per minute an “at scale” or volume problem, myself included. In the past, I've collaborated with many companies to optimize Cassandra deployments running clusters in a multi data-center world with TB of data, low latency & high volume; back then, never would have I thought to myself that this low transaction volume could be hard to achieve when working with blockchain infrastructure.

The challenge is two-fold: 

  1. We need to be able to physically execute the transaction 
  2. We need to ensure the correctness of the transaction and more precisely its eventual data consistency and accuracy. 

To address the first challenge, this is pretty much the same as sizing any platform cluster deployed in k8s with auto-scaling capabilities. One small differentiation would be on the way we sharded the load using an HD wallet in a StateFulSet to provide each pod with it’s own wallet to manage directly. And, remember that blockchain mandates transactions be executed in order (for a specific order wallet). 

For the second, think about a transaction in the mempool but not getting mined (gas).

Our options are:

  1. Wait for the network to cool down (if possible) for the transaction to just be mined
  2. Overwrite the transaction with a new one (forcing nonce) and increased gas settings
  3. Simply use a different wallet

Clearly, in our case, we can’t be in bucket 1 and we can’t be in bucket 3. So bucket 2 was our only option, which created its own complexities.

This is a quick look at our experience, partnering with Alchemy moving from Reinforced Transactions towards Gas Optimized Transactions.

Mining 20,000 transactions per day while maximizing correctness

Why is it so hard to scale an application or game that makes heavy use of the blockchain as a backbone? 

  • Are we limited by the number of transactions that can be mined in a single block?
  • Are we limited by the capacity of the network as a whole? 
  • Are we limited by network congestion and specific intraday usage patterns?
  • Are we limited by the way the transactions on the blockchain get executed? 
  • Are we limited by gas price volatility? 

After several weeks of operations and a lot of iterating all our controls, we now have a platform for the game to operate in optimal conditions.

Before I explain all the obstacles we had to overcome to get here, let me introduce you to the game dynamics, and give an overview of the actions we automate from a blockchain standpoint and its challenge. Then I will explain how we solved the problems partnering with Alchemy.

How does The Smurfs’ on-chain game work?

The game, as it stands right now, is a disruption of the normal concept of a “waitlist,” where the players mix ingredients to try to find recipes that create potions, some of which can give access to a crystal (guaranteed mint access with discount).

All the assets we use on the game are represented as NFTs. These are either ERC-1155 tokens for items we would consider as fungible and consumable (e.g. ingredients, potions, blue clay, statuettes, etc.), or ERC-721 tokens for items that aren’t fungible (e.g. crystals).

The Smurfs' Game interactions

Below are the interactions our platform has with the Polygon network for the game to run:

  1. Claim drop box(s) composed of ingredients - players are invited to claim their box of 3 ingredients on a daily basis
  2. Mint statuette (the main component of the game) - all players are required to possess one to be able to complete actions of the game. The statuette is dropped for free when the player logs into the game. Later in the game, the statue’s role will be announced.
  3. Burn clay - when the player who attempts a recipe fails, the blue clay is burned (x1)

The player on the other hand performs a single operation on the Polygon network: 

  1. Mix recipe - the player is invited to mix 3 ingredients to create a recipe (the player is the one paying the gas fees in this case).
Figure 1: Player mixing a recipe

For the first two functions, we can work in a 100% asynchronous fashion without real impact to the game if a transaction we attempted wasn’t mined for a long time. Obviously, our preference from an experience point of view would be to have these times as low as possible. 

Burning clay is a bit more challenging and implementation choices were mainly driven by the game itself. When a player actually proposes a recipe, they do so by signing an EIP-712 struct containing all the ingredients they want to use to mix it with a 3-hour deadline as shown below. 

Figure 2: Player signs EIP-712 to propose a recipe

This signature will be used as authorization to burn if no recipe was found with the associated mix. And, we execute this burnClay transaction on behalf of the player to absorb the associated transactional fees.

From now on, we have to manage several states:

  1. Transaction is transit - stored off-chain -> moving to on-chain. Used for reconciliations.
  2. Transaction is in the mempool and waits to be mined - transition state requires us to reconcile what we believe the user has in flight versus what’s being processed - all with no guarantee on how long it would take
  3. Transaction is mined - the blue clay quantity displayed on screen is now accurate.  This state is the final state and does not represent any challenge - blockchain is the source of truth
  4. Transaction is mined but considered as aborted

Why The Smurfs' team chose Polygon

In an ideal world, we expect all states to converge in 1 minute or less to guarantee the correctness of operations and game experience. For this use case, we decided to use the Polygon network for the gamification phase. Below are some of the most important criteria that helped us choose Polygon as our network:

  • Fast finalization times for reactiveness
  • Reduced transaction fees for operational costs management
  • Ethereum interoperability for long-term vision

Our challenge: mine 20,000 transactions per day while maximizing consistency 

Reducing Polygon gas volatility impact while delivering a high success rate

Polygon can experience congestion and can directly affect us by adding delays on the different actions we described earlier. There are basic actions we can take to address these:

  1. Sharding - have multiple wallets interacting with the blockchain. As long as there is an operational wallet (similar to a connection pool for DB), transactions can be done. 
  2. Over pricing - to guarantee our transactions are processed in a timely manner, we can boost the incentive on priority fees (EIP-1559).

Our initial setup was done using the Reinforced transaction solution and over-pricing the market on gas fees to optimize the transaction finalization times. 

Example of extreme gas prices on Polygon.
Example of dramatic gas price volatility on Polygon.

The two screenshots were taken within a few minutes' interval (~5 minutes).  You can see the big variation in gas in a very short time frame, which has a direct impact on us: any rapid surge in the market volatility for gas price, and our transaction would be there stuck blocking one of our wallets!

How to Optimize Polygon Transactions During Periods of Volatile Gas

We started to work on optimizing transactions during periods of volatile gas by:

  • Adding resiliency to our application and track the state of transactions better for longer period of time
  • Track and alert when symptoms are exhibited - we did this by tracking the time a message gets processed

When everything looks good in micro, we see this:

A healthy process with a processing time that varies but below 20-second mark.

And this is from a macro level when things get more interesting:

Example of messages that are older than 200 seconds.

Occasionally, some messages were nearly 200” old (the example above is from one of our hardest days,.) When this gap of time grows too much, players can start to notice it: the biggest flag being that the blue clay counter doesn’t reduce.

On our worst day, post Polygon upgrade, we had to stop the game for 30 minutes because some transactions could not be mined. 

Since then we are actively collaborating with Alchemy to attack this challenge with one of their new offerings and we decided to enter the beta for Gas Optimized transactions.  

The promise of gas optimized transactions is to manage unpredictable outliers.

Imagine if instead of signing a single transaction with some hard coded kind of logic to determine the gas, you could sign 5 or 10 transactions with different gas price conditions, and leave it up to Alchemy to escalate nonce pricing automatically! 

In our case, we decided to opt for the following escalation policy: 1.25, 1.5, 1.75, 2, 2.25, 2.5

So far, we've had great results with it. 

We have successfully run several optimizations and see it live in the system. We look forward to continuing our collaboration with the Alchemy team, whom I want to personally thank for the reactivity the day shit hit the fan and for the quick turnaround response time. It’s all about choosing the right partners!

Smurfily,

The Smurfs’ Society team. 

More articles