Redpanda can save you a lot of headache

No, this post is not sponsored by Redpanda (yet)

3 min readFeb 1, 2023

Last year we were building an ambitious “cryptocurrency arbitrage platform”, which, simply put, is basically buying crypto apples for $5 from some market, then selling them for $5 worth of crypto bananas from another market which then sold for $10 worth of crypto apples in yet another market.

It sounds simple enough until you start listing the modules you require to pull something like that off:

You need constant price feeds from every market (i.e CEX, DEXs, AMMs) for every token pair (e.g BTC/USDT) that you’re interested in.
You need an algorithm to detect arbitrage opportunities (i.e 1 apple → 3 bananas → 2 apples).
You need to have an order execution engine that instantly grabs any opportunity detected.

In this post, we’re going to focus on the 1st module: Price Feeds.

Normally, getting the price feeds isn’t a huge issue. You have several products (Chainlink for Ethereum, Birdeye for Solana etc) that provide convenient REST APIs (mostly priced). But for something like this, you need constant streaming updates, and a REST API doesn’t cut it.

Naturally, you need an arrangement like this:

But that bloats up the “Algorithm” module. Not to mention, the execution engine is also going to need information about the current prices before it can execute any arbitrage — prices may change in a split second and you may be executing a trade in loss if you don’t verify before executing.

The (watered down) solution is:

What the hell is Redpanda?

Simply put, it’s a message broker that receives “Messages” from “Producers” and passes them onto “Consumers”.

In our case, the DEXs, AMMs etc are the Producers. And the Consumer is the Algorithm module.

The great thing about Redpanda is that it ensures no message is lost (it takes care of replication of data across its nodes) and a message can only be consumed once (meaning an update cannot be made twice).

But why can’t you just use Apache Kafka for that?

Good question. Redpanda *is* Kafka compliant. And it’s faster. Much faster.

Plugging in a Redpanda cluster amidst the architecture also led to an easy mental model about the overall architecture: The Price Feeds were now “Producers” and the Algorithm & Execution Engine were “Consumers” . That meant that slight changes to the Algorithm could be tested against the old data (which was stored by Redpanda for easy replayability) so we could compare which tweaks work without polluting the data pipeline or any Look-Ahead Bias.

All messages in Redpanda are tagged, which means you can start your replay from any desired point. This convenience in debugging was honestly a benefit we did not anticipate but it saved us a lot of hours trying to figure out what went wrong.

In the next post, I’ll go into more details about setting up the Consumers and Producers (of which we had 100s).

Cross-posted from: https://muhammadrassam.substack.com/p/serving-deep-learning-models-for-7ee

Subscribe to the newsletter for weekly updates direct in your inbox: https://muhammadrassam.substack.com/

Redpanda can save you a lot of headache

No, this post is not sponsored by Redpanda (yet)

Written by Antematter

No responses yet