Choosing a serverless database for a serverless app
When using services such as Next.js on Vercel, Cloudflare workers or even lambda functions, you just cannot use any database. The are a few reasons why this is the case:
- Some of these environments are not full Node.js implementations, so you cannot just use your regular database driver. These environments may be so limited that sometimes the only way to communicate to other services is using just
fetch. You cannot create a TCP socket.
- These environments do not have an instance of your application running all the time. Instead they create instances of your app when a request comes in and destroy them when the request is done. Whereas typically databases are designed to keep a persistent connection open with your application for different reasons:
- Opening a connection to a database is not cheap. The underlying protocol may require multiple roundtrips just to authenticate and select the database you want to use. On a long running application you just keep the connection open, but in a serverless environment you cannot do that.
- Not only it is not cheap, but the number of connections you can open is limited.
So, what are some options?
DynamoDB is a fully managed database offered by Amazon. It's promoted as a key-value database but thanks to some of its capabilities it can be used for more complex data models than just storing pairs of key-values. However it is much harder to use than a relational database and evolving the schema can be very painful.
Nevertheless it has some very competitive benefits:
- When used correctly it can be very cheap and fast. They promote it as single-digit millisecond performance at any scale.
- Your application will connect to it using HTTP, so it just works on any serverless environment.
- It is fully managed and has optional global replication.
If you are interested in using DynamoDB, I recommend you to read this book which is a great introduction to DynamoDB. It covers all you need to know about single table design which is the recommended way to model your data in DynamoDB for relational-like data models.
PlanetScale is a database that is built on top of Vitess, a database clustering system almost fully compatible with MySQL that is used by companies like Slack and GitHub. It is a relational database, so it is much easier to use than DynamoDB. It is also fully managed and has optional global replication (with a feature called "Portals"). The only think I know so far that doesn't make it 100% compatible with MySQL is that it doesn't support foreign keys. So, in many cases it is not just a drop-in replacement for MySQL. For example if your application logic depends on getting NULL values once a foreign key is deleted, you will have to change your application logic.
For serverless environments PlanetScale offers a serverless driver that internally connects to planetscale using HTTP. With this driver you can execute raw SQL queries from a serverless environment. I'm not sure if other ORMs such as Prisma or TypeORM have added or will addd support for this, but I know at least that there's a driver for kysely: kysely-planetscale.
Another great thing about PlanetScale is that you don't have to worry about how much instances, CPU or memory you need for your databases. PlanetScale will automatically scale the infrastructure for you. You just pay for number of rows read/written and storage.
Neon.tech is a serverless PostgreSQL service. It is fully managed and fully compatible with PostgreSQL. Their best selling point is that they have developed an alterlative storage with copy-on-write capabilities that allows them to be very cost efficient and fast. This mechanism is also how they can do branching of your database (including the data, not just the schema) in a fast and reliable way. They also have a serverless driver that allows you to execute raw SQL queries from a serverless environment. The way this driver is implemented is different to the PlanetScale one. Neon.tech has implemented a drop-in replacement of
pg that runs on serverless environments. This package is what Prisma and TypeORM for example use to connect to PostgreSQL, so any application using these ORMs can potentially migrate easily thanks to this package.
Neon.tech doesn't have public pricing yet, and they don't provide global replication yet, but the service is very promising.
(disclaimer: I worked at Xata)
Xata is a serverless database built on top of PostgreSQL and Elastic Search. However these services are not exposed. Instead you use a TypeScript SDK or a REST API. This means Xata can be used in serverless environments natively. Xata is fully managed but you need to reserve capacity to scale.
Xata is the only database in this comparison that offers powerful search capabilities (thanks to the underlying Elastic Search instances) but doesn't offer all the functionality you may find in a relational database regarding data modeling.
Supabase scales automatically but it doesn't support global replication.
Cloudflare D1 is fully managed and it is built on top of SQLite. It's a very promising offering because if you are a Cloudflare workers user, it'll allow you to have the database as close as possible to your workers and be globally distributed.
It is still in beta and very limited at the moment (e.g. dbs have a limit of 100Mb), and it's still unclear how much it will scale.
Cloudflare Durable Objects
Cloudflare Durable Objects can be a simple and efficient mechanism if you are using Cloudflare workers and your use case is simple enough. It's not a relational database service but it's an interesting way of storing data.
Prisma data proxy
Prisma data proxy provices connection pooling and avoids cold starts in serverless functions because you don't connect directly to the database, but to a proxy hosted by Prisma that does the connection pooling and avoids the multiple roundtrips when establishing the connection.
There are a few choices and choosing one or the other may depend fundamentally on these factors:
- If your use case is simple or you prefer to use a relational database.
- If you are a Cloudflare workers user.
- If you need global replication.
- If you need powerful search capabilities.
- If you need to scale automatically.
- If you have an existing app and need compatibility or you are starting from scratch.