Using Cloudflare Workers With SQL Databases
Sam Aybar
Jul 04, 2023Overview
Cloudflare Workers are serverless functions that run on Cloudflare’s edge network. They can be used to intercept and modify requests before they reach your origin server. This can be used for a variety of purposes, such as caching, security, and performance optimization.
Historically, the ability to connect SQL databases to Cloudflare Workers was limited due to lack of support for TCP connections from a Worker. However, with the recent introduction of the TCP Sockets API and database integrations, Cloudflare has now enabled several options.
In this article, I am going to explore several of the ways that Cloudflare Workers can connect to SQL databases, specifically Postgres. The goal here is to not compare databases or vendors as such, but help developers understand the different options for connecting a Postgres database to Cloudflare Workers, along with highlighting performance nuances for consideration. I will provide some basic performance results and I will also (shamelessly!) compare PolyScale Serverless for database access, pooling and caching at the edge.
With that context, specifically I will focus on:
- TCP Connections with PG
- Supabase-JS
- Neon Serverless
- PolyScale Serverless
Connection Library | Databases Supported | Connection Method |
---|---|---|
node-postgres | All Postgres databases | Workers Socket API |
Supabase-JS | Supabase (Postgres) | HTTP Fetch |
Neon Serverless | Neon (Postgres) | WebSockets |
PolyScale Serverless | Postgres, MySQL, MariaDB and MS SQL Server databases | HTTP Fetch |
Connection Options
Until recently, Cloudflare workers could only make outbound connections using the HTTP Fetch API. As noted, with the introduction of the TCP Sockets API, support for direct TCP connections is available. In addition, a number of different database providers have developed HTTP-based or WebSocket-based drivers, allowing non-TCP connections to SQL databases. Let’s break down the Postgres options further:
TCP Connections with PG
Cloudflare Workers can now connect directly to a database using the PG library. This allows for a TCP connection to any Postgres database directly from a CloudFlare Worker.
Supabase-JS
The Supabase-JS client is a JavaScript client which allows for HTTP based connectivity between an application and a Supabase Postgres instance. Every Supabase instance is running PostgREST, which provides an HTTP API to access the database. The Supabase-JS client provides HTTP access to the underlying Supabase Postgres instance.
Neon Serverless
Neon Serverless is a library built to enable connectivity to a database using Websockets instead of TCP. It is designed as a drop-in replacement for node-postgres, so can be used in place of direct TCP connections from Cloudflare Workers. While in theory it could be used with databases other than a Neon Postgres, it needs to be used in conjunction with a proxy component that makes the translation from Websocket to TCP.
PolyScale Serverless
PolyScale.ai is a serverless, plug-and-play database cache. It dramatically accelerates database read performance, lowers network latency and reduces database infrastructure costs. PolyScale is plug-and-play requiring no code to implement, and zero configuration or tuning. It can be implemented in minutes with a simple configuration change. PolyScale offers a global edge network SaaS platform as well as a self hosted option.
PolyScale can be used in conjunction with any Postgres, MySQL, MariaDB or MS SQL Server (more databases coming soon) with direct TCP connections. PolyScale also supports a lightweight HTTP API that processes SQL queries over HTTP. The HTTP requests are serviced by the closest PolyScale Point of Presence (PoP). Here, a native TCP connection is made to the database. This provides a high-performance, HTTP to TCP translation along with connection pooling and of course, intelligent caching.
Cloudflare Worker Database Considerations
There are several, significant factors to take into consideration with Cloudflare Workers and databases.
Database Location
When connecting to databases from non-serverless environments, the traditional best-practice would be to locate the database as close as possible to the application tier. This way, latency between the application and the database is minimized. With Cloudflare Workers, because the Workers can run in so many different locations, it means it is impossible to have the database always near to the application making the request. A few different database location strategies can be taken:
- Locate the database closest to the location where the most requests come from
- Locate the database so as to minimize the request-weighted distance between the CloudFlare Worker locations and the database
Cloudflare Worker Location
There are additional nuances as well. Generally speaking, Cloudflare Workers are designed to execute closest to the location where they are called. However, the creation of the TCP connection to the database often requires multiple round trips to authenticate and retrieve data, as compared to an http connection wherein the single fetch retrieves the data. As a result, it may make sense to execute the Worker closest to the database, rather than the one closest to the user, so as to minimize the distance of the round trips to establish the database connection.
Cloudflare illustrate this well in the diagrams below: making one short HTTP roundtrip between the user and the worker and three roundtrips between the worker and the database, is best served with a Worker closer to the database rather than the user.
To address a pattern like this, Cloudflare has introduced Smart Placement, which dynamically determines the optimal location from which to execute the Worker. However, at time of publication, it appears to be HTTP based (requests being made between a Worker and an endpoint), rather than TCP. Thus, at present, for TCP connections, Smart Placement is not relevant.
Caching and Cloudflare Workers
Caching database queries makes a lot of sense when working with Cloudflare Workers. Because the code can be executed in so many different locations - many of which may be far from the database - providing cached responses can allow for improved performance. However, as with all manual caching solutions, this means developers must determine what queries to cache and how and when to invalidate.
In the case of Cloudflare Workers, the “what queries to cache” question is easier in the sense that a typical Worker is likely running a single query (microservice). Cloudflare offers a Key Value store that is often used in conjunction with Cloudflare Workers. See How KV works.
However, that said, dealing with cache freshness and eviction remains complex. Developers are forced to decide how long to cache responses for and when to refresh the values from the database. And while the KV Store will propagate changes, it can take over a minute for Workers to get the updated value, though recent changes by Cloudflare may reduce this latency.
Connection Pooling
Because Cloudflare Workers are short-lived processes that create individual connections to a database, and can be executed from multiple locations simultaneously, (and have multiple instances executing from the same location i.e. concurrency), the complexity of managing connections to the database is an additional consideration.
With traditional applications, client side connection pooling is a good strategy for decreasing connection times. The client can warm a number of connections and then just hand an open connection to a process as needed. This saves connection initiation time and allows for a fixed, manageable number of connections to the database. However, since Cloudflare Workers are ephemeral and run independently, there is no ability to create a reusable client side pool.
Server side connection pooling is another way to manage the many connections from Cloudflare Workers to databases. Solutions like PG Bouncer allow many inbound connections from Workers to funnel into a smaller number of connections to the origin database, ensuring that the database isn’t overwhelmed. However, with PG Bouncer, the Workers do still need to create a new connection - in this case to PG Bouncer.
Connecting to Postgres from Cloudflare Workers
Having discussed some of the challenges and complexities associated with Cloudflare Workers and SQL database use, let’s now take a more in depth look at how to connect to a database via the four previously outlined methods:
- Direct TCP Connection with PG
- Supabase-JS
- Neon Serverless
- PolyScale Serverless
Option 1: TCP Connections with PG
Using PG to connect to a database via TCP from a Cloudflare Worker is a straightforward way to connect any Postgres database to a Cloudflare Worker. With the introduction of connect(), Cloudflare’s new API for making TCP sockets from Cloudflare Workers, Postgres databases can be accessed from a Worker with the following code:
import { Client } from "pg";
export interface Env {
DB: string;
}
export default {
async fetch(
request: Request,
env: Env,
ctx: ExecutionContext
): Promise<Response> {
const client = new Client(env.DB);
await client.connect();
const result = await client.query({
text: "SELECT * from todos",
});
console.log(JSON.stringify(result.rows));
const resp = Response.json(result.rows);
// Close the database connection, but don't block returning the response
ctx.waitUntil(client.end());
return resp;
},
};
Note that, when running locally, you must use the --experimental-local
flag to ensure local behavior matches production behavior.
Using an environment variable of DB
which has the postgres connection string for your database (eg. postgres://username:password@hostname:5432/databasename
), this code will connect your Cloudflare Worker to your database.
Benefits:
- Can connect any Postgres database to Cloudflare Workers
- Doesn’t require any additional services
- Supports transactions (within single Worker execution)
Negatives:
- Geographic latency
- Connectivity latency - every execution requires new connection to the database
Option 2: Connecting via Supabase-JS
Many Supabase users are familiar with the Supabase JS client, which provides an HTTP interface to their Supabase instance. Since Cloudflare Workers have native support for fetch, the Supabase-JS client can easily be used for database access. Example:
import { createClient } from "@supabase/supabase-js";
export default {
async fetch(request, env) {
const supabase = createClient(env.SUPABASE_URL, env.SUPABASE_KEY);
const { data, error } = await supabase.from("todos").select("*");
if (error) throw error;
return new Response(JSON.stringify(data), {
headers: {
"Content-Type": "application/json",
},
});
},
};
The environment variables of SUPABASE_URL
and SUPABASE_KEY
identify your Supabase instance and your credentials to connect via the built-in PostgREST instance.
Benefits:
- If you already use Supabase, you may already be familiar with the Supabase-JS library
- Automatically handles database connection limits
- Supports connection pooling using PgBouncer
Negatives:
- Geographic latency
- Only works with Supabase
- Doesn’t use native SQL for queries (a benefit for some?!)
Option 3: Connecting via Neon Serverless
Neon has taken an innovative approach to providing access to their Postgres database, by using a Websocket connection. Neon Serverless is designed as a drop-in replacement for node-postgres, so can be used in place of direct TCP connections from Cloudflare Workers. Neon can be accessed from Cloudflare Workers with the following code:
import { Client } from "@neondatabase/serverless";
interface Env {
DATABASE_URL: string;
}
export default {
async fetch(request: Request, env: Env, ctx: ExecutionContext) {
const client = new Client(env.DATABASE_URL);
await client.connect();
const result = await client.query("select * from todos;");
const resp = Response.json(result.rows);
ctx.waitUntil(client.end()); // this doesn't hold up the response
return resp;
},
};
Using a DATABASE_URL
environment variable like postgresql://username:password@hostname:5432/databasename
will provide access to your Neon database.
You can read more about Neon Serverless here
Benefits:
- Drop in replacement for PG library, providing native use of SQL queries
- Can be used with any Postgres instance if you set up a Websocket-to-TCP Proxy
- Supports transactions (within single Worker execution)
- Supports connection pooling using PgBouncer
Negatives:
- Geographic latency
- At time of writing, Neon Websocket-to-TCP proxy only runs in Frankfurt, which means higher latency for non-European access. No doubt this is a roadmap item for further expansion.
- Need to run your own Websocket-to-TCP proxy if you want to connect different Postgres database than Neon
Option 4: Connecting via PolyScale Serverless
As noted earlier, PolyScale Serverless is a lightweight API that supports SQL queries over HTTP. The HTTP requests are processed (SQL extracted) and proxied through to the database at the closest PolyScale PoP. This uses native database TCP connectivity from the PoP to the database. When used in conjunction with Cloudflare Workers, firstly, this has the benefit of providing connection pooling at the PolyScale PoP. This dramatically lowers the latency for establishing new connections as those are initiated between the Worker and the PolyScale PoP, which will be warm and geographically close.
Secondly, PolyScale will automatically cache any cacheable queries at the closest PoP, reducing latency without needing to write code or make decisions about cache time to live and invalidation.
PolyScale can be used not only with any Postgres database, but also with MySQL, MariaDB, and MS SQL Server, with others coming soon. To use PolyScale, first create a free PolyScale account, and then use your database hostname to create a PolyScale cache.
You can then connect to PolyScale with the following code:
import { Client } from "@polyscale/serverless-js";
const polyscale = new Client("https://serverless.aws.polyscale.global", {
cacheId: "CACHE_ID",
username: "USERNAME",
password: "PASSWORD",
database: "DATABASE",
provider: "postgres", //mysql, postgres, mariadb or mssql
});
export default {
async fetch(): Promise<Response> {
const result = await polyscale.query("SELECT * from todos;");
return new Response(JSON.stringify(result));
},
};
With PolyScale Serverless, you can connect to your database with an automatic low latency cache.
Benefits:
- Low global latency for cacheable queries
- Connection pooling for low latency new connections and massive concurrency
- Native use of SQL for queries
- Can connect any Postgres, MySQL, MariaDB or SQL Server database to Cloudflare Workers
Negatives:
- Requires PolyScale service
- Doesn’t support transactions (yet!)
Performance Comparison
To compare the performance of these four different connection methods, we set up the following instances of Postgres databases running in Virginia:
- AWS RDS Postgres
- Neon Postgres
- Supabase Postgres
We then connected to these databases from Cloudflare Workers using the four methods described above. (The PolyScale Serverless connection was made to the AWS RDS instance.) These Workers were then triggered 10 times each from the Virginia area and from Mumbai, India. We then compared the median performance for each method for:
- A simple SELECT query
- A more complex SELECT query (average database execution time of 500ms)
- A database write
The chart below compares queries and their execution times from the Virginia area.
For simple reads and writes, PolyScale and Direct TCP connections have similar latency; however, for complex queries, the built in cache from PolyScale allows PolyScale to deliver much lower latency, since the database processing time is eliminated for cache hits.
As we go further from the database, both Supabase-JS and Neon Serverless show better performance than a direct TCP connection. However, the benefit of PolyScale is even greater. While all of the other methods have the added latency of needing to move the data from Virginia to India, for cache hits, PolyScale achieves low latency from the nearby PoP. Perhaps surprisingly, PolyScale also delivers much lower write speeds. This is because PolyScale maintains an open connection between its PoP and the database, which results in a much lower connection time as well. While writes are slower than reads (because they do necessitate a trip to the origin database), they can still be much faster than the other methods due to the possibility of leveraging an existing connection.
Conclusion
For a long time, it had been difficult to use Cloudflare Workers with SQL databases due to the lack of TCP support. Database vendors have evolved client libraries utilizing HTTP and with the introduction of the TCP Sockets API, further connectivity options are now available.
The purpose of this article was not to compare databases, vendors or performance, but to illustrate the different methods currently available for connecting Postgres databases to Cloudflare Workers. Of course these methods extend well beyond Postgres and the specific chosen vendors - Neon and Supabase (for example Planetscale’s serverless JS driver), however both highlight different approaches.
Use of PG library enables Direct TCP connections to query any Postgres database. Supabase and Neon each offer different options for connecting to their respective databases. Finally, PolyScale’s Serverless API offers an access method using HTTP (and TCP), supporting connection pooling and automated caching across multiple databases.
Interested in trying PolyScale? Sign up for your own free PolyScale account here (no credit card required). Connect your database to a cache and start delivering your data faster, either locally or around the globe.