Supercharging Postgres on Neon with PolyScale
Sam Aybar
Jan 12, 2023Overview
Neon is a relatively new serverless Postgres database that is gaining popularity due to its simplicity, high performance, the promise of serverless pricing, and its innovative features. Between November 2022 and January 2023, Neon databases have more than doubled, from five thousand to over ten thousand. In this article, we will describe how to create a Neon database and then show how to make it faster with PolyScale.
PolyScale.ai is a serverless, plug-and-play database cache. It dramatically accelerates database read performance, lowers network latency and reduces database infrastructure costs. PolyScale requires no code to implement, and zero configuration and tuning. It can be implemented in minutes with a simple configuration change. PolyScale offers a global edge network SaaS platform as well as a self hosted option.
PolyScale is Postgres wire protocol compatible, so connecting any database is easy – you can see how here. But read on to learn more specifically about working with Neon and PolyScale.
What is Neon
Neon is a fully managed serverless Postgre database. The product is still in its technical preview, but they describe themselves as offering “a generous free tier.” Differentiating features that Neon provides include:
- Separation of storage and compute
- Database branching
- Bottomless storage
Neon is designed to be easy to use and get up and running quickly. This makes it an attractive choice for startups, as well as for developers who want a quick and easy way to store and retrieve data with Postgres.
Separation of Storage and Compute
Separating storage and compute for a database refers to the practice of using separate physical or virtual machines for the storage and processing of data. In a traditional database setup, both storage and compute functions are typically handled by a single machine or cluster of machines. By separating these functions, it is possible to scale the storage and compute resources of a database independently, allowing for more flexibility and better resource utilization.
With Neon, storage is optimized so that cold data can be stored in less expensive locations, such as Amazon S3, while being able to provide rapid access to frequently used/recent data.
There are several benefits to separating storage and compute for a database, such as:
Improved scalability: By separating storage and compute, it is possible to scale each component independently, allowing you to add more storage or processing power as needed. This can be especially useful in environments where the workloads are highly variable or where there is a need to handle sudden spikes in traffic.
Better resource utilization: Separating storage and compute also improves resource utilization, as it allows you to allocate resources more efficiently. For example, if you have a high-capacity storage system but relatively low computing needs, you can use a smaller, less expensive compute cluster to process the data, rather than overprovisioning compute resources.
Improved availability: Separating storage and compute increases the availability of your database, as it allows you to deploy redundant compute clusters that can take over if there are issues with the primary cluster.
However, there are also some tradeoffs to consider when separating storage and compute for a database. One potential drawback is that it can be more complex to set up and manage a system with separated storage and compute, as you will need to manage and maintain multiple components. Because Neon is a fully managed Postgres instance however, this complexity is abstracted for the database user, so is not a concern.
Database Branching
Database branching refers to the ability to create a new branch of a database from a specific point in the database’s history. This is done using a feature called “physical replication”.
One benefit of database branching is the ability to perform testing and debugging on a separate branch of the database, without affecting the production database. This allows developers to make changes and run tests without risking data loss or corruption in the production database.
Database branching can also be useful for creating development or staging environments, where new features can be tested before being deployed to the production database.
Overall, database branching provides a way to isolate changes to a database, allowing for more controlled and safe development and testing processes.
Neon offers easy to use branching, where you can instantly create a branch of your database, taking the current state and users and then create an endpoint to access the branch. This is very useful for testing and development as you can operate with real data seamlessly, without needing to take time to create data or the risk of corrupting production data.
Getting Started with Neon
You can quickly get started with Neon by navigating to https://console.neon.tech and then signing in with a Github or Google account. Once signed in, you can create your first project.
When creating a project, you need to select which region you want to host your database. At present, Neon has four regions, all hosted on AWS:
- US East (Ohio) —
aws-us-east-2
- US West (Oregon) —
aws-us-west-2
- Europe (Frankfurt) —
aws-eu-central-1
- Asia Pacific (Singapore) —
aws-ap-southeast-1
In general, you will want to choose the region closest to your primary application server to minimize latency between your database and your application. (But more on this later!)
When you create your project, Neon will provide you with a env
file that includes connection details for the database.
Branching with Neon
As noted previously, one of the great features of Neon is the easy ability to create branches from your database. With the click of a button you can create a copy-on-write clone of your data that is independent of the original branch. Once created, changes to the branch are not impacted by changes to the original branch and vice versa. Resources are separate, so load on one branch does not impact load on another branch.
Branches are based in the same location as the original branch, so branching does not help address geographic latency to the extent that you have an application running far from the database. Branching does also not reduce execution time for time-consuming queries (beyond the fact that load on one branch does not impact load on another branch).
When creating a branch, you can choose to create an endpoint, which allows you to work with the branch just as you did the original database. For example, if you want to use a branch in staging, you would just replace the original (production) endpoint with the branch endpoint.
Enhancing Neon with PolyScale
While Neon has many great features that make it an excellent choice as a database, like most database solutions, there are two areas where improvements can be made. As indicated above, it is best practice to locate your database close to your application. But if you have multiple locations that host your application (multi-region deployments), this becomes impossible. Many databases (though not Neon, at least yet) offer the ability to have a read replica that can allow for geographic distribution, but this can introduce additional expense and complexity. Alternatively, you can write caching logic in your application, but this requires development time as well as difficulty in determining which data to cache and for how long to cache it.
This is where PolyScale can supercharge Neon. PolyScale is a serverless, plug-and-play database cache. It dramatically accelerates database read performance, lowers network latency and reduces database infrastructure costs. PolyScale requires no code to implement, and zero configuration and tuning. It can be implemented in minutes with a simple configuration change.
By creating a PolyScale cache for your Neon database, in literally a matter of minutes you can have a global cache that will allow you to retrieve data from close to your applications, wherever they are located, as well as provide lightning fast execution even for complex queries.
Connecting PolyScale
Creating a PolyScale cache can be done either through the PolyScale UI or via API. Below we describe how to do so via the website.
Step 0: Create a PolyScale account
If you do not already have a PolyScale account, you can create an account here. PolyScale offers a free tier and no credit card is required.
Step 1: Create your PolyScale Cache
- In your PolyScale account, click on the New Cache button in the top-right corner
- Give the cache a Name
- Select PostgreSQL for the Type
- Enter the Host from your Neon env file (this can also found on the Endpoints tab of Neon)
- Enter 5432 for the Port
- Click Create
Your cache is now created. PolyScale automatically checks to see that your database is accessible from all our global regions and alerts you if there are any connectivity issues. By default, Neon databases are accessible from all IP addresses, so there should be no issue.
Step 2: Connect to your PolyScale Cache
Using your PolyScale cache is simple — instead of connecting to your database directly, you’ll update your original connection string with the PolyScale credentials.
For example, if your original connection string was: postgres://postgres:zqGHFAbPLvVCKw@ep-black-flask-024570.ap-southeast-1.aws.neon.tech:5432/main
Your PolyScale connection string would be: postgres://postgres:zqGHFAbPLvVCKw@
psedge.global
:5432/main?
application_name=a645cb93-fa53-46b2-9d6c-227e357e5bfb
Note that the hostname has been updated (psedge.global
) to route traffic via the PolyScale edge network and also an application_name (in this case, a645cb93-fa53-46b2-9d6c-227e357e5bfb
) has been appended to authenticate with PolyScale.
Keep this string for the next step.
Step 3: Update your application DATABASE_URL
In whatever application your are using Neon, simply replace your direct connection string with the PolyScale connection string instead.
Now all requests will be routed through PolyScale, and PolyScale will automatically cache responses. If your database writes are routed through PolyScale, PolyScale will intelligently clear the cache when it spots updates.
That’s it. You now have a Neon database with a global cache, no code necessary.
Benefits of PolyScale in Action
To illustrate the benefits of PolyScale with Neon, I created a Neon database hosted in Singapore, and then ran queries both from SIngapore and Virginia.
I ran two queries:
Simple:
SELECT * from employees where emp_no = 10001
Complex:
SELECT
COUNT(*) AS employ_count,
MAX(salary) AS max_salary,
MIN(salary) AS min_salary
FROM
employees e
INNER JOIN dept_emp de ON e.emp_no = de.emp_no
INNER JOIN departments d ON de.dept_no = d.dept_no
INNER JOIN salaries s ON s.emp_no = e.emp_no
WHERE
d.dept_name = 'Sales'
I ran the queries both directly to the database as well as via PolyScale from a location in Singapore close to the database.
Query Times (time to retrieve data from database)
Query | Direct | PolyScale |
---|---|---|
Simple | 2ms | 2ms |
Complex | 212ms | 2ms |
Note that for the simple query – where execution time is inconsequential – response times direct to Neon and via PolyScale were similar. In both cases, the time to retrieve data was just a factor of navigating the network from the application to the database (or cache in PolyScale’s case).
However, for the more complex query, even when located so close to the database, making the query via PolyScale is dramatically faster, since the compute time required for the query is eliminated.
I then ran these same queries from an application in Virginia.
Query Times (time to retrieve data from database)
Query | Direct | PolyScale |
---|---|---|
Simple | 232ms | 1ms |
Complex | 381ms | 1ms |
Here you can see that both queries via PolyScale are dramatically faster than the direct connections to Neon. And regardless of location, queries via PolyScale served out of cache can be equally fast around the globe. (Assuming equal proximity to PolyScale’s network.
PolyScale with Neon Branching
PolyScale works easily with Neon branching as well. It is as simple as creating a new cache for the branch and then rather than replacing the host in the application (as is done when moving from main branch to new branch), you simply replace the application_name parameter with the new PolyScale cache id.
Summary
Neon is a powerful, easy to use database option that provides bottomless storage, a separation of storage and compute, and the ability to create instant branches. Combined with PolyScale, which provides code-free global caching, in literally minutes you can have a global database that will scale to your needs while providing low latency response times.