A First Look at Neon: A Postgres Database That Branches

Photo by YUEBIRDS on Unsplash

A First Look at Neon: A Postgres Database That Branches

Have you tried this NewSQL Engine?

Relational databases have a long, long history. The first appeared in the 1970s, and while the technology has certainly evolved, the relational model has proved the most popular over the last 5 decades.

Is there room for innovation after 50 years of history? The folks at Neon are showing us that one can, in fact, teach an old dog new tricks.

Neon is an open-source (Apache 2.0) alternative to AWS Aurora or Google's Cloud SQL for Postgres. Neon is a serverless, scalable implementation of PostgreSQL that you can run on-premise or try through its managed service.

Neon decomposes the PostgreSQL architecture into two layers: compute and storage. The compute layer consists of stateless PostgreSQL running on Kubernetes, allowing pods to be scaled on demand — even to zero.

Persistence is achieved with the Neon storage engine, a custom-made layer that handles both transactions and data. The transaction log is processed through a set of redundant safekeeper services, while data pages are managed on disk by the pageserver.

An achitecture diagram. Many compute nodes are running as pods in a Kubernetes cluster. The nodes are part of the compute plane. The plane connects with the storage plane in two ways: first, compute nodes connect to a pageserver. The page server then stores the data in the storage backend. Second, compute nodes communicate with safekeeper nodes, which deal with the transaction log (WAL) stream. Neon architecture splits the database into scalable compute and storage planes. Compute nodes can be started and stopped depending on demand.

Currently, the managed service is running a free tech preview with some limitations that we'll discuss later.

Neon was launched in June 2021. Being a new project, the managed service may have fewer features than the competition. But Neon has one feature that, to my knowledge, no one else has: branches.

Every developer is familiar with branches. In Neon, branches work pretty much the same as in Git, except they cannot be merged (although there are plans to add schema-based merging in the future). You can, at any point, branch off the main trunk, effectively creating an “alternate timeline".

Since branches in Neon are writable, this feature allows us to do things no other database engine can do. For instance:

  • Freely experiment without impacting the main branch.
  • Instantly back up the database. So, if data is lost by mistake, we can switch to the last good branch.
  • Simplify integration testing. Developers can run tests in disposable test-specific branches.
  • Safely try out automated database migrations on production.
  • Run analytics or machine learning workloads in isolation.

Instantly duplicate all the databases that serve to a specific cluster of microservices.

You can't do any of these things on traditional database engines. Not easily at least. Some database engines like SQL Server have snapshots, which indeed can create instant copies of a database. But snapshots are read-only and this limits their utility. On most database engines, we have to resort to clunkier mechanisms like backup and restore or replication.

The diagram shows 4 branches. Three branches split from the main one. They are called 'test A', 'test B', and 'test C'. The branches have the contents of the database at the branching point. From that moment on, they follow their own timelines and can differ from each other and the main branch. A few use cases for Neon branches.

Branches are per-project. And a project can have multiple databases. That means that creating a branch duplicates all the databases in that project. We can take advantage of this project > database hierarchy to clone a group of related databases in one operation.

Let's try out Neon's managed service. To create a tech preview free account, just follow these steps:

  1. Sign up at neon.tech/sign_in.
  2. Click on Create a project.
  3. Click on Download env.txt. This file contains everything you need to connect to the database instance.
  4. Click on Settings and copy the project id.

We also need to generate an API Key, as shown below:

  1. Click on your avatar and select Account > Developer Settings > Create new API key
  2. Edit env.txt and add the following lines:
    • export NEON_API_KEY=Your-API-Key
    • export PROJECT_ID=Your-Project-ID
  3. Add the keyword export before every variable.

The final env.txt file should look like this example:

# Connection details
export PGHOST=ep-random-name.us-east-2.aws.neon.tech
export PGDATABASE=neondb
export PGUSER=Tommy
export PGPASSWORD=sekret1

# Connection string
export DATABASE_URL=postgres://Tommy:sekret1@ep-random-name.us-east-2.aws.neon.tech/neondb

# Neon config
export NEON_API_KEY=MyApiKey
export NEON_PROJECT_ID=random-name-140532

We’ll need this file to connect to the Neon database and API.

The Neon dashboard includes an SQL editor to run commands and controls for creating branches or endpoints.

Screenshot of the Neon dashboard managed service at console.neon.tech. The managed service dashboard.

On the Branches page, we’ll find options for creating a new branch. Here, you can select the what and when. You must choose the parent branch and how much data to include:

  • Head: the new branch is a copy of the current database.
  • Time: the branch has the parent's data up to a specified date and time.
  • LSN: the branch has the parent's data up to a specified log sequence number.

Screenshot of the create branch UI. It shows a parent branch selector, three options to create the branch: head, time, and LSN, and the option to create an endpoint for the new branch. The create branch UI.

Endpoints for the branches can be created on the same screen or on the Endpoints page.

Neon is a PostgreSQL database, so we'll need to install the client tools. Check which version yours is running with:

$ psql --version
psql (PostgreSQL) 15.1 (Ubuntu 15.1-1.pgdg20.04+1)

Neon works best with versions 14 and 15 of the client tools. So, if needed, head to postgresql.org/download to get the latest release.

Now, let's source the env.txt and try connecting.

$ source env.txt
$ psql

psql (15.1 (Ubuntu 15.1-1.pgdg20.04+1), server 14.6)

neondb=> SELECT version();
 version
---------------------------------------------------------------------------------------------------
 PostgreSQL 14.6 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
(1 row)

You can also do browser-based authentication by running the command shown below. This will open a browser window and let you select the project you want to connect to.

$ psql -h pg.neon.tech

We may also test the Neon API key with curl:

$ curl -s 'https://console.neon.tech/api/v2/projects' \
 -H 'accept: application/json' \
 -H "Authorization: Bearer $NEON_API_KEY"

{
 "projects": [
 {
 "id": "calm-guy-140532",
 "platform_id": "aws",
 "region_id": "aws-us-east-2",
 "name": "test-project",
 "provisioner": "k8s-pod",
 "pg_version": 14,
 "locked": false,
 "created_at": "2022-12-22T18:16:13Z",
 "updated_at": "2022-12-22T18:16:13Z",
 "proxy_host": "us-east-2.aws.neon.tech"
 }
 ]
}

One of the most delicate parts of deployment is the database migration step because there is always the possibility of data loss. There are several techniques to make this process safer. But with Neon, we can leverage branches to make the process infallible.

You can check out a live example for running branches with continuous integration and delivery by checking out this tutorial.

At the time of writing Neon is in a technical preview stage. While the managed service is entirely free, it comes with some limitations:

  • You can only have one project per user. But a project can have multiple databases.
  • A project can have up to nine branches in addition to the main branch.
  • You can have up to three endpoints. One is always reserved for the main database. That leaves only two endpoints accessible for two other branches.
  • The size limit is 3GB per branch on the free tier.
  • Point-in-time branches can only go up to seven days into the past.
  • There are no backup or restore options on the UI. The only alternative seems to run pg_dump neondb and take a remote backup.
  • There are a few observations around importing data from another PostgreSQL instance or a backup.
  • There is a 100 concurrent connections limit. You can enable connection pooling to raise the limit to 1,000 connections.

The good news is that if you like the database, you can always run it on-premise or in your cloud of choice to remove these limitations.

Neon's branching feature presents new options for development and database management. Even in its current technical preview stage, I can see great potential for this engine. Of course, we'll have to see how the project evolves, especially how the final pricing model will turn out.

I hope you found this project interesting, and if so, you might want to consider contributing to Neon.

Thanks for reading!