Configuring the database connection pool

Chapter 5.3.

Configuring the database connection pool

In the first Let’s Go book we talked through the sql.DB connection pool at a high level and demonstrated the core principles of how to use it. But in this chapter we’re going to go in-depth — explaining how the connection pool works behind the scenes, and exploring the settings we can use to change and optimize its behavior.

So how does the sql.DB connection pool work?

The most important thing to understand is that a sql.DB pool contains two types of connections — ‘in-use’ connections and ‘idle’ connections. A connection is marked as in-use when you are using it to perform a database task, such as executing a SQL statement or querying rows, and when the task is complete the connection is then marked as idle.

When you instruct Go to perform a database task, it will first check if any idle connections are available in the pool. If one is available, then Go will reuse this existing connection and mark it as in-use for the duration of the task. If there are no idle connections in the pool when you need one, then Go will create a new additional connection.

When Go reuses an idle connection from the pool, any problems with the connection are handled gracefully. Bad connections will automatically be retried twice before giving up, at which point Go will remove the bad connection from the pool and create a new one to carry out the task.

Configuring the pool

The connection pool has four methods that we can use to configure its behavior. Let’s talk through them one-by-one.

The SetMaxOpenConns method

The SetMaxOpenConns() method allows you to set an upper MaxOpenConns limit on the number of ‘open’ connections (in-use + idle connections) in the pool. By default, the number of open connections is unlimited.

Broadly speaking, the higher that you set the MaxOpenConns limit, the more database queries can be performed concurrently and the lower the risk is that the connection pool itself will be a bottleneck in your application.

But leaving it unlimited isn’t necessarily the best thing to do. By default PostgreSQL has a hard limit of 100 open connections and, if this hard limit is hit under heavy load, it will cause our pq driver to return a "sorry, too many clients already" error.

To avoid this error, it makes sense limit the number of open connections in our pool to comfortably below 100 — leaving enough headroom for any other applications or sessions that also need to use PostgreSQL.

The other benefit of setting a MaxOpenConns limit is that it acts as a very rudimentary throttle, and prevents the database from being swamped by a huge number of tasks all at once.

But setting a limit comes with an important caveat. If the MaxOpenConns limit is reached, and all connections are in-use, then any further database tasks will be forced to wait until a connection becomes free and marked as idle. In the context of our API, the user’s HTTP request could ‘hang’ indefinitely while waiting for a free connection. So to mitigate this, it’s important to always set a timeout on database tasks using the context.Context type. We’ll explain how to do that later in the book.

The SetMaxIdleConns method

The SetMaxIdleConns() method sets an upper MaxIdleConns limit on the number of idle connections in the pool. By default, the maximum number of idle connections is 2.

In theory, allowing a higher number of idle connections in the pool will improve performance because it makes it less likely that a new connection needs to be established from scratch — therefore helping to save resources.

But it’s also important to realize that keeping an idle connection alive comes at a cost. It takes up memory which can otherwise be used for your application and database, and it’s also possible that if a connection is idle for too long then it may become unusable. For example, by default MySQL will automatically close any connections which haven’t been used for 8 hours.

So, potentially, setting MaxIdleConns too high may result in more connections becoming unusable and more resources being used than if you had a smaller idle connection pool — with fewer connections that are used more frequently. As a guideline: you only want to keep a connection idle if you’re likely to be using it again soon.

Another thing to point out is that the MaxIdleConns limit should always be less than or equal to MaxOpenConns. Go enforces this and will automatically reduce the MaxIdleConns limit if necessary.

The SetConnMaxLifetime method

The SetConnMaxLifetime() method sets the ConnMaxLifetime limit — the maximum length of time that a connection can be reused for. By default, there’s no maximum lifetime and connections will be reused forever.

If we set ConnMaxLifetime to one hour, for example, it means that all connections will be marked as ‘expired’ one hour after they were first created, and cannot be reused after they’ve expired. But note:

This doesn’t guarantee that a connection will exist in the pool for a whole hour; it’s possible that a connection will become unusable for some reason and be automatically closed before then.
A connection can still be in use more than one hour after being created — it just cannot start to be reused after that time.
This isn’t an idle timeout. The connection will expire one hour after it was first created — not one hour after it last became idle.
Once every second Go runs a background cleanup operation to remove expired connections from the pool.

In theory, leaving ConnMaxLifetime unlimited (or setting a long lifetime) will help performance because it makes it less likely that new connections will need to be created from scratch. But in certain situations, it can be useful to enforce a shorter lifetime. For example:

If your SQL database enforces a maximum lifetime on connections, it makes sense to set ConnMaxLifetime to a slightly shorter value.
To help facilitate swapping databases gracefully behind a load balancer.

If you do decide to set a ConnMaxLifetime on your pool, it’s important to bear in mind the frequency at which connections will expire (and subsequently be recreated). For example, if you have 100 open connections in the pool and a ConnMaxLifetime of 1 minute, then your application can potentially kill and recreate up to 1.67 connections (on average) every second. You don’t want the frequency to be so great that it ultimately hinders performance.

The SetConnMaxIdleTime method

The SetConnMaxIdleTime() method sets the ConnMaxIdleTime limit. This works in a very similar way to ConnMaxLifetime, except it sets the maximum length of time that a connection can be idle for before it is marked as expired. By default there’s no limit.

If we set ConnMaxIdleTime to 1 hour, for example, any connections that have sat idle in the pool for 1 hour since last being used will be marked as expired and removed by the background cleanup operation.

This setting is really useful because it means that we can set a relatively high limit on the number of idle connections in the pool, but periodically free up resources by removing any idle connections that we know aren’t really being used anymore.

Putting it into practice

So that’s a lot of information to take in… and what does it mean in practice? Let’s summarize all the above into some actionable points.

As a rule of thumb, you should explicitly set a MaxOpenConns value. This should be comfortably below any hard limits on the number of connections imposed by your database and infrastructure, and you may also want to consider keeping it fairly low to act as a rudimentary throttle.

For this project we’ll set a MaxOpenConns limit of 25 connections. I’ve found this to be a reasonable starting point for small-to-medium web applications and APIs, but ideally you should tweak this value for your hardware depending on the results of benchmarking and load-testing.
In general, higher MaxOpenConns and MaxIdleConns values will lead to better performance. But the returns are diminishing, and you should be aware that having a too-large idle connection pool (with connections that are not frequently reused) can actually lead to reduced performance and unnecessary resource consumption.

Because MaxIdleConns should always be less than or equal to MaxOpenConns, we’ll also limit MaxIdleConns to 25 connections for this project.
To mitigate the risk from point 2 above, you should generally set a ConnMaxIdleTime value to remove idle connections that haven’t been used for a long time. In this project we’ll set a ConnMaxIdleTime duration of 15 minutes.
It’s probably OK to leave ConnMaxLifetime as unlimited, unless your database imposes a hard limit on connection lifetime, or you need it specifically to facilitate something like gracefully swapping databases. Neither of those things apply in this project, so we’ll leave this as the default unlimited setting.

Configuring the connection pool

Rather than hard-coding these settings, let’s update the cmd/api/main.go file to accept them as command-line flags.

The command-line flag for the ConnMaxIdleTime value is particularly interesting, because we want it to convey a duration of time, like 5s (5 seconds) or 10m (10 minutes). To assist with this we can use the flag.DurationVar() function to read in the command-line flag value, which will automatically convert it to a time.Duration type for us.

Go ahead and update the cmd/api/main.go file as follows:

File: cmd/api/main.go

package main

...

// Add maxOpenConns, maxIdleConns and maxIdleTime fields to hold the configuration
// settings for the connection pool.
type config struct {
    port int
    env  string
    db   struct {
        dsn          string
        maxOpenConns int
        maxIdleConns int
        maxIdleTime  time.Duration
    }
}

...

func main() {
    var cfg config

    flag.IntVar(&cfg.port, "port", 4000, "API server port")
    flag.StringVar(&cfg.env, "env", "development", "Environment (development|staging|production)")

    flag.StringVar(&cfg.db.dsn, "db-dsn", os.Getenv("GREENLIGHT_DB_DSN"), "PostgreSQL DSN")

    // Read the connection pool settings from command-line flags into the config struct.
    // Note that the default values we're using are the ones we discussed above.
    flag.IntVar(&cfg.db.maxOpenConns, "db-max-open-conns", 25, "PostgreSQL max open connections")
    flag.IntVar(&cfg.db.maxIdleConns, "db-max-idle-conns", 25, "PostgreSQL max idle connections")
    flag.DurationVar(&cfg.db.maxIdleTime, "db-max-idle-time", 15*time.Minute, "PostgreSQL max connection idle time")

    flag.Parse()

    ...
}

func openDB(cfg config) (*sql.DB, error) {
    db, err := sql.Open("postgres", cfg.db.dsn)
    if err != nil {
        return nil, err
    }

    // Set the maximum number of open (in-use + idle) connections in the pool. Note that
    // passing a value less than or equal to 0 will mean there is no limit.
    db.SetMaxOpenConns(cfg.db.maxOpenConns)

    // Set the maximum number of idle connections in the pool. Again, passing a value
    // less than or equal to 0 will mean there is no limit.
    db.SetMaxIdleConns(cfg.db.maxIdleConns)

    // Set the maximum idle timeout for connections in the pool. Passing a duration less
    // than or equal to 0 will mean that connections are not closed due to their idle time. 
    db.SetConnMaxIdleTime(cfg.db.maxIdleTime)

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    err = db.PingContext(ctx)
    if err != nil {
        db.Close()
        return nil, err
    }

    return db, nil
}

If you go ahead and run the application again now, everything should still work correctly. For the db-max-idle-time flag, you can pass in any value acceptable to the time.ParseDuration() function, such as 300ms (300 milliseconds), 5s (5 seconds) or 2h45m (2 hours and 45 minutes). The valid time units are ns, us (or µs), ms, s, m and h.

For example:

$ go run ./cmd/api -db-max-open-conns=50 -db-max-idle-conns=50 -db-max-idle-time=2h30m

At this point you won’t really notice any obvious changes to the application, and there’s not much we can do to demonstrate the impact of these settings. But later in the book we’ll do a bit of load testing, and explain how to monitor the state of the connection pool in real-time using the db.Stats() method. At that point, you’ll be able to see some of the behavior that we’ve talked about in this chapter in action.