“PG::UnableToSend: SSL error: sslv3 alert bad record mac”: not Postgres, not SSL

If you see this error in theĀ build log for your Rails app (we were getting it during our Cucumber tests):

PG::UnableToSend: SSL error: sslv3 alert bad record mac
: ...SQL query...

then what you have is probably not an SSL problem, and probably not a Postgres problem. You should not turn off the SSL connection to Postgres, because more than likely not SSL causing the problem, and it’s not (directly) Postgres either, so don’t disable SSL in Postgres.

What it most likely is, especially if it’s intermittent, is Postgres dropping a connection and then Rails trying to reuse it. This is still not Postgres’s fault.

In my particular case, it turned out my app was using Puma to serve the app, and when you’ve got Puma in there, you need to properly configure it to work with Postgres by setting up a connection pool, like this:

workers Integer(ENV['WEB_CONCURRENCY'] || 2)
threads_count = Integer(ENV['MAX_THREADS'] || 1)
threads threads_count, threads_count

preload_app!

rackup DefaultRackup
port ENV['PORT'] || 3000
environment ENV['RACK_ENV'] || 'development'

on_worker_boot do
  # Worker specific setup for Rails 4.1+
  ActiveRecord::Base.establish_connection
end

on_worker_shutdown do
  ActiveRecord::Base.connection.disconnect!
end

This is a minimal configuration; we’re probably going to want to add more workers later. The important part is the on_worker_boot block. This tells ActiveRecord that it’s time for a new connection pool for this worker. I’ve cut the number of threads to one, as connections cannot be shared across threads.

Once I had this in place, our build passed and all seems good. We may have to add some more workers once we go to production, but that’s a simple matter of updating the Ansible playbook to bump WEB_CONCURRENCY to an acceptable level.

Why didn’t this show up until today? Probably because we hadn’t gotten our application’s test suite up to the point that we used up all our connections and had some drop on us. We’re still early on in our development and are just starting to hit the database a fair amount.

I’m posting this because I searched for hours today for a solution that wasn’t just “turn off your security and it’ll be fine” – which wouldn’t have actually fixed the issue anyway.

[Edited 11/24/15 to add on_worker_shutdown block.]

Tags: , ,

Reply