OBS 2.5.x + clockwork = ActiveRecord::ConnectionTimeoutError

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

OBS 2.5.x + clockwork = ActiveRecord::ConnectionTimeoutError

Srinidhi B
Hi,

A couple of months back, we updated our local OBS server to OBS 2.5.4. Ever since this update, we've had a lot of issues with clockwork and delayed_job.

As soon as we updated OBS, /srv/www/obs/api/tmp/clockworkd.clock.output said:

  I, [2014-09-12T20:30:55.163170 #29041]  INFO -- : Triggering 'fetch notifications'
  E, [2014-09-12T20:31:00.163931 #29041] ERROR -- : could not obtain a database connection within 5.000 seconds (waited 5.000 seconds) (ActiveRecord::ConnectionTimeoutError)
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll'
  /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `acquire_connection'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:356:in `block in checkout'
  /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:355:in `checkout'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:265:in `block in connection'
  /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:264:in `connection'
  /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:294:in `with_connection'
  /srv/www/obs/api/config/clock.rb:39:in `block in <module:Clockwork>'
  /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `call'
  /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `execute'
  /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:41:in `block in run'

This continued for a few days until I realized we weren't receiving any build failure notifications. Since I couldn't debug the issue - and I wasn't aware of the fact that clockwork log file was in "tmp" directory instead of "log" - I simply restarted /etc/init.d/obsapidelayed service. Immediately, our users were bombarded with a build failure mails!

Running "mysqladmin processlist", showed that there were more than 34-36 database connections open - almost each and every one of them had "Sleep" mentioned in the "Command" column.

The same story repeated a couple of weeks back which triggered me to make the following changes in database.yml:

  production:
  ..
  -  pool: 30
  +  pool: 50
  +  checkout_timeout: 15

After this change, I managed to keep delayed_job happy for almost 2 weeks. Last week, things once again went to bad state:

I, [2014-11-12T16:09:31.940219 #6089]  INFO -- : Triggering 'status.refresh'
E, [2014-11-12T16:09:36.938688 #6089] ERROR -- : could not obtain a database connection within 15.000 seconds (waited 15.000 seconds) (ActiveRecord::ConnectionTimeoutError)
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll'
/usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `acquire_connection'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:356:in `block in checkout'
/usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:355:in `checkout'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:265:in `block in connection'
/usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:264:in `connection'
/usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:294:in `with_connection'
/srv/www/obs/api/config/clock.rb:39:in `block in <module:Clockwork>'
/usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `call'
/usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `execute'
/usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:41:in `block in run'

As of now, I've 57 connections to the database:

  # mysqladmin processlist | grep frontend | wc -l
  57

(Ignore the "grep", I'm just being lazy for skipping the table borders and the table header. There is just one database served by the MySQL server)

Regarding mysql's max connections:

  # mysqladmin -p variables | grep max_conn
  Enter password:
  | max_connect_errors                                | 10                                                                                                                     |
  | max_connections                                   | 151                                                                                                                    |

Apparently, we keep filling up ActiveRecord's connection pool without freeing any of the connections. Not sure if the issue occurs because we have set "fetch_notifications" job to be run in a different thread:

  every(27.seconds, 'fetch notifications', thread: true) do
    ActiveRecord::Base.connection_pool.with_connection do |sql|
      # this will return if there is already a thread running
      UpdateNotificationEvents.new.perform
    end
  end

(Yes, I changed the duration from 17 to 27 seconds after I increased the timeout in database.yml from 5 to 15)

Has someone else found this issue? I can't find any solution other than setting a daily cron job to restart obsapidelayed :-/

Regards,
Srinidhi.
--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]