Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  22 Feb 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
    Reply  22 Feb 2023, Stefan Ritt, Info, connection to a MySQL server: retry procedure in the Logger 
       Reply  07 Mar 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
Message ID: 2464     Entry time: 07 Mar 2023     In reply to: 2459
Author: Stefano Piacentini 
Topic: Info 
Subject: connection to a MySQL server: retry procedure in the Logger 
> > Dear all,
> > 
> > we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> > option to retry multiple times the I/O on the MySQL?
> > 
> > The error we are experiencing is the following (hiding the IP address):
> > 
> > [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> > connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> > 
> > Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
> 
> What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
> the connection still might not work. If we retry many times, people will complain that the run start or stop 
> takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
> information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
> so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
> 
> In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
> except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
> use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
> 
> Best,
> Stefan


Dear Stefan,

a possible solution could be to define the number of times to retry as a parameter that is 0 by default, as well as a wait time between two subsequent tries. This 
would leave the decision on how to handle a possible failed connection to the user. In our case, for example, we would prefer to not stop the acquisition in case 
of a failed connection to the external SQL. In addition, we have other software that, with a retry procedure, doesn’t fail: with 1 re-try and a sleep time of 0.5 s 
we already recover 100% of the faults.

Anyway, we implemented a local database, which is a mirror of the external one, and the problems disappeared.

Thanks,
Stefano.
ELOG V3.1.4-2e1708b5