Back Midas Rome Roody Rootana
  Midas DAQ System, Page 65 of 146  Not logged in ELOG logo
ID Date Author Topic Subjectdown
  939   20 Nov 2013 Konstantin OlchanskiBug ReportToo many bm_flush_cache() in mfe.c
I was looking at something in the mserver and noticed that for remote frontends, for every periodic event, 
there are about 3 RPC calls to bm_flush_cache().

Sure enough, in mfe.c::send_event(), for every event sent, there are 2 calls to bm_flush_cache() (once for 
the buffer we used, second for all buffers). Then, for a good measure, the mfe idle loop calls 
bm_flush_cache() for all buffers about once per second (even if no events were generated).

So what is going on here? To allow good performance when processing many small events,
the MIDAS event buffer code (bm_send_event()) buffers small events internally, and only after this internal
buffer is full, the accumulated events are flushed into the shared memory event buffer,
where they become visible to the mlogger, mdump and other consumers.

Because of this internal buffering, infrequent small size periodic events can become
stuck for quite a long time, confusing the user: "my frontend is sending events, how come I do not
see them in mdump?"

To avoid this, mfe.c manually flushes these internal event buffers by calling bm_flush_buffer().

And I think that works just fine for frontends directly connected to the shared memory, one call to 
bm_flush_buffer() should be sufficient.

But for remote fronends connected through the mserver, it turns out there is a race condition between 
sending the event data on one tcp connection and sending the bm_flush_cache() rpc request on another 
tcp connection.

I see that the mserver always reads the rpc connection before the event connection, so bm_flush_cache() 
is done *before* the event is written into the buffer by bm_send_event(). So the newly
send event is stuck in the buffer until bm_flush_cache() for the *next* event shows up:

mfe.c: send_event1 -> flush -> ... wait until next event ... -> send_event2 -> flush
mserver: flush -> receive_event1 -> ... wait ... -> flush -> receive_event2 -> ... wait ...
mdump -> ... nothing ... -> ... nothing ... -> event1 -> ... nothing ...

Enter the 2nd call to bm_flush_cache in mfe.c (flush all buffers) - now because mserver seems to be 
alternating between reading the rpc connection and the event connection, the race condition looks like 
this:

mfe.c: send_event -> flush -> flush
mserver: flush -> receive_event -> flush
mdump: ... -> event -> ...

So in this configuration, everything works correctly, the data is not stuck anywhere - but by accident, and 
at the price of an extra rpc call.

But what about the periodic 1/second bm_flush_cache() on all buffers? I think it does not quite work
either because the race condition is still there: we send an event, and the first flush may race it and only 
the 2nd flush gets the job done, so the delay between sending the event and seeing it in mdump would be 
around 1-2 seconds. (no more than 2 seconds, I think). Since users expect their events to show up "right
away", a 2 second delay is probably not very good.

Because periodic events are usually not high rate, the current situation (4 network transactions to send 1 
event - 1x send event, 3x flush buffer) is probably acceptable. But this definitely sets a limit on the 
maximum rate to 3x (2x?) the mserver rpc latency - without the rpc calls to bm_flush_buffer() there
would be no limit - the events themselves are sent through a pipelined tcp connection without 
handshaking.

One solution to this would be to implement periodic bm_flush_buffer() in the mserver, making all calls to 
bm_flush_buffer() in mfe.c unnecessary (unless it's a direct connection to shared memory).

Another solution could be to send events with a special flag telling the mserver to "flush the buffer right 
away".

P.S. Look ma!!! A race condition with no threads!!!

K.O.
  940   21 Nov 2013 Stefan RittBug ReportToo many bm_flush_cache() in mfe.c
> And I think that works just fine for frontends directly connected to the shared memory, one call to 
> bm_flush_buffer() should be sufficient.

That's correct. What you want is once per second or so for polled events, and once per periodic event (which anyhow will typically come only every 10 seconds or so). If there are 3 calls 
per event, this is certainly too much.


> But for remote fronends connected through the mserver, it turns out there is a race condition between 
> sending the event data on one tcp connection and sending the bm_flush_cache() rpc request on another 
> tcp connection.
> 
> ...
> 
> One solution to this would be to implement periodic bm_flush_buffer() in the mserver, making all calls to 
> bm_flush_buffer() in mfe.c unnecessary (unless it's a direct connection to shared memory).
> 
> Another solution could be to send events with a special flag telling the mserver to "flush the buffer right 
> away".

That's a very good and useful observation. I never really thought about that. 

Looking at your proposed solutions, I prefer the second one. mserver is just an interface for RPC calls, it should not do anything "by itself". This was a strategic decision at the beginning. 
So sending a flag to punch through the cache on mserver seems to me has less side effects. Will just break binary compatibility :-)

/Stefan
  624   01 Sep 2009 Jimmy NgaiForumTimeout during run transition
Dear All,

I'm using SL5 and MIDAS rev 4528. Occasionally, when I stop a run in odbedit, 
a timeout would occur: 
[midas.c:9496:rpc_client_call,ERROR] rpc timeout after 121 sec, routine 
= "rc_transition", host = "computerB", connection closed
Error: Unknown error 504 from client 'Frontend' on host computerB

This error seems to be random without any reason or pattern. After this error 
occurs, I cannot start or stop any run. Sometime restarting MIDAS can bring 
the system working again, but sometime not.

Another transition timeout occurs after I change any ODB value using the web 
interface:
[midas.c:8291:rpc_client_connect,ERROR] timeout on receive remote computer 
info: 
[midas.c:3642:cm_transition,ERROR] cannot connect to client "Frontend" on host 
computerB, port 36255, status 503
Error: Cannot connect to client 'Frontend'

This error is reproducible: start run -> change ODB value within webpage -> 
stop run -> timeout!

Any idea?

Thanks,
Jimmy
  625   03 Sep 2009 Stefan RittForumTimeout during run transition
> Dear All,
> 
> I'm using SL5 and MIDAS rev 4528. Occasionally, when I stop a run in odbedit, 
> a timeout would occur: 
> [midas.c:9496:rpc_client_call,ERROR] rpc timeout after 121 sec, routine 
> = "rc_transition", host = "computerB", connection closed
> Error: Unknown error 504 from client 'Frontend' on host computerB
> 
> This error seems to be random without any reason or pattern. After this error 
> occurs, I cannot start or stop any run. Sometime restarting MIDAS can bring 
> the system working again, but sometime not.
> 
> Another transition timeout occurs after I change any ODB value using the web 
> interface:
> [midas.c:8291:rpc_client_connect,ERROR] timeout on receive remote computer 
> info: 
> [midas.c:3642:cm_transition,ERROR] cannot connect to client "Frontend" on host 
> computerB, port 36255, status 503
> Error: Cannot connect to client 'Frontend'
> 
> This error is reproducible: start run -> change ODB value within webpage -> 
> stop run -> timeout!

A few hints for debugging:

- do the run stop via odbedit and the "-v" flag, like

[local:Online:R]/> stop -v

then you see which computer is contacted when.

- Then put some debugging code into your front-end end_of_run() routine at the 
beginning and the end of that routine, so you see when it's executed and how long 
this takes. If you do lots of things in your EOR routine, this could maybe cause a 
timeout.

- Then make sure that cm_yield() in mfe.c is called periodically by putting some 
debugging code there. This function checks for any network message, such as the 
stop command from odbedit. If you trigger event readout has an endless loop for 
example, cm_yield() will never be called and any transition will timeout.

- Make sure that not 100% CPU is used on your frontend. Some OSes have problems 
handling incoming network connections if the CPU is completely used of if 
input/output operations are too heavy.

- Stefan
  2144   09 Apr 2021 Lars MartinSuggestionTime zone selection for web page
The new history as well as the clock in the web page header show the local time 
of the user's computer running the browser.
Would it be possible to make it either always use the time zone of the Midas 
server, or make it selectable from the config page?
It's not ideal trying to relate error messages from the midas.log to history 
plots if the time stamps don't match.
  2152   14 Apr 2021 Stefan RittSuggestionTime zone selection for web page
> The new history as well as the clock in the web page header show the local time 
> of the user's computer running the browser.
> Would it be possible to make it either always use the time zone of the Midas 
> server, or make it selectable from the config page?
> It's not ideal trying to relate error messages from the midas.log to history 
> plots if the time stamps don't match.

I implemented a new row in the config page to select the time zone. 

"Local": Time zone where the browser runs
"Server": Time zone where the midas server runs (you have to update mhttpd for that)
"UTC+X": Any other time zone

The setting affects both the status header and the history display.

I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
support for that is not that great in JavaScript, so I decided to go with simple 
UTC+X. Hope that's ok.

Please give it a try and let me know if it's working for you.

Best,
Stefan
Attachment 1: Screenshot_2021-04-14_at_16.54.12_.png
Screenshot_2021-04-14_at_16.54.12_.png
  2157   29 Apr 2021 Pierre-Andre AmaudruzSuggestionTime zone selection for web page
> > The new history as well as the clock in the web page header show the local time 
> > of the user's computer running the browser.
> > Would it be possible to make it either always use the time zone of the Midas 
> > server, or make it selectable from the config page?
> > It's not ideal trying to relate error messages from the midas.log to history 
> > plots if the time stamps don't match.
> 
> I implemented a new row in the config page to select the time zone. 
> 
> "Local": Time zone where the browser runs
> "Server": Time zone where the midas server runs (you have to update mhttpd for that)
> "UTC+X": Any other time zone
> 
> The setting affects both the status header and the history display.
> 
> I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
> support for that is not that great in JavaScript, so I decided to go with simple 
> UTC+X. Hope that's ok.
> 
> Please give it a try and let me know if it's working for you.
> 
> Best,
> Stefan

Hi Stefan,

This is great, the UTC+x is perfect, thank you.
PAA
  2132   23 Mar 2021 Lars MartinBug ReportTime shift in history CSV export
Version: release/midas-2020-12

I'm exporting the history data shown in elog:2132/1 to CSV, but when I look at the 
CSV data, the step no longer occurs at the same time in both data sets (elog:2132/2)
Attachment 1: Cooling-MoxaCalib-20212118-190450-20212119-102151.png
Cooling-MoxaCalib-20212118-190450-20212119-102151.png
Attachment 2: Screenshot_from_2021-03-23_12-29-21.png
Screenshot_from_2021-03-23_12-29-21.png
  2133   23 Mar 2021 Lars MartinBug ReportTime shift in history CSV export
History is from two separate equipments/frontends, but both have "Log history" set to 1.
  2134   23 Mar 2021 Lars MartinBug ReportTime shift in history CSV export
Tried with export of two different time ranges, and the shift appears to remain the same, 
about 4040 rows.
  2135   24 Mar 2021 Stefan RittBug ReportTime shift in history CSV export
I confirm there is a problem. If variables are from the same equipment, they have the same 
time stamps, like

t1 v1(t1) v2(t1)
t2 v1(t2) v2(t2)
t3 v1(t3) v2(t3)

when they are from different equipments, they have however different time stamps

t1 v1(t1)
t2    v2(t2)
t3 v1(t3)
t4    v2(t4)

The bug in the current code is that all variables use the time stamps of the first variable, 
which is wrong in the case of different equipments, like

t1 v1(t1) v2(*t2*)
t3 v1(t3) v2(*t4*)

So I can change the code, but I'm not sure what would be the bast way. The easiest would be to 
export one array per variable, like

t1 v1(t1)
t2 v1(t2)
...
t3 v2(t3)
t4 v2(t4)
...

Putting that into a single array would leave gaps, like

t1 v1(t1) [gap]
t2 [gap]  v2(t2)
t3 v1(t3) [gap]
t4 [ga]]  v2(t4)

plus this is programmatically more complicated, since I have to merge two arrays. So which 
export format would you prefer?

Stefan
  2136   24 Mar 2021 Lars MartinBug ReportTime shift in history CSV export
I think from my perspective the separate files are fine. I personally don't really like the format 
with the gaps, so don't see an advantage in putting in the extra work.
I'm surprised the shift is this big, though, it was more than a whole hour in my case, is it the 
time difference between when the frontends were started?
  2154   14 Apr 2021 Stefan RittBug ReportTime shift in history CSV export
I finally found some time to fix this issue in the latest commit. Please update and check if it's 
working for you.

Stefan
  594   15 Jun 2009 Jimmy NgaiForumTime limit of each run
Dear All,

Can one set a time limit for each run? I can only find event limit in ODB. 
Thanks.

Jimmy
  615   04 Aug 2009 Exaos LeeForumThe contents of the attachment
As requested from K.O., I paste the "00README.txt" as the following:
#-*- mode: outline -*-
#-*- encoding: utf-8 -*-
#AUTHOR: Exaos Lee <Exaos DOT Lee AT gmail DOT com>

* Directories
  +--> 00README.txt : This file
  |
  +--> bustester : Directory contains utilities for VME bus testing
  |
  +--> modules   : APIs to handle VME modules
  |
  +--> pyutil    : Uitilies in Python, including PyMVME
  |
  +--> sis3100   : Provide lib_sis3100mvme.a/so using with "mvmestd.h"

* Utilities in Python

** PyMVME module

   The module "PyMVME" provides the following stuff:
      a. class StdVME
      	 -- contains standard VME informations.
      b. class MVME_INTERFACE
      	 -- the C structure MVME_INTERFACE wrapped in Python
      c. dict MVME_STATUS
      	 -- the return information defined in "mvmestd.h"
      d. the related useful aliases from "mvmestd.h"
      	 -- including "mvme_addr_t", "mvme_locaddr_t", "mvme_size_t"
      e. class MvmeDev
      	 -- the major class which provides methods to access VME bus.

   You may find examples of how to use module "PyMVME" from "find_caen.py" or
   scripts in dir "test". All of the examples are using "lib_sis3100mvme.so".
   You may find information later in this introduction.

** find_caen.py

   The script to find VME modules from CAEN. Now, it is still in test status
   and can only find ADCs, TDCs or QDCs.

* SIS3100 library to be used togather with "mvmestd.h"

  The directory "sis3100" contains sources to build libraries as the following:
  a. lib_sis3100.a     -- APIs declared in "sis3100_vme_calls.h"
  b. lib_sis3100mvme.a -- APIs declared in "mvmestd.h". It also contains the
     		       	  same APIs from lib_sis3100.a

  If you want to use shared libraries, especially when you are using utilities
  wrote in Python, you may rebuild the libraries as the following:

    $ cd sis3100
    $ make shared

* APIs to handle VME modules

** vadc_caen.h/c

   Provides APIs to handle ADC-type modules from CAEN, including:
      a. ADCs --- V785, V785N
      b. TDCs --- V775, V775N
      c. QDCs --- V792, V792N

* VME bus testers

  Still under development.


  1930   04 Jun 2020 Hisataka YOSHIDAForumTemplate of slow control frontend
I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
because it contains only nulldevice and null bus driver case...
(I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)

Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
Hopefully, I would like to have examples of frontend and device driver.
(if any device driver which is included in the package is similar, please tell me.)

Thanks a lot.
  1931   04 Jun 2020 Pintaudi GiorgioForumTemplate of slow control frontend
> I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
> I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
> because it contains only nulldevice and null bus driver case...
> (I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)
> 
> Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
> Hopefully, I would like to have examples of frontend and device driver.
> (if any device driver which is included in the package is similar, please tell me.)
> 
> Thanks a lot.

Dear Yoshida-san,
my name is Giorgio and I am a Ph.D. student working on the T2K experiment.
I had to write many MIDAS frontends recently, so I think that my code could be of some help to you.

As you might already know, the MIDAS slow control system is structured into three layers/levels.

 - The highest layer is the "class" layer that directly interfaces with the user and the ODB. It is called 
"class" layer because it refers to a class of devices (for example all the high voltage power supplies, 
etc...). The idea is that in the same experiment you can have many different models of power supplies but 
all of them can be controlled with a single class driver.

 - Then there is the "device" layer that implements the functions specific to the particular device.

 - Finally, there is the "BUS" layer that directly communicates with the device. The BUS can be Ethernet 
(TCP/IP), Serial (RS-232 / RS-422 / RS-485), USB, etc ...

You can read more about the MIDAS slow control system here:
https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Anyway, you need to write code for all those layers. If you are lucky you can reuse some of the already 
existing MIDAS code. Keep in mind that all the examples that you find in the MIDAS documentation and the 
MIDAS source code are written in C (even if it is then compiled with g++). But, you can write a frontend in 
C++ without any problem so choose whichever language you are familiar the most with.

I am attaching an archive with some sample code directly taken from our experiment. It is just a small 
fraction of the code not meant to be compilable. The code is disclosed with the GPL3 license, so you can use 
it as you please but if you do, please cite my name and the WAGASCI-T2K experiment somewhere visible.

In the archive, you can find two example frontends with the respective drivers. The "Triggers" frontend is 
written in C++ (or C+ if you consider that the mfe.cxx API is very C-like). The "WaterLevel" frontend is 
written in plain C. The "Triggers" frontend controls our trigger board called CCC and the "WaterLevel" 
frontend controls our water level sensors called PicoLog 1012. They share a custom implementation of the 
TCP/IP bus. Anyway, this is not relevant to you. You may just want to take a look at the code structure.

Finally, recently there have been some very interesting developments regarding the ODB C++ API. I would 
definitely take a look at that. I wish I had that when I was developing these frontends.

Good luck

--
Pintaudi Giorgio, Ph.D. student
Neutrino and Particle Physics Minamino Laboratory
Faculty of Science and Engineering, Yokohama National University
giorgio.pintaudi.kx@ynu.jp
TEL +81(0)45-339-4182
Attachment 1: MIDAS_frontend_sample.zip
  1932   04 Jun 2020 Stefan RittForumTemplate of slow control frontend
> I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
> I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
> because it contains only nulldevice and null bus driver case...
> (I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)
> 
> Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
> Hopefully, I would like to have examples of frontend and device driver.
> (if any device driver which is included in the package is similar, please tell me.)

Have you checked the documentation?

https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under 
midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one 
you need and modify it.

Best,
Stefan
  1933   04 Jun 2020 Hisataka YOSHIDAForumTemplate of slow control frontend
Dear Stefan,

Thank you for you quick reply.


> Have you checked the documentation?
> 
> https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Yes, I have read the wiki, but not easy to figure out how I treat the individual case.

> Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under 
> midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one 
> you need and modify it.

Okay, I will try to write drivers for my own devices using existing drivers.
(maybe I can find some device drivers which uses TCP/IP, RS232)

Best regards,
Hisataka Yoshida
  1934   04 Jun 2020 Hisataka YOSHIDAForumTemplate of slow control frontend
Dear Giorgio,

Thank you very much for your kind and quick reply!

I appreciate you giving me such a nice explanation, experience, and great sample codes (This is what I desired!).
They all are useful for me. I will try to write my frontend codes using gift from you.

Thank you again!

Best regards,
Hisataka Yoshida
ELOG V3.1.4-2e1708b5