Back Midas Rome Roody Rootana
  Midas DAQ System, Page 1 of 39  Not logged in ELOG logo
Entry  16 Dec 2021, Zaher Salman, Forum, Device driver for modbus 
Dear all, does anyone have an example of for a device driver using modbus or modbus tcp to communicate with a device and willing to share it? Thanks.
Entry  12 Dec 2021, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  dummy_fe.cpp
Dear all,

in 13 Feb 2020 to 21 Feb 2020 we had a talk about how I try to create MIDAS events directly on a FPGA and 
than use DMA to hand the event over to MIDAS. In the thread I also explained how I do it in my MIDAS frontend. 

For testing the DAQ I created a dummy frontend which was emulating my FPGA (see attached file). The interesting code is 
in the function read_stream_thread and there I just fill a array according to the 32b BANKS which are 64b aligned (more or less
the lines 306-369). And than I do:

    uint32_t * dma_buf_volatile;
    dma_buf_volatile = dma_buf_dummy;

    copy_n(&dma_buf_volatile[0], sizeof(dma_buf_dummy)/4, pdata);

    pdata+=sizeof(dma_buf_dummy);
    rb_increment_wp(rbh, sizeof(dma_buf_dummy)); // in byte length

to send the data to the buffer.

This summer (Mai - July) everything was working fine but today I did not get the data into MIDAS. 
I was hopping around a bit with the commits and everything was at least working until: 3921016ce6d3444e6c647cbc7840e73816564c78.

Thanks,
Marius
Entry  02 Dec 2021, Alexey Kalinin, Bug Report, some frontend kicked by cm_periodic_tasks 
Hello,
We have a small experiment with MIDAS based DAQ.
Status page shows :
ES	ESFrontend@192.168.0.37	207	0.2	0.000
Trigger06	Sample Frontend06@192.168.0.37	1.297M	0.3	0.000
Trigger01	Sample Frontend01@192.168.0.37	1.297M	0.3	0.000
Trigger16	Sample Frontend16@192.168.0.37	1.297M	0.3	0.000
Trigger38	Sample Frontend38@192.168.0.37	1.297M	0.3	0.000
Trigger37	Sample Frontend37@192.168.0.37	1.297M	0.3	0.000
Trigger03	Sample Frontend03@192.168.0.38	1.297M	0.3	0.000
Trigger07	Sample Frontend07@192.168.0.38	1.297M	0.3	0.000
Trigger04	Sample Frontend04@192.168.0.38	59898	0.0	0.000
Trigger08	Sample Frontend08@192.168.0.38	59898	0.0	0.000
Trigger17	Sample Frontend17@192.168.0.38	59898	0.0	0.000


And SYSTEM buffers page shows:
ESFrontend	1968	198	47520	0	0x00000000	0		
193 ms
Sample Frontend06	1332547	1330826	379729872	0	0x00000000	
0		1.1 sec
Sample Frontend16	1332542	1330839	361988208	0	0x00000000	
0		94 ms
Sample Frontend37	1332530	1330841	337798408	0	0x00000000	
0		1.1 sec
Sample Frontend01	1332543	1330829	467136688	0	0x00000000	
0		34 ms
Sample Frontend38	1332528	1330830	291453608	0	0x00000000	
0		1.1 sec
Sample Frontend04	63254	61467	20882584	0	0x00000000	
0		208 ms
Sample Frontend08	63262	61476	27904056	0	0x00000000	
0		205 ms
Sample Frontend17	63271	61473	20433840	0	0x00000000	
0		213 ms
Sample Frontend03	1332549	1330818	386821728	0	0x00000000	
0		82 ms
Sample Frontend07	1332554	1330821	462210896	0	0x00000000	
0		37 ms
Logger	968742	0w+9500418r	0w+2718405736r	0	0x00000000	0	
GET_ALL Used 0 bytes 0.0%	303 ms
rootana	254561	0w+29856958r	0w+8718288352r	0	0x00000000	0		
762 ms


The problem is that eventually some of frontend closed with message 
:19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

in the meantime mserver loggging :
mserver started interactively
mserver will listen on TCP port 1175
double free or corruption (!prev)
double free or corruption (!prev)
free(): invalid next size (normal)
double free or corruption (!prev)


I can find some correlation between number of events/event size produced by 
frontend, cause its failed when its become big enough. 

frontend scheme is like this:

poll event time set to 0;

poll_event{
//if buffer not transferred return (continue cutting the main buffer)
//read main buffer from hardware
//buffer not transfered
}

read event{
// cut the main buffer to subevents (cut one event from main buffer) return;
//if (last subevent) {buffer transfered ;return}
}

What is strange to me that 2 frontends (1 per remote pc) causing this.

Also, I'm executing one FEcode with -i # flag , put setting eventid in 
frontend_init , and using SYSTEM buffer for all.

Is there something I'm missing?
Thanks. 
A.
Entry  19 Nov 2021, Jacob Thorne, Forum, Sequencer error with ODB Inc 
Hi,

I am having problems with the midas sequencer, here is my code:

1  COMMENT "Example to move a Standa stage"
2  RUNDESCRIPTION "Example movement sequence - each run is one position of a single stage
3  
4  PARAM numRuns
5  PARAM sequenceNumber
6  PARAM RunNum
7  
8  PARAM positionT2
9  PARAM deltapositionT2
10 
11 ODBSet "/Runinfo/Run number", $RunNum
12 ODBSet "/Runinfo/Sequence number", $sequenceNumber
13 
14 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Type of Measurement", 2
15 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Time Bins", 10
16 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Sweeps", 1
17 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Dwell Time", 100000
18 
19 ODBSet "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $positionT2
20 
21 LOOP $numRuns
22  WAIT ODBvalue, "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Ready", ==, 1
23  TRANSITION START
24  WAIT ODBvalue, "/Equipment/Neutron Detector/Statistics/Events sent", >=, 1
25  WAIT ODBvalue, "/Runinfo/State", ==, 1
26  WAIT ODBvalue, "/Runinfo/Transition in progress", ==, 0
27  TRANSITION STOP
28  ODBInc "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $deltapositionT2
29 
30 ENDLOOP
31 
32 ODBSet "/Runinfo/Sequence number", 0

The issue comes with line 28, the ODBInc does not work, regardless of what number I put I get the following error:

[Sequencer,ERROR] [odb.cxx:7046:db_set_data_index1,ERROR] "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position" invalid element data size 32, expected 4

I don't see why this should happen, the format is correct and the number that I input is an int.

Sorry if this is a basic question.

Jacob
    Reply  02 Dec 2021, Stefan Ritt, Forum, Sequencer error with ODB Inc 
Thanks for reporting that bug. Indeed there was a problem in the sequencer code which I fixed now. Please try the updated develop branch.

Stefan
Entry  01 Dec 2021, Lars Martin, Bug Report, Off-by-one in sequencer documentation 
The documentation for the sequencer loop says:

<quote>
LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
inside the loop via $name.
</quote>

In fact the loop variable runs from 1...n, as can be seen by running this exciting 
sequencer code:

1 COMMENT "Figuring out MSL"
2 
3 LOOP n,4
4   MESSAGE $n,1
5 ENDLOOP
    Reply  02 Dec 2021, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation 
> The documentation for the sequencer loop says:
> 
> <quote>
> LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
> can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
> inside the loop via $name.
> </quote>
> 
> In fact the loop variable runs from 1...n, as can be seen by running this exciting 
> sequencer code:
> 
> 1 COMMENT "Figuring out MSL"
> 2 
> 3 LOOP n,4
> 4   MESSAGE $n,1
> 5 ENDLOOP

Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Stefan
Entry  09 Nov 2021, Hunter Lowe, Forum, MityCAMAC Login  
Hello all,

I've recently acquired a MityCAMAC system that was built at TRIUMF and I'm 
having issues accessing it over ethernet.

The system: Ubuntu VM inside Windows 10 machine.

I've tried reconfiguring the network settings for the VM but nmap and arp/ip 
commands have yielded me no results in finding the crate controller. 

I was getting help from Pierre Amaudruz but I think he is now busy for some 
time. I have the mac address of the crate controller and its name. The 
controller seems to initialize fine inside of the CAMAC crate. The windows side 
of the workstation also tells me that an unknown network is in fact connected.

I suspect I either need to do something with an ssh key (which I thought we 
accomplished but maybe not), or perhaps the domain name in the controller needs 
to be changed. 

If anybody has experience working with MityARM I would appreciate any advice I 
could get.

Best,
Hunter Lowe
UNBC Graduate Physics
    Reply  11 Nov 2021, Thomas Lindner, Forum, MityCAMAC Login  
Hi Hunter 

This sounds like a Triumf specific problem; 
not a MIDAS problem.  Please email me directly 
and we can try to solve this problem.

Thomas Lindner
TRIUMF DAQ


 Hello all,
> 
> I've recently acquired a MityCAMAC system 
that was built at TRIUMF and I'm 
> having issues accessing it over ethernet.
> 
> The system: Ubuntu VM inside Windows 10 
machine.
> 
> I've tried reconfiguring the network 
settings for the VM but nmap and arp/ip 
> commands have yielded me no results in 
finding the crate controller. 
> 
> I was getting help from Pierre Amaudruz but 
I think he is now busy for some 
> time. I have the mac address of the crate 
controller and its name. The 
> controller seems to initialize fine inside 
of the CAMAC crate. The windows side 
> of the workstation also tells me that an 
unknown network is in fact connected.
> 
> I suspect I either need to do something with 
an ssh key (which I thought we 
> accomplished but maybe not), or perhaps the 
domain name in the controller needs 
> to be changed. 
> 
> If anybody has experience working with 
MityARM I would appreciate any advice I 
> could get.
> 
> Best,
> Hunter Lowe
> UNBC Graduate Physics
Entry  09 Nov 2021, Francesco Renga, Forum, Issue in data writing speed 
Dear all,
       I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
explain this behavior? Is there any way to improve it?

Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
to improve it, it would be really appreciated.

Thank you very much,
          Francesco



  const char* pSrc = (const char*)bufframe.buf;

  for(int y = 0; y < bufframe.height; y++ ){

    //Copy one row
    const unsigned short* pDst = (const unsigned short*)pSrc;

    //go through the row
    for(int x = 0; x < bufframe.width; x++ ){

      WORD tmpData = *pDst++; 

      *pdata++ = tmpData;

    }

    pSrc += bufframe.rowbytes;

  }
 
    Reply  10 Nov 2021, Stefan Ritt, Forum, Issue in data writing speed 
Midas uses various buffers (in the frontend, at the server side before the SYSTEM buffer, the SYSTEM buffer itself, on the 
logger before writing to disk. All these buffers are in RAM and have fast access, so you can fill them pretty quickly. When
they are full, the logger writes to disk, which is slower. So I believe at 2 Hz your disk can keep up with your writing 
speed, but at 4 Hz (2x8MBx4=32 MB/sec) your disk starts slowing down the writing process. Now 32MB/s is pretty slow for
a disk, so I presume you have turned compression on which takes quite some time.

To verify this, disable logging. The disable compression and keep logging. Then report back here again.

> Dear all,
>        I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
> explain this behavior? Is there any way to improve it?
> 
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
> to improve it, it would be really appreciated.
> 
> Thank you very much,
>           Francesco
> 
> 
> 
>   const char* pSrc = (const char*)bufframe.buf;
> 
>   for(int y = 0; y < bufframe.height; y++ ){
> 
>     //Copy one row
>     const unsigned short* pDst = (const unsigned short*)pSrc;
> 
>     //go through the row
>     for(int x = 0; x < bufframe.width; x++ ){
> 
>       WORD tmpData = *pDst++; 
> 
>       *pdata++ = tmpData;
> 
>     }
> 
>     pSrc += bufframe.rowbytes;
> 
>   }
>  
Entry  29 Oct 2021, Kushal Kapoor, Bug Report, Unknown Error 319 from client Screenshot_2021-10-26_114015.png
I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
localhost.

Frontend code compiled without any errors and MIDAS reads the frontend 
successfully, this only comes when I start the new run on MIDAS, here are a few 
more details from the terminal-

11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
mismatch for "/" (expected size: 1, size in ODB: 41920)

11:46:32 [Logger,INFO] Deleting previous file 
"/home/rcmp/online3/run00621_000.root"

11:46:32 [ODBEdit,ERROR] [midas.cxx:5073:cm_transition,ERROR] transition START 
aborted: client "fetiglab" returned status 319

11:46:32 [ODBEdit,ERROR] [midas.cxx:5246:cm_transition,ERROR] Could not start a 
run: cm_transition() status 319, message 'Unknown error 319 from client 
'fetiglab' on host "localhost"'

TR_STARTABORT transition: cleanup after failure to start a run

&#8204;

I’ve also enclosed a screenshot for the same, any suggestions would be highly 
appreciated. thanks
Entry  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator 
I have 16 array odb key

{"FIR Energy", {
            {"Energy Gap Value", std::array<uint32_t,16>(10) },

I can get the maximum of this array like 


uint32_t max_value = *std::max_element(values.begin(),values.end());

but when I need the maximum of a sub range

uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);

I get

/home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
  584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
      |                                                ~~~~~~~~~~~~~~^~
      |                                                            |  |
      |                                                            |  int
      |               

As the + operator is overloaded for midas::odb::iterator, I was expected this to work.

(and yes, I can find the max element by accessing the elements on by one)
    Reply  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator | work around 
ok, so retrieving as a std::array (as it was defined) does not work

    std::array<uint32_t,16> avalues = settings["FIR Energy"]["Energy Gap Value"];

but retrieving as an std::vector does, and then I have a standard c++ iterator which I can use in std stuff

    std::vector<uint32_t> values = settings["FIR Energy"]["Energy Gap Value"];

> I have 16 array odb key
> 
> {"FIR Energy", {
>             {"Energy Gap Value", std::array<uint32_t,16>(10) },
> 
> I can get the maximum of this array like 
> 
> 
> uint32_t max_value = *std::max_element(values.begin(),values.end());
> 
> but when I need the maximum of a sub range
> 
> uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);
> 
> I get
> 
> /home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
>   584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
>       |                                                ~~~~~~~~~~~~~~^~
>       |                                                            |  |
>       |                                                            |  int
>       |               
> 
> As the + operator is overloaded for midas::odb::iterator, I was expected this to work.
> 
> (and yes, I can find the max element by accessing the elements on by one)
Entry  25 Oct 2021, Francesco Renga, Forum, Logger crash 
Hello,
     I'm experiencing crashes of the mlogger program on the time scale of a couple 
of days. The only messages from MIDAS are:

05:34:47.336 2021/10/24 [mhttpd,INFO] Client 'Logger' (PID 14281) on database 
'ODB' removed by db_cleanup called by cm_periodic_tasks (idle 10.2s,TO 10s)
05:34:47.335 2021/10/24 [mhttpd,INFO] Client 'Logger' on buffer 'SYSMSG' removed 
by cm_periodic_tasks (idle 10.2s, timeout 10s)

Any suggestion to further investigate this issue?

Thank you very much,
         Francesco
    Reply  25 Oct 2021, Stefan Ritt, Forum, Logger crash 
The short term solution would be to increase the logger timeout in the ODB under

/Programs/Logger/Watchdog timeout

and set it to 6000 (one minute). But that is curing just the symptoms. It would be 
interesting to understand the cause of this error. Probably the logger takes more than 10 
seconds to start or stop the run. The reason could be that the history grow too big (what 
we have right now in MEG II), or some disk problems. But that needs detailed debugging on 
the logger side.

Stefan
Entry  22 Oct 2021, Francesco Renga, Forum, mhttpd error 
Dear all,
      I am trying to make the MIDAS web server for my DAQ project accessible from other machines. In the ODB, I activated the necessary flags:

[local:CYGNUS_RD:S]/WebServer>ls
Enable localhost port           y
localhost port                  8080
localhost port passwords        n
Enable insecure port            y
insecure port                   8081
insecure port passwords         y
insecure port host list         y
Enable https port               y
https port                      8443
https port passwords            y
https port host list            n
Host list
                                localhost
Enable IPv6                     y
Proxy                           
mime.types

Following the instructions on the Wiki I enabled the SSL support. When running mhttpd, I get these messages:

Mongoose web server will use HTTP Digest authentication with realm "CYGNUS_RD" and password file "/home/cygno/DAQ/online/htpasswd.txt"
Mongoose web server will use the hostlist, connections will be accepted only from: localhost
Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "[::1]:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "8081", passwords enabled, hostlist enabled
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8081"
Mongoose web server will use https certificate file "/home/cygno/DAQ/online/ssl_cert.pem"
Mongoose web server listening on https address "8443", passwords enabled, hostlist OFF
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8443"

and the server is not accessible from other machines. Any suggestion to solve or better investigate this problem?

Thank you very much,
         Francesco
    Reply  22 Oct 2021, Stefan Ritt, Forum, mhttpd error 
> Enable IPv6                     y

Probably the IPv6 problem, see here elog:2269

I asked to turn off IPv6 by default, or at least mention this in the documentation,
but unfortunately nothing happened.

Stefan
       Reply  25 Oct 2021, Francesco Renga, Forum, mhttpd error 
It worked, thank you very much!

Francesco

> > Enable IPv6                     y
> 
> Probably the IPv6 problem, see here elog:2269
> 
> I asked to turn off IPv6 by default, or at least mention this in the documentation,
> but unfortunately nothing happened.
> 
> Stefan
Entry  14 Oct 2021, Amy Roberts, Suggestion, Adding (or improving discoverability) of TID for odbset 
Creating an ODB key requires users to know the Type ID that are defined in 
https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.

I can't find any information on the Midas Wiki about these values or how to find 
them.

Am I missing something obvious?  Is there a way to improve how to find these values?  
Or is this not the best way to interact with the ODB?
    Reply  15 Oct 2021, Stefan Ritt, Suggestion, Adding (or improving discoverability) of TID for odbset 
> Creating an ODB key requires users to know the Type ID that are defined in 
> https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.
> 
> I can't find any information on the Midas Wiki about these values or how to find 
> them.
> 
> Am I missing something obvious?  Is there a way to improve how to find these values?  
> Or is this not the best way to interact with the ODB?

Well, you found them in midas.h, so where is the problem?

If you want a more detailed description, just look in the midas documentation (RTFM):

https://midas.triumf.ca/MidasWiki/index.php/Midas_Data_Types

If you want a more modern interface to the ODB without these data types, look here:

https://midas.triumf.ca/MidasWiki/index.php/Odbxx

Best regards,
Stefan
Entry  11 Oct 2021, Konstantin Olchanski, Forum, midas forum updated, moved 
The midas forum software (elogd) was updated to latest version and moved from our old server 
(ladd00.triumf.ca) to our new server (daq00.triumf.ca).

The following URLs should work:

https://daq00.triumf.ca/elog-midas/Midas/ (new URL)
https://midas.triumf.ca/elog/Midas/ (old URL, redirects to daq00)
https://midas.triumf.ca/forum (link from midas wiki)

The configuration on the old server ladd00.triumf.ca is quite tangled between
several virtual hosts and several DNS CNAMEs. I think I got all the redirects
correct and all old URLs and links in old emails & etc still work.

If you see something wrong, please reply to this message here or email me directly.

K.O.
Entry  11 Oct 2021, Konstantin Olchanski, Forum, test 
test, no email. K.O.
    Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> test, no email. K.O.

test reply, no email. K.O.
       Reply  11 Oct 2021, Konstantin Olchanski, Forum, test image.png
> > test, no email. K.O.
> 
> test reply, no email. K.O.

test attachment, no email. K.O.
          Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> > > test, no email. K.O.
> > 
> > test reply, no email. K.O.
> 
> test attachment, no email. K.O.

test email. K.O.
Entry  11 Oct 2021, Stefan Ritt, Info, Modification in the history logging system 
A requested change in the history logging system has been made today. Previously, history values were
logged with a maximum frequency (usually once per second) but also with a minimum frequency, meaning
that values were logged for example every 60 seconds, even if they did not change. This causes a problem.
If a frontend is inactive or crashed which produces variables to be logged, one cannot distinguish between
a crashed or inactive frontend program or a history value which simply did not change much over time.
The history system was designed from the beginning in a way that values are only logged when they actually
change. This design pattern was broken since about spring 2021, see for example this issue:

https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

Today I modified the history code to fix this issue. History logging is now controlled by the value of 
common/Log history in the following way:

* Common/Log history = 0 means no history logging
* Common/Log history = 1 means log whenever the value changes in the ODB
* Common/Log history = N means log whenever the value changes in the ODB and 
  the previous write was more than N seconds ago

So most experiments should be happy with 0 or 1. Only experiments which have fluctuating values due to noisy 
sensors might benefit from a value larger than 1 to limit the history logging. Anyhow this is not the preferred 
way to limit history logging. This should be done by the front-end limiting the updates to the ODB. Most of the 
midas slow control drivers have a “threshold” value. Only if the input changes by more then the threshold are 
written to the ODB. This allows a per-channel “dead band” and not a per-event limit on history logging 
as ‘log history’ would do. In addition, the threshold reduces the write accesses to the ODB, although that is
only important for very large experiments.

Stefan
Entry  30 Sep 2021, Francesco Renga, Forum, OPC client within MIDAS 
Dear all,
     I need to integrate in my MIDAS project the communication with an OPC UA 
server. My plan is to develop an OPC UA client as a "device" in 
midas/drivers/device.

Two questions:

1) Is anybody aware of some similar effort for some other project, so that I can 
get some example?

2) What could be the more appropriate driver's class to be used? generic.cxx? 
multi.cxx?

Thank you for your help,
           Francesco
Entry  29 Sep 2021, Richard Longland, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
Thank you, Stefan.

I found these instructions under
1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)

I do see reference to updating the submodules under the TRIUMF install 
instructions 
(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
all_MIDAS) although perhaps it can be clarified.

Cheers,
Richard
    Reply  29 Sep 2021, Stefan Ritt, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
> Thank you, Stefan.
> 
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
> 
> I do see reference to updating the submodules under the TRIUMF install 
> instructions 
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
> 
> Cheers,
> Richard

Hi Richard,

I updated the documentation at

https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas

by putting the submodule update command everywhere.

Best,
Stefan
Entry  28 Sep 2021, Richard Longland, Bug Report, Install clash between MIDAS 2020-08 and mscb 
All,

I am performing a fresh install of MIDAS on an Ubuntu linux box. I follow the 
usual installation procedure:

1) git clone https://bitbucket.org/tmidas/midas --recursive
2) cd midas
3) git checkout release/midas-2020-08
4) mkdir build
5) cd build
6) cmake ..
7) make

Step 3 warns me that 
"warning: unable to rmdir 'manalyzer': Directory not empty" and 
"warning: unable to rmdir 'midasio': Directory not empty"

Step 7 fails.
Compilation fails with an mhttp error related to mscb:
mhttpd.cxx:8224:59: error: too few arguments to function 'int mscb_ping(int, 
short unsigned int, int, int)'
 8224 |             status = mscb_ping(fd, (unsigned short) ind, 1);


I was able to get around this by rolling mscb back to some old version (commit 
74468dd), but am extremely nervous about mix-and-matching the code this way.

Any advice would be greatly appreciated.

Cheers,
Richard 
    Reply  28 Sep 2021, Stefan Ritt, Bug Report, Install clash between MIDAS 2020-08 and mscb 
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make

When you do step 3), you get

~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M	mjson
M	mscb
M	mvodb
M	mxml

The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the 
*current* submodules, which won't match. So you have to roll back
also the submodules with:

3.5) git submodule update --recursive

This fetched those versions of the submodules which match the
midas version 2020-08. See here for details: 

https://git-scm.com/book/en/v2/Git-Tools-Submodules

From where did you get the command

git checkout release/xxxx ???

If you tell me the location of that documentation, I will take
care that it will be amended with the command

git submodule update --recursive

Best,
Stefan
Entry  19 Sep 2021, Stefan Ritt, Bug Fix, Chat working again Screenshot_2021-09-19_at_21.27.19_.png
Not sure how many people are using it, but the Chat facility in midas was broken 
for some time now and got fixed today again.

Just for your information: Chat can be used like WhatsApp & Co, and connects all 
people who access a midas experiment through their browser. It's good to 
communicate between shift crew members located at different places. One advantage 
is that the chat messages can get 'spoken' by the text-to-speech engine of your 
browser, so it can be used to "wake up" shifters. Can be configured through the 
"Config" page.

Stefan
Entry  06 Sep 2021, Andreas Suter, Forum, mhttpd crash 
midas version used: midas-2019-05-cxx-1461-g906be8b

I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:

[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!

with various socket numbers of course.

Can anybody hint me what is going wrong here?

The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

Help would be very much appreciated!

Andreas 
    Reply  06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash 
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

This is my code. I am the culprit. I had a bit of discussion about this with Stefan.

Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.

I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.

This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.

Back to your troubles.

1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.

2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.

3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).

4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)

In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).

K.O.
       Reply  07 Sep 2021, Andreas Suter, Forum, mhttpd crash 
Dear Konstantin,

thanks for the prompt response, this helps a lot!

> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.

I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.

> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
> 
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
> 

I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?

Andreas
          Reply  17 Sep 2021, Stefan Ritt, Forum, mhttpd crash mhttpdScreenshot_2021-09-17_at_21.11.15_.png
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI 
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive. 

To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into

/etc/monit.d/mhttpd

You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put 

set daemon 10

into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.

Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is 
multi-threaded, but I haven't tested this in detail.

Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).

I hope this helps some of you.

Stefan
ELOG V3.1.4-2e1708b5