Back Midas Rome Roody Rootana
  Midas DAQ System, Page 1 of 43  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  22 Nov 2023, Pavel Murat, Forum, run number from an external (*SQL) db? 
Dear MIDAQ developers,

I wonder if there is a non-intrusive way to have an external (wrt MIDAS)*SQL database 
serving as a primary source of the run number information for a MIDAS-based DAQ system? 
- like a plugin with a getNextRunNumber() function, for example, or a special client?

Here is the use case: 

- multiple subdetectors are taking test data during early commissioning 
- a postgres db is a single sorce of run numbers.
- test runs taken by different subsystems are assigned different [unique] run numbers and 
  the data taken by the subsystem are identified not by the run number/dataset name , but 
  by the run type, different for different susbsystems.

-- many thanks, regards, Pasha
    Reply  22 Nov 2023, Ben Smith, Forum, run number from an external (*SQL) db? 
> I wonder if there is a non-intrusive way to have an external (wrt MIDAS)*SQL database 
> serving as a primary source of the run number information for a MIDAS-based DAQ system? 
> - like a plugin with a getNextRunNumber() function, for example, or a special client?

One of my experiments has special rules for run numbering as well. I created a client that registers a begin-of-run transition handler with sequence 1 (so it's the first client to handle the begin-of-run transition). That client updates "/Runinfo/Run number" in the ODB. 

This mostly works. mlogger will create .mid files based on the new run number, the ODB dumps within those files show the new run number etc.

But there are 2 quirks. Let's say your client changed the number from 11 to 400. The message log will say "Run #11 started" and "Run #400 stopped". And the history system will record the start/stop times the same way. That only matters for when you're viewing history plots on the webpage and zoom in far enough to see the run transitions (represented by green and red vertical dashed lines) - the green line will be labelled 11 and the red line 400.

Depending on the exact logic you need, you may be able to avoid these quirks by also recomputing the run number before the user even tries to start a run (e.g. after the end of the previous run, or when the user changes an important setting in the ODB). If you're changing the run number between runs, make sure to set it to "desired number - 1", as midas will increment the run number automatically before handling the next start run request.
    Reply  22 Nov 2023, Stefan Ritt, Forum, run number from an external (*SQL) db? 
> - multiple subdetectors are taking test data during early commissioning 
> - a postgres db is a single sorce of run numbers.
> - test runs taken by different subsystems are assigned different [unique] run numbers and 
>   the data taken by the subsystem are identified not by the run number/dataset name , but 
>   by the run type, different for different susbsystems.

For that purpose I would not "mis-use" run numbers. Run number are meant to be incremented 
sequentially, like if you have a time-stamp in seconds since 1.1.1970 (Unix time). Intead, I 
would add additional attributes under /Experiment/Run Parameters like "Subsystem type", "Run 
mode (production/commissioning)" etc. You have much more freedom in choosing any number of 
attributes there. Then, send this attributes to your postgred db via "/Logger/Runlog/SQL/Links 
BOR". Then you can query your database to give you all runs of a certain subtype or mode.

See https://daq00.triumf.ca/MidasWiki/index.php/Logging_to_a_mySQL_database

Stefan
Entry  21 Nov 2023, Ivo Schulthess, Forum, Polled frontend writes data to ODB without RO_ODB 
Good morning, 

In our setup, we have a neutron detector that creates up to 16 MB of polled (EQ_POLLED) data in one event (event limit = 1) that we do not want to have saved into the ODB. Nevertheless, I cannot disable it. The equipment has only the read-on-flag RO_RUNNING, and the read-on value in the ODB is 1. The data are also not saved to the history. I also tried with a minimal example frontend with the same settings, but also those "data" get written to the ODB. For now, I increased the size of the ODB to 40 MB (50% for keys and 50% for data is automatic), but in principle, I do not want it to be saved to the ODB at all. Is there something I am missing?

Thanks in advance for your advice. 

Cheers,
Ivo
    Reply  22 Nov 2023, Stefan Ritt, Forum, Polled frontend writes data to ODB without RO_ODB 
I cannot confirm that. I just tried myself with examples/experiment/frontend.cxx, removed the RO_ODB, and the trigger events did NOT get copied to the ODB.

Actually you can debug the code yourself. The relevant line is in mfe.cxx:2075:

   /* send event to ODB */
   if (pevent->data_size && (eq_info->read_on & RO_ODB)) {
      if (actual_millitime - eq->last_called > ODB_UPDATE_TIME) {
         eq->last_called = actual_millitime;
         update_odb(pevent, eq->hkey_variables, eq->format);
         eq->odb_out++;
      }
   }

so if read_on is equal 1, the function update_odb should never be called.

So the problem must be on your side.

Best,
Stefan
Entry  13 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
Good evening, 

I am setting up Midas (v2.1) for a new experiment. We want to save the data in the ROOT format. We installed ROOT from source (v6.28/06), and ROOTSYS is set. When we compile Midas, it says that it found ROOT. We set up a second logger channel where we set the filename to run%05d.root, the format to ROOT, and the output to ROOT. Nevertheless, when starting a run, the logger writes the error that "channel '1' requested ROOT output, but mlogger is built without HAVE_ROOT". From the CMake file, I would assume that it is set automatically if ROOT is found. Do you have any idea why the mlogger does not find ROOT or save the data in the ROOT format?

Thanks in advance for your ideas, input, and help. 

Cheers,
Ivo
    Reply  13 Nov 2023, Konstantin Olchanski, Forum, mlogger does not HAVE_ROOT 
> I am setting up Midas (v2.1) for a new experiment. We want to save the data in the ROOT format. We installed ROOT from source (v6.28/06), and ROOTSYS is set. When we compile Midas, it says that it found ROOT. We set up a second logger channel where we set the filename to run%05d.root, the format to ROOT, and the output to ROOT. Nevertheless, when starting a run, the logger writes the error that "channel '1' requested ROOT output, but mlogger is built without HAVE_ROOT". From the CMake file, I would assume that it is set automatically if ROOT is found. Do you have any idea why the mlogger does not find ROOT or save the data in the ROOT format?

when you build midas using "make cmake", it prints information about packages that it finds (or does not). please post this here. it would be even more helpful if you post the whole output of "make cmake" (make cmake >& make.log, post make.log here as attachment).

historically, this problem has been a major annoyance over the years, mlogger would not find ROOT when needed, will find the wrong ROOT when not needed or ROOT at run time will be different from ROOT at build time. "cmake" has been of no help in improving on this, only made all debugging more difficult.

K.O.
    Reply  13 Nov 2023, Stefan Ritt, Forum, mlogger does not HAVE_ROOT 
When you do "cmake .." in the build directory, you will see

-- MIDAS: Found ROOT version xxx in yyy

which will tell you that ROOT has been found. Then you should check if it has been turned off manually by doing\

ccmake ..

in the build directory. You will then see all the control variables. Make sure NO_ROOT is turned OFF (meaning ROOT is enabled).

Finally, make sure you start "rmlogger" and not "mlogger". Only "rmlogger" contains the ROOT binding.

Stefan
       Reply  14 Nov 2023, Konstantin Olchanski, Forum, mlogger does not HAVE_ROOT 
> Finally, make sure you start "rmlogger" and not "mlogger". Only "rmlogger" contains the ROOT binding.

Stefan is right. I forgot this. As solution to our troubles, mlogger is built without root support. use rmlogger instead.

K.O.
          Reply  14 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
> Stefan is right. I forgot this. As solution to our troubles, mlogger is built without root support. use rmlogger instead.
> 
> K.O.

Thanks, Stefan and Konstantin, for your feedback. 

So I checked the cmake file, and already the existence of the rmlogger shows that HAVE_ROOT was set. It was really only the problem of not being aware of rmlogger. This now works, and it produces root files that are readable. 

However, we encountered a new problem not that it does not find a bank that is produced by a multi-threaded slow-control frontend. The logger triggers the error "mlogger.cxx:3328:root_book_bank,ERROR] received unknown bank 'MSRD' in event #8". After this, we get a segmentation violation, but I guess this is then coming from the error. If we run only the polled FE, it works fine. If we run the polled and the multi-threaded FE with only the logger saving mid files, it works fine as well. Are you aware of issues with multi-threaded slow-control frontends and saving their banks in root format?

Cheers,
Ivo
             Reply  14 Nov 2023, Stefan Ritt, Forum, mlogger does not HAVE_ROOT 
No, I'm not aware of this problem, but I suspect that your events somehow got corrupted. You can try the mdump utility
or the "Event Dump" web page to peek into your events, maybe you see an issue there. To give you more detailed information,
I would have to reproduce your problem, which is probably hard without your hardware.

Stefan
                Reply  15 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
> No, I'm not aware of this problem, but I suspect that your events somehow got corrupted. You can try the mdump utility
> or the "Event Dump" web page to peek into your events, maybe you see an issue there. To give you more detailed information,
> I would have to reproduce your problem, which is probably hard without your hardware.
> 
> Stefan

Hi Stefan,

So I did a few things:
- I checked with mdump online, the data stream looks good, and I can see the bank name properly
- I checked with mdump offline the .mid files, the banks are there, and the data look good
- I removed the creating of the bank MSRD in the class driver. This stopped the writing of the data in the midas/root file but kept the stream to the history files. In principle, this is a quick and dirty fix because we still have all the data in the history files. Do you see any bigger problem with that solution?
- I tried to run the multi-threaded slow-control frontend with the generic class driver (generic.cxx) and the nulldev device driver (nulldev.cxx). This produces the DMND and MSRD bank with and also produces the error in the with the logger when trying to save in the root format (received unknown bank "DMND" in event #8). This means it is not related to the devices (maybe some other part of my user code of course). 

Cheers,
Ivo
Entry  23 Oct 2023, Francesco Renga, Forum, Device with inputs and outputs 
Dear all,
       I'm writing a very simple device driver starting from the nulldev.cxx 
example. 

I define an equipment as reported at the end of this message then, if all 
variables are Input variables, I define them with:

  mdevice device("myEquimpent", "Input", DF_INPUT | DF_MULTITHREAD, mydevice);
  device.define_var("Var1", 0.1);
  device.define_var("Var2", 0.1);
  ...

If all variables are output variables, I define them with:

  mdevice device("myEquipment", "Output", DF_OUTPUT | DF_MULTITHREAD, mydevice);
  device.define_var("Var1", 0.1);
  device.define_var("Var2", 0.1);

But I don't know what to do if I have mixed input and output variables in the same 
device. I think I can do:

  mdevice device_in("myEquipment", "Input", DF_INPUT | DF_MULTITHREAD, mydevice);
  device.define_var("Var1", 0.1);

  mdevice device_out("myEquipment", "Output", DF_OUTPUT | DF_MULTITHREAD, 
mydevice);
  device.define_var("Var2", 0.1);

but in this case, inside mydevice.cxx, I don't know how to distinguish Var1 and 
Var2, because they are both identified as channel 0.

Do you have any suggestion?

Thank you,
      Francesco




-------------------------------------------------------------------



   {"SourceMotor",                       /* equipment name */
    {7, 0,                       /* event ID, trigger mask */
     "SYSTEM",                  /* event buffer */
     EQ_SLOW,                   /* equipment type */
     0,                         /* event source */
     "MIDAS",                   /* format */
     TRUE,                      /* enabled */
     RO_ALWAYS,        /* read when running and on transitions */
     60000,                     /* read every 60 sec */
     0,                         /* stop run after this event limit */
     0,                         /* number of sub events */
     1,                         /* log history every event */
     "", "", ""} ,
    cd_multi_read,                 /* readout routine */
    cd_multi,                      /* class driver main routine */
   },
    Reply  24 Oct 2023, Stefan Ritt, Forum, Device with inputs and outputs 
The "multi" class driver takes care of that. It properly calls the SET and GET functions 
with the correct index. The code for that is in multi.cxx:105:

 device_driver(m_info->driver_input[i], CMD_GET,
               i - m_info->channel_offset_input[i],
               &m_info->var_input[i]);

The "channel_offset_input" and "channel_offset_output" store the first index of the 
channel in the overall ODB array (where inputs and outputs are staggered together), so 
the device_driver is always called with an index 0...n each for input and output, but 
with different commands CMD_GET and CMD_SET. You can take the mscbdev.cxx device driver 
as a working example.

Stefan
Entry  03 Oct 2023, Gennaro Tortone, Bug Report, Python midas.file_reader get_eor_odb_dump() 
Hi,

the method get_eor_odb_dump() of midas.file_reader does not contain an
initial jump_to_start() and this is a problem if the following access
pattern is used:

---

mfile = midas.file_reader.MidasFile("run00008.mid.lz4")
begin_odb = mfile.get_bor_odb_dump().data

# loop on data events
...

end_odb = mfile.get_eor_odb_dump().data

---

in this case the script ends with a RuntimeError (Unable to find EOR event) and
force user to do a manual mfile.jump_to_start() before mfile.get_eor_odb_dump();

Thanks,
Gennaro
    Reply  16 Oct 2023, Ben Smith, Bug Report, Python midas.file_reader get_eor_odb_dump() 
Thanks for the bug report Gennaro! 

I've fixed the code so that we'll now find the end-of-run ODB dump even if the user is already at the end of the file when they call get_eor_odb_dump().

Ben
Entry  06 Oct 2023, Stefan Ritt, Info, New equipment display 
Since a long time we tried to convert all "static" mhttpd-generated pages to 
dynamic JavaScript. With the new history panel editor we were almost there. Now I 
committed the last missing piece - the equipment display. This is shown when you 
click on some equipment on the main status page, or if you define some Alias with 

?cmd=eqtable&eq=Trigger

This is now a dynamic display, so the values change if they change in the ODB. The 
also flash briefly in yellow to visually highlight any change. In addition, these 
pages have a unit display, and some values can be edited. This is controlled by 
following settings:

/Equipment/<name>/settings/Unit <variable>

where <name> is the name of the equipment and <variable> the variable array name 
under /Equipment/<name>/Variables/<variable>

If the unit setting is not present, just a blank column is shown.

The other setting is

/Equipment/<name>/settings/Editable 

which may contain a comma-separated string of variables which can be editied on 
the equipment page.

In addition, one can save/export the equipment in a json file, which is the same 
as a ODB save of that branch. A load or import however only loads values into the 
ODB which are under the "Editable" setting above. This allows a simple editor for 
HV values etc.

Stefan
    Reply  09 Oct 2023, Stefan Ritt, Info, New equipment display Screenshot_2023-10-09_at_21.56.25.png
An additional functionality has been implemented on the equipment table:

You can now select several elements by Ctrl/Shift-Click on their names, then change the 
first one. After a confirmation dialog, all selected variables are then set to the new 
value. This way one can very easily change all values to zero etc.

Stefan
Entry  06 Oct 2023, Konstantin Olchanski, Info, new history panel editor 
the new history panel editor has been activated. it is meant to work the same as 
the old editor, with some improvements to the history variables selection page. 
this new version is written in html+javascript and it will be easier to improve, 
update and maintain compared to the old version written in c++. the old history 
panel editor is still usable and accessible by pressing the "edit in old editor" 
button. please report any problem, quirks and improvements in this thread or in 
the bitbucket bug reports. K.O.
Entry  06 Oct 2023, Konstantin Olchanski, Info, default midas history switched to "FILE" and "PerVariable" history 
We are very happy with the "FILE" implementation of MIDAS history and it is time 
to make it the default for new experiments. This history driver works best if 
"per variable" history is alos enabled. (SQL history already only works in "per-
variable" mode). commit 676051b3024965bd8a04da112965a141d5f61a39
K.O.
Entry  02 Aug 2023, Stefan Ritt, Bug Report, Error accessing history files log.txt
We sporadically (like once per few hours) have an error message when we access the 
history plots through mhttpd:

07:21:35.109 2023/08/03 [mhttpd,ERROR] 
[history_schema.cxx:2345:FileHistory::read_data,ERROR] Cannot read 
'/data2/history/mhf_1690890685_20230801_dc_hv.dat', read() errno 2 (No such file 
or directory)

When I log in to the machine, I properly see the file and also can access it

[meg@megon02 history]$ ls -l mhf_1690890685_20230801_dc_hv.dat
-rw-rw-r--. 1 meg meg 34176312 Aug  3 07:23 mhf_1690890685_20230801_dc_hv.dat

and I also can dump that file. 

When I try again with mhttpd, I properly see that file. 

Now in principle this is not a problem, but the error message is annoying, since this 
is the only error we get in 24 hours. I attached a 24h log to see what I mean. If this 
is an OS issue, I wonder if we should add code to retry the file access in case we get 
that error.

Anybody seen a similar thing?

Best,
Stefan
    Reply  09 Aug 2023, Konstantin Olchanski, Bug Report, Error accessing history files 
I confirm I see same on the agmini system. Two problems: (a) error message is wrong, it's a 
short read, not a read error (clue: read() syscall does not return "no such file"). (b) 
mlogger is supposed to write history in record-size blocks, read in the same record size 
blocks. UNIX file semantics require that both reader and writer see read() and write() as 
atomic, even on NFS, so mhttpd should never see partially written history records. I can 
debug this on the agmini system. Probably should.

Problem (a) fixed in commit bb423c8680cc67220312534403840442868f2b3b, if you update, you 
should see error messages about "short read" and the read sizes it reports are very 
interesting, please put them in the elog here.

K.O.


> We sporadically (like once per few hours) have an error message when we access the 
> history plots through mhttpd:
> 
> 07:21:35.109 2023/08/03 [mhttpd,ERROR] 
> [history_schema.cxx:2345:FileHistory::read_data,ERROR] Cannot read 
> '/data2/history/mhf_1690890685_20230801_dc_hv.dat', read() errno 2 (No such file 
> or directory)
> 
> When I log in to the machine, I properly see the file and also can access it
> 
> [meg@megon02 history]$ ls -l mhf_1690890685_20230801_dc_hv.dat
> -rw-rw-r--. 1 meg meg 34176312 Aug  3 07:23 mhf_1690890685_20230801_dc_hv.dat
> 
> and I also can dump that file. 
> 
> When I try again with mhttpd, I properly see that file. 
> 
> Now in principle this is not a problem, but the error message is annoying, since this 
> is the only error we get in 24 hours. I attached a 24h log to see what I mean. If this 
> is an OS issue, I wonder if we should add code to retry the file access in case we get 
> that error.
> 
> Anybody seen a similar thing?
> 
> Best,
> Stefan
       Reply  16 Aug 2023, Stefan Ritt, Bug Report, Error accessing history files 
Tonight we got another error of that type after the update:

04:17 - [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read 
'/data2/history/mhf_1692128214_20230815_gassystem.dat', read() errno 2 (No such file or directory)

This morning I looked at the file, and it was there:

[meg@megon02 history]$ ls -alg mhf_1692128214_20230815_gassystem.dat
-rw-rw-r--. 1 meg 4663228 Aug 17 08:50 mhf_1692128214_20230815_gassystem.dat
[meg@megon02 history]$


Stefan
          Reply  17 Aug 2023, Konstantin Olchanski, Bug Report, Error accessing history files 
Confirmed. The error message is wrong. It is printed after a short read(), but short read() does not 
set errno, and errno reported by the error message is from some previous syscall. Corrected error 
message is already committed. K.O.


> Tonight we got another error of that type after the update:
> 
> 04:17 - [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read 
> '/data2/history/mhf_1692128214_20230815_gassystem.dat', read() errno 2 (No such file or directory)
> 
> This morning I looked at the file, and it was there:
> 
> [meg@megon02 history]$ ls -alg mhf_1692128214_20230815_gassystem.dat
> -rw-rw-r--. 1 meg 4663228 Aug 17 08:50 mhf_1692128214_20230815_gassystem.dat
> [meg@megon02 history]$
> 
> 
> Stefan
             Reply  19 Aug 2023, Stefan Ritt, Bug Report, Error accessing history files 
Still get the same error with the latest version:

3:28 [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read 
'/data2/history/mhf_1692391703_20230818_hv_tc.dat', read() errno 2 (No such file or directory)

Stefan
                Reply  06 Oct 2023, Konstantin Olchanski, Bug Report, Error accessing history files 
> Still get the same error with the latest version:
> 3:28 [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read 
> '/data2/history/mhf_1692391703_20230818_hv_tc.dat', read() errno 2 (No such file or directory)

I figured it out. I claim defense of temporary insanity and old age senility.

1) I added the "short read" check in one place, missed the second place
2) writes of history were meant to be atomic, and they are atomic in my head, but not in the midas 
code:

history_schema.cxx:HsFileSchema::write_event()
...
   status = write(s->writer_fd, &t, 4);
   if (status != 4) {
      cm_msg(MERROR, "FileHistory::write_event", "Cannot write to \'%s\', write(timestamp) errno 
%d (%s)", s->file_name.c_str(), errno, strerror(errno));
      return HS_FILE_ERROR;
   }

   status = write(s->writer_fd, data, expected_size);
   if (status != expected_size) {
      cm_msg(MERROR, "FileHistory::write_event", "Cannot write to \'%s\', write(%d) errno %d 
(%s)", s->file_name.c_str(), data_size, errno, strerror(errno));
      return HS_FILE_ERROR;
   }
...

that's not atomic, that's two separate writes. history reader hits the history file between the 
two writes and gets a short read of 4 bytes timestamp instead of full record size. that's the 
error message reported by mhttpd.

two fixes forthcoming:
a) check for short read in the 2nd place that I missed
b) two write() are replaced by 2 memcpy() to a preallocated buffer and 1 write()

Overall, I am pretty happy that this is the only bug in the FILE history code found in N years, 
and it does not even cause data corruption...

K.O.
                   Reply  06 Oct 2023, Konstantin Olchanski, Bug Report, Error accessing history files 
> two fixes forthcoming:
> a) check for short read in the 2nd place that I missed
> b) two write() are replaced by 2 memcpy() to a preallocated buffer and 1 write()

commit 713ec4a583365d57ffcd700ceeb09dcc14518295

K.O.
Entry  03 Oct 2023, Konstantin Olchanski, Bug Fix, wrong array size after loading xml or json file 
both the xml and the json decoders have a bug (fix pending). loading saved odb 
from xml and json file did not truncate arrays in odb to the size of arrays in 
the file. for example, if /example/double_array has size 20 in odb, but size 5 
in xml or json file, after loading the file, array size is still 20.

this is unexpected: after loading an odb save file we expect odb to return to 
same state as when odb save file was created. we do not expect some arrays to 
have half of their elements restored from file and half their elements left 
unchanged.

save and restore from .odb file does not have this problem.

I think this is a bug and I committed (but did not yet push) a fix for both xml 
and json odb decoder.

I have run this problem while writing the new history panel editor, where 
deleting variables did not work because json rpc db_paste() was not truncating 
any arrays.

I am still finishing up the last few bits of the new history panel editor, and 
there is a bit of time to discuss and comment this odb change before I push it 
to midas.

K.O.
Entry  30 Sep 2023, Gennaro Tortone, Bug Report, ODB page and hex values 10.png
Hi,

I was playing with MIDAS devel branch and I realized that
if I set an ODB INT32 key to a value using new ODB web interface 
it is reported in parenthesis always as (0xFFFFFFFF);

I tested with different browser and result is the same while this 
never happens in OldODB web interface...

Cheers,
Gennaro
    Reply  01 Oct 2023, Stefan Ritt, Bug Report, ODB page and hex values 
Thanks for reporting this bug, I fixed it in the last commit.

Best,
Stefan
Entry  26 Sep 2023, Stefan Ritt, Info, mjsonrpc_db_save / mjsonrpc_db_load have been dropped 
The JavaScript function

mjsonrpc_db_save / mjonrpc_db_load

have been dropped from the API because they were not considered safe. Users 
should use now the new function

file_save_ascii()

and

file_load_ascii()

These function have the additional advantage that the file is not loaded 
directly into the ODB but goes into the JavaScript code in the browser, which 
can check or modify it before sending it to the ODB via mjsonrpc_db_paste(). 

Access of these functions is limited to <experiment>/userfiles/* where 
<experiment> is the normal MIDAS experiment directory defined by "exptab" or 
"MIDAS_DIR". This ensures that there is no access to e.g. system-level files. If 
you need to access a directory not under "userfile", us symbolic links.

These files can be combined with file_picker(), which lets you select files on 
the server interactively.

Stefan
Entry  24 Sep 2023, Frederik Wauters, Suggestion, scroll when browsing for a link 
Another small user experience request:

When making a link in the odb (web interface) a nice browser window pop's up. There is however not scrolling possible in the window. As a result, you can not reach a odb key if it is nested to deeply. 

Trying to type out the Link target in the field only allows for 32 characters

context: we are setting up a bunch of Links in the History
    Reply  26 Sep 2023, Stefan Ritt, Suggestion, scroll when browsing for a link 
> When making a link in the odb (web interface) a nice browser window pop's up. There is however not scrolling possible in the window. As a result, you can not reach a odb key if it is nested to deeply. 
> 
> Trying to type out the Link target in the field only allows for 32 characters

Thanks for reporting the bug with the pop-up not being able to scroll, I fixed that and committed the change.

I do however not understand the issue with 32 characters. The link NAME should not be more than 32 chars (which applies to all ODB keys). 
But if I try I can write more than 32 chars in the link target.

Stefan
Entry  19 Sep 2023, Frederik Wauters, Bug Report, epics fe "Start Command" 
The epics frontend overwrites the "start command" odb after each start:

  // set start command in ODB
   midas::odb efe("/Programs/EPICS Frontend");
   std::string p(__FILE__);
   std::string s("build/epics_fe");
   auto i = p.find("epics_fe.cxx");
   p.replace(i, s.length(), s);
   p = p.substr(0, i + s.length());
   efe["Start command"].set_string_size(p, 256);

this should be set such that it only writes when the key is not there. It causes the following issue: on a pc with multiple experiments defined, you need to start the fe's with a "-e <name>" flag. 
    Reply  20 Sep 2023, Stefan Ritt, Bug Report, epics fe "Start Command" 
Thanks for reporting this problem. It has been fixed today, so the start command is only written if it's emtpy.

Stefan
Entry  08 Sep 2023, Nick Hastings, Forum, Hide start and stop buttons 
The wiki documents an odb variable to enable the hiding of the Start and Stop buttons on the mhttpd status page
https://daq00.triumf.ca/MidasWiki/index.php//Experiment_ODB_tree#Start-Stop_Buttons

However mhttpd states this option is obsolete. See commit:
https://bitbucket.org/tmidas/midas/commits/2366eefc6a216dc45154bc4594e329420500dcf7

I note that that commit also made mhttpd report that the "Pause-Resume Buttons" variable is also obsolete, however that code seems to have since been removed.

Is there now some other mechanism to hide the start and stop buttons?
Note that this is for a pure slow control system that does not take runs.
    Reply  08 Sep 2023, Nick Hastings, Forum, Hide start and stop buttons 
> Is there now some other mechanism to hide the start and stop buttons?
> Note that this is for a pure slow control system that does not take runs.

Just wanted to add that I realize that this can be done by copying
status.html and/or midas.css to the experiment directory and then modifying
them/it, but wonder if the is some other preferred way.
       Reply  13 Sep 2023, Stefan Ritt, Forum, Hide start and stop buttons 
Indeed the ODB settings are obsolete. Now that the status page is fully dynamic 
(JavaScript), it's much more powerful to modify the status.html page directly. You 
can not only hide the buttons, but also remove the run numbers, the running time, 
and so on. This is much more flexible than steering things through the ODB.

If there is a general need for that, I can draft a "non-run" based status page, but 
it's a bit hard to make a one-fits-all. Like some might even remove the logging 
channels and the clients, but add certain things like if their slow control front-
end is running etc.

Best,
Stefan
          Reply  13 Sep 2023, Nick Hastings, Forum, Hide start and stop buttons screenshot-20230914-085054.png
Hi Stefan,

> Indeed the ODB settings are obsolete.

I just applied for an account for the wiki.
I'll try add a note regarding this change.

> Now that the status page is fully dynamic 
> (JavaScript), it's much more powerful to modify the status.html page directly. You 
> can not only hide the buttons, but also remove the run numbers, the running time, 
> and so on. This is much more flexible than steering things through the ODB.

Very true. Currently I copied the resources/midas.css into the experiment directory and appended:

#runNumberCell { display: none;}
#runStatusStartTime { display: none;}
#runStatusStopTime { display: none;}
#runStatusSequencer { display: none;}
#logChannel { display: none;}

See screenshot attached. :-)

But if feels a little clunky to copy the whole file just to add five lines.
It might be more elegant if status.html looked for a user css file in addition
to the default ones.

> If there is a general need for that, I can draft a "non-run" based status page, but 
> it's a bit hard to make a one-fits-all. Like some might even remove the logging 
> channels and the clients, but add certain things like if their slow control front-
> end is running etc.

The logging channels are easily removed with the css (see attachment), but it might be
nice if the string "Run Status" table title was also configurable by css. For this
slow control system I'd probably change it to something like "GSC Status". Again
this is a minor thing, I could trivially do this by copying the resources/status.html
to the experiment directory and editing it.

Lots of fun new stuff migrating from circa 2012 midas to midas-2022-05-c :-)

Cheers,

Nick.
             Reply  13 Sep 2023, Stefan Ritt, Forum, Hide start and stop buttons 
> Hi Stefan,
> 
> > Indeed the ODB settings are obsolete.
> 
> I just applied for an account for the wiki.
> I'll try add a note regarding this change.

Please coordinate with Ben Smith at TRIUMF <bsmith@triumf.ca>, who coordinates the documentation. 


> Very true. Currently I copied the resources/midas.css into the experiment directory and appended:
> 
> #runNumberCell { display: none;}
> #runStatusStartTime { display: none;}
> #runStatusStopTime { display: none;}
> #runStatusSequencer { display: none;}
> #logChannel { display: none;}
> 
> See screenshot attached. :-)
> 
> But if feels a little clunky to copy the whole file just to add five lines.
> It might be more elegant if status.html looked for a user css file in addition
> to the default ones.

I would not go to change the CSS file. You only can hide some tables. But in a while I'm sure you
want to ADD new things, which you only can do by editing the status.html file. You don't have to
change midas/resources/status.html, but can make your own "custom status", name it differently, and
link /Custom/Default in the ODB to it. This way it does not get overwritten if you pull midas.


> The logging channels are easily removed with the css (see attachment), but it might be
> nice if the string "Run Status" table title was also configurable by css. For this
> slow control system I'd probably change it to something like "GSC Status". Again
> this is a minor thing, I could trivially do this by copying the resources/status.html
> to the experiment directory and editing it.

See above. I agree that the status.html file is a bit complicated and not so easy to understand
as the CSS file, but you can do much more by editing it.

> Lots of fun new stuff migrating from circa 2012 midas to midas-2022-05-c :-)

I always advise people to frequently pull, they benefit from the newest features and avoid the
huge amount of work to migrate from a 10 year old version.

Best,
Stefan
                Reply  14 Sep 2023, Konstantin Olchanski, Forum, Hide start and stop buttons 
I believe the original "hide run start / stop" was added specifically for ND280 GSC MIDAS. I do not know 
why it was removed. "hide pause / resume" is still there. I will restore them. Hiding logger channel 
section should probably be automatic of there is no /logger/channels, I can check if it works and what 
happens if there is more than one logger channel. K.O.
                   Reply  14 Sep 2023, Stefan Ritt, Forum, Hide start and stop buttons 
> I believe the original "hide run start / stop" was added specifically for ND280 GSC MIDAS. I do not know 
> why it was removed. "hide pause / resume" is still there. I will restore them. Hiding logger channel 
> section should probably be automatic of there is no /logger/channels, I can check if it works and what 
> happens if there is more than one logger channel. K.O.

Very likely it was "forgotten" when the status page was converted to a dynamic page by Shouyi Ma. Since he is 
not around any more, it's up to us to adapt status.html if needed.

Stefan
                   Reply  15 Sep 2023, Stefan Ritt, Forum, Hide start and stop buttons 
> I believe the original "hide run start / stop" was added specifically for ND280 GSC MIDAS. I do not know 
> why it was removed. "hide pause / resume" is still there. I will restore them. Hiding logger channel 
> section should probably be automatic of there is no /logger/channels, I can check if it works and what 
> happens if there is more than one logger channel. K.O.

Actually one thing is the functionality of the /Experiment/Start-Stop button in status.html, but the other is 
the warning we get from mhttpd:

[mhttpd,ERROR] [mhttpd.cxx:1957:init_mhttpd_odb,ERROR] ODB "/Experiment/Start-Stop Buttons" is obsolete, please 
delete it.

This was added by KO on Nov. 29, 2019 (commit 2366eefc). So we have to decide re-enable this feature (and 
remove the warning above), or keep it dropped and work on changes of status.hmtl.

Stefan
                Reply  14 Sep 2023, Nick Hastings, Forum, Hide start and stop buttons 
Hi

> > > Indeed the ODB settings are obsolete.
> > 
> > I just applied for an account for the wiki.
> > I'll try add a note regarding this change.
> 
> Please coordinate with Ben Smith at TRIUMF <bsmith@triumf.ca>, who coordinates the documentation. 

I will tread lightly. 

> I would not go to change the CSS file. You only can hide some tables. But in a while I'm sure you
> want to ADD new things, which you only can do by editing the status.html file. You don't have to
> change midas/resources/status.html, but can make your own "custom status", name it differently, and
> link /Custom/Default in the ODB to it. This way it does not get overwritten if you pull midas.

We have *many* custom pages. The submenus on the status page:

&#9656; FGD
&#9656; TPC
&#9656; TRIPt

hide custom pages with all sorts of good stuff.

> > The logging channels are easily removed with the css (see attachment), but it might be
> > nice if the string "Run Status" table title was also configurable by css. For this
> > slow control system I'd probably change it to something like "GSC Status". Again
> > this is a minor thing, I could trivially do this by copying the resources/status.html
> > to the experiment directory and editing it.
> 
> See above. I agree that the status.html file is a bit complicated and not so easy to understand
> as the CSS file, but you can do much more by editing it.

I may end up doing this since the events and data columns do not provide particularly
useful information in this instance. But for now, the css route seems like a quick and
fairly clean way to remove irrelevant stuff from a prominent place at the top of the page.
 
> > Lots of fun new stuff migrating from circa 2012 midas to midas-2022-05-c :-)
> 
> I always advise people to frequently pull, they benefit from the newest features and avoid the
> huge amount of work to migrate from a 10 year old version.

The long delay was not my choice. The group responsible for the system departed in 2018, and
and were not replaced by the experiment management. Lack of personnel/expertise resulted in
a "if it's not broken then don't fix it" situation. Eventually, the need to update the PCs/OSs
and the imminent introduction of new sub-detectors resulted people agreeing to the update. 

Cheers,

Nick.
Entry  12 Sep 2023, Maia Henriksson-Ward, Suggestion, Syntax highlighting for sequencer scripts 
Recently I was trying to read sequencer scripts written by a previous student, and realized it would be easier to 
quickly read/skim sequencer code with some form of syntax highlighting. I've been using Visual Studio Code as my 
editor, so I made myself an extension for VS Code that provides basic syntax highlighting (with help from 
ChatGPT-3.5). It's good enough for my purposes, but is missing some features you'd expect for full language 
support. This got me wondering - does anything like this already exist, perhaps with more complete support?

If it doesn't already exist, and if there is interest, I could to publish mine 
to vscode's "Extension Marketplace" for easy installations (I'd also welcome contributions for 
more features). For now, I've installed it on my computer directly from the .vsix file, which I've put on my own 
github at https://github.com/maia-hw/midas-sequencer-support . There is also a readme with screenshot showing what scripts 
will look like with the highlighting
    Reply  12 Sep 2023, Stefan Ritt, Suggestion, Syntax highlighting for sequencer scripts 
I like the idea of syntax highlighting, but your solution is just for one editor which not everybody
is using. It would be better if the editor built into mhttpd for MSL files would have the possibility.

I looked at highlighting in an HTML <textarea> tag, and found that we can do it with a 

<div contenteditable="true" style="font-family: monospace"> ... </div>

tag where we can change the color of individual words. If you translate your existing rules of syntax
highlighting into JavaScript, I'm happy to put that into the mhttpd sequencer editor. So I would need
a function which receives a MSL text, then replaces all keywords with some color tagging, like

ODBSET -> <span style="color:red">ODBSET</span>

Best,
Stefan
Entry  01 Jun 2023, Thomas Lindner, Info, MIDAS Workshop 2023 
Dear MIDAS users,

We would like to arrange another MIDAS workshop, following on from previous successful workshops in 2015, 2017 and 2019.  The 
goals of the workshop would include:

- Getting updates from MIDAS developers on new features and other changes.
- Getting reports from MIDAS users on how they are using MIDAS, what is working and what is not
- Making plans for future MIDAS changes and improvements

This would be a one-day virtual workshop, planned for about 4 hours length.  The workshop will probably be after another of 
Stefan's visits to TRIUMF.

If you would be interested in participating in such a workshop, please help us choose the date by filling out this doodle poll:

https://doodle.com/meeting/organize/id/dBPVMQJa

Please fill in the poll by June 9, if you are interested.  We will announce the date soon after that.

Thanks,
Thomas
    Reply  13 Jun 2023, Thomas Lindner, Info, MIDAS Workshop 2023 - Sept 13 
Hi All,

Thanks to everyone who filled out the doodle poll.  

Based on the results we will plan to have this workshop on September 13, at 9AM-1PM (Vancouver) / 6PM-10PM (Geneva).  Apologies to 
those for whom this is a bad time/day; in particular for MIDAS users in Asia.

If you would like to present a report at the workshop on your experiment's MIDAS experience, then please email me (lindner@triumf.ca).  
It would be great to know this in advance so that we can start preparing an agenda.  Feel free to also email me if there are topics 
that you would like addressed at the workshop.

Thanks,
Thomas


> Dear MIDAS users,
> 
> We would like to arrange another MIDAS workshop, following on from previous successful workshops in 2015, 2017 and 2019.  The 
> goals of the workshop would include:
> 
> - Getting updates from MIDAS developers on new features and other changes.
> - Getting reports from MIDAS users on how they are using MIDAS, what is working and what is not
> - Making plans for future MIDAS changes and improvements
> 
> This would be a one-day virtual workshop, planned for about 4 hours length.  The workshop will probably be after another of 
> Stefan's visits to TRIUMF.
> 
> If you would be interested in participating in such a workshop, please help us choose the date by filling out this doodle poll:
> 
> https://doodle.com/meeting/organize/id/dBPVMQJa
> 
> Please fill in the poll by June 9, if you are interested.  We will announce the date soon after that.
> 
> Thanks,
> Thomas
       Reply  17 Aug 2023, Thomas Lindner, Info, MIDAS Workshop 2023 - Sept 12-13 
Dear All,

A quick update on the MIDAS workshop.  Based on the number of planned talks we have made the decision to switch to a two day workshop on Sept 12 and 13 
(rather than just Sept 13).  We decided 4 hours was not enough time to hear all the reports and have fruitful discussions; having a much longer meeting 
on a single day was a bad idea given the time zones involved.

We have a tentative agenda planned for the workshop, which you can see here:

https://indico.psi.ch/event/15025/timetable/

We are still confirming some talks, so the agenda may still change a bit.  But the baseline plan will be that the workshop will be 

8:30AM-12:30PM (PDT) / 5:30PM-9:30PM (CEST)

on Sept 12-13.  We hope that these times still work for everyone planning to attend.

Cheers,
Thomas


> Hi All,
> 
> Thanks to everyone who filled out the doodle poll.  
> 
> Based on the results we will plan to have this workshop on September 13, at 9AM-1PM (Vancouver) / 6PM-10PM (Geneva).  Apologies to 
> those for whom this is a bad time/day; in particular for MIDAS users in Asia.
> 
> If you would like to present a report at the workshop on your experiment's MIDAS experience, then please email me (lindner@triumf.ca).  
> It would be great to know this in advance so that we can start preparing an agenda.  Feel free to also email me if there are topics 
> that you would like addressed at the workshop.
> 
> Thanks,
> Thomas
> 
> 
> > Dear MIDAS users,
> > 
> > We would like to arrange another MIDAS workshop, following on from previous successful workshops in 2015, 2017 and 2019.  The 
> > goals of the workshop would include:
> > 
> > - Getting updates from MIDAS developers on new features and other changes.
> > - Getting reports from MIDAS users on how they are using MIDAS, what is working and what is not
> > - Making plans for future MIDAS changes and improvements
> > 
> > This would be a one-day virtual workshop, planned for about 4 hours length.  The workshop will probably be after another of 
> > Stefan's visits to TRIUMF.
> > 
> > If you would be interested in participating in such a workshop, please help us choose the date by filling out this doodle poll:
> > 
> > https://doodle.com/meeting/organize/id/dBPVMQJa
> > 
> > Please fill in the poll by June 9, if you are interested.  We will announce the date soon after that.
> > 
> > Thanks,
> > Thomas
          Reply  06 Sep 2023, Thomas Lindner, Info, MIDAS Workshop 2023 - Sept 12-13 
Dear All, 

A final reminder about the MIDAS workshop in 6 days.  A (hopefully) finalized agenda is posted here:

https://indico.psi.ch/event/15025/timetable/

In the overview section of the indico page you will find the zoom link for the workshop.

We plan for the workshop to have a lot of time for discussion. This means that the exact schedule of the workshop is a little uncertain; hence the actual start 
time of each talk will also have some uncertainty.  We have scheduled that each day's session will be around 3.5 hours, but it is possible that the sessions will 
be a little longer in reality.  Stefan and Pierre will try to ensure that we stay roughly on schedule.

Looking forward to seeing people there.

Cheers,
Thomas

> Dear All,
> 
> A quick update on the MIDAS workshop.  Based on the number of planned talks we have made the decision to switch to a two day workshop on Sept 12 and 13 
> (rather than just Sept 13).  We decided 4 hours was not enough time to hear all the reports and have fruitful discussions; having a much longer meeting 
> on a single day was a bad idea given the time zones involved.
> 
> We have a tentative agenda planned for the workshop, which you can see here:
> 
> https://indico.psi.ch/event/15025/timetable/
> 
> We are still confirming some talks, so the agenda may still change a bit.  But the baseline plan will be that the workshop will be 
> 
> 8:30AM-12:30PM (PDT) / 5:30PM-9:30PM (CEST)
> 
> on Sept 12-13.  We hope that these times still work for everyone planning to attend.
> 
> Cheers,
> Thomas
> 
> 
> > Hi All,
> > 
> > Thanks to everyone who filled out the doodle poll.  
> > 
> > Based on the results we will plan to have this workshop on September 13, at 9AM-1PM (Vancouver) / 6PM-10PM (Geneva).  Apologies to 
> > those for whom this is a bad time/day; in particular for MIDAS users in Asia.
> > 
> > If you would like to present a report at the workshop on your experiment's MIDAS experience, then please email me (lindner@triumf.ca).  
> > It would be great to know this in advance so that we can start preparing an agenda.  Feel free to also email me if there are topics 
> > that you would like addressed at the workshop.
> > 
> > Thanks,
> > Thomas
> > 
> > 
> > > Dear MIDAS users,
> > > 
> > > We would like to arrange another MIDAS workshop, following on from previous successful workshops in 2015, 2017 and 2019.  The 
> > > goals of the workshop would include:
> > > 
> > > - Getting updates from MIDAS developers on new features and other changes.
> > > - Getting reports from MIDAS users on how they are using MIDAS, what is working and what is not
> > > - Making plans for future MIDAS changes and improvements
> > > 
> > > This would be a one-day virtual workshop, planned for about 4 hours length.  The workshop will probably be after another of 
> > > Stefan's visits to TRIUMF.
> > > 
> > > If you would be interested in participating in such a workshop, please help us choose the date by filling out this doodle poll:
> > > 
> > > https://doodle.com/meeting/organize/id/dBPVMQJa
> > > 
> > > Please fill in the poll by June 9, if you are interested.  We will announce the date soon after that.
> > > 
> > > Thanks,
> > > Thomas
Entry  16 May 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
Our default configuration of apache httpd logs every request. MIDAS custom web pages can easily make a huge number of RPC calls creating a 
huge log file and filling system disk to 100% capacity

this example has around 100 RPC requests per second. reasonable/unreasonable? available hardware can handle it (web browser, network 
httpd, mhttpd, etc), so we should try to get this to work. perhaps filter the apache httpd logs to exclude mjsonrpc requests? of course we 
can ask the affected experiment why they make so many RPC calls, is there a bug?

[14/May/2023:03:49:01 -0700] 142.90.111.176 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /?mjsonrpc HTTP/1.1" 299
[14/May/2023:03:49:01 -0700] 142.90.111.176 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /?mjsonrpc HTTP/1.1" 299
[14/May/2023:03:49:01 -0700] 142.90.111.176 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /?mjsonrpc HTTP/1.1" 299
[14/May/2023:03:49:01 -0700] 142.90.111.176 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /?mjsonrpc HTTP/1.1" 299
[14/May/2023:03:49:01 -0700] 142.90.111.176 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /?mjsonrpc HTTP/1.1" 299

K.O.
    Reply  16 May 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> Our default configuration of apache httpd logs every request. MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> huge log file and filling system disk to 100% capacity

perhaps use existing logrotate, add limit on file size (size) and limit of 2 old log files (rotate).

/etc/logrotate.d/httpd

/var/log/httpd/*log { 
    size 100M 
    rotate 2 
    missingok 
    notifempty 
    sharedscripts 
    delaycompress 
    postrotate 
        /bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true 
    endscript 
} 

K.O.
       Reply  16 May 2023, Stefan Ritt, Bug Report, excessive logging of http requests 
Maybe you remember the problems we had with a custom page in Japan loading it from TRIUMF. It took almost one minute since each RPC request took 
about 1s round-trip. This got fixed by the modb* scheme where the framework actually collects all ODB variables in a custom page and puts them 
into ONE rpc request (making the path an actual array of paths). That reduced the requests from 100 to 1 in the above example. Maybe the same 
could be done in your current case. Pulling one ODB variable at a time is not very efficient.

Stefan
       Reply  02 Aug 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> > Our default configuration of apache httpd logs every request. MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> > huge log file and filling system disk to 100% capacity
> perhaps use existing logrotate, add limit on file size (size) and limit of 2 old log files (rotate).

logrotate was ineffective.

following apache httpd config seems to disable logging of mjsonrpc requests. note that we cannot filter on the "mjsonrpc" string because 
Request_URI excludes the query string (ouch!).

#SetEnvIf Request_URI "^POST /?mjsonrpc.*" nolog 
SetEnvIf Request_Method "POST" envpost 
SetEnvIf Request_URI "^\/$" envuri 
SetEnvIfExpr "-T reqenv('envpost') && -T reqenv('envuri')" envnolog 
 
CustomLog logs/ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b" env=!envnolog 

K.O.
          Reply  03 Aug 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> > > Our default configuration of apache httpd logs every request. MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> > > huge log file and filling system disk to 100% capacity
> > perhaps use existing logrotate, add limit on file size (size) and limit of 2 old log files (rotate).
>  
> CustomLog logs/ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b" env=!envnolog 
> 

TransferLog is not conditional and has to be commented out to stop logging every jsonrpc request.

K.O.
             Reply  14 Aug 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> Our default configuration of apache httpd logs every request.
> MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> huge log file and filling system disk to 100% capacity.

close but no cigar. mhttpd is not running and /var/log got filled to 100% capacity by http error messages. I do not see any apache facility to filter 
error messages, hmm...

-rw-r--r-- 1 root root 1864421376 Aug 14 12:53 ssl_error_log

[Sun Aug 13 23:53:12.416247 2023] [proxy:error] [pid 18608] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.416538 2023] [proxy:error] [pid 19686] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.416603 2023] [proxy:error] [pid 19681] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.416775 2023] [proxy:error] [pid 19588] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.417022 2023] [proxy:error] [pid 19311] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.421864 2023] [proxy:error] [pid 18620] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.422051 2023] [proxy:error] [pid 19693] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.422199 2023] [proxy:error] [pid 19673] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.422222 2023] [proxy:error] [pid 18608] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.422230 2023] [proxy:error] [pid 19657] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.422259 2023] [proxy:error] [pid 18633] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.427513 2023] [proxy:error] [pid 19686] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.427549 2023] [proxy:error] [pid 19681] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.427645 2023] [proxy:error] [pid 19588] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.427774 2023] [proxy:error] [pid 19693] AH00940: HTTP: disabled connection for (localhost)
[Sun Aug 13 23:53:12.427800 2023] [proxy:error] [pid 18620] AH00940: HTTP: disabled connection for (localhost)

K.O.
                Reply  16 Aug 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> > Our default configuration of apache httpd logs every request.
> > MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> > huge log file and filling system disk to 100% capacity.

added "daily" to /etc/logrotate.d/httpd, default was "weekly", not often enough.

K.O.
                   Reply  17 Aug 2023, Konstantin Olchanski, Bug Report, excessive logging of http requests 
> > > Our default configuration of apache httpd logs every request.
> > > MIDAS custom web pages can easily make a huge number of RPC calls creating a 
> > > huge log file and filling system disk to 100% capacity.
> added "daily" to /etc/logrotate.d/httpd, default was "weekly", not often enough.

this should fix it good, make /var/log bigger:

[root@mpmt-test ~]# df
Filesystem      1K-blocks       Used  Available Use% Mounted on
/dev/sdc2        52403200   52296356     106844 100% /
[root@mpmt-test ~]# 
[root@mpmt-test ~]# xfs_growfs /
data blocks changed from 13107200 to 106367750
[root@mpmt-test ~]# 
[root@mpmt-test ~]# df
Filesystem      1K-blocks       Used  Available Use% Mounted on
/dev/sdc2       425445400   52300264  373145136  13% /

K.O.
Entry  16 Aug 2023, Konstantin Olchanski, Bug Report, midas wants to show notification? 
I started to get web browser popups about "midas wants to show notifications, 
block/allow/x". is this a glitch or a new unannounced/undocumented feature? 
google chrome on macos. K.O.
    Reply  16 Aug 2023, Stefan Ritt, Bug Report, midas wants to show notification? 
> I started to get web browser popups about "midas wants to show notifications, 
> block/allow/x". is this a glitch or a new unannounced/undocumented feature? 
> google chrome on macos. K.O.

https://bitbucket.org/tmidas/midas/commits/e101dea764c647211c560a68db7ecda1834198db

I did not consider this a significant feature to be announced here. Just a few lines 
of code. You can turn it on/off via the "Config" web page.

Stefan
       Reply  16 Aug 2023, Stefan Ritt, Bug Report, midas wants to show notification? 
> > I started to get web browser popups about "midas wants to show notifications, 
> > block/allow/x". is this a glitch or a new unannounced/undocumented feature? 
> > google chrome on macos. K.O.
> 
> https://bitbucket.org/tmidas/midas/commits/e101dea764c647211c560a68db7ecda1834198db
> 
> I did not consider this a significant feature to be announced here. Just a few lines 
> of code. You can turn it on/off via the "Config" web page.
> 
> Stefan

Now as I look at it again I realized that the config check boxes had a bug. I fixed that 
and now the disable should work correctly.

This feature was asked by some people who monitor an experiment and have the browser window 
in the background, also have sound off (large office). So desktop notifications are a good 
thing for them.

Stefan
          Reply  16 Aug 2023, Konstantin Olchanski, Bug Report, midas wants to show notification? 
> This feature was asked by some people ...

"show notifications" popups are strongly associated with disreputable web sites (presumably to 
push spam), it was surprising to see it from midas.

K.O.
             Reply  17 Aug 2023, Stefan Ritt, Bug Report, midas wants to show notification? 
> > This feature was asked by some people ...
> 
> "show notifications" popups are strongly associated with disreputable web sites (presumably to 
> push spam), it was surprising to see it from midas.
> 
> K.O.

I agree. But unlike emails (where you get lots of spam as well), you can nicely blacklist/whitelist 
desktop notifications. I suppress all of them except the one for MIDAS. This allows me to watch our 
experiment without staring on the web page all the time.

The main question here is maybe if the desktop notification should be on or off by default (for a 
fresh browser). While you always can change that via the mhttpd "Config" page, the default value is 
chosen by the system. I thought I put it to "on" so people can experience it, and then turn it off if 
they don't like. Having them off by default, most people never would notice this possibility. But I'm 
open to a discussion here.

Stefan
Entry  15 Aug 2023, Konstantin Olchanski, Info, mlogger update 
A bit of update to the mlogger. In preparation for more cleanup when Stefan is 
here at TRIUMF.

1) fix overwrite of existing files if run number is reset (check for existing 
files was missing in the LZ4, BZ2 & co data path)
2) made output files read-only (midas, json and checksum files)
3) commented out the old code paths

Currently active per-channel ODB settings:

Active - enable or disable mlogger channel
Type - NOT USED
Filename - output filename template, %d are replaced by run number and subrun 
number, also pipe command for PIPE output
Format - NOT USED
Compression - NOT USED
ODB dump - enable/disable writing ODB dump to data file
ODB dump format - "json" is recommended for new experiments
Log messages - write log messages to output file, 0=off, -1=write all messages
Buffer - "SYSTEM" read events from this event buffer
EventID - "-1" for all events
Trigger Mask - "-1" for all events
Event Limit - stop run after so many events
Byte Limit - stop run after so many bytes
Subrun Byte limit - switch to next subrun file after writing so many bytes. 
actual file size is longer than subrun_byte_limit because of ODB dumps.
Tape Capacity - NOT USED
Subdir Format - if not empty, output file name is DIR/SUBDIR/FILENAME, "%" 
format things are expanded by strftime().
Current Filename - updated by mlogger, contains the currently written file name
Data checksum - checksum before compression, use CRC32C for maximum speed, 
SHA512 for maximum security.
File checksum - checksum after compression, CRC32C is good against accidental 
file corruption, SHA512 is cryptographically strong, good against purposeful 
tampering.
Compress - use "lz4" for maximum speed, bzip2 or pbzip2 for maximum compression. 
no compression and gzip are not recommended. (ZFS may apply lz4 compression to 
uncompressed data).
Output - "NULL" do not write anything, "FILE" write to disk, "FTP" write to FTP 
server, "ROOT" write via the mlogger ROOT writer (docs?), "PIPE" pipe data 
through an external command (i.e. for bzip2 compression).
Gzip compression - gzip compression flags (see gzip docs, 1=max speed, 9=max 
compression)
Bzip2 compression - if non-zero, bzip2 compression level (see "bzip2 -h", 1=max 
speed, 9=max compression)
Pbzip2 num cpu - number of CPUs used by parallel bzip2 compression, pbzip2 -p 
flag
Pbzip2 compression - if non-zero, pbzip2 compresison level (see "pbzip2 -h", 
default is 9=max compression)
Pbzip2 options - any additional pbzip2 options, i.e. -l, -m, -p, etc.

Currently active /Logger options:

Data Dir - where to write all output files, if empty, cm_get_path() is used.
Message file date format - not used in mlogger
Message dir - not used in mlogger
Write data - if set to "no", midas file, runlog, etc will not be written.
ODB Dump - at run stop, save odb to disk
ODB Dump File - file name for "ODB Dump" save file. "%d" is replaced by run 
number. "json" format is recommended for new experiments.
ODB Last Dump File - at run start, save ODB to disk. "json" format is 
recommended for new experiments.
Auto restart - run stopped by time limit or event limit is automatically 
restarted
Auto restart delay - wair for some many seconds before restarting the run
Tape message - NOT USED
Run duration - stop the run after so many seconds
Next subrun - change from "no" to "yes" to force mlogger to open a new subrun 
file (should this be per-channel?)
Subrun duration - open new subrun file after so many seconds (should this be 
per-channel?)
History dir - not used in mlogger
Detached transition - "no" use the normal multithreaded transtions 
(recommended), "yes" use mtransition helper to stop and restart runs. sometimes 
files because mtransition is not in the user $PATH or wrong version of 
mtransition is in the user $PATH.

K.O.
Entry  09 Aug 2023, Konstantin Olchanski, Bug Fix, Stefan's improved ODB flush to disk 
This is an important improvement, should have a post of it's own. K.O.

> > > RFE filed:
> > > https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-
periodically
> > 
> > Implemented and closed: https://bitbucket.org/tmidas/midas/issues/367/odb-
should-be-saved-to-disk-periodically
> > 
> > Stefan
> 
> Stefan's comments from the closed bug report:
> 
> Ok I implemented some periodic flushing. Here is what I did:
> 
> Created
> 
> /System/Flush/Flush period : TID_UINT32 /System/Flush/Last flush : TID_UINT32
> 
> which control the flushing to disk. The default value for “Flush period” is 60 
seconds or one minute.
> 
> All clients call db_flush_database() through their cm_yield() function
> db_flush_database() checks the “Last flush” and only flushes the ODB when the 
period has expired. This test is 
> done inside the ODB semaphore so that we don’t get a race condigiton
> If the period has expired, db_flush_database() calls ss_shm_flush()
> ss_shm_flush() tries to allocate a buffer of the shared memory. If the 
allocation is not successful (out of 
> memory), ss_shm_flush() writes directly to the binary file as before.
> If the allocation is successful, ss_shm_flush() copies the share memory to a 
buffer and passes this buffer to a 
> dedicated thread which writes the buffer to the binary file. This causes 
ss_shm_flush() to return immediately and 
> not block the calling program during the disk write operation.
> Added back the “if (destroy_flag) ss_shm_flush()” so that the ODB is flushed 
for sure before the shared memory 
> gets deleted.
> This means now that under normal circumstances, exiting programs like odbedit 
do NOT flush the ODB. This allows to 
> call many “odbedit -c” in a row without the flush penalty. Nevertheless, the 
ODB then gets flushed by other 
> clients latest 60 seconds (or whatever the flush period is) after odbedit 
exits.
> 
> Please note that ODB flushing has two purposes:
> 
> When all programs exit, we need a persistent storage for the ODB. In most 
experiments this only happens very 
> seldom. Maybe at the end of a beam time period.
> If the computer crashes, a recent version of the ODB is kept on disk to 
simplify recovery after the crash.
> Since crashes are not so often (during production periods we have maybe one 
hardware failure every few years) the 
> flushing of the ODB too often does not make sense and just consumes resources. 
Flushing does also not help from 
> corrupted ODBs, since the binary image will also get corrupted. So the only 
reason for periodic flushes is to ease 
> recovery after a total crash. I put the default to 60 seconds, but if people 
are really paranoid they can decrease 
> it to 10 seconds or so. Or increase it to 600 seconds if their system does not 
crash every week and disks are 
> slow.
> 
> I made a dedicated branch feature/periodic_odb_flush so people can test the 
new functionality. If there are no 
> complaints within the next few days, I will merge that into develop.
> 
> Stefan
Entry  31 Mar 2022, Stefan Ritt, Suggestion, Maximum ODB size 
Anybody some idea what the maximum ODB size can be? In the old days, the linux 
kernels had a severe limit on shared memory of usually 8MB, but in the age of 
64GB RAM being a standard, we should be able to grow bigger. Tried

odbinit -s 1024MB --cleanup

which went through without complain, even put that value in to .ODB_SIZE.TXT, but 
when I started odbedit doing "mem", I only see a size of 1MB. Probably somewhere 
deep inside we have a limit which prevents the user to create very large ODBs, 
but this should be mentioned more prominently in odbinit. Like "size too large, 
maximum allowd is xxx MB".

Stefan
    Reply  04 Apr 2022, Konstantin Olchanski, Suggestion, Maximum ODB size 
> Anybody some idea what the maximum ODB size can be?

It turns out ODB size limit is hardwired on db_open_database() at 100 Mbytes.

I now committed an improved error message for this.

I confirm that "odbinit -s 100MB" works and creates ODB with 50 Mbyte data area and 50 
Mbyte key area.

> in the age of 64GB RAM being a standard, we should be able to grow bigger ...

I agree, I think we can safely bump the limit from 100 Mbytes to 1 Gbyte, maybe 1.5 or 
1.99 Gbytes. Above that we run into 32-bit/31-bit cleanliness problems.

And creating extra large 1 GB ODB but using only a few megabytes will not waste any 
RAM, because the .ODB.SHM file is demand-paged and non-used parts of ODB will not be 
mapped into RAM. (It will waste disk space, file .ODB.SHM will be 1 GByte size).

However, 1 GByte (FPGA based) and 4-8 GByte (Raspberry Pi & co) machines are again
becoming popular and relevant for running MIDAS, and they have very slow "disk" 
subsystems, with NAND, SD and USB flash, so we should not go crazy here.

> odbinit -s 1024MB --cleanup

there is a bug in odbinit, if initial odbinit fails, ODB with default size is creates, 
and original rejected ODB size is written to .ODB_SIZE.TXT (an inconsistency).

bitbucket bug 328

> [ how do I resize ODB ??? ]

we need odbresize. bitbucket bug 329.

K.O.
       Reply  27 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size 
Hi all,

> I agree, I think we can safely bump the limit from 100 Mbytes to 1 Gbyte, maybe 1.5 or 
> 1.99 Gbytes. Above that we run into 32-bit/31-bit cleanliness problems.
We just went in and changed: int odb_size_limit = INT_MAX;//100*1000*1000; in odb.cxx. And we could create ODBs with 1GB and 1.5 GB.

Since the DecodeSize function in odbinit has also foreseen yottabytes ;) (const char units[] = {'k', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y'};) we think going to GB for the maximum ODB size would be create.

> there is a bug in odbinit, if initial odbinit fails, ODB with default size is creates, 
> and original rejected ODB size is written to .ODB_SIZE.TXT (an inconsistency).
Can#t we go with the maximum size here if the user inputs a larger size? So just below       printf("Checking ODB size...\n"); one could check for the odb_size_limit. In general one could move the odb_size_limit to midas.h so its not only available in odb.cxx.

Best,
Marius
          Reply  27 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > I agree, I think we can safely bump the limit from 100 Mbytes to 1 Gbyte, maybe 1.5 or 
> > 1.99 Gbytes. Above that we run into 32-bit/31-bit cleanliness problems.
>
> We just went in and changed: int odb_size_limit = INT_MAX;//100*1000*1000; in odb.cxx.
>

This is change is wrong. As I wrote, ODB is not 64-bit clean and it is not 32-bit clean. We think is is 31-bit clean, so maximum size would be slightly less than 2 Gbytes.

> And we could create ODBs with 1GB and 1.5 GB.

Congratulations. created != "it works". for proper test, you should fill it with 1.5 GB of stuff, save to json file, reload from json file, save to a different json file and compare that they have same contents (minus timestamps).

We could spend a lot of time making odb 32-bit clean and give you 4GB-max ODB, but would it be useful? For large ODB, "save to .json" already takes a long time ("save to .xml" is slower, "save to .odb" ditto, also buggy). We already have complaints that runs take forever to start because mlogger 
takes a long time to write the ODB save file.

P.S. 64-bit clean ODB will be binary incompatible, all internal pointers are 32-bit right now.

K.O.
             Reply  27 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size 
> This is change is wrong. As I wrote, ODB is not 64-bit clean and it is not 32-bit clean. We think is is 31-bit clean, so maximum size would be slightly less than 2 Gbytes.

I just wanted to show that changing it and creating bigger ODBs is in general possible.

My main intention was to trigger the discussion again. I also think in general 1GB is enough. But for our applications sometimes 100MB is just on the edge. 



> Congratulations. created != "it works". for proper test, you should fill it with 1.5 GB of stuff, save to json file, reload from json file, save to a different json file and compare that they have same contents (minus timestamps).

You’re right we did not properly test it. I will run this test with a 1GB ODB.



> We could spend a lot of time making odb 32-bit clean and give you 4GB-max ODB, but would it be useful? For large ODB, "save to .json" already takes a long time ("save to .xml" is slower, "save to .odb" ditto, also buggy). We already have complaints that runs take forever to start because mlogger 

> takes a long time to write the ODB save file.

I also agree that going in and making it 32-bit or even 64-bit clean is not worth the effort.

Also concerning the writing speed of the logger etc I am fully with you.

However, having the freedom to choose a bit bigger ODB would be great.



You said the writing into .odb is buggy. Do you mean it’s buggy in general or only in this specific case?

We save the ODB most of the time in the .odb format. 



Cheers,

Marius
                Reply  27 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> You said the writing into .odb is buggy. Do you mean it’s buggy in general or only in this specific case?
> We save the ODB most of the time in the .odb format. 

I recommend JSON. Main advantage is you can read it using JSON decoder available for any language, no need to write custom code.

Other than that, the main issue is encoding of strings. For ODB this is key names and string values.

JSON was the first to standardize escape characters what can encode all valid UTF-8 UNICODE strings,
the system of escape characters is clean, easy to understand and easy to implement. https://www.json.org/json-en.html

XML is not as well defined as JSON, i.e. go and try to find the XML BNF grammar. I am not sure if the MIDAS XML encoder
and decoder is fully UTF-8 clean, and if some unlucky combinations of characters break string encoding or decoding. This
is usually tested using a fuzzer (generates all possible, unlucky and unlikely string values). Most suspicious
would be quotes, and square and angle brackets. If some character combinations break encoding or decoding, likely
this cannot be fixed in MIDAS without breaking backwards self-compatibility (will not read old ODB files correctly).

Same applies for the ODB format, except that it is even more ad-hoc. Again, any problems are hard to fix without
breaking backward self-compatibility.

In addition, in the past, the ODB and XML decoders had trouble with very long strings, this has been
fixed some time ago.

K.O.
                Reply  27 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
my vote is to bump the ODB size limit to 1999*1000*1000 (not quite 2GB). but this needs to be tested. especially save and restore from ODB, XML and JSON files, including how long it takes to save and load a 1.9GB ODB. K.O.
                   Reply  28 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size create_time.pdffails.pdftest_odb.py
> my vote is to bump the ODB size limit to 1999*1000*1000 (not quite 2GB). but this needs to be tested. especially save and restore from ODB, XML and JSON files, including how long it takes to save and load a 1.9GB ODB. K.O.

I had some fun with python and created a test script which can be executed in the MIDASSYS/online folder (test_odb.py). I did not really normalize the time so it will be different at different systems but I guess the trend is important (see create_time.pdf).
What is surprising to me is that even that I only write one STRING key to the time increases. Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?
Second thing is that also the creation / storing and load time is increasing. Should this be or is there a bug in the code I use or again is this related to the previous point?

The test of comparing the ODB after store / load / store already fails for the json format. I know I only test if the dicts are the same, so for timestamps this already fails.
But what is strange here is that sometimes the test works sometimes not and its different from run to run.

I will try to improve the test a bit more but for a short update this is how it looks so fare.

Best,
Marius
                      Reply  28 Apr 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?

At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
This does NOT depend on the ODB size, but on the ODB *content*. Every key in the ODB takes time to convert. So if your ODB as 1.5 GB
but only a few keys, this is still fast. Only if you have 200 million keys int he ODB, then mlogger takes lots of time to convert
200 million values to XML or JSON strings.

Stefan
                         Reply  28 Apr 2023, Marius Koeppel, Suggestion, Maximum ODB size 
> At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
> This does NOT depend on the ODB size, but on the ODB *content*. Every key in the ODB takes time to convert. So if your ODB as 1.5 GB
> but only a few keys, this is still fast. Only if you have 200 million keys int he ODB, then mlogger takes lots of time to convert
> 200 million values to XML or JSON strings.
This was also my assumption. Is this the same for odbedit -c save FILE?
Because this is what I tested with the script and there one can see in the plot that the time increases to write the file if the ODB size increases.
The content of the ODB is always the same - one STRING key in the directory Test.

Best,
Marius
                         Reply  28 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > Is this maybe related to what Stefan said about the run start - so that odbedit needs some time to load the bigger ODB?
> 
> At the run start mlogger writes the ODB to the .mid file. This needs conversion (binary ODB -> XML ASCII) which can take time.
> This does NOT depend on the ODB size, but on the ODB *content*.
>

Yes and no. They must be storing more than 100 Mbytes of stuff in ODB, if they are asking to bump ODB size from 100 Mbyte to 2 GByte ODB.

On the MIDAS, side, though, we have to plan for the worst case, if max ODB size 1.9 GB and it is full of data,
and mlogger (and odbedit save and load) take 10-30 seconds, then at least all timeouts (watchdog timeout, RPC timeout, etc)
must be increased accordingly.

K.O.
             Reply  27 Apr 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> Congratulations. created != "it works".

Two other tings to consider:

1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished and read if the first client starts, to get it persistent. 
So this could slow down starting and stopping, but only the first client, so I guess it's not an issue.

2) Traditionally, the ODB gets dumped to the .mid file at the beginning and end of every run, so that one know the exact configuration of the experiment
for offline analysis. This can be turned off of course, but most experiments use it. If the ODB is dumped in any ASCII format, this can take quite long.
Assume it takes 10 seconds at the beginning of each run, and we take a run every five minutes. Then we loose 48 mins of precious beam time every day.

Best,
Stefan
                Reply  28 Apr 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > Congratulations. created != "it works".
> 
> Two other tings to consider:
> 
> 1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished and read if the first client starts, to get it persistent. 
> So this could slow down starting and stopping, but only the first client, so I guess it's not an issue.
>

typical disk writing speed is 100-1000 Mbytes/sec, so writing 1 GB .ODB.SHM will take 1-10 seconds. NFS over 1gige network is 100 Mbytes/sec, so 10 seconds to 
write .ODB.SHM. embedded ARM write speed to SD flash can be as low as 10 Mbytes/sec, so up to 100 seconds.

> 
> 2) Traditionally, the ODB gets dumped to the .mid file at the beginning and end of every run, so that one know the exact configuration of the experiment
> for offline analysis. This can be turned off of course, but most experiments use it. If the ODB is dumped in any ASCII format, this can take quite long.
> Assume it takes 10 seconds at the beginning of each run, and we take a run every five minutes. Then we loose 48 mins of precious beam time every day.
> 

new default is to save as JSON, (as of my last measurement) JSON encoder is faster than the XML (and ODB?) encoder, by default result is compressed by GZIP-1 (66 
Mbytes/sec is my old benchmark, should remeasure on new DDR5 machines), compressed JSON is written .mid.gz file at disk speed (as above). Alternatively, use LZ4 
compression, runs roughly at memcpy() speed, less compression, written to .mid.lz4 at disk speed.

if data storage is ZFS, ZFS built-in LZ4 compression is now enabled by default, so result writing uncompressed .mid file (no compression of ODB dump), should be 
roughly same as when using MIDAS LZ4 compression and writing .mid.lz4.

bottom line, I need to remeasure gzip and lz4 compression speeds on new computers (DDR4 AMD 5000 series and DDR5 AMD 7000 series).

K.O.
                   Reply  09 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > 1) The ODB shared memory is dumped into a binary file (".ODB.SHM") after the last client finished ...

correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().

I have just run into a problem with this in the DRAGON experiment. At begin and end of run they run
a script that does a large number of "odbedit" calls to read stuff from ODB and it was taking a very long time.
Each odbedit invocation was taking about 1 second, starting odbedit is quick, stopping odbedit takes about 1 second.

It turns out each invocation of odbedit saves .ODB.SHM, ODB was 100 Mbytes size, home disk is an HDD (~100-200 Mbytes/sec writing speed), so yes, about 1 second to 
stop odbedit.

Solution was to reduce ODB size from 100 Mbytes to 10 Mbytes, odbedit now run quickly, begin and end of run scripts run quickly. problem solved.

K.O.

P.S. no, I am not the dragon experiment, no, I did not write those scripts, no, I will not rewrite them, persons who wrote them are long gone, no, the persons running 
dragon today will not be rewriting them.
                      Reply  12 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().

The original design of the midas shared memory (back in the 1990's) was that the ODB shared memory file gets
only saved into the .ODB.SHM when the *last* client exits. This ensures to keep the ODB persistent when the
shared memory gets deleted. I vaguely remember I put something in like:

db_close_database()
...
  destroy_flag = (pheader->num_clients == 0);

  if (destroy_flag)
     ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);
...

Now I see that the "if (destory_flag)" is missing. Not sure if it was removed once, or if it actually never
was there. But I see no point in flushing the ODB when a client ends. We need the flushing only before the
shared memory gets deleted. We we have to ensure that the share memory and the binary dump file stay in sync
(like if all midas clients die at the same time), we could add some code to flush the ODB like once per minute,
but not attach it to db_close_database(). I know several experiments using "odbedit -c xxx" in vast quantities,
so all these experiments would then benefit.

Note: Mu3e at PSI also uses 100 MB ODB, and they really need it.

Thoughts and opinions?

Best,
Stefan
                         Reply  12 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > correction: ODB shared memory is saved to .ODB.SHM each time a client stops, this is db_close_database().
> 
> The original design of the midas shared memory (back in the 1990's) was that the ODB shared memory file gets
> only saved into the .ODB.SHM when the *last* client exits. This ensures to keep the ODB persistent when the
> shared memory gets deleted. I vaguely remember I put something in like:
> 
> db_close_database()
> ...
>   destroy_flag = (pheader->num_clients == 0);
> 
>   if (destroy_flag)
>      ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);

I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

What's more, ss_shm_flush() is done while holding the ODB semaphore, so all other midas programs that try to access
odb at the same time (including the mserver) will stall until write() and close() return. at least we do not fsync(),
and there is no waiting until data is committed to physical media.

$ git annotate 3bb04af4d^ src/odb.c
...
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	875)  destroy_flag = (pheader->num_clients == 0);
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	876)
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	877)  /* flush shared memory to disk */
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	878)  ss_flush_shm(pheader->name, pheader, sizeof(DATABASE_HEADER)+2*pheader->data_size);
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	879)
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	880)  /* unmap shared memory, delete it if we are the last */
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	881)  ss_close_shm(pheader->name, pheader,
ef8320177	(Stefan Ritt	1998-10-08 13:46:02 +0000	882)               _database[hDB-1].shm_handle, destroy_flag);
...

K.O.
                            Reply  13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
> odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

I confirm. Really strange how your mind can trick you. I'm absolutely sure I had this planned originally (1995?), but it got never implemented.

Well, never too late. So I added the "if" and committed to develop. I did a quick test and things seem to work fine here. Actually programs stop 
a bit faster now. So please everybody give it a try and report back here.

BTW, how do I resize the ODB. I remember we discussed this some time ago, and concluded that odbedit needs a resize flag. Has this even been 
done? If not, what is the "official" way to resize the ODB. We had some documentation about that some time ago, but I can't find it anymore.

Stefan
                               Reply  13 Jun 2023, Marius Koeppel, Suggestion, Maximum ODB size 
> BTW, how do I resize the ODB. I remember we discussed this some time ago, and concluded that odbedit needs a resize flag. Has this even been 
> done? If not, what is the "official" way to resize the ODB. We had some documentation about that some time ago, but I can't find it anymore.

I guess this is still not done and the issue is still open: https://bitbucket.org/tmidas/midas/issues/329/need-odbresize
I guess if we touch this maybe the problem with the wrong size should be also fixed: https://bitbucket.org/tmidas/midas/issues/328/odbinit-s-1024mb-creates-odb-with-wrong

Best,
Marius
                                  Reply  13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > BTW, how do I resize the ODB.

ODB cannot be resized "online". Everything has to stop, save content to odb.json, get rid of old ODB.SHM, ensure ODB shared memory is destroyed (SysV or POSIX shared memory), 
create new ODB with new size, load odb.json. Feel free to punch this into chatgpt > odbresize.cxx, commit, test, push.

> I remember we discussed this some time ago, and concluded that odbedit needs a resize flag.

ODB cannot be resized online. ODB API has ODB clients holding ODB handles which are pointers (offsets) into ODB shared memory.

> Has this even been done?
> I guess this is still not done and the issue is still open: https://bitbucket.org/tmidas/midas/issues/329/need-odbresize
> I guess if we touch this maybe the problem with the wrong size should be also fixed: https://bitbucket.org/tmidas/midas/issues/328/odbinit-s-1024mb-creates-odb-with-wrong

please contribute 14 distraction-free days to my patreon. thanks in advance!

K.O.
                               Reply  13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > I remember the same, but I tracked it down in git to the very first commit, and there is no if() there,
> > odb is saved to .ODB.SHM on every client shutdown, not just the last client. I guess we both misremebered.

small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
begin and end of every data file, you can recover some of the lost data.

for better effect, ODB should be dumped to disk at periodic intervals. but. current implementation writes odb to disk while holding the ODB 
semaphore, which means all ODB access stops for the duration, specifically, there will be gaps in the history because mlogger cannot read history 
data from ODB.

a better implementation could take the ODB lock, make a copy of ODB shared memory, release the ODB lock, complete writing to disk without holding the 
lock. protection is needed against 100 midas programs trying to do this all at the same time. computers with 0.5 GB RAM (many ARM FPGA SoCs) will be 
limited to ~100 Mbyte ODB). plus deal with memory allocation failures when taking a copy of a 2GB ODB.

in theory, the mmap() shared memory (already implemented in midas) does this automatically, but we lose control
over disk writes, we see some OSes write odb to disk "too often" and at wrong times, i.e. while we are in the middle
of creating or deleting something. current sequence of open(), atomic write() and close() ensures ODB.SHM always
contains a valid odb. (minus loss of OS and disk caches to crash or power loss).

K.O.
                                  Reply  13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
> the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
> begin and end of every data file, you can recover some of the lost 

The new behavior is not much worse than before. Assume 10 programs running happily for days, computer crashes, all ODB changes lost. 
So indeed a periodic flush without holding the lock might be best. Use a semaphore to prevent all programs flushing at the same time, or put
the flush only in the logger after an end of run.

Stefan
                                     Reply  13 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> 
> > small problem. build an experiment, start taking data, observe how ODB is never saved to disk because the "last client" never stops. as bonus, crash 
> > the computer, observe how all changes to ODB are now lost. if mlogger is configured to save odb.json at the end of run, and to write ODB dumps at 
> > begin and end of every data file, you can recover some of the lost 
> 
> The new behavior is not much worse than before. Assume 10 programs running happily for days, computer crashes, all ODB changes lost. 
> So indeed a periodic flush without holding the lock might be best. Use a semaphore to prevent all programs flushing at the same time, or put
> the flush only in the logger after an end of run.

are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
(I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?

sanity check. dragon experiment, very active, .ODB.SHM timestamp is 1 second old. not-very-active agmini, today is June 13th, timestamp of .ODB.SHM is June 
2nd. inactive TACTIC, timestamp of .ODB.SHM is May 16th.

so yes, not great, but in the new scheme, ODB.SHM timestamps would probably be from 2021 or 2020.

my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.

K.O.
                                        Reply  13 Jun 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
> (I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?

Indeed this is almost never the case, maybe once per months. On the other hand, we have a complete crash of the os maybe once a year. Most of the time the programs 
run continuously (we do not need odbedit), so our timestamp is typically one or two days old, so not good either.

> my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.

My vote is to flush the odb either periodically or after each run.

Stefan 
                                           Reply  15 Jun 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> 
> > are you sure? when/how often does "last midas program finishes" happen? it does not happen on a system crash, not on power loss, not on "shutdown -r now" 
> > (I am pretty sure). In the experiments you run, how often do you shut down all programs (and check that you did not forget one somehow)?
> 
> Indeed this is almost never the case, maybe once per months. On the other hand, we have a complete crash of the os maybe once a year. Most of the time the programs 
> run continuously (we do not need odbedit), so our timestamp is typically one or two days old, so not good either.
> 
> > my vote is to undo this change, it is dangerous because it causes odb to be saved to ODB.SHM never.
> 
> My vote is to flush the odb either periodically or after each run.
> 

So we are in agreement.

RFE filed:
https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically

Dangerous change reverted:
60e4c44ad66346b89ba057391acf7a02890049be

K.O.

bash-3.2$ git diff
diff --git a/src/odb.cxx b/src/odb.cxx
index 0d3b88c2..d104ff28 100644
--- a/src/odb.cxx
+++ b/src/odb.cxx
@@ -2199,7 +2199,14 @@ INT db_close_database(HNDLE hDB)
       destroy_flag = (pheader->num_clients == 0);
 
       /* flush shared memory to disk */
-      if (destroy_flag)
+
+      /* if we save ODB to disk only after last client finishes, we will never save ODB to disk
+         in most experiments - none of them ever completely stop MIDAS in normal operation.
+         as result, all changes to ODB contents will be lost on system crash, power loss
+         or normal reboot. see https://daq00.triumf.ca/elog-midas/Midas/2539
+         K.O. June 2023. */
+
+      if (1 || destroy_flag)
          ss_shm_flush(pheader->name, pdb->shm_adr, pdb->shm_size, pdb->shm_handle);
 
       strlcpy(xname, pheader->name, sizeof(xname));

K.O.
                                              Reply  28 Jul 2023, Stefan Ritt, Suggestion, Maximum ODB size 
> RFE filed:
> https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically

Implemented and closed: https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically

Stefan
                                                 Reply  09 Aug 2023, Konstantin Olchanski, Suggestion, Maximum ODB size 
> > RFE filed:
> > https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically
> 
> Implemented and closed: https://bitbucket.org/tmidas/midas/issues/367/odb-should-be-saved-to-disk-periodically
> 
> Stefan

Stefan's comments from the closed bug report:

Ok I implemented some periodic flushing. Here is what I did:

Created

/System/Flush/Flush period : TID_UINT32 /System/Flush/Last flush : TID_UINT32

which control the flushing to disk. The default value for “Flush period” is 60 seconds or one minute.

All clients call db_flush_database() through their cm_yield() function
db_flush_database() checks the “Last flush” and only flushes the ODB when the period has expired. This test is 
done inside the ODB semaphore so that we don’t get a race condigiton
If the period has expired, db_flush_database() calls ss_shm_flush()
ss_shm_flush() tries to allocate a buffer of the shared memory. If the allocation is not successful (out of 
memory), ss_shm_flush() writes directly to the binary file as before.
If the allocation is successful, ss_shm_flush() copies the share memory to a buffer and passes this buffer to a 
dedicated thread which writes the buffer to the binary file. This causes ss_shm_flush() to return immediately and 
not block the calling program during the disk write operation.
Added back the “if (destroy_flag) ss_shm_flush()” so that the ODB is flushed for sure before the shared memory 
gets deleted.
This means now that under normal circumstances, exiting programs like odbedit do NOT flush the ODB. This allows to 
call many “odbedit -c” in a row without the flush penalty. Nevertheless, the ODB then gets flushed by other 
clients latest 60 seconds (or whatever the flush period is) after odbedit exits.

Please note that ODB flushing has two purposes:

When all programs exit, we need a persistent storage for the ODB. In most experiments this only happens very 
seldom. Maybe at the end of a beam time period.
If the computer crashes, a recent version of the ODB is kept on disk to simplify recovery after the crash.
Since crashes are not so often (during production periods we have maybe one hardware failure every few years) the 
flushing of the ODB too often does not make sense and just consumes resources. Flushing does also not help from 
corrupted ODBs, since the binary image will also get corrupted. So the only reason for periodic flushes is to ease 
recovery after a total crash. I put the default to 60 seconds, but if people are really paranoid they can decrease 
it to 10 seconds or so. Or increase it to 600 seconds if their system does not crash every week and disks are 
slow.

I made a dedicated branch feature/periodic_odb_flush so people can test the new functionality. If there are no 
complaints within the next few days, I will merge that into develop.

Stefan
Entry  06 Mar 2023, Gennaro Tortone, Forum, pull request for PostgreSQL support 
Hi,

some minutes ago I published a PR for PostgreSQL support I developed
at INFN-Napoli for Darkside experiment...

I don't know if you receive a notification about this PR and in doubt
I wrote this message...

Thanks in advance,
Gennaro
    Reply  06 Mar 2023, Konstantin Olchanski, Forum, pull request for PostgreSQL support 
> some minutes ago I published a PR for PostgreSQL support I developed
> at INFN-Napoli for Darkside experiment...
> 
> I don't know if you receive a notification about this PR and in doubt
> I wrote this message...

Hi, Gennaro, thank you for the very useful contribution. I saw the previous version 
of your pull request and everything looked quite good. But that pull request was 
for an older version of midas and it would not have applied cleanly to the current 
version. I will take a look at your updated pull request. In theory it should only 
add the Postgres class and modify a few other places in history_schema.cxx and have 
no changes to anything else. (if you need those changes, it should be a separate 
pull request).

Also I am curious what benefits and drawbacks of Postgres vs mysql/mariadb you have 
observed for storing and using midas history data.

K.O.
       Reply  06 Mar 2023, Gennaro Tortone, Forum, pull request for PostgreSQL support 
Hi Konstantin,
thanks for this update |

My main interest for PostgreSQL is usage of TimescaleDB
(https://github.com/timescale/timescaledb) a PostgreSQL extension that
makes possible usage of downsampling functions on time-series...

here at INFN-Napoli we have a large history dataset that we manage
with MIDAS history and MySQL tables. We have a lot of issues 
(wait time, browser hangs, crashes) when we use MIDAS history plot 
pages on large time period because the Javascript web page try to 
download million of records in order to display them on a plot of 
(max) 2000 pixel width...

with native downsampling we can reduce a large dataset keeping the
"shape" of the curve using only the points needed by the plot area;

in TimescaleDB there is "lttb" ( Largest Triangle Three Bucket) a very 
nice and impressive downsampling function that preserve very well the
shape of the series.

If you are interested to see a lttb at work on some data you can open this page:
   https://www.base.is/flot

In next days I will work to add TimescaleDB backend to MIDAS history (it will be
similar to PostgreSQL backend) and we can discuss on how to add these 
downsampling features to history plot web pages, I already developed some
solutions and I will be happy to share them with MIDAS community;

Cheers,
Gennaro

> > some minutes ago I published a PR for PostgreSQL support I developed
> > at INFN-Napoli for Darkside experiment...
> > 
> > I don't know if you receive a notification about this PR and in doubt
> > I wrote this message...
> 
> Hi, Gennaro, thank you for the very useful contribution. I saw the previous version 
> of your pull request and everything looked quite good. But that pull request was 
> for an older version of midas and it would not have applied cleanly to the current 
> version. I will take a look at your updated pull request. In theory it should only 
> add the Postgres class and modify a few other places in history_schema.cxx and have 
> no changes to anything else. (if you need those changes, it should be a separate 
> pull request).
> 
> Also I am curious what benefits and drawbacks of Postgres vs mysql/mariadb you have 
> observed for storing and using midas history data.
> 
> K.O.
       Reply  20 Mar 2023, Gennaro Tortone, Forum, pull request for PostgreSQL support 
Hi,
I have updated the PR with a new one that includes TimescaleDB support and some
changes to mhistory.js to support downsampling queries...

Cheers,
Gennaro

> > some minutes ago I published a PR for PostgreSQL support I developed
> > at INFN-Napoli for Darkside experiment...
> > 
> > I don't know if you receive a notification about this PR and in doubt
> > I wrote this message...
> 
> Hi, Gennaro, thank you for the very useful contribution. I saw the previous version 
> of your pull request and everything looked quite good. But that pull request was 
> for an older version of midas and it would not have applied cleanly to the current 
> version. I will take a look at your updated pull request. In theory it should only 
> add the Postgres class and modify a few other places in history_schema.cxx and have 
> no changes to anything else. (if you need those changes, it should be a separate 
> pull request).
> 
> Also I am curious what benefits and drawbacks of Postgres vs mysql/mariadb you have 
> observed for storing and using midas history data.
> 
> K.O.
          Reply  24 May 2023, Gennaro Tortone, Forum, pull request for PostgreSQL support 
Hi,
is there any news regarding this pull request ? 
(https://bitbucket.org/tmidas/midas/pull-requests/30)

If you agree to merge I can resolve conflicts that now 
(after two months) are listed...

Regards,
Gennaro

> 
> Hi,
> I have updated the PR with a new one that includes TimescaleDB support and some
> changes to mhistory.js to support downsampling queries...
> 
> Cheers,
> Gennaro
> 
> > > some minutes ago I published a PR for PostgreSQL support I developed
> > > at INFN-Napoli for Darkside experiment...
> > > 
> > > I don't know if you receive a notification about this PR and in doubt
> > > I wrote this message...
> > 
> > Hi, Gennaro, thank you for the very useful contribution. I saw the previous version 
> > of your pull request and everything looked quite good. But that pull request was 
> > for an older version of midas and it would not have applied cleanly to the current 
> > version. I will take a look at your updated pull request. In theory it should only 
> > add the Postgres class and modify a few other places in history_schema.cxx and have 
> > no changes to anything else. (if you need those changes, it should be a separate 
> > pull request).
> > 
> > Also I am curious what benefits and drawbacks of Postgres vs mysql/mariadb you have 
> > observed for storing and using midas history data.
> > 
> > K.O.
             Reply  18 Jul 2023, Konstantin Olchanski, Forum, pull request for PostgreSQL support 
> is there any news regarding this pull request ? 
> (https://bitbucket.org/tmidas/midas/pull-requests/30)

apologies for taking a very long time to review the proposed changes.

the main problem with this pull request remains, it tangles together too many changes to the code and I cannot simply 
say "this is okey", merge and commit it.

example of unrelated change is diff of mlogger.cxx, change of function in: "db_get_value(hDB, 0, "/Logger/Multithread 
transitions" ... )". there is also unrelated changes to whitespace sprinkled around.

can you review your diffs again and try to remove as much unrelated and unnecessary changes as you can?

I could do this for you, and merge my version, but next time you merge base midas, you will have a collision.

unrelated change of function is introduction of something called "downsampling", what is the purpose of this? How is it 
different from requesting binned data? Is it just a kludge to reduce the data size? Before we merge it, can you post a 
description/discussion to this forum here? (as a separate topic, separate from discussion of PostgreSQL merge).

the changes to add PostgreSQL so fat look reasonable:
- CMakeLists, is always painful but if you do same a MySQL, should be okey, we always end up rejigging this several 
times before it works everywhere.
- history.h, ok, minus changes for adding the "downsample" feature
- mlogger.cxx, changes are too tangled with "downsample" feature, cannot review
- SetDownsample() API is defective, should have separate Get() and Set() functions
- history_common.cxx, please do not add downsampling code to history providers that do not/will not support it.
- history_odbc.cxx, please do not change it. it does not support downsampling and never will.
- history.cxx, ditto
- mjsonrpc.cxx, history API is changed, we must know: is new JS compatible with old mhttpd? is old JS compatible with 
new mhttpd? (mixed versions are very common in practice). if there is incompatibility, can you recoded it to be 
compatible?
- history_schema.cxx: bitbucket diff is a dog's breakfast, cannot review. I will have to checkout your branch and diff 
by hand.

changes to mhistory.js appear to be extensive and some explanation is needed for what is changed, what bugs/problems 
are fixed, what new features are added.

to move forward, can you generate a pull requests that only adds pgsql to history_schema.cxx, history_common.cxx and 
mlogger.cxx and does not add any other functions, features and does not change any whistespace?

K.O.


> 
> If you agree to merge I can resolve conflicts that now 
> (after two months) are listed...
> 
> Regards,
> Gennaro
> 
> > 
> > Hi,
> > I have updated the PR with a new one that includes TimescaleDB support and some
> > changes to mhistory.js to support downsampling queries...
> > 
> > Cheers,
> > Gennaro
> > 
> > > > some minutes ago I published a PR for PostgreSQL support I developed
> > > > at INFN-Napoli for Darkside experiment...
> > > > 
> > > > I don't know if you receive a notification about this PR and in doubt
> > > > I wrote this message...
> > > 
> > > Hi, Gennaro, thank you for the very useful contribution. I saw the previous version 
> > > of your pull request and everything looked quite good. But that pull request was 
> > > for an older version of midas and it would not have applied cleanly to the current 
> > > version. I will take a look at your updated pull request. In theory it should only 
> > > add the Postgres class and modify a few other places in history_schema.cxx and have 
> > > no changes to anything else. (if you need those changes, it should be a separate 
> > > pull request).
> > > 
> > > Also I am curious what benefits and drawbacks of Postgres vs mysql/mariadb you have 
> > > observed for storing and using midas history data.
> > > 
> > > K.O.
                Reply  21 Jul 2023, Konstantin Olchanski, Forum, pull request for PostgreSQL support 
> > is there any news regarding this pull request ? 
> > (https://bitbucket.org/tmidas/midas/pull-requests/30)
> 
> apologies for taking a very long time to review the proposed changes.
> 

I merged the PgSql bits by hand - the automatic tools make a dog's breakfast from the history_schema.cxx diffs. Ouch.

history_schema.cxx merged pretty much cleanly, but I have one question about CreateSqlColumn() with sql_strict set to "true". Can you say 
more why this is needed? Should this also be made the default for MySQL? The best I can tell the default values are only needed if we write 
to SQL but forget to provide values that should not be NULL? But our code never does this? Or this is for reading from SQL, where NULL values 
are replaced with the default values? I do not have time to look into this right now, I hope you can clarify it for me?

Also notice the fDownsample is set to zero and cannot be changed. I recommend we set it through the MakeMidasHistoryPgsql() factory method.

Please pull, merge, retest, update the pull request, check that there is no unrelated changes (changes in mlogger.cxx is a direct red flag!) 
and we should be able to merge the rest of your stuff pronto.

K.O.

commit e85bb6d37c85f02fc4895cae340ba71ab36de906 (HEAD -> develop, origin/develop, origin/HEAD)
Author: Konstantin Olchanski <olchansk@triumf.ca>
Date:   Fri Jul 21 09:45:08 2023 -0700

    merge PQSQL history in history_schema.cxx

commit f254ebd60a23c6ee2d4870f3b6b5e8e95a8f1f09
Author: Konstantin Olchanski <olchansk@triumf.ca>
Date:   Fri Jul 21 09:19:07 2023 -0700

    add PGSQL Makefile bits

commit aa5a35ba221c6f87ae7a811236881499e3d8dcf7
Author: Konstantin Olchanski <olchansk@triumf.ca>
Date:   Fri Jul 21 08:51:23 2023 -0700

    merge PGSQL support from https://bitbucket.org/gtortone/midas/branch/feature/timescaledb_support except for history_schema.cxx
                   Reply  21 Jul 2023, Gennaro Tortone, Forum, pull request for PostgreSQL support 









Hi Konstantin,

thanks a lot for your work on PostgreSQL and TimescaleDB integration...
and sorry for unrelated changes on source code !

I will return on this task at end of this year (maybe October or November) because
I'm working on different tasks... but I will keep in mind your suggestions in order
to provide good source code.

Thanks,
Gennaro


> 
> I merged the PgSql bits by hand - the automatic tools make a dog's breakfast from the history_schema.cxx diffs. Ouch.
> 
> history_schema.cxx merged pretty much cleanly, but I have one question about CreateSqlColumn() with sql_strict set to "true". Can you say 
> more why this is needed? Should this also be made the default for MySQL? The best I can tell the default values are only needed if we write 
> to SQL but forget to provide values that should not be NULL? But our code never does this? Or this is for reading from SQL, where NULL values 
> are replaced with the default values? I do not have time to look into this right now, I hope you can clarify it for me?
> 
> Also notice the fDownsample is set to zero and cannot be changed. I recommend we set it through the MakeMidasHistoryPgsql() factory method.
> 
> Please pull, merge, retest, update the pull request, check that there is no unrelated changes (changes in mlogger.cxx is a direct red flag!) 
> and we should be able to merge the rest of your stuff pronto.
> 
> K.O.
> 
> commit e85bb6d37c85f02fc4895cae340ba71ab36de906 (HEAD -> develop, origin/develop, origin/HEAD)
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 09:45:08 2023 -0700
> 
>     merge PQSQL history in history_schema.cxx
> 
> commit f254ebd60a23c6ee2d4870f3b6b5e8e95a8f1f09
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 09:19:07 2023 -0700
> 
>     add PGSQL Makefile bits
> 
> commit aa5a35ba221c6f87ae7a811236881499e3d8dcf7
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 08:51:23 2023 -0700
> 
>     merge PGSQL support from https://bitbucket.org/gtortone/midas/branch/feature/timescaledb_support except for history_schema.cxx
                      Reply  28 Jul 2023, Stefan Ritt, Forum, pull request for PostgreSQL support 
The compilation of midas was broken by the last modification. The reason is that 

   Pgsql *fPgsql = NULL;

was not protected by 

#ifdef HAVE_PGSQL

So I put all PGSQL code under a big #ifdef and now it compiles again. You might want to double check my modification at

https://bitbucket.org/tmidas/midas/commits/e3c7e73459265e0d7d7a236669d1d1f2d9292a74

Best,
Stefan
                         Reply  09 Aug 2023, Konstantin Olchanski, Forum, pull request for PostgreSQL support 
> The compilation of midas was broken by the last modification. The reason is that 
>    Pgsql *fPgsql = NULL;
> was not protected by #ifdef HAVE_PGSQL

confirmed, my mistake, I forgot to test with "make cmake NO_PGSQL". your fix is correct, thanks.

K.O.
Entry  02 Aug 2023, Caleb Marshall, Forum, Issues with Universe II Driver  

Hello,

At our lab we are currently in the process of migrating more of our systems over to Midas. However, all of our working systems are dependent on SBCs with the Tsi-148 chips of which we only have a handful. In order to have some backups and spares for testing, we have been attempting to get Midas working with some borrowed SBCs (Concurrent Technologies VX 40x/04x) with Universe-II chips. The SBC is running CentOS 7. I have tried to follow the instructions posted here. The universe-II kernel module appears to load correctly, dmesg gives:

[   32.384826] VME: Board is system controller
[   32.384875] VME: Driver compiled for SMP system
[   32.384877] VME: Installed VME Universe module version: 3.6.KO6
 

However, running vmescan.exe fails with a segfault. Running with gdb shows:

vmic_mmap: Mapped VME AM 0x0d addr 0x00000000 size 0x00ffffff at address 0x80a01000
mvme_open:
Bus handle              = 0x7
DMA handle              = 0x6045d0
DMA area size           = 1048576 bytes
DMA    physical address = 0x7ffff7eea000
vmic_mmap: Mapped VME AM 0x2d addr 0x00000000 size 0x0000ffff at address 0x86ff0000

Program received signal SIGSEGV, Segmentation fault.
mvme_read_value (mvme=0x604010, vme_addr=<optimized out>)
    at /home/jam/midas/packages/midas/drivers/vme/vmic/vmicvme.c:352
352        dst  = *((WORD *)addr);
 

With the pointer addr originating from a call to vmic_mapcheck within the  mvme_read_value functions in the vmicvme.c file. Help with where to go from here would be appreciated.

-Caleb 

 

    Reply  02 Aug 2023, Konstantin Olchanski, Forum, Issues with Universe II Driver  
I maintain the tsi148 and the universe-II drivers. I confirm -KO6 is my latest 
version, last updated for 32-bit Debian-11, and we still use it at TRIUMF.

It is good news that the vme_universe kernel module built, loaded and reported 
correct stuff to dmesg.

It is not clear why mvme_read_value() crashed. We need to know the value of 
vme_addr and addr, can you add printf()s for them using format "%08x" and try 
again?

K.O.

<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; 
width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Caleb Marshall 
wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>Hello,</p>

			<p>At our lab we are currently in the process of 
migrating more of our systems over to Midas. However, all of our working systems 
are dependent on SBCs with the Tsi-148 chips of which we only have a handful. In 
order to have some backups and spares for testing, we have been attempting to 
get Midas working with some borrowed SBCs (Concurrent Technologies VX 40x/04x) 
with Universe-II chips. The SBC is running&nbsp;CentOS 7.&nbsp;I have tried to 
follow the instructions posted <a 
href="https://daq00.triumf.ca/DaqWiki/index.php/VME-
CPU#V7648_and_V7750_BIOS_Settings">here</a>. The universe-II kernel module 
appears to load correctly, dmesg gives:</p>

			<p>[ &nbsp; 32.384826] VME: Board is system 
controller<br />
			[ &nbsp; 32.384875] VME: Driver compiled for SMP 
system<br />
			[ &nbsp; 32.384877] VME: Installed VME Universe module 
version: 3.6.KO6<br />
			&nbsp;</p>

			<p>However, running vmescan.exe fails with a segfault. 
Running with gdb shows:</p>

			<p>vmic_mmap: Mapped VME AM 0x0d addr 0x00000000 size 
0x00ffffff at address 0x80a01000<br />
			mvme_open:<br />
			Bus handle &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
&nbsp;= 0x7<br />
			DMA handle &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
&nbsp;= 0x6045d0<br />
			DMA area size &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 
1048576 bytes<br />
			DMA &nbsp; &nbsp;physical address = 0x7ffff7eea000<br />
			vmic_mmap: Mapped VME AM 0x2d addr 0x00000000 size 
0x0000ffff at address 0x86ff0000</p>

			<p>Program received signal SIGSEGV, Segmentation fault.
<br />
			mvme_read_value (mvme=0x604010, vme_addr=&lt;optimized 
out&gt;)<br />
			&nbsp; &nbsp; at 
/home/jam/midas/packages/midas/drivers/vme/vmic/vmicvme.c:352<br />
			352&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;dst &nbsp;= *((WORD 
*)addr);<br />
			&nbsp;</p>

			<p>With the pointer addr originating from a call 
to&nbsp;vmic_mapcheck within the&nbsp;&nbsp;mvme_read_value functions in the 
vmicvme.c file. Help with where to go from here would be appreciated.</p>

			<p>-Caleb&nbsp;</p>

			<p>&nbsp;</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>
       Reply  03 Aug 2023, Caleb Marshall, Forum, Issues with Universe II Driver  
Here is the output:

vmic_mmap: Mapped VME AM 0x0d addr 0x00000000 size 0x00ffffff at address 0x80a01000
mvme_open:
Bus handle              = 0x3
DMA handle              = 0x158f5d0
DMA area size           = 1048576 bytes
DMA    physical address = 0x7f91db553000
vmic_mmap: Mapped VME AM 0x2d addr 0x00000000 size 0x0000ffff at address 0x86ff0000
vme addr: 00000000 
addr: db543000 
          Reply  03 Aug 2023, Konstantin Olchanski, Forum, Issues with Universe II Driver  
> Here is the output:
> 
> vmic_mmap: Mapped VME AM 0x0d addr 0x00000000 size 0x00ffffff at address 0x80a01000
> mvme_open:
> Bus handle              = 0x3
> DMA handle              = 0x158f5d0
> DMA area size           = 1048576 bytes
> DMA    physical address = 0x7f91db553000
> vmic_mmap: Mapped VME AM 0x2d addr 0x00000000 size 0x0000ffff at address 0x86ff0000
> vme addr: 00000000 
> addr: db543000 

I see the problem. A24 is mapped at 0x80xxxxxx, A16 is mapped at 0x86ffxxxx, but 
mvme_read computed address 0xdb543000, out of range of either mapped vme address. ouch.

One more thing to check, AFAIK, this universe-II codes were never used on 64-bit CPU 
before, we only have 32-bit Pentium-3 and Pentium-4 machines with these chips. The 
tsi148 codes used to work both 32-bit and 64-bit, we used to have both flavours of 
CPUs, but now only have 64-bit.

What is your output for "uname -a"? does it report 32-bit or 64-bit kernel?

If you feel adventurous, you can build 32-bit midas (cd .../midas; make linux32), 
compile vmescan.o with "-m32" and link vmescan.exe against .../midas/linux-m32/lib, and 
see if that works. Meanwhile, I can check if vmicvme.c is 64-bit clean. Checking if 
kernel module is 64-bit clean would be more difficult...

K.O.
             Reply  03 Aug 2023, Caleb Marshall, Forum, Issues with Universe II Driver  
I am looking into compiling the 32 bit midas.

In the meantime, here is the kernel info:

3.10.0-1160.11.1.el7.x86_64 

Thank you for the help.
-Caleb
                Reply  04 Aug 2023, Caleb Marshall, Forum, Issues with Universe II Driver  
I can compile 32 bit midas. Unless I am interpreting the linking error, I don't 
think I can use the driver as built. 

While trying to compile vme_scan, most of the programs fail with:

/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-
linux/4.8.5/../../../../lib/libvme.so when searching for -lvme
/usr/bin/ld: skipping incompatible /lib/../lib/libvme.so when searching for -lvme
/usr/bin/ld: skipping incompatible /usr/lib/../lib/libvme.so when searching for -
lvme
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-redhat-
linux/4.8.5/../../../libvme.so when searching for -lvme
/usr/bin/ld: skipping incompatible //lib/libvme.so when searching for -lvme
/usr/bin/ld: skipping incompatible //usr/lib/libvme.so when searching for -lvme

with libvme.so being built by the universe-II driver. Not sure if I can get around 
this without messing with the driver? Is it possible to build a 32 bit version of 
that shared library without having to touch the actual kernel module? 

-Caleb
                   Reply  04 Aug 2023, Konstantin Olchanski, Forum, Issues with Universe II Driver  
> I can compile 32 bit midas. Unless I am interpreting the linking error, I don't 
> think I can use the driver as built.

I think you are right, Makefile from the Universe package does not build a -m32 version 
of libvme.so. I think I can fix that...

K.O.
Entry  12 May 2023, Stefan Ritt, Info, New environment variable MIDAS_EXPNAME 
A new environment variable MIDAS_EXPNAME has been introduced. This must be
used for cases where people use MIDAS_DIR, and is then equivalent for the
experiment name and directory usually used in the exptab file. This fixes
and issue with creating and deleting shared memory in midas as described in

https://bitbucket.org/tmidas/midas/issues/363/deletion-of-shared-memory-fails

The documentation has been updated at

https://daq00.triumf.ca/MidasWiki/index.php/MIDAS_environment_variables#MIDAS_EXPN
AME

/Stefan
    Reply  12 May 2023, Konstantin Olchanski, Info, New environment variable MIDAS_EXPNAME 
> A new environment variable MIDAS_EXPNAME has been introduced [to be used together with 
MIDAS_DIR]

This is fixes an important buglet. If experiment uses MIDAS_DIR instead of exptab, at the time 
of connecting to ODB, we do not know the experiment name and use name "Default" to create ODB 
shared memory, instead of actual experiment name.

This creates an inconsistency, if some MIDAS programs in the same experiment use MIDAS_DIR while 
others use exptab (this would be unusual, but not impossible) they would connect to two 
different ODB shared memories, former using name "Default", latter using actual experiment name.

As an indication that something is not right, when stopping MIDAS programs, there is an error 
message about failure to delete shared ODB shared memory (because it was created using name 
"Default" and delete using the correct experiment name fails).

Also it can cause co-mingling between two different experiments, depending on the type of shared 
memory used by MIDAS (see $MIDAS_DIR/.SHM_TYPE.TXT):
POSIX - (usually not used) not affected (experiment name is not used)
POSIXv2 - (usually not used) affected (shm name is "$EXP_NAME_ODB")
POSIXv3 - used on MacOS - affected (shm name is "$UID_$EXP_NAME_ODB" so "$UID_Default_ODB" will 
collide)
POSIXv4 - used on Linux - not affected (shm name includes $MIDAS_DIR which is different for 
different experiments)

K.O.
       Reply  20 Jun 2023, Stefan Ritt, Info, New environment variable MIDAS_EXPNAME 
I just realized that we had already MIDAS_EXPT_NAME, and now people get confused with

MIDAS_EXPT_NAME
and
MIDAS_EXPNAME

In trying to fix this confusion, I changed the name of the second variable to MIDAS_EXPT_NAME as well, 
so we only have one variable now. If this causes any problems please report here.

Stefan
          Reply  28 Jul 2023, Stefan Ritt, Info, New environment variable MIDAS_EXPNAME 
Concerning naming of shared memories I went one step further due to some requirement of a local experiment. 
The experiment needs to change the experiment name shown on the web status page depending on the exact 
configuration, but we do not want to change the whole midas experiment each time.

So I simple removed the check that the experiment name coming from the environment and used for the shared 
memory gets checked against the experiment name in the ODB. The only connection there is that the ODB name 
gets set to the environment name is it does not exist or is empty, just to have some default value. So for 
most people nothing should change. If one changes however the name in the ODB (under /Experiment/Name), 
nothing will change internally, just the web display via mhttpd changes its title.

I hope this has no bad side-effects, so please have a look if you see any issue in your experiment.

Stefan
Entry  24 Jul 2023, Nick Hastings, Bug Report, Incompatible data types with mysql odbc interface 
Hello,

I have recently set up a midas-2022-05-c instance and have been trying to configure
it to use the mysql odbc interface. Tables are being created for it but
the logger is issuing errors that some of the column types are incorrect. For example
in the log I see:

14:22:12.689 2023/07/25 [Logger,ERROR] [history_odbc.cxx:1531:hs_define_event,ERROR] Error: History event 'Run transitions': Incompatible data type for tag 'State' type 'UINT32', SQL column 'state' type 'INT UNSIGNED'
14:22:12.689 2023/07/25 [Logger,ERROR] [history_odbc.cxx:1531:hs_define_event,ERROR] Error: History event 'Run transitions': Incompatible data type for tag 'Run number' type 'UINT32', SQL column 'run_number' type 'INT UNSIGNED'

Checking the table in the database I see:

MariaDB [t2kgscND280]> describe run_transitions;
+------------+------------------+------+-----+---------------------+-------------------------------+
| Field      | Type             | Null | Key | Default             | Extra                         |
+------------+------------------+------+-----+---------------------+-------------------------------+
| _t_time    | timestamp        | NO   | MUL | current_timestamp() | on update current_timestamp() |
| _i_time    | int(11)          | NO   | MUL | NULL                |                               |
| state      | int(10) unsigned | YES  |     | NULL                |                               |
| run_number | int(10) unsigned | YES  |     | NULL                |                               |
+------------+------------------+------+-----+---------------------+-------------------------------+
4 rows in set (0.000 sec)

Please note that this is not the only history variable that has this problem. There are multiple variables
for which:

Incompatible data type for tag 'Foo Bar' type 'UINT32', SQL column 'foo_bar' type 'INT UNSIGNED'

Checking history_odbc.cxx, I see:

static const char *sql_type_mysql[] = {
   "xxxINVALIDxxxNULL", // TID_NULL
   "tinyint unsigned",  // TID_UINT8
   "tinyint",           // TID_INT8
   "char",              // TID_CHAR
   "smallint unsigned", // TID_UINT16
   "smallint",          // TID_INT16
   "integer unsigned",  // TID_UINT32
   "integer",           // TID_INT32
   "tinyint",           // TID_BOOL
   "float",             // TID_FLOAT
   "double",            // TID_DOUBLE
   "tinyint unsigned",  // TID_BITFIELD
   "VARCHAR",           // TID_STRING
   "xxxINVALIDxxxARRAY",
   "xxxINVALIDxxxSTRUCT",
   "xxxINVALIDxxxKEY",
   "xxxINVALIDxxxLINK"
};

So it seems that unsigned int should map to UINT32.

The database is:
Server version: 10.5.16-MariaDB MariaDB Server

Please let me know if more information is needed.

Note that the choice of using the odbc interface is because we
plan to import an old database that was created using the odbc interface
with a previous version of midas (yes this is your old friend T2K/ND280).

Regards,

Nick.
    Reply  25 Jul 2023, Nick Hastings, Bug Report, Incompatible data types with mysql odbc interface 
Hello,

wanted add few things:

1. I see the same problem for INT32
2. For now I've worked around these problems with https://bitbucket.org/nickhastings/midas/commits/e4776f7511de0647077c8c80d43c17bbfe2184fd
3. I'm using mariadb-connector-odbc-3.1.12-3.el9.x86_64 (System is AlmaLinux 9)

Regards,

Nick.
Entry  18 Jul 2023, Gennaro Tortone, Bug Report, access to filesystem through mhttpd 
Hi,

after some networks security scans I received some warnings because mhttpd expose
server filesystem through HTTP(S)...

in details a MIDAS user can access to /etc/passwd or download other files from
filesystem using a web browser:

(e.g. http://midas.host:8080/etc/passwd)

I know that /etc/passwd does not contain users password and mhttpd runs as an
unprivileged user but in principle this should be avoided in order to minimize 
security risks: if I authorize a user to use MIDAS interface in order to handle
acquisition tasks this should not authorize the user to access the server filesystem...
but this access should be restricted to MIDAS web pages, custom pages etc.

What do you think about this ?

Cheers,
Gennaro
    Reply  18 Jul 2023, Konstantin Olchanski, Bug Report, access to filesystem through mhttpd 
> (e.g. http://midas.host:8080/etc/passwd)

not again! I complained about this before, and I added a fix, but it must be broken again.

getting a copy of /etc/passwd is reasonably benign, but getting a copy of 
/home/$USER/.ssh/id_rsa, id_rsa.pub, knownhosts and authorized_keys is a disaster.

(running mhttpd behind a web proxy does not solve the problem, number of attackers is 
reduced to only the people who know the proxy password and to local users).

K.O.
       Reply  19 Jul 2023, Zaher Salman, Bug Report, access to filesystem through mhttpd 
Have you actually been able to read /etc/passwd this way? I tested this on a few of our servers and it does not work. As far as I know, there is access to files in resources, custom pages etc.

Other possible ways to access the file system is via mjsonrpc calls, but again these are restricted to certain folders.

Can you please give us more details about this.

Zaher

> > (e.g. http://midas.host:8080/etc/passwd)
> 
> not again! I complained about this before, and I added a fix, but it must be broken again.
> 
> getting a copy of /etc/passwd is reasonably benign, but getting a copy of 
> /home/$USER/.ssh/id_rsa, id_rsa.pub, knownhosts and authorized_keys is a disaster.
> 
> (running mhttpd behind a web proxy does not solve the problem, number of attackers is 
> reduced to only the people who know the proxy password and to local users).
> 
> K.O.
Entry  11 Jul 2023, Anubhav Prakash, Forum, Possible ODB corruption! Webpages https://midptf01.triumf.ca/?cmd=Programs not loading! Screenshot_2023-07-11_153016.pngScreenshot_2023-07-11_153016j.png
The ODB server seems to have crashed/corrupted. I tried reloading the previous
working version of ODB(using the commands in folliwng image) but it didn't work.



I have also attached the screenshot of the site https://midptf01.triumf.ca/?cmd=Programs. Any help to resolve this would be appreciated! Normally Prof. Thomas Lindner would solve such issues, but he is busy working at CERN till 17th of July, and we cannot afford to wait until then.

The following is the error: when I run bash /home/midptf/online/bin/start_daq.sh

[ODBEdit1,INFO] Fixing ODB "/Programs/ODBEdit" struct size mismatch (expected
316, odb size 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/ODBEdit",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/ODBEdit"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "ODBEdit", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/mhttpd" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/mhttpd",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/mhttpd"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "mhttpd", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/Logger" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/Logger",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/Logger"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "Logger", db_get_record1() status 319

14:54:29 [ODBEdit,ERROR] [odb.cxx:1763:db_validate_db,ERROR] Warning: database
data area is 100% full

14:54:29 [ODBEdit,ERROR] [odb.cxx:1283:db_validate_key,ERROR] hkey 643368, path
"/Alarms/Classes/<NULL>/Display BGColor", string value is not valid UTF-8

14:54:29 [ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot
malloc_data(256), called from db_set_link_data

14:54:29 [ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot
reallocate "/System/Tmp/140305391605888I/Start command" with new size 256 bytes,
online database full

14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command"

14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command"
    Reply  11 Jul 2023, Thomas Lindner, Forum, Possible ODB corruption! Webpages https://midptf01.triumf.ca/?cmd=Programs not loading! 
Hi Anubhav,

I have fixed the ODB corruption problem.

Cheers,
Thomas



Anubhav Prakash wrote:
The ODB server seems to have crashed/corrupted. I tried reloading the previous
working version of ODB(using the commands in folliwng image) but it didn't work.



I have also attached the screenshot of the site https://midptf01.triumf.ca/?cmd=Programs. Any help to resolve this would be appreciated! Normally Prof. Thomas Lindner would solve such issues, but he is busy working at CERN till 17th of July, and we cannot afford to wait until then.

The following is the error: when I run bash /home/midptf/online/bin/start_daq.sh

[ODBEdit1,INFO] Fixing ODB "/Programs/ODBEdit" struct size mismatch (expected
316, odb size 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/ODBEdit",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/ODBEdit"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "ODBEdit", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/mhttpd" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/mhttpd",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/mhttpd"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "mhttpd", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/Logger" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/Logger",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/Logger"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "Logger", db_get_record1() status 319

14:54:29 [ODBEdit,ERROR] [odb.cxx:1763:db_validate_db,ERROR] Warning: database
data area is 100% full

14:54:29 [ODBEdit,ERROR] [odb.cxx:1283:db_validate_key,ERROR] hkey 643368, path
"/Alarms/Classes/<NULL>/Display BGColor", string value is not valid UTF-8

14:54:29 [ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot
malloc_data(256), called from db_set_link_data

14:54:29 [ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot
reallocate "/System/Tmp/140305391605888I/Start command" with new size 256 bytes,
online database full

14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command"

14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command"
Entry  21 Jun 2023, Gennaro Tortone, Bug Report, mserver and script execution 
Hi,
I have the following setup:

- MIDAS release: release/midas-2022-05-c
- host with MIDAS frontend (mclient)
- host with MIDAS server (mhttpd / mserver)

On mclient I run a frontend with:

./feodt5751 -h mserver -e develop -i 0

On mserver I see frontend ready and ODB variables in place;

I noticed a strange behavior with "/Programs/Execute on start run" and 
"/Programs/Execute on stop run". In details the script to execute at start of run
is executed on "mserver" host but the script to execute at stop of run is executed on
"mclient" host (!)

Is this a bug or I'm missing some documentation links ?

Thanks in advance,
Gennaro
    Reply  26 Jun 2023, Stefan Ritt, Bug Report, mserver and script execution 
Indeed that could well be (and is certainly not intended like that). I checked the code
and found that "execute on start run" and "execute on stop run" are called inside
cm_transition(). That means they are executed on the computer which calls cm_transition().
If you use mhttpd and start a run through the web interface, then mhttpd runs on your
server and "execute on start run" gets executed on your server. If you stop the run
by your frontend running on the client machine (like if a certain number of events 
is reached), then "execute on stop run" gets executed on your client.

An easy way around would not to use "/Equipment/Trigger/Common/Event limit" which
gets check by your frontend and therefore on the client computer, but use 
"/Logger/Channels/0/Settings/Event limit" which gets checked by the logger and
therefore executed on the server computer.

Getting a consistent behaviour (like always executing scripts on the server) would
require a major rework of the run transition framework with probably many undesired
side-effects, so lots of debugging work.

Stefan
       Reply  27 Jun 2023, Gennaro Tortone, Bug Report, mserver and script execution 
Hi Stefan,

> Indeed that could well be (and is certainly not intended like that). I checked the code
> and found that "execute on start run" and "execute on stop run" are called inside
> cm_transition(). That means they are executed on the computer which calls cm_transition().
> If you use mhttpd and start a run through the web interface, then mhttpd runs on your
> server and "execute on start run" gets executed on your server. If you stop the run
> by your frontend running on the client machine (like if a certain number of events 
> is reached), then "execute on stop run" gets executed on your client.

ok, this is clear to me...

> An easy way around would not to use "/Equipment/Trigger/Common/Event limit" which
> gets check by your frontend and therefore on the client computer, but use 
> "/Logger/Channels/0/Settings/Event limit" which gets checked by the logger and
> therefore executed on the server computer.

we never used "/Equipment/Trigger/Common/Event limit" but we always used
"/Logger/Channels/0/Settings/Event limit"...

btw I did some tests and I understand that this issue is related to 'deferred transition'
on frontend. Indeed I disabled deferred transition on frontend side and now script
execution is carried out always on MIDAS server;

Cheers,
Gennaro
          Reply  27 Jun 2023, Stefan Ritt, Bug Report, mserver and script execution 
> btw I did some tests and I understand that this issue is related to 'deferred transition'
> on frontend. Indeed I disabled deferred transition on frontend side and now script
> execution is carried out always on MIDAS server;

Ah, that's clear now. In a deferred transition, the frontend finally stops the run (after the 
condition is given to finish). Since the client calls cm_transition(), the script gets executed on 
the client. Changing that would be a rather large rework of the code. So maybe better call a 
script which executes another script via ssh on the server.

Stefan
Entry  23 Jun 2023, Gennaro Tortone, Bug Report, deferred stop transition 
Hi,

I'm facing some issues with 'stop' deferred transition and I suspect of
a MIDAS bug regarding this...

to reproduce the issue I use the 'deferredfe' MIDAS example (develop branch), 
changing only the equipment name from 'Deferred' to 'Deferred%02d' in order
to be able to run multiple 'deferredfe' instances;

I run *three* 'deferredfe' frontends using:

./deferredfe -i 0
./deferredfe -i 2
./deferredfe -i 3

Everything goes fine on MIDAS web page and 'deferredfe' frontends are initialized
and ready to run; issues occour after 'start' when I stop the frontends: sometimes
at first shot and sometimes at next 'start'/'stop' the deferred 'stop' transition
seems to be handled in wrong way... and often one frontend goes in 'segmentation fault'

The odd thing is when I run *two* instances: in this case no issues are reported...

Thanks in advance,
Gennaro
    Reply  23 Jun 2023, Stefan Ritt, Bug Report, deferred stop transition 
Deferred transitions were only implemented with a single instance of a program deferring the 
transition. To have several instances, MIDAS probably needs to be extended. Certainly this 
was never tested, so it's not a surprise that we get a segmentation fault.

Stefan
       Reply  23 Jun 2023, Gennaro Tortone, Bug Report, deferred stop transition 
Hi Stefan,

so if I have two different frontends (feov1725 and feodt5751) connected on the same 'mserver'
I'm in the same situation ?

Cheers,
Gennaro

> Deferred transitions were only implemented with a single instance of a program deferring the 
> transition. To have several instances, MIDAS probably needs to be extended. Certainly this 
> was never tested, so it's not a surprise that we get a segmentation fault.
> 
> Stefan
          Reply  23 Jun 2023, Stefan Ritt, Bug Report, deferred stop transition 
> I'm in the same situation ?

Yepp.
       Reply  26 Jun 2023, Gennaro Tortone, Bug Report, deferred stop transition 
> Deferred transitions were only implemented with a single instance of a program deferring the 
> transition. To have several instances, MIDAS probably needs to be extended. Certainly this 
> was never tested, so it's not a surprise that we get a segmentation fault.
> 
> Stefan

Hi Stefan,

I copied deferredfe.cxx to mydeferredfe.cxx and I changed mydeferred.cxx to be a different frontend:

const char *frontend_name = "mydeferredfe";

If I start two "different" frontends:

./deferredfe
./mydeferredfe 

and try to start/stop a run... the result is the same: frontend status messing up on next 'start':

---

deferredfe:

Started run 332
Event ID:4 - Event#: 0
Event ID:4 - Event#: 1
Event ID:4 - Event#: 2
Event ID:4 - Event#: 3
Event ID:4 - Event#: 4
Event ID:4 - Event#: 5
Event ID:4 - Event#: 6

mydeferredfe:

Started run 332
Transition ignored, Event ID:2 - Event#: 0
Transition ignored, Event ID:2 - Event#: 1
Transition ignored, Event ID:2 - Event#: 2
Transition ignored, Event ID:2 - Event#: 3
Transition ignored, Event ID:2 - Event#: 4
End of cycle... perform transition
Event ID:2 - Event#: 5
End of cycle... perform transition
Event ID:2 - Event#: 6

---

so, it seems that the issue is not related to different 'instances' of same frontend but
that *at most* one frontend on whole MIDAS server can handle deferred transitions...

is this the case ?

Cheers,
Gennaro 
          Reply  26 Jun 2023, Stefan Ritt, Bug Report, deferred stop transition 
> so, it seems that the issue is not related to different 'instances' of same frontend but
> that *at most* one frontend on whole MIDAS server can handle deferred transitions...
> 
> is this the case ?


Correct.

- Stefan
Entry  13 Jun 2023, Thomas Senger, Forum, Include subroutine through relative path in sequencer 
Hi, I would like to restructure our sequencer scripts and the paths. Until now many things are not generic at all. I would like to ask if it is possible to include files through a relative path for example something like 
INCLUDE ../chip/global_basic_functions
Maybe I just did not found how to do it.
    Reply  13 Jun 2023, Stefan Ritt, Forum, Include subroutine through relative path in sequencer 
> Hi, I would like to restructure our sequencer scripts and the paths. Until now many things are not generic at all. I would like to ask if it is possible to include files through a relative path for example something like 
> INCLUDE ../chip/global_basic_functions
> Maybe I just did not found how to do it.

It was not there. I implemented it in the last commit.

Stefan
       Reply  13 Jun 2023, Marco Francesconi, Forum, Include subroutine through relative path in sequencer 
> > Hi, I would like to restructure our sequencer scripts and the paths. Until now many things are not generic at all. I would like to ask if it is possible to include files through a relative path for example something like 

> > INCLUDE ../chip/global_basic_functions

> > Maybe I just did not found how to do it.

> 

> It was not there. I implemented it in the last commit.

> 

> Stefan



Hi Stefan,

when I did this job for MEG II we decided not to include relative paths and the ".." folder to avoid an exploit called "XML Entity Injection".

In short is to avoid leaking files outside the sequencer folders like  /etc/password or private SSH keys.

I do not remember in this moment why we pushed for absolute paths instead but let's keep this in mind.



Marco
          Reply  13 Jun 2023, Stefan Ritt, Forum, Include subroutine through relative path in sequencer 
> when I did this job for MEG II we decided not to include relative paths and the ".." folder to avoid an exploit called "XML Entity Injection".
> In short is to avoid leaking files outside the sequencer folders like  /etc/password or private SSH keys.
> I do not remember in this moment why we pushed for absolute paths instead but let's keep this in mind.

I thought about that. But before we had absolute paths in the sequencer INCLUDE statement. So having "../../../etc/passwd" is as bad as the
absolute path "/etc/passwd". So nothing really changed. What we really should prevent is to LOAD files into the sequencer from outside the
sequence subdirectory. And this is prevented by the file loader. Actually we will soon replace the file loaded with a modern JS dialog, and
the code restricts all operations to within the experiment directory and below.

Stefan
Entry  09 Jun 2023, Konstantin Olchanski, Info, added IPv6 support for mserver and MIDAS RPC 
as of commit 71fb5e82f3e9a1501b4dc2430f9193ee5d176c82, MIDAS RPC and the mserver 
listen for connections both on IPv4 and IPv6. mserver clients and MIDAS RPC 
clients can connect to MIDAS using both IPv4 and IPv6. In the default 
configuration ("/Expt/Security/Enable non-localhost RPC" set to "n"), IPv4 
localhost is used, as before. Support for IPv6 is a by product from switching 
from obsolete non-thread-safe gethostbyname() and getaddrbyname() to modern 
getaddrinfo() and getnameinfo(). This fixes bug 357, observed crash of mhttpd 
inside gethostbyname(). K.O.
Entry  23 May 2023, Kou Oishi, Bug Report, Event builder fails at every 10 runs 
Dear MIDAS experts,

Greetings! 
I am currently utilizing MIDAS for our experiment and I have encountered an issue with our event builder, which was developed based on the example code 'eventbuilder/mevb.cxx'. I'm uncertain whether this is a genuine bug or an inherent feature of MIDAS.

The event builder fails to initiate the 10th run since its startup, requiring us to relaunch it. Upon investigating the code, I have identified that this issue stems from line 8404 of mfe.cxx (the version's hash is db94df6fa79772c49888da9374e143067a1fff3a). According to the code, the 10-run limit is imposed by the variable MAX_EVENT_REQUESTS in midas.h. While I can increase this value as the code suggests, it does not provide a complete solution, as the same problem will inevitably resurface. This complication unnecessarily hampers our data collection during long observation periods.

Despite the code indicating 'BM_NO_MEMORY: too many requests,' this explanation does not seem logical to me. In fact, other standard frontends do not encounter this problem and can start new runs as required without requiring a frontend relaunch.

I apologize for not yet fully grasping the intricate implementation of midas.cxx and mfe.cxx. However, I would greatly appreciate any suggestions or insights you can offer to help resolve this issue.

Thank you in advance for your kind assistance.
    Reply  31 May 2023, Ben Smith, Bug Report, Event builder fails at every 10 runs 
> The event builder fails to initiate the 10th run since its startup, 
> 'BM_NO_MEMORY: too many requests,'

Hi Kou,

It sounds like you might be calling bm_request_event() when starting a run, but not calling bm_delete_request() when the run stops. So you end up "leaking" event requests and eventually reach the limit of 10 open requests.

In examples/eventbuilder/mevb.c the request deletion happens in source_unbooking(), which is called as part of the "run stopping" logic. I've just updated the midas repository so the example compiles correctly, and was able to start/stop 15 runs without crashing.

Can you check the end-of-run logic in your version to ensure you're calling bm_delete_request()?
       Reply  02 Jun 2023, Kou Oishi, Bug Report, Event builder fails at every 10 runs 
Dear Ben,

Hello. Thank you for your attention to this problem!

> It sounds like you might be calling bm_request_event() when starting a run, but not calling bm_delete_request() when the run stops. So you end up "leaking" event requests and eventually reach the limit of 10 open requests.

I understand. Thanks for the description.

> In examples/eventbuilder/mevb.c the request deletion happens in source_unbooking(), which is called as part of the "run stopping" logic. I've just updated the midas repository so the example compiles correctly, and was able to start/stop 15 runs without crashing.
> 
> Can you check the end-of-run logic in your version to ensure you're calling bm_delete_request()?

I really appreciate your update.
Although I am away at the moment from the DAQ development, I will test it and report the result here as soon as possible.

Best regards,
Kou
Entry  10 May 2023, Lukas Gerritzen, Suggestion, Desktop notifications for messages 
It would be nice to have MIDAS notifications pop up outside of the browser window.

To get enable this myself, I hijacked the speech synthesis and I added the following to mhttpd_speak_now(text) inside mhttpd.js:
let notification = new Notification('MIDAS Message', {
    body: text,
});

I couldn't ask for the permission for notifications here, as Firefox threw the error "The Notification permission may only be requested from inside a short running user-generated event handler". Therefore, I added a button to config.html:
<button class="mbutton" onclick="Notification.requestPermission()">Request notification permission</button>

There might be a more elegant solution to request the permission.
    Reply  10 May 2023, Stefan Ritt, Suggestion, Desktop notifications for messages 

Lukas Gerritzen wrote:
It would be nice to have MIDAS notifications pop up outside of the browser window.


There are certainly dozens of people who do "I don't like pop-up windows all the time". So this has to come with a switch in the config page to turn it off. If there is a switch "allow pop-up windows", then we have the other fraction of people using Edge/Chrome/Safari/Opera saying "it's not working on my specific browser on version x.y.z". So I'm only willing to add that feature if we are sure it's a standard things working in most environments.

Best,
Stefan
       Reply  10 May 2023, Lukas Gerritzen, Suggestion, Desktop notifications for messages 

Stefan Ritt wrote:

people using Edge/Chrome/Safari/Opera saying "it's not working on my specific browser on version x.y.z". So I'm only willing to add that feature if we are sure it's a standard things working in most environments.



[The API looks pretty standard to me. Firefox, Chrome, Opera have been supporting it for about 9 years, Safari for almost 6. I didn't find out when Edge 14 was released, but they're at version 112 now.

Since browsers don't want to annoy their users, many don't allow websites to ask for permissions without user interaction. So the workflow would be something like: The user has to press a button "please ask for permission", then the browser opens a dialog "do you want to grant this website permission to show notifications?" and only then it works. So I don't think it's an annoying popup-mess, especially since system notifications don't capture the focus and typically vanish after a few seconds. If that feature is hidden behind a button on the config page, it shouldn't lead to surprises. Especially since users can always revoke that permission.
          Reply  11 May 2023, Stefan Ritt, Suggestion, Desktop notifications for messages 
Ok, I implemented desktop notifications. In the MIDAS config page, you can now enable browser notifications for the different types of messages. Not sure this works perfectly, but a staring point. So please let me know if there is any issue.

Stefan
Entry  10 May 2023, Lukas Gerritzen, Suggestion, Make sequencer more compatible with mobile devices 
When trying to select a run script on an iPad or other mobile device, you cannot enter subdirectories. This is caused by the following part:
if (script.substring(0, 1) === "[") {
   // refuse to load script if the selected a subdirectory
   return;
}

and the fact that the <option> elements are listening for double click events, which seem to be impossible on a mobile device.

The following modification allows browsing the directories without changing the double click behaviour on a desktop:
diff --git a/resources/load_script.html b/resources/load_script.html
index 41bfdccd..36caa57f 100644
--- a/resources/load_script.html
+++ b/resources/load_script.html
@@ -59,6 +59,28 @@
 </div>

 <script>
+   document.getElementById("msg_sel").onchange = function() {
+      script = this.value;
+      button = document.getElementById("load_button");
+      if (script.substring(0, 4) === "[..]") {
+         // Change button to go back
+         enable_button_by_id("load_button");
+         button.innerHTML = "Back";
+         button.onclick = up_subdir;
+      } else if (script.substring(0, 1) === "[") {
+         // Change button to load subdirectory
+         enable_button_by_id("load_button");
+         button.innerHTML = "Enter subdirectory";
+         button.onclick = load_subdir;
+      } else {
+         // Change button to load script
+         enable_button_by_id("load_button");
+         button = document.getElementById("load_button");
+         button.innerHTML = "Load script";
+         button.onclick = load_script;
+      }
+   }
+
 function set_if_changed(id, value)
 {
    var e = document.getElementById(id);

This makes the code quoted above redundant, so the check can actually be omitted.
    Reply  10 May 2023, Stefan Ritt, Suggestion, Make sequencer more compatible with mobile devices 

Lukas Gerritzen wrote:
When trying to select a run script on an iPad or other mobile device, you cannot enter subdirectories. This is caused by the following part:


We are working right now on a general file picker, which will replace also the file picker for the sequencer. So please wait until the new thing is out and then test it there.

Stefan
Entry  08 May 2023, Alexey Kalinin, Forum, Scrript in sequencer 
Hello,

I tried different ways to pass parameters to bash script, but there are seems to 
be empty, what could be the problem?

We have seuqencer like

ODBGET "/Runinfo/runnumber", firstrun
LOOP n,10
#changing HV
TRANSITION start
WAIT seconds,300
TRANSITION stop
ENDLOOP
ODBGET "/Runinfo/runnumber", lastrun
SCRIPT /.../script.sh ,$firstrun ,$lastrun

and script.sh like
firstrun=$1
lastrun=$2

Thanks. Alexey.
    Reply  08 May 2023, Stefan Ritt, Forum, Scrript in sequencer 
> I tried different ways to pass parameters to bash script, but there are seems to 
> be empty, what could be the problem?

Indeed there was a bug in the sequencer with parameter passing to scripts. I fixed it
and committed the changes to the develop branch.

Stefan
       Reply  09 May 2023, Alexey Kalinin, Forum, Scrript in sequencer 
Thanks. It works perfect.
Another question is:
Is it possible to run .msl seqscript from bash cmd?
Maybe it's easier then
1 odbedit -c 'set "/sequencer/load filename" filename.msl'
2 odbedit -c 'set "/sequencer/load new file" TRUE'
3 odbedit -c 'set "/sequencer/start script" TRUE'

What is the best way to have a button starting sequencer
from  /script (or /alias )?

Alexey.

> > I tried different ways to pass parameters to bash script, but there are seems to 
> > be empty, what could be the problem?
> 
> Indeed there was a bug in the sequencer with parameter passing to scripts. I fixed it
> and committed the changes to the develop branch.
> 
> Stefan
          Reply  10 May 2023, Stefan Ritt, Forum, Scrript in sequencer 
> Thanks. It works perfect.
> Another question is:
> Is it possible to run .msl seqscript from bash cmd?
> Maybe it's easier then
> 1 odbedit -c 'set "/sequencer/load filename" filename.msl'
> 2 odbedit -c 'set "/sequencer/load new file" TRUE'
> 3 odbedit -c 'set "/sequencer/start script" TRUE'

That will work.

> What is the best way to have a button starting sequencer
> from  /script (or /alias )?

Have a look at 

https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#Controlling_the_sequencer_from_custom_pages

where I put the necessary information.

Stefan
Entry  26 Apr 2023, Martin Mueller, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
Hi

We have a problem with running midas frontends when they should connect to a experiment on a different machine using the -h option. Starting them locally works fine. Firewall is off on both systems, Enable non-localhost RPC and Disable RPC hosts check are set to 'yes', mserver is running on the machine that we want to connect to. 
Error message looks like this: 

...
Connect to experiment Mu3e on host 10.32.113.210...
OK
Init hardware...
terminate called after throwing an instance of 'mexception'
  what():  
/home/mu3e/midas/include/odbxx.h:1102: Wrong key type in XML file
Stack trace:
1   0x00000000000042D828 (null) + 4380712
2   0x00000000000048ED4D midas::odb::odb_from_xml(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 605
3   0x0000000000004999BD midas::odb::odb(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 317
4   0x000000000000495383 midas::odb::read_key(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 1459
5   0x0000000000004971E3 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool) + 259
6   0x000000000000497636 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) + 502
7   0x00000000000049883B midas::odb::connect_and_fix_structure(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 171
8   0x0000000000004385EF setup_odb() + 8351
9   0x00000000000043B2E6 frontend_init() + 22
10  0x000000000000433304 main + 1540
11  0x0000007F8C6FE3724D __libc_start_main + 239
12  0x000000000000433F7A _start + 42

Aborted (core dumped)


We have the same problem for all our frontends. When we want to start them locally they work. Starting them locally with ./frontend -h localhost also reproduces the error above.

The error can also be reproduced with the odbxx_test.cxx example in the midas repo by replacing line 22 in midas/examples/odbxx/odbxx_test.cxx (cm_connect_experiment(NULL, NULL, "test", NULL);) with cm_connect_experiment("localhost", "Mu3e", "test", NULL); (Put the name of the experiment instead of "Mu3e")

running odbxx_test locally gives us then the same error as our other frontend.

Thanks in advance,
Martin
    Reply  27 Apr 2023, Konstantin Olchanski, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
Looks like your MIDAS is built without debug information (-O2 -g), the stack trace does not have file names and line numbers. Please rebuild with debug information and report the stack trace. Thanks. K.O.

> Connect to experiment Mu3e on host 10.32.113.210...
> OK
> Init hardware...
> terminate called after throwing an instance of 'mexception'
>   what():  
> /home/mu3e/midas/include/odbxx.h:1102: Wrong key type in XML file
> Stack trace:
> 1   0x00000000000042D828 (null) + 4380712
> 2   0x00000000000048ED4D midas::odb::odb_from_xml(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 605
> 3   0x0000000000004999BD midas::odb::odb(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 317
> 4   0x000000000000495383 midas::odb::read_key(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 1459
> 5   0x0000000000004971E3 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool) + 259
> 6   0x000000000000497636 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) + 502
> 7   0x00000000000049883B midas::odb::connect_and_fix_structure(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 171
> 8   0x0000000000004385EF setup_odb() + 8351
> 9   0x00000000000043B2E6 frontend_init() + 22
> 10  0x000000000000433304 main + 1540
> 11  0x0000007F8C6FE3724D __libc_start_main + 239
> 12  0x000000000000433F7A _start + 42
> 
> Aborted (core dumped)

K.O.
       Reply  28 Apr 2023, Martin Mueller, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
> Looks like your MIDAS is built without debug information (-O2 -g), the stack trace does not have file names and line numbers. Please rebuild with debug information and report the stack trace. Thanks. K.O.
> 
> > Connect to experiment Mu3e on host 10.32.113.210...
> > OK
> > Init hardware...
> > terminate called after throwing an instance of 'mexception'
> >   what():  
> > /home/mu3e/midas/include/odbxx.h:1102: Wrong key type in XML file
> > Stack trace:
> > 1   0x00000000000042D828 (null) + 4380712
> > 2   0x00000000000048ED4D midas::odb::odb_from_xml(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 605
> > 3   0x0000000000004999BD midas::odb::odb(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 317
> > 4   0x000000000000495383 midas::odb::read_key(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 1459
> > 5   0x0000000000004971E3 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool) + 259
> > 6   0x000000000000497636 midas::odb::connect(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) + 502
> > 7   0x00000000000049883B midas::odb::connect_and_fix_structure(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 171
> > 8   0x0000000000004385EF setup_odb() + 8351
> > 9   0x00000000000043B2E6 frontend_init() + 22
> > 10  0x000000000000433304 main + 1540
> > 11  0x0000007F8C6FE3724D __libc_start_main + 239
> > 12  0x000000000000433F7A _start + 42
> > 
> > Aborted (core dumped)
> 
> K.O.

As i said we can easily reproduce this with midas/examples/odbxx/odbxx_test.cpp  (with cm_connect_experiment changed to "localhost")
Stack trace of odbxx_test with line numbers:

Set ODB key "/Test/Settings/String Array 10[0...9]" = ["","","","","","","","","",""]
Created ODB key "/Test/Settings/Large String Array 10"
Set ODB key "/Test/Settings/Large String Array 10[0...9]" = ["","","","","","","","","",""]
[test,ERROR] [system.cxx:5104:recv_tcp2,ERROR] unexpected connection closure
[test,ERROR] [system.cxx:5158:ss_recv_net_command,ERROR] error receiving network command header, see messages
[test,ERROR] [midas.cxx:13900:rpc_call,ERROR] routine "db_copy_xml": error, ss_recv_net_command() status 411, program abort

Program received signal SIGABRT, Aborted.
0x00007ffff6665cdb in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install libgcc_s1-debuginfo-11.3.0+git1637-150000.1.9.1.x86_64 libstdc++6-debuginfo-11.2.1+git610-1.3.9.x86_64 libz1-debuginfo-1.2.11-3.24.1.x86_64
(gdb) bt
#0  0x00007ffff6665cdb in raise () from /lib64/libc.so.6
#1  0x00007ffff6667375 in abort () from /lib64/libc.so.6
#2  0x0000000000431bba in rpc_call (routine_id=11249) at /home/labor/midas/src/midas.cxx:13904
#3  0x0000000000460c4e in db_copy_xml (hDB=1, hKey=1009608, buffer=0x7ffff7e9c010 "", buffer_size=0x7fffffffadbc, header=false) at /home/labor/midas/src/odb.cxx:8994
#4  0x000000000046fc4c in midas::odb::odb_from_xml (this=0x7fffffffb3f0, str=...) at /home/labor/midas/src/odbxx.cxx:133
#5  0x000000000040b3d9 in midas::odb::odb (this=0x7fffffffb3f0, str=...) at /home/labor/midas/include/odbxx.h:605
#6  0x000000000040b655 in midas::odb::odb (this=0x7fffffffb3f0, s=0x4a465a "/Test/Settings") at /home/labor/midas/include/odbxx.h:629
#7  0x0000000000407bba in main () at /home/labor/midas/examples/odbxx/odbxx_test.cxx:56
(gdb) 
          Reply  28 Apr 2023, Konstantin Olchanski, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
> As i said we can easily reproduce this with midas/examples/odbxx/odbxx_test.cpp  (with cm_connect_experiment changed to "localhost")
> [test,ERROR] [system.cxx:5104:recv_tcp2,ERROR] unexpected connection closure
> [test,ERROR] [system.cxx:5158:ss_recv_net_command,ERROR] error receiving network command header, see messages
> [test,ERROR] [midas.cxx:13900:rpc_call,ERROR] routine "db_copy_xml": error, ss_recv_net_command() status 411, program abort

ok, cool. looks like we crashed the mserver. either run mserver attached to gdb or enable mserver core dump, we need it's stack trace,
the correct stack trace should be rooted in the handler for db_copy_xml.

but most likely odbxx is asking for more data than can be returned through the MIDAS RPC.

what is the ODB key passed to db_copy_xml() and how much data is in ODB at that key? (odbedit "du", right?).

K.O.
             Reply  28 Apr 2023, Martin Mueller, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
> > As i said we can easily reproduce this with midas/examples/odbxx/odbxx_test.cpp  (with cm_connect_experiment changed to "localhost")
> > [test,ERROR] [system.cxx:5104:recv_tcp2,ERROR] unexpected connection closure
> > [test,ERROR] [system.cxx:5158:ss_recv_net_command,ERROR] error receiving network command header, see messages
> > [test,ERROR] [midas.cxx:13900:rpc_call,ERROR] routine "db_copy_xml": error, ss_recv_net_command() status 411, program abort
> 
> ok, cool. looks like we crashed the mserver. either run mserver attached to gdb or enable mserver core dump, we need it's stack trace,
> the correct stack trace should be rooted in the handler for db_copy_xml.
> 
> but most likely odbxx is asking for more data than can be returned through the MIDAS RPC.
> 
> what is the ODB key passed to db_copy_xml() and how much data is in ODB at that key? (odbedit "du", right?).
> 
> K.O.

Ok. Maybe i have to make this more clear. ANY odbxx access of a remote odb reproduces this error for us on multiple machines. 
It does not matter how much data odbxx is asking for.

Something as simple as this reproduces the error, asking for a single integer:

int main() {
   cm_connect_experiment("localhost", "Mu3e", "test", NULL);
   midas::odb o = {
      {"Int32 Key", 42}
   };
   o.connect("/Test/Settings");
   cm_disconnect_experiment();
   return 1;
}

at the same time this runs fine:

int main() {
   cm_connect_experiment(NULL, NULL, "test", NULL);
   midas::odb o = {
      {"Int32 Key", 42}
   };
   o.connect("/Test/Settings");
   cm_disconnect_experiment();
   return 1;
}

in both cases mserver does not crash. I do not have a stack trace. There is also no error produced by mserver.

Last year we did not have these problems with the same midas frontends (For example in midas commit 9d2ef471 the code from above runs 
fine). I am trying to pinpoint the exact commit where this stopped working now. 
                Reply  28 Apr 2023, Konstantin Olchanski, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
> > > As i said we can easily reproduce this with midas/examples/odbxx/odbxx_test.cpp
> > ok, cool. looks like we crashed the mserver.
> Ok. Maybe i have to make this more clear. ANY odbxx access of a remote odb reproduces this error for us on multiple machines. 
> It does not matter how much data odbxx is asking for.
> midas commit 9d2ef471 the code from above runs fine

so, a regression. ouch.

if core dumps are turned off, you will not "see" the mserver crash, because the main mserver is still running. it's the mserver forked to 
serve your RPC connection that crashes.

> int main() {
>    cm_connect_experiment("localhost", "Mu3e", "test", NULL);
>    midas::odb o = {
>       {"Int32 Key", 42}
>    };
>    o.connect("/Test/Settings");
>    cm_disconnect_experiment();
>    return 1;
> }

to debug this, after cm_connect_experiment() one has to put ::sleep(1000000000); (not that big, obviously),
then while it is sleeping do "ps -efw | grep mserver", this will show the mserver for the test program,
connect to it with gdb, wait for ::sleep() to finish and o.connect() to crash, with luck gdb will show
the crash stack trace in the mserver.

so easy to debug? this is why back in the 1970-ies clever people invented core dumps, only to have
even more clever people in the 2020-ies turn them off and generally make debugging more difficult (attaching
gdb to a running program is also disabled-by-default in some recent linuxes).

rant-off.

to check if core dumps work, to "killall -7 mserver". to enable core dumps on ubuntu, see here:
https://daq00.triumf.ca/DaqWiki/index.php/Ubuntu

last known-working point is:

commit 9d2ef471c4e4a5a325413e972862424549fa1ed5
Author: Ben Smith <bsmith@triumf.ca>
Date:   Wed Jul 13 14:45:28 2022 -0700

    Allow odbxx to handle connecting to "/" (avoid trying to read subkeys as "//Equipment" etc.

K.O.
                   Reply  02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
Thanks for all the helpful hints. When finally managing to evade all timeouts and attach the debugger in just the right moment, we find that we get a segfault in mserver at L827:
   case RPC_DB_COPY_XML:
      status = db_copy_xml(CHNDLE(0), CHNDLE(1), CSTRING(2), CPINT(3), CBOOL(4));
Some printf debugging then pointed us to the fact that the culprit is the pointer de-referencing in CBOOL(4). This in turn can be traced back to mrpc.cxx L282 ff, where the line with the arrow was missing:
  {RPC_DB_COPY_XML, "db_copy_xml",
    {{TID_INT32, RPC_IN},
     {TID_INT32, RPC_IN},
     {TID_ARRAY, RPC_OUT | RPC_VARARRAY},
     {TID_INT32, RPC_IN | RPC_OUT},
 ->  {TID_BOOL, RPC_IN},
     {0}}},

If we put that in, the mserver process completes peacfully and we get a segfault in the client ("Wrong key type in XML file") which we will attempt to debug next. Shall I create a pull request for the additional RPC argument or will you just fix this on the fly?
                      Reply  02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 
And now we also fixed the client segfault, odb.cxx L8992 also needs to know about the header:
if (rpc_is_remote())
      return rpc_call(RPC_DB_COPY_XML, hDB, hKey, buffer, buffer_size, header);

(last argument was missing before).
                      Reply  02 May 2023, Stefan Ritt, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option 

>  Shall I create a pull request for the additional RPC argument or will you just fix this on the fly?



Just fix it in the fly yourself. It’s an obvious bug, so please commit to develop.



Stefan 
Entry  01 May 2023, Giovanni Mazzitelli, Bug Report, python issue with mathplot lib vs odb query Screenshot_2023-05-01_at_09.57.01.png
Ciao, 
we have a very strange issue with python lib with client.odb_get("/") function 
when running as midas process and matplotlib is used.


we are developing a remote console by means of sending via kafka producer the odb, 
camera image and pmt waveforms, in the INFN cloud where grafana make available 
data for non expert shifters, as well as sending midas events for online 
reconstruction to the htcondr queue on cloud. The process work perfectly and allow 
use to parallelise to standard midas pipeline for file production, ecc the online 
monitoring and data processing where we have computing resources (our DAQ is 
underground at LNGS). Part of the work will be presented next weak at CHEP
the full code is available at https://github.com/CYGNUS-
RD/middleware/blob/master/dev/event_producer_s3.py
but to get the strange behaviour I report here a test script:

----
def main(verbose=False):
    from matplotlib import pyplot as plt

    import time

    import midas
    import midas.client

    
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2) 
    
    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))
    
    
    while True:
        # 
        odb = client.odb_get("/")
        if verbose:
            print(odb)
        start1 = time.time()
              
        client.communicate(10)
        time.sleep(1)
        

    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()
----
if I run it as cli interactivity including or not matplotlib the everything si ok. 
As I run it as midas "program" I get: 
-----
Traceback (most recent call last):
  File "/home/standard/daq/middleware/dev/test_midas_error.py", line 48, in 
<module>
    main(verbose=options.verbose)
  File "/home/standard/daq/middleware/dev/test_midas_error.py", line 29, in main
    odb = client.odb_get("/")
  File "/home/standard/packages/midas/python/midas/client.py", line 354, in 
odb_get
    retval = midas.safe_to_json(buf.value, use_ordered_dict=True)
  File "/home/standard/packages/midas/python/midas/__init__.py", line 552, in 
safe_to_json
    return json.loads(decoded, strict=False, 
object_pairs_hook=collections.OrderedDict)
  File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: 
line 300 column 26 (char 17535)
----
if I comment out the import of matplotlib every think works perfectly again also 
as midas program. 

it seams that there is a difference between the to way of use the code, and that 
is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
    Reply  01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query 
> it seams that there is a difference between the to way of use the code, and that 
> is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?

I can't reproduce this on my machines, so this is going to be fun to debug!

Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.




import sys
import os
import time
import midas
import midas.client
import ctypes

def debug_get(client):
    c_path = ctypes.create_string_buffer(b"/")
    hKey = ctypes.c_int()
    client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))

    buf = ctypes.c_char_p()
    bufsize = ctypes.c_int()
    bufend = ctypes.c_int()

    client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))

    print("-" * 80)
    print("FULL DUMP")
    print("-" * 80)
    print(buf.value)
    print("-" * 80)
    print("Chars 17000-18000")
    print("-" * 80)
    print(buf.value[17000:18000])
    print("-" * 80)

    as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)

    client.lib.c_free(buf)

    return as_dict

def main(verbose=False):
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2)

    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))

    while True:
        # odb = client.odb_get("/")
        odb = debug_get(client)

        if verbose:
            print(odb)
        start1 = time.time()

        client.communicate(10)
        time.sleep(1)


    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()

if __name__ == "__main__":
    main()
       Reply  01 May 2023, Giovanni Mazzitelli, Bug Report, python issue with mathplot lib vs odb query output.txt
> > it seams that there is a difference between the to way of use the code, and that 
> > is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
> 
> I can't reproduce this on my machines, so this is going to be fun to debug!
> 
> Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.

Thank you!
if I added the matplotlib as follow:

#!/usr/bin/env python3

import sys
import os
import time
import midas
import midas.client
import ctypes
from matplotlib import pyplot as plt

def debug_get(client):
    c_path = ctypes.create_string_buffer(b"/")
    hKey = ctypes.c_int()
    client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))

    buf = ctypes.c_char_p()
    bufsize = ctypes.c_int()
    bufend = ctypes.c_int()

    client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))

    print("-" * 80)
    print("FULL DUMP")
    print("-" * 80)
    print(buf.value)
    print("-" * 80)
    print("Chars 17000-18000")
    print("-" * 80)
    print(buf.value[17000:18000])
    print("-" * 80)

    as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)

    client.lib.c_free(buf)

    return as_dict

def main(verbose=False):
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2)

    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))

    while True:
        # odb = client.odb_get("/")
        odb = debug_get(client)

        if verbose:
            print(odb)
        start1 = time.time()

        client.communicate(10)
        time.sleep(1)


    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()

        
if __name__ == "__main__":
    from optparse import OptionParser
    parser = OptionParser(usage='usage: %prog\t ')
    parser.add_option('-v','--verbose', dest='verbose', action="store_true", default=False, help='verbose output;');
    (options, args) = parser.parse_args()
    main(verbose=options.verbose)


then tested the code in interactive mode without any error. as soon as I submit as midas "Program" I get the attached output.
thank you again, Giovanni
          Reply  01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query 
Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".

Can you run the following few lines please? Then I'll be able to write a test using the same setup as you:


import locale
print(locale.getlocale())
from matplotlib import pyplot as plt
print(locale.getlocale())
 
             Reply  01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query 
> Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".

This should be fixed in the latest commit to the midas develop branch. The JSON specification requires a dot for the decimal separator, so we must ignore the user's locale when formatting floats/doubles for JSON.

I've tested the fix on my machine by manually changing the locale, and also added an automated test in the python directory.
                Reply  01 May 2023, Giovanni Mazzitelli, Bug Report, python issue with mathplot lib vs odb query 
> > Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".
> 
> This should be fixed in the latest commit to the midas develop branch. The JSON specification requires a dot for the decimal separator, so we must ignore the user's locale when formatting floats/doubles for JSON.
> 
> I've tested the fix on my machine by manually changing the locale, and also added an automated test in the python directory.

Thanks very macth Ben,
so if I understand correctly we have to update MIDAS to latest develop branch available? can you sand me the link to be sure of install the right update. 
can you also tell me how you fix manually? we are restarting and then well be difficult install and makes updete.
thank you again, regards, Giovanni
Entry  21 Apr 2023, Grzegorz Nieradka, Forum, Setup Midas with Caen vx2740 - ask for help 
I'm trying to setup Midas with the Caen vx2740 VME digitizer board.
As the backend driver I used the software from Darkside located here:

https://bitbucket.org/ttriumfdaq/dsproto_vx2740/src/develop/

They implemented some helpers program and one from them should diagnose correct running of digitizer. But when I'm trying to run example program "vx2740_readout_test" I have segmentation fault:

Thread 1 "vx2740_readout_" received signal SIGSEGV, Segmentation fault.
0x00005555555c2ee1 in rpc_register_function (id=id@entry=18000, func=func@entry=0x5555555a2790 <jrpc_helper(int, void**)>) at /home/astrocent/workspace/packages/midas/src/midas.cxx:11947

During the calling this program I have running mhttpd, mlogger and the backend for vx2740 from the repository.

I'm not able to find documentation what is purpose of the RPC? Could someone give any indicators how I can start debug this behavior? Or there is some documentation about the RPC?


I'm freshman in the Midas world, so at this moment everything seems for me very complicated - and I'm learning by doing.

Regards,
Grzegorz

The backtrace from gdb which indicates the function in Midas package:

#0  0x00005555555c2ee1 in rpc_register_function (id=id@entry=18000, func=func@entry=0x5555555a2790 <jrpc_helper(int, void**)>)
    at /home/astrocent/workspace/packages/midas/src/midas.cxx:11947
#1  0x00005555555c2f12 in cm_register_function (id=id@entry=18000, func=func@entry=0x5555555a2790 <jrpc_helper(int, void**)>)
    at /home/astrocent/workspace/packages/midas/src/midas.cxx:5840
#2  0x00005555555a26f6 in VX2740GroupFrontend::init (this=this@entry=0x7fffffffcba0, group_idx=group_idx@entry=-1, hDB=hDB@entry=0, enable_jrpc=enable_jrpc@entry=true)
    at /home/astrocent/workspace/packages/ttriumfdaq-dsproto_vx2740-8122058cacd1/vx2740_fe_class.cxx:134
#3  0x000055555557e492 in do_fe (board_name=..., is_scope=<optimized out>) at /home/astrocent/workspace/packages/ttriumfdaq-dsproto_vx2740-8122058cacd1/vx2740_readout_test.cxx:185
#4  0x000055555557adc9 in main (argc=<optimized out>, argv=0x7fffffffd1f8) at /home/astrocent/workspace/packages/ttriumfdaq-dsproto_vx2740-8122058cacd1/vx2740_readout_test.cxx:253
    Reply  21 Apr 2023, Ben Smith, Forum, Setup Midas with Caen vx2740 - ask for help 
> I'm not able to find documentation what is purpose of the RPC? Could someone give any indicators how I can start debug this behavior? Or there is some documentation about the RPC?

The RPC system allows midas clients to issue commands to each other. In the case of the VX2740 code we use it so the midas webserver (mhttpd) can tell the frontend to perform some actions when a user clicks a button on a webpage.

I've been writing most of the code for the VX2740 for Darkside, so will contact you directly to help debug the issue.
    Reply  21 Apr 2023, Konstantin Olchanski, Forum, Setup Midas with Caen vx2740 - ask for help 
> I'm trying to setup Midas with the Caen vx2740 VME digitizer board.

welcome to the world of daq and midas! Ben already answered and he will help you with this specific hardware. (we work together)

> #0  0x00005555555c2ee1 in rpc_register_function (id=id@entry=18000, func=func@entry=0x5555555a2790 <jrpc_helper(int, void**)>)
>     at /home/astrocent/workspace/packages/midas/src/midas.cxx:11947

I look at this line in midas and I do not see any problems other than all functions that touch rpc_list are not thread safe,
and calling them at the same time as rpc calls are active will cause memory corruption and crash. This is not a problem
in most programs because rpc_register_function() is usually called once at the beginning of everything, before any RPCs
are received, sent or processed. I filed a bug against this problem. https://bitbucket.org/tmidas/midas/issues/362/rpc_list-is-not-thread-safe

> with is "jrpc"?

I implemented it years ago to allow web pages to call mhttpd (XHR/HTTP) to call user frontends (MIDAS RPC) to perform real-time actions,
i.e. to turn power supplied on or off. "j" stands for "json", but most experiments send very simple commands
and do not use json encoding.

K.O.
Entry  17 Mar 2023, Konstantin Olchanski, Info, T2K/ND280 - Many warning from ten year old variables in ODB 
Forwarded from the T2K/ND280 elog:

Author              : Nick Hastings
Subject             : Many warning from ten year old variables in ODB
Logbook URL         : http://elog.nd280.org/elog/FGD/2553

Midas does period checks that the variables in the ODB are ok.
One of these is a check to see if each variable was set with +/- 10
years. Since this experiment has been running for longer than 10
years there are *many* variables that fail this check.

As a result the midas.log and messages in mhttpd are spammed
with many warnings. Eg

Mon Feb 13 14:49:18 2023 [ODBEdit,ERROR] [odb.c:548:db_validate_key,ERROR] Warning: invalid access time, key 
"/System/Prompt", time 1288763123

These can be removed by simply setting the variable again with its current value.

So I wonder if it would be best to just do a full odbdump and then load all the values
back in. Comments from MIDAS experts would be appreciated. Eg:

odbedit -c 'save fgddaq.odb'
odbedit -c 'load fgddaq.odb'

Note this problem is currently seen on both the FGD DAQ and the global slow control MIDAS instances.
It may also be a problem on the INGRID GSC and the DAQs of other ND280 systems but I did not check.
Entry  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
midas bitbucket repository had a spam attack, about 40 spam messages were posted 
into the issues. I was able to delete them manually. No idea how they got past 
bitbucket spam filters and if they are spam or an attack against automated issue 
tracker tools or an attack against the repo owner (who is vulnerable as they rush 
in to deal with the spam). if this happens again, "anonymous issues" may have to 
be disabled, bitbucket login required. K.O.
    Reply  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
> midas bitbucket repository had a spam attack, about 40 spam messages were posted 
> into the issues. I was able to delete them manually. No idea how they got past 
> bitbucket spam filters and if they are spam or an attack against automated issue 
> tracker tools or an attack against the repo owner (who is vulnerable as they rush 
> in to deal with the spam). if this happens again, "anonymous issues" may have to 
> be disabled, bitbucket login required. K.O.

Two more spam messages, deleted. "Anonymous users can create issues" is now turned off.

K.O.
       Reply  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
> > midas bitbucket repository had a spam attack, about 40 spam messages were posted 
> > into the issues. I was able to delete them manually. No idea how they got past 
> > bitbucket spam filters and if they are spam or an attack against automated issue 
> > tracker tools or an attack against the repo owner (who is vulnerable as they rush 
> > in to deal with the spam). if this happens again, "anonymous issues" may have to 
> > be disabled, bitbucket login required. K.O.
> 
> Two more spam messages, deleted. "Anonymous users can create issues" is now turned off.

Also, same for: rootana.
Also, empty issue trackers disabled: mvodb, midasio, mscb.

K.O.
Entry  15 Mar 2023, Casey, Forum, Having trouble with MIDAS setup 
Hi

I'm not sure if this is the right forum for this query (if it is not, I would truly appreciate it if someone could point me at the right forum). I'm having a little bit of trouble with the setup of a Midas system.

Now, this is a system that I inherited after the previous guy who was looking after it went to other employment. There was a point at which it was working. And then there was an unrelated issue in the electrical system which, as a side effect, meant that the building lost power for a time, and the entire system had to be rebooted.

No problem, I thought. I'll just reset and restart all of the software...

...and I can't seem to get it to work. I keep getting the error message "mvme_read_value: Could not perform read!: Bad address". So far as I can tell, this seems to be related to the idea that the base address being used to read from the boards is incorrect. The base addresses are hardcoded in the software (not autogenerated) and, aside from the power going down and up again, the hardware hasn't been touched since the system was working.

I imagine that there is something that needs to be set, twiddled, tweaked, or turned on in the driver. The output of 'lsmod | grep vme' is:

vmedriver             117742  0

so presumably the driver is at least *present*, even if I have no idea how to twiddle anything on it.

Could anyone perhaps suggest a way forward? Is there some way to gather the information that I need, perhaps, or some way to twiddle anything twiddle-able on the driver?

Casey
    Reply  16 Mar 2023, Konstantin Olchanski, Forum, Having trouble with MIDAS setup 
> I'm not sure if this is the right forum for this query

this might be the right place, depending.

> I'm having a little bit of trouble with the setup of a Midas system ...
> inherited after the previous guy ...

a rather major problem, but a typical situation.

> There was a point at which it was working.

this is very good. if it worked before, there is good chance it will work again.

> And then there was an unrelated issue in the electrical system which, as a side effect,
> meant that the building lost power for a time, and the entire system had to be rebooted.
> No problem, I thought. I'll just reset and restart all of the software...
> ...and I can't seem to get it to work.

this happens often enough. several things are likely to happen:
- unexpected software updates, i.e. new linux kernel was installed but inactive, waiting for a reboot
- hardware failures, i.e. we usually see blown up power supplies. check that all VME crate voltages are okey. (ask me how).
- firmware corruption, i.e. we have seen VME modules lose their firmware after power outage, had to be reloaded by jtag

> I keep getting the error message "mvme_read_value: Could not perform read!: Bad address".

this is a generic error, it does not mean that software suddenly is trying to read from wrong address.

> I imagine that there is something that needs to be set, twiddled, tweaked, or turned on in the driver. The output of 'lsmod | grep vme' is:
> vmedriver             117742  0

this is not the vme driver we use at TRIUMF, so I am not familiar with it's errors. we use the vme_tsi148 driver and the vme_universe driver. (ask me about them).

> so presumably the driver is at least *present*, even if I have no idea how to twiddle anything on it.

could be the wrong version of the driver or the wrong version of the linux kernel. worth checking log files
to see if kernel and driver version numbers are the same.

> Could anyone perhaps suggest a way forward?

Yes. You will have to tell me much more about your system. You can do this publicly here or privately by email to olchansk@triumf.ca

To start, I need to know your VME setup, what is the crate, what is the VME processor, what OS you run, what VME driver you use, what VME modules you have installed.

K.O.
Entry  22 Feb 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
Dear all,

we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
option to retry multiple times the I/O on the MySQL?

The error we are experiencing is the following (hiding the IP address):

[Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)

Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

Best,
Stefano.
    Reply  22 Feb 2023, Stefan Ritt, Info, connection to a MySQL server: retry procedure in the Logger 
> Dear all,
> 
> we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> option to retry multiple times the I/O on the MySQL?
> 
> The error we are experiencing is the following (hiding the IP address):
> 
> [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> 
> Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
the connection still might not work. If we retry many times, people will complain that the run start or stop 
takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
so that people get aware that there is a problem and fix the source, rather than curing the symptoms.

In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.

Best,
Stefan
       Reply  07 Mar 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
> > Dear all,
> > 
> > we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> > option to retry multiple times the I/O on the MySQL?
> > 
> > The error we are experiencing is the following (hiding the IP address):
> > 
> > [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> > connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> > 
> > Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
> 
> What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
> the connection still might not work. If we retry many times, people will complain that the run start or stop 
> takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
> information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
> so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
> 
> In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
> except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
> use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
> 
> Best,
> Stefan


Dear Stefan,

a possible solution could be to define the number of times to retry as a parameter that is 0 by default, as well as a wait time between two subsequent tries. This 
would leave the decision on how to handle a possible failed connection to the user. In our case, for example, we would prefer to not stop the acquisition in case 
of a failed connection to the external SQL. In addition, we have other software that, with a retry procedure, doesn’t fail: with 1 re-try and a sleep time of 0.5 s 
we already recover 100% of the faults.

Anyway, we implemented a local database, which is a mirror of the external one, and the problems disappeared.

Thanks,
Stefano.
Entry  05 Nov 2022, Zaher Salman, Suggestion, histories capture 'ruy' 
The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.

In mhistory.js we have

// Keyboard event handler (has to be on the window!)
window.addEventListener("keydown", this.keyDown.bind(this));

I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.
    Reply  08 Feb 2023, Stefan Ritt, Suggestion, histories capture 'ruy' 
> The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.
> 
> In mhistory.js we have
> 
> // Keyboard event handler (has to be on the window!)
> window.addEventListener("keydown", this.keyDown.bind(this));
> 
> I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.

I could reproduce the problem. I see two options there:

1) We replace 'r' with 'Ctrl-r' etc. 

2) We change the history JS code not to process keyboard inputs if we are currently editing a value.

I added 2) to give it a try. It works fine for me. The additional code is

MhistoryGraph.prototype.keyDown = function (e) {
   // don't consume events if we are editing a value
   if (e.target.tagName === "INPUT")
      return;
   ...
}


Feedback is welcome.

Stefan
       Reply  09 Feb 2023, Zaher Salman, Suggestion, histories capture 'ruy' 
I agree with you, option 2 is better and works well. 
The only problem is that if you are showing multiple histograms in the same window the keyDown even will affect all of the histories in the window. 
This may be the intended behaviour, but I think that if we can find a way to have the event affecting only the intended history (focused element for example) it would be better.

Zaher 

> > The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.
> > 
> > In mhistory.js we have
> > 
> > // Keyboard event handler (has to be on the window!)
> > window.addEventListener("keydown", this.keyDown.bind(this));
> > 
> > I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.
> 
> I could reproduce the problem. I see two options there:
> 
> 1) We replace 'r' with 'Ctrl-r' etc. 
> 
> 2) We change the history JS code not to process keyboard inputs if we are currently editing a value.
> 
> I added 2) to give it a try. It works fine for me. The additional code is
> 
> MhistoryGraph.prototype.keyDown = function (e) {
>    // don't consume events if we are editing a value
>    if (e.target.tagName === "INPUT")
>       return;
>    ...
> }
> 
> 
> Feedback is welcome.
> 
> Stefan
          Reply  09 Feb 2023, Stefan Ritt, Suggestion, histories capture 'ruy' 
> I agree with you, option 2 is better and works well. 
> The only problem is that if you are showing multiple histograms in the same window the keyDown even will affect all of the histories in the window. 
> This may be the intended behaviour, but I think that if we can find a way to have the event affecting only the intended history (focused element for example) it would be better.

The history panels are simple pictures from the perspective of the browser and have no concept of a focus. This would have to be tweaked somehow. If you want to reset a single history, 
just click on its reset button (circle arrow at the right top).

Stefan
Entry  31 Jan 2023, Lukas Gerritzen, Suggestion, "Soft interlock" possible? 
Is it possible to impose requirements on certain output variables in an interlock-like fashion? For example: "As long as the temperature exceeds a certain threshold, a light switched by a relay cannot be turned on."

A workaround would be to set an alarm on that variable and call a script which turns the light back off, but that might not be ideal in certain scenarios. For safety-critical situations, a PLC would be preferred, but I am missing an option between those two.
    Reply  31 Jan 2023, Stefan Ritt, Suggestion, "Soft interlock" possible? 
> Is it possible to impose requirements on certain output variables in an interlock-like fashion? For example: "As long as the temperature exceeds a certain threshold, a light switched by a relay cannot be turned on."
> 
> A workaround would be to set an alarm on that variable and call a script which turns the light back off, but that might not be ideal in certain scenarios. For safety-critical situations, a PLC would be preferred, but I am missing an option between those two.

No, interlocks of that kind are not implemented in midas. And that is for a good reason. Interlocks are critical, so one must be sure 100% that they are working. This cannot be done with complex software such as midas. You have to use dedicated hardware for that. 
Most people use a PLC controller which is made for that. Midas is then only used to read and display the status of the interlock controller.

Stefan
Entry  13 Jan 2023, Denis Calvet, Suggestion, Debug printf remaining in mhttpd.cxx 
Hi everyone,

It has been a long time since my last message. While porting Midas front-end 
programs developed for the T2K experiment in 2008 to a modern version of Midas, 
I noticed that some debug printf remain in mhttpd.cxx.

A number of debug messages are printed on stdout each time a graph is displayed 
in the OldHistory window (which is the style of history plots that will continue 
to be used in the upgraded T2K experiment for some technical reasons).

Here is an example of such debug messages:
Load from ODB History/Display/HA_EP_0/V33: hist plot: 8 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, 
show_run_markers: 1, show_values: 1, show_fill: 0, show_factor 0, enable_factor: 
1
var[0] event [HA_TPC_SC][E0M00FEM_V33] formula [], colour [#00AAFF] label 
[Mod_0] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
10
var[1] event [HA_TPC_SC][E0M01FEM_V33] formula [], colour [#FF9000] label 
[Mod_1] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
20
var[2] event [HA_TPC_SC][E0M02FEM_V33] formula [], colour [#FF00A0] label 
[Mod_2] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
30
var[3] event [HA_TPC_SC][E0M03FEM_V33] formula [], colour [#00C030] label 
[Mod_3] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
40
var[4] event [HA_TPC_SC][E0M04FEM_V33] formula [], colour [#A0C0D0] label 
[Mod_4] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
50
var[5] event [HA_TPC_SC][E0M05FEM_V33] formula [], colour [#D0A060] label 
[Mod_5] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
60
var[6] event [HA_TPC_SC][E0M06FEM_V33] formula [], colour [#C04010] label 
[Mod_6] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
70
var[7] event [HA_TPC_SC][E0M07FEM_V33] formula [], colour [#807060] label 
[Mod_7] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
80
read_history: nvars 10, hs_read() status 1
read_history: 0: event [HA_TPC_SC], var [E0M00FEM_V33], index 0, odb index 0, 
status 1, num_entries 475
read_history: 1: event [HA_TPC_SC], var [E0M01FEM_V33], index 0, odb index 1, 
status 1, num_entries 475
read_history: 2: event [HA_TPC_SC], var [E0M02FEM_V33], index 0, odb index 2, 
status 1, num_entries 475
read_history: 3: event [HA_TPC_SC], var [E0M03FEM_V33], index 0, odb index 3, 
status 1, num_entries 475
read_history: 4: event [HA_TPC_SC], var [E0M04FEM_V33], index 0, odb index 4, 
status 1, num_entries 475
read_history: 5: event [HA_TPC_SC], var [E0M05FEM_V33], index 0, odb index 5, 
status 1, num_entries 475
read_history: 6: event [HA_TPC_SC], var [E0M06FEM_V33], index 0, odb index 6, 
status 1, num_entries 475
read_history: 7: event [HA_TPC_SC], var [E0M07FEM_V33], index 0, odb index 7, 
status 1, num_entries 475
read_history: 8: event [Run transitions], var [State], index 0, odb index -1, 
status 1, num_entries 0
read_history: 9: event [Run transitions], var [Run number], index 0, odb index 
-2, status 1, num_entries 0


Looking at the source code of mhttpd, these messages originate from:

[mhttpd.cxx line 10279] printf("Load from ODB %s: ", path.c_str());
[mhttpd.cxx line 10280] PrintHistPlot(*hp);
...
[mhttpd.cxx line  8336] int read_history(...
...
[mhttpd.cxx line  8343] int debug = 1;
...

Although seeing many debug messages in the mhttpd does not hurt, these can hide 
important error messages. I would rather suggest to turn off these debug 
messages by commenting out the relevant lines of code and setting the debug 
variable to 0 in read_history().

That is a minor detail and it is always a pleasure to use Midas.

Best regards,
Denis.
 
    Reply  13 Jan 2023, Stefan Ritt, Suggestion, Debug printf remaining in mhttpd.cxx 
These debug statements were added by K.O. on June 24, 2021. He should remove it.

https://bitbucket.org/tmidas/midas/commits/21f7ba89a745cfb0b9d38c66b4297e1aa843cffd

Best,
Stefan
Entry  19 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
serious (but rare) bug was fixed in the history reader. unlucky experiment would see 
errors about "Detected duplicate or non-monotonous data" in some history file, fixed by 
removing/renaming the offending file. (reported by MEG experiment)

it turns out there was nothing wrong with the data files (good), but there
was a nasty bug in the history reader. it did not ensure that we read history
files in chronological order. under some conditions order of files could be
reversed, older files would be read after newer files and trip the built-in
protection against returning non-monotonically increasing history data to the user.

fixed commit 
https://bitbucket.org/tmidas/midas/commits/9893f85ebe33e96cc63f501a0f89e1f8932c894d

for more details, see https://bitbucket.org/tmidas/midas/issues/350/file-history-non-
monotonic-time

K.O.
    Reply  23 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> serious (but rare) bug was fixed in the history reader.

previous fix was incomplete. please update to git commit
https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

K.O.
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> > serious (but rare) bug was fixed in the history reader.
> previous fix was incomplete. please update to git commit
> https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

a race condition between reading history file in mhttpd and writing history file in 
mlogger was accidentally introduced. mhttpd would file spurious errors about "timestamp 
is after last timestamp".

fixed, please update to git commit
https://bitbucket.org/tmidas/midas/commits/7a9f6e0c58ffddcacb9ee19934ce3e2033a805ef

fix race condition in history file reader - a race condition was added accidentally - 
first the reader remembers the history file size and the time of the last entry, then it 
goes to read the file and bombs if at the same time mlogger added more entries - their 
time is after the remembered time of last entry and error "timestamp is after last 
timestamp" is triggered.

K.O.
Entry  11 Nov 2022, Frederik Wauters, Bug Fix, O_CREAT in open in split.cxx 
midas currently does not compile on linux

/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
   50 |    __open_missing_mode ();

giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html

fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600
    Reply  12 Nov 2022, Stefan Ritt, Bug Fix, O_CREAT in open in split.cxx 
> midas currently does not compile on linux
> 
> /usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
>    50 |    __open_missing_mode ();
> 
> giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html
> 
> fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

Thanks. Fixed.

Stefan
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, O_CREAT in open in split.cxx 
> > midas currently does not compile on linux
> > fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

I got more warnings from split.cxx, looked at the code and see so many problems that it is easier
to delete it than it is to fix it.

Check for end of file is done incorrectly (check for read() return 0, -1 or short read),
memory overrun if given file name is longer than 80 bytes, no check for valid event length
read from the file, and so on and so on.

A better example for reading and writing midas files is in midasio/test_midasio.cxx. Proper c++ coding, and can read compressed files.

K.O.
Entry  22 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
I really like the concept of the odbxx interface.
I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.
    Reply  24 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
> I really like the concept of the odbxx interface.
> I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
> Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.

Having a "const midas::odb" probably does not work (at least I would not know how to implement that).

But I could make an internal flag analog to the auto refresh flags. So you would have

    o.set_write_protect(true);

to turn on write protection. Would that work for you?

Best,
Stefan
       Reply  26 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
> Having a "const midas::odb" probably does not work (at least I would not know how to implement that).
> 
> But I could make an internal flag analog to the auto refresh flags. So you would have
> 
>     o.set_write_protect(true);
> 
> to turn on write protection. Would that work for you?

Absolutely. Looking at the underlying code I was also at a loss how const would work.
I'm mostly just interested in having small clients that only read from the odb (for whatever reason) without risking corrupting it by messing something 
up in the code, especially since such small clients are almost by definition hacked together quickly on the fly.
          Reply  29 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
Ok, I implemented and committed that. Just call

o.set_write_protect(true)

on a key you don't want to modify. If you do so, an exception gets thrown.

Best,
Stefan
Entry  14 Oct 2022, Lars Martin, Suggestion, Allow onchange to refer to arbitrary js function 
Maybe this is already possible, I have a hard time understanding the mhttpd source code.

I would like to use a function defined in the <script> block of my custom page as an onchange callback.

Specific example:
I have an modbthermo that I would like to change to three different colours for "too cold", "just right", and "too hot" (measuring porridge, presumably). The examples only show the explicit (condition)?(val1):(val2) syntax, which doesn't allow more than two values, so I had hoped to replace

onchange="this.dataset.color=this.value > 40?'red':'blue';"

with something like

onchange="this.dataset.color=check_Temp(this.value);"

or

onchange="check_Temp(this.value, this.dataset.color);"

if that's easier somehow. The function itself would then return the colour string, or set the color argument to that string (I'm not sure if JS passes references or just values.)

Is this a possibility?
    Reply  14 Oct 2022, Ben Smith, Info, Allow onchange to refer to arbitrary js function 
> I would like to use a function defined in the <script> block of my custom page as an onchange callback.
>
> Is this a possibility?

Yes, this is already possible. An example was shown in the "modb" section of the custom page documentation, but not in the "Changing properties of controls dynamically" section. I've updated the wiki with an example.

https://daq00.triumf.ca/MidasWiki/index.php/Custom_Page#Changing_properties_of_controls_dynamically
       Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
I figured I wasn't the first to have this idea.
Works great, thanks!
          Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
Actually, now that I look again, there is a mistake in the instructions:
you say

onchange="this.dataset.color=check_therm(this)"

but check_therm doesn't return anything and instead changes the color itself. So you either want the function to return the string and use the above assignment, or use the function you provide and use

onchange="check_therm(this)"
Entry  10 Oct 2022, Zaher Salman, Suggestion, JSON-RPC function to read files mjsonrpc_user.cxx
Hello ,

The midas sequencer uses the function js_seq_list_files to get a list of files in the /Sequencer/State/Path with extension *.msl. It would be nice to generalize this function to be able to read files with other (or any) extension.

Based on the js_seq_list_files I added a function in js_any_list_files mjsonrpc_user.cxx (attached) which does the job. Maybe a better/safer implementation can be made in midas. Are there any plans to do this?

thanks.
Entry  21 Aug 2022, Joseph McKenna, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 


I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:

https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract
    Reply  22 Aug 2022, Stefan Ritt, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 
> I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:
> 
> https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract

Thanks for raising that point. I realized that the odbxx API was also missing that functionality, so I added it: 
https://bitbucket.org/tmidas/midas/commits/6991a92c19292eaf67721cb80f182c61db077f45

Best,
Stefan
Entry  15 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console

17:21:28.821 Script terminated by timeout at:
MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
mhistory.js:2828:7

Any ideas how to resolve this??
    Reply  15 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> 
> 17:21:28.821 Script terminated by timeout at:
> MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> mhistory.js:2828:7
> 
> Any ideas how to resolve this??

I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
problem anywhere else.

Stefan
       Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> > Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> > 
> > 17:21:28.821 Script terminated by timeout at:
> > MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> > MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> > mhistory.js:2828:7
> > 
> > Any ideas how to resolve this??
> 
> I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
> problem anywhere else.
> 
> Stefan

Hi Stefan and Konstantin,

The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas
Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.
The hangs seem to happen randomly so I have not been able to reproduce it yet. 
I have histories here  http://lem03.psi.ch:8081/?cmd=custom&page=Mudas&tab=3 (30 minutes each), but I have also histories popping up in modals though they do not cause any issues. 
I'll try to reproduce it in the coming few days and report again.

thanks,
Zaher
          Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
I found the bug. The problem is triggered by changing the firefox window. This calls a function that is supposed to change the size of the history plot and it works well when the history plots are visible but not if the history plots are hidden in a javascript tab (not another firefox tab).

Is there a clean way to resize the history plot if the parent div changes size?? The offending code is
mhist[i].mhg = new MhistoryGraph(mhist[i]);
mhist[i].mhg.initializePanel(i);
mhist[i].mhg.resize();
mhist[i].resize = function () {
   mhis.mhg.resize();
};
             Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like

<div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">

Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.

In you function mhistory_init_one() you forgot the call

   mhist.dataset.baseURL = baseURL; 

where baseURL has to come from the current address bar like

   let baseURL = window.location.href;
   if (baseURL.indexOf("?cmd") > 0)
      baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
   baseURL += "?cmd=history";

If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.

Best,
Stefan
                Reply  17 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like
> 
> <div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">
> 
> Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.
> 
> In you function mhistory_init_one() you forgot the call
> 
>    mhist.dataset.baseURL = baseURL; 
> 
> where baseURL has to come from the current address bar like
> 
>    let baseURL = window.location.href;
>    if (baseURL.indexOf("?cmd") > 0)
>       baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
>    baseURL += "?cmd=history";
> 
> If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.
> 

Thanks Stefan, but this was not the problem since I am setting the baseURL. You may have looked at the code during my debugging.

Some of my histories are placed in an IFrame object. I eventually realized that my code fails when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue by making sure that I am resizing plots only if they are in a visible IFrame.
  
                   Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Some of my histories are placed in an IFrame object. I eventually realized that my code fails 
> when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue 
> by making sure that I am resizing plots only if they are in a visible IFrame.

Just to be clear: You could resolve everything on your side, or do you need to change anything in mhistory.js?

Just a tip: IFrames are not good to put anything in. I recommend just to dynamically crate a <div> element, 
append it to the document body, make it floating and initially invisible. Then put all inside that div. Have
a look how control.js do it. This takes less resources than a complete IFrame and is much easier to handle.

Stefan
          Reply  16 Aug 2022, Konstantin Olchanski, Bug Report, firefox hangs due to mhistory 
> > > Firefox is hanging/becoming unresponsive due to javascript code.
> 
> The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas

so malfunction is not in the midas history page, but in a custom page. I could help you debug it,
but you would have to provide the complete source code (javascript and html).

> Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.

my firefox is 103.something. when you say google-chrome has "similar issues",
I read it as "google-chrome does not show this same bug, but shows some other
bug somewhere else". (if I misread you, you have to write better).

but this gives you a front to attack your bugs. basically all browsers should render your
custom page exactly the same (unless you use some obscure or experimental feature, which I
recommend against).

so you tweak your page to identify the source of different rendering results, and try to eliminate it,
hopefully by the time you get your page render exactly the same everywhere, all the real bugs
have gotten shaken out, too. (this is similar to debugging a c++ program by compiling
it on linux, mac, windows, vax, raspbery pi, etc and checking that you get the same result everywhere).

> The hangs seem to happen randomly so I have not been able to reproduce it yet.

I find that javascript debuggers are not setup to debug hangs. I think debugger runs partially
inside the same javascript engine you are debugging, so both hang and debugging is impossible.

(latest google-chrome has another improvement, all pages from the same computer run in the same
javascript engine, so if one midas page stops (on exception or because I debug it), all midas pages
stop and I have to run two different browsers if I want to debug (i.e.) a history page crash
and look at odb at the same time. fun).

K.O.
Entry  05 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
Several submodules of midas have been re-organized, so if you want to pull the 
newest version, you need a 

git pull --recurse-submodules
git submodule update --init --recursive

before you can build again. To do this automatically the next time, you can do

git config submodule.recurse true

which needs git 2.14 or later. I hope this works for everybody. If there is a 
better way to do that (I'm not a big expert on git) please reply here.

Stefan
    Reply  08 Aug 2022, Konstantin Olchanski, Info, Information for midas updates though git 
> git pull --recurse-submodules
> git submodule update --init --recursive
> git config submodule.recurse true

does not work for me, macos 12.4 git 2.32.1.

after I set "submodule.recurse true", I still have to type "git submodule update --
init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

P.S. the underlying issue is that the mxml submodule is now included twice 
(midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
the same. (No idea what happens if the two mxml's are different).
       Reply  08 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
> after I set "submodule.recurse true", I still have to type "git submodule update --
> init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

Indeed, doesn't work for me either. If some git guru has some more insight, please post 
here!

 
> P.S. the underlying issue is that the mxml submodule is now included twice 
> (midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
> the same. (No idea what happens if the two mxml's are different).

The version of each mxml is defined by last commit of the parent repository, which contains 
the hash of the submodule version. If we have to update mxml for some reason, we have to 
commit also mscb with the new version, and then midas with the same version of mxml. If one 
checks out midas then with 

git clone https://bitbucket.org/tmidas/midas --recursive

one gets the same versions for mxml.
Entry  08 Aug 2022, Konstantin Olchanski, Info, odb disallow key names that start or end with spaces 
while testing the new odb editor, we ran into a number of problems with key names 
that start or end with spaces. we cannot think of any valid use case for such key 
names (subdirectories and variables) and we think they could only have been 
created by mistake. ODB now disallows such names. K.O.
Entry  08 Aug 2022, Konstantin Olchanski, Info, midas on ubuntu LTS 22.04 
reporting that as of commit 78f707c0686d22f8329c7a1f1c46d7dccf35ceff, midas builds 
without errors or warnings on Ubuntu LTS 22.04, 20.04, CentOS-7 and MacOS 12.4. 
(except for some warnings from mscb and msc). K.O.
Entry  06 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
While the odbxx API has been successfully used since the last months, a potential 
problem with large ODBs surfaced. If you have lots of data in the ODB and load it 
into an object like

midas::odb o("/Equipment");

this might take quite long, since each ODB value is fetched separately, which is 
very quick on a local machine but can take long over a client-server connection. 
For large experiments this can take up to minutes (!).

To get rid of this problem, the underlying object model has been modified. When an 
object is instantiated like above, then the whole ODB tree is fetched in an XML 
buffer in a single transfer, which even for large ODBs usually takes much less 
than a second. Then the XML buffer is decomposed on the client side and converted 
into the proper midas::odb objects. In one case this gave an improvement from 35 
seconds to 0.5 seconds which is significant. To enable the new method, the object 
can be created with a flag like

midas::odb o("/Equipment", true);

which then switches to the new method. One has to take care not to fool oneself 
(like I did) by printing the object like

midas::odb o("/Equipment", true);
std::cout << o << std::endl;

because each read access to any sub-object of o causes a separate read request to 
the server which again can take long. Therefore, one has to switch off the auto 
refresh via

midas::odb o("/Equipment", true);
o.set_auto_refresh_read(false);
std::cout << o << std::endl;

Accessing any sub-object of o then does not cause a client-server request, which 
is not necessary if all objects just have been pulled from the server before. If 
one keeps the object however for a long time in memory, one has to be aware that 
it only contains "old" values from the time if instantiation. If one needs more 
current ODB values, the auto read refresh has to be turned on again.

Stefan
    Reply  08 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
After some thought, I changed the API again and removed the flag in the constructor,
so the system now automatically choses the best algorithm depending if the client
is connected to a local or a remote API. So in all cases you use again the old syntax:

midas::odb o("/Equipment");



Stefan
Entry  18 Jul 2022, Konstantin Olchanski, Release, midas-2022-05 
There is a release branch for midas-2022-05 and corresponding git tag midas-2022-
05-b. This branch is known to be stable and is working well for the ALPHA 
experiment at CERN. Latest update to this branch fixes two problems in the 
mserver (rpc timeout and a use-after-free internal error).

https://bitbucket.org/tmidas/midas/branch/release/midas-2022-05
K.O.
Entry  25 Jun 2022, Joseph McKenna, Bug Report, RPC timeout for manalyzer over network 

In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

gdb trace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff5d35859 in __GI_abort () at abort.c:79
#2  0x00005555555a2a22 in rpc_call (routine_id=11111) at /home/alpha/packages/midas/src/midas.cxx:13866
#3  0x000055555562699d in bm_receive_event_rpc (buffer_handle=buffer_handle@entry=2, buf=buf@entry=0x0, buf_size=buf_size@entry=0x0, ppevent=ppevent@entry=0x0, pvec=pvec@entry=0x7fffffffd700,
    timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10510
#4  0x0000555555631082 in bm_receive_event_vec (buffer_handle=2, pvec=pvec@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10794
#5  0x0000555555673dbb in TMEventBuffer::ReceiveEvent (this=this@entry=0x555557388b30, e=e@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/tmfe.cxx:312
#6  0x0000555555607b56 in ReceiveEvent (b=0x555557388b30, e=0x7fffffffd6c0, timeout_msec=100) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1411
#7  0x000055555560d8dc in ProcessMidasOnlineTmfe (args=..., progname=<optimized out>, hostname=<optimized out>, exptname=<optimized out>, bufname=<optimized out>, event_id=<optimized out>,
    trigger_mask=<optimized out>, sampling_type_string=<optimized out>, num_analyze=0, writer=<optimized out>, multithread=<optimized out>, profiler=<optimized out>,
    queue_interval_check=<optimized out>) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1534
#8  0x000055555560f93b in manalyzer_main (argc=<optimized out>, argv=<optimized out>) at /usr/include/c++/9/bits/basic_string.h:2304
#9  0x00007ffff5d37083 in __libc_start_main (main=0x5555555b1130 <main(int, char**)>, argc=8, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffdd98) at ../csu/libc-start.c:308
#10 0x00005555555b184e in _start () at /usr/include/c++/9/bits/stl_vector.h:94

Any suggestions? Many thanks
    Reply  18 Jul 2022, Konstantin Olchanski, Bug Report, RPC timeout for manalyzer over network 
> In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

there is a subtle bug in the mserver. under rare conditions, ss_suspend() will recurse in an unexpected way
and mserver will go to sleep waiting for data from a udp socket (that will never arrive, so sleep forever).
remote client will see it as an rpc timeout. in my tests (and in ALPHA-g at CERN, as reported by Joseph),
I see this rare condition to happen about every 5 minutes. in normal use, this is the first time we become
aware of this problem, the best I can tell this bug was in the mserver since day one.

commit https://bitbucket.org/tmidas/midas/commits/fbd06ad9d665b1341bd58b0e28d6625877f3cbd0
to develop and
to release/midas-2022-05

The stack trace that shows the mserver hang/crash (sleep() is the stand-in for the sleep-forever socket read).

(gdb) bt
#0  0x00007f922c53f9e0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1  0x00007f922c53f894 in sleep () from /lib64/libc.so.6
#2  0x0000000000451922 in ss_suspend (millisec=millisec@entry=100, msg=msg@entry=1) at /home/agmini/packages/midas/src/system.cxx:4433
#3  0x0000000000411d53 in bm_wait_for_more_events_locked (pbuf_guard=..., pc=pc@entry=0x7f920639b93c, timeout_msec=timeout_msec@entry=100, 
    unlock_read_cache=unlock_read_cache@entry=1) at /home/agmini/packages/midas/src/midas.cxx:9429
#4  0x00000000004238c3 in bm_fill_read_cache_locked (timeout_msec=100, pbuf_guard=...) at /home/agmini/packages/midas/src/midas.cxx:9003
#5  bm_read_buffer (pbuf=pbuf@entry=0xdf8b50, buffer_handle=buffer_handle@entry=2, bufptr=bufptr@entry=0x0, buf=buf@entry=0x7f9203d75020, 
    buf_size=buf_size@entry=0x7f920639aa20, vecptr=vecptr@entry=0x0, timeout_msec=timeout_msec@entry=100, convert_flags=0, 
    dispatch=dispatch@entry=0) at /home/agmini/packages/midas/src/midas.cxx:10279
#6  0x0000000000424161 in bm_receive_event (buffer_handle=2, destination=0x7f9203d75020, buf_size=0x7f920639aa20, timeout_msec=100)
    at /home/agmini/packages/midas/src/midas.cxx:10649
#7  0x0000000000406ae4 in rpc_server_dispatch (index=11111, prpc_param=0x7ffcad70b7a0) at /home/agmini/packages/midas/progs/mserver.cxx:575
#8  0x000000000041ce9c in rpc_execute (sock=10, buffer=buffer@entry=0xe11570 "g+", convert_flags=0)
    at /home/agmini/packages/midas/src/midas.cxx:15003
#9  0x000000000041d7a5 in rpc_server_receive_rpc (idx=idx@entry=0, sa=0xde6ba0) at /home/agmini/packages/midas/src/midas.cxx:15958
#10 0x0000000000451455 in ss_suspend (millisec=millisec@entry=1000, msg=msg@entry=0) at /home/agmini/packages/midas/src/system.cxx:4575
#11 0x000000000041deb2 in rpc_server_loop () at /home/agmini/packages/midas/src/midas.cxx:15907
#12 0x0000000000405266 in main (argc=9, argv=<optimized out>) at /home/agmini/packages/midas/progs/mserver.cxx:390
(gdb) 

K.O.
Entry  19 Jun 2022, Francesco Renga, Forum, Alarm on variable not updating 
Dear all,
I've an ODB equipment that sometimes loses the connection with the hardware, so that the variables are not updated anymore. The connection can be restored by restarting the frontend. It would be useful to have an alarm based on the time from the last update of some variable (i.e. the alarm is triggered if the variable is not updated for more than X seconds). Is there a method to implement such an alarm in MIDAS?

Thank you very much,
Francesco
    Reply  20 Jun 2022, Stefan Ritt, Forum, Alarm on variable not updating 
There are two functions to do that, one check the last write access, the other the last write access if the run is running. The alarm condition looks like:

access(/Equipment/.../Variables/Input[10]) > 60

which will cause an alarm if the Input[10] is not written for more than 60 seconds. The other function which checks the run status as well is like:

access_running(...odb key...) > 60

You can actually see an example on the MEG alarm page.

Rather than having an alarm for that I would however recommend that you program you frontend such that it realizes if it looses connections, then tries automatically to reconnect or trigger an alarm itself (so-called "internal" alarm). This is also how the MSCB system is working and is much more robust.

Stefan
Entry  20 Jun 2022, jianrun, Bug Report, Error in "midas/src/mana.cxx" 
Dear Midas developers,

When we are running the examples in $MIDASSYS/examples/experiment/, we meet some 
problems when analyzing the results:
1. When we analyze the data using the analyzer: ./analyzer -i run00001.mid -o 
run00001.rz  , we find some bugs: 
"
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
[Analyzer,ERROR] [mana.cxx:1832:bor,ERROR] HBOOK support is not compiled in
[Analyzer,INFO] Set run number 6 in ODB
Load ODB from run 6...OK
run00006.mid:2680  events, 0.00s
"
We think this occurs in the "midas/src/mana.cxx ". How can we solve this?

2. When we analyze the above data, an error also occurs: 
[Analyzer,ERROR] [odb.cxx:847:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
not contain "["

We simply fixed that just by replacing the "Rate [Hz]" with "Rate" in the 
test_write in midas/src/mana.cxx 
We are curious whether you can fix the problem permanently in the next version, 
or we are not running the code properly. Thanks!
Entry  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++ 
For a long time now we have keep the core of midas (odb.c, midas.c, etc) compatible with plain C and by default 
we have built the MIDAS library using the plain C compiler. Over time, we have switched most MIDAS programs 
(mhttpd, mlogger, mdump, odbedit, etc) to C++ (with happy results). (and for a long time now, all of MIDAS 
could be build as C++, even if the default build remained plain C).

The main reason for keeping the core of MIDAS as C has been to allow writing MIDAS frontends in C - for 
example, in environments with no C++ compilers or no C++ runtime (VxWorks) or where C++ had too much 
overhead (small memory machines, etc).

Today, all concerns against using C++ seem to have receded into the past. C++ compilers are now always 
available, even for small embedded systems. C++ overheads are now well understood and one can easily write 
C++ code that is as efficient as C for using limited CPU and memory resources. (While at the same time, today's 
embedded systems tend to have more CPU and RAM than "big" MIDAS DAQ machines had in the past - 1GHz 
CPU, 1GB RAM is pretty typical for embedded ARM).

As examples of small hardware where MIDAS frontends written in C++ worked just fine, consider the T2K ND280 
FGD data collector running on XILINX FPGA with a 300MHz PowerPC and 128 Mbytes of RAM (standard Linux 
kernel) and the GRIFFIN Clock distribution module control running on a Microsemi FPGA with a 300MHz ARM 
CPU (ucLinux without an MMU). More typical Cyclone-5 ARM SoCs with 1GB RAM and 1GHz CPU run standard 
Linux (CentOS7) and can build MIDAS natively (no need for cross-compiling).

With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).

Next to consider is "which C++ should we use?".

K.O.
    Reply  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, which C++? 
>
> With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> 

Consider the most basic C++ construct, std::string, and observe how many member functions are annotated "c++11", "c++17", etc:
https://en.cppreference.com/w/cpp/string/basic_string

For MIDAS this means that we cannot target "a" C++ or "the" C++, we have to chose between C++ "before C++11", C++11, C++17 
(plus the incoming c++20).

For example, the ROOT 6 package requires C++11 *and* g++ >= 4.8.

Now consider the platforms we use at TRIUMF:

- Linux RHEL/SL/CentOS6 - gcc 4.4.7, no C++11.
- Linux RHEL/SL/CentOS7 - gcc 4.8.5, full C++11, no C++14, no C++17
- Ubuntu 18.04.2 LTS - gcc 7.3.0, full C++11, full C++14, "experimental" C++17.
- MacOS 10.13 - llvm 10.0.0 (clang-1000.11.45.5), full C++11, full C++14, full C++17.

(see here for GCC C++ support: https://gcc.gnu.org/projects/cxx-status.html)
(see here for LLVM clang c++ support: https://clang.llvm.org/cxx_status.html)

As is easy to see from the std::string reference how C++17 has a large number of very useful new features.

Alas, at TRIUMF we still run MIDAS on many SL6 machines where C++11 and C++17 is not normally available. I estimate another 1-2 
years before all our SL6 machines are upgraded to RHEL/SL/CentOS7 (or Ubuntu LTS).

This means we cannot use C++11 and C++17 in MIDAS yet. We are stuck with pre-C++11 for now.

Remarks:
- there will be trouble right away as both Stefan and myself do MIDAS development on MacOS where full C++17 is available and is 
tempting to use. (as they say, watch this space)
- it is possible to install a newer C++ compiler into RHEL/SL/CentOS 6 and 7 systems, but we are loath to require this (same as we 
are loath to require cmake for building MIDAS) - the "I" in MIDAS means integrated, meaning "does not require installing 100 
additional packages before one can use it".
- the MS Windows situation is unclear, but since one has to install the C++ compiler as an additional package anyway, I do not see 
any problem with requiring C++17 support, with a choice of MS compilers, GCC and LLVM. I doubt we will support anything older 
than Windows 10.

K.O.
       Reply  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, how much C++? 
> >
> > With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> > default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> > 

C++ is a big animal. Obviously we want to use std::string, std::vector and similar improvements over plain C (we already use "//" for comments).

But in keeping with the Camel's nose fable (https://en.wikipedia.org/wiki/Camel%27s_nose), there are some parts of C++ we definitely do not want to use in MIDAS. Even the C++ FAQ talks 
about "evil features", see https://isocpp.org/wiki/faq/big-picture#use-evil-things-sometimes

Here is my list of things to use and to avoid. Comments on this are very welcome - as everybody's experience with C++ is different (and everybody's experience is very valuable and very 
welcome).

- std::string, std:vector, etc are in. I am already using them in the MIDAS API (midas.h)
- extern "C" is out, everything has to be C++, will remove "extern "C"" from all midas header files.
- exceptions are out, see https://stackoverflow.com/questions/1736146/why-is-exception-handling-bad
- std::thread and std::mutex are in, at least for writing new frontends, but see discussion of "cannot use c++11". (maybe replace ss_mutex_xxx() with out own std::mutex look-alike).
- heavy use of templates and heavy use of argument overloading is out - just by looking at the code, impossible to tell what function will be called
- "auto" is on probation. I need to know if "auto v=f()" is an integer or a double when I write "auto w=v/2" or "auto w=v/2.0". see 
https://softwareengineering.stackexchange.com/questions/180216/does-auto-make-c-code-harder-to-understand
- unreadable gibberish is out (lambdas, etc)
- C-style malloc()/free() is in. C++ new and delete are okey, but "delete[]" confuses me.
- C-style printf() is in. C++ cout and "<<" gunk provide no way to easily format the output for easy reading.

K.O.
          Reply  16 Apr 2019, Pintaudi Giorgio, Info, switch of MIDAS to C++, how much C++? 
Dear Konstantin,

even if I am still quite young and have only limited experience (but not null), I would like to give my two cents. I have reflected a bit about the C++ issue, also because I am developing a 
brand new MIDAS interface for the WAGASCI-T2K experiment, and I feel that the future of MIDAS could influence the future of our DAQ system, too. I'll start from the conclusions: I completely 
agree with you on a practical level, even if I kind of disagree on an "ethical" level.

What you propose in essence is to migrate the MIDAS core from pure C to a version of C with some fancy C++ features. Let's say a kind of C+ with only one plus. Theoretically speaking, even if 
on the surface C and C++ are very similar, they are completely different languages and require different mindsets (and I am sure that everyone is aware of it). This is the reason why even if I 
would have preferred to develop the MIDAS frontend for our experiment in C++, I have chosen to stick to pure C because I feel that MIDAS is still very C-like in its architecture (or from what 
I can see from the documentation). So I wanted to "keep on track" for better internal coherence. What I mean is that, if someone told me to port a C project of mine to C++, I would end up 
rewriting it almost completely, instead of just modifying it (I really don't know how much of the MIDAS core has been written with C++ in mind, so if a large part of it is already C++-like, 
please ignore my comment above).

Anyway, on a practical level, I completely agree with your approach, because I imagine that a complete rewrite of MIDAS is off the table but, at the same time, some new C++ features like 
better string and vector handling are very tempting to use. Moreover, in general, physicists are more familiar with the C syntax than with the C++ one (but thanks to ROOT that is changing). As 
for the use of MIDAS in embedded devices, I have no experience so I refrain from judging. So, in the particular case of MIDAS, what you propose is probably the best and only option.

As far as the C++ standard to adopt, I would say that the C++11 standard is the best fit for the T2K experiment since the official OS for T2K is CentOS7 and, out of the box, it supports C++11 
only. Anyway, I acknowledge that there are many other experiments and requirements. For the records, I do development on Ubuntu 18.04.

Best regards
Giorgio
          Reply  17 Apr 2019, John M O'Donnell, Info, switch of MIDAS to C++, how much C++? 
some semi-random thoughts:

no templates strictly means you can't use std::string, std::vector etc.

printf is in any case part of C++ (#include <cstdio>), but std::ostreams can be faster (for std::cout, endl line causes buffer flushing, whereas "\n" does not flush the buffer but printf
always flushes the buffer), and formatting is possible (though very long winded).  printf does not allow to print things other than simple data, e.g. BANK_HEADER* bh; printf( "%?", *bh);

I've been writing all our DAQ code in C++ for a while now.

> > >
> > > With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> > > default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> > > 
> 
> C++ is a big animal. Obviously we want to use std::string, std::vector and similar improvements over plain C (we already use "//" for comments).
> 
> But in keeping with the Camel's nose fable (https://en.wikipedia.org/wiki/Camel%27s_nose), there are some parts of C++ we definitely do not want to use in MIDAS. Even the C++ FAQ talks 
> about "evil features", see https://isocpp.org/wiki/faq/big-picture#use-evil-things-sometimes
> 
> Here is my list of things to use and to avoid. Comments on this are very welcome - as everybody's experience with C++ is different (and everybody's experience is very valuable and very 
> welcome).
> 
> - std::string, std:vector, etc are in. I am already using them in the MIDAS API (midas.h)
> - extern "C" is out, everything has to be C++, will remove "extern "C"" from all midas header files.
> - exceptions are out, see https://stackoverflow.com/questions/1736146/why-is-exception-handling-bad
> - std::thread and std::mutex are in, at least for writing new frontends, but see discussion of "cannot use c++11". (maybe replace ss_mutex_xxx() with out own std::mutex look-alike).
> - heavy use of templates and heavy use of argument overloading is out - just by looking at the code, impossible to tell what function will be called
> - "auto" is on probation. I need to know if "auto v=f()" is an integer or a double when I write "auto w=v/2" or "auto w=v/2.0". see 
> https://softwareengineering.stackexchange.com/questions/180216/does-auto-make-c-code-harder-to-understand
> - unreadable gibberish is out (lambdas, etc)
> - C-style malloc()/free() is in. C++ new and delete are okey, but "delete[]" confuses me.
> - C-style printf() is in. C++ cout and "<<" gunk provide no way to easily format the output for easy reading.
> 
> K.O.
             Reply  22 Apr 2019, Pintaudi Giorgio, Info, switch of MIDAS to C++, how much C++? example.cpp
Dear Konstantin and others,
our recent discussion stimulated my curiosity and I wrote a small frontend for the trigger board of our experiment in C++.
The underlying hardware details are not relevant here. I would just like to briefly report and discuss what I found out.

I have written all the frontend files (but the bus driver) in C++11:
  • my_frontend.cpp
  • driver/class/my_class_driver.cpp
  • driver/device/my_device_driver.cpp

All went quite smoothly, but I feel that the overall structure is still very C-like (that may be a good thing or a bad thing depending on the point of view).
As far as I know, the MIDAS frontend mfe.c has still only the C version (I couldn't find any mfe.cxx). This means that all the points of contact between the MIDAS frontend code and the user
frontend code must be C compatible (no C++ features or name mangling). To accomplish this I needed to slightly modify the midas.h header file like this:
@@ -1141,7 +1141,13 @@ typedef struct eqpmnt {
+#ifdef __cplusplus
+extern "C" {
+#endif
 INT device_driver(DEVICE_DRIVER *device_driver, INT cmd, ...);
+#ifdef __cplusplus
+}
+#endif

I also tested the new strcomb1 function and it seems to work OK.

I have attached a source file to show how I implemented the device driver in C++. The code is not meant to be compilable: it is just to show how I implemented it. This is the most C++-like syntax that I could come out with. Feel free to comment it and if you think that it could be improved let me know.

Best Regards
Giorgio
                Reply  23 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, how much C++? 
> Dear Konstantin and others, our recent discussion stimulated my curiosity and I wrote a small frontend for the trigger board of our 
experiment in C++.

Yay!

> my_frontend.cpp

In MIDAS we are using .cxx, not .cpp, per ROOT coding convention https://root.cern.ch/coding-conventions

> the overall structure is still very C-like

this is object-oriented programming done in C. (actually C++ looks exactly the same if you look behind the curtain)

right now we do not hope to rewrite the slow control class driver framework in C++, but if somebody does it,
we should be happy to add it to midas.

for the mfe.c framework, I have a new C++ class based frontend framework in development (and already in use
in the ALPHA-g experiment at CERN). There is a number of lose ends to polish befire I can add it to midas.
And as usual the last 10% of the work consume 90% of the time.

> the MIDAS frontend mfe.c has still only the C version (I couldn't find any mfe.cxx). 
> This means that all the points of contact between the MIDAS frontend code and the user frontend code must be C compatible
> (no C++ features or name mangling).

this will change with the switch to C++, mfe.c will become mfe.cxx and I shall add the required definitions to mfe.h (or midas.h, TBD)

> To accomplish this I needed to slightly modify the midas.h header file like this:
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
>  INT device_driver(DEVICE_DRIVER *device_driver, INT cmd, ...);

I intend for all "extern "C"" to go away, everything will use the C++ linkage (and name mangling). This will break existing frontends
and I will need to write clear instructions on converting them to the new scheme.

> I also tested the new strcomb1 function and it seems to work OK.

good.

> I have attached a source file to show how I implemented the device driver in C++

Yup, looks familiar, I have a couple of C++ frontends written like this, too.

K.O.
       Reply  11 May 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, which C++? 
> [which c++]
> 
> - Linux RHEL/SL/CentOS6 - gcc 4.4.7, no C++11.
> - Linux RHEL/SL/CentOS7 - gcc 4.8.5, full C++11, no C++14, no C++17
>

The construct I now always use:

class X {
int a = 0; // do not leave data members uninitialized, see "Non-static data member initializers", N2756 and N2628
}

is only available starting from gcc 4.7, see https://gcc.gnu.org/projects/cxx-status.html

Another nail into the coffin of "pre c++11" c++ and el < el7.

Hmm...

K.O.
    Reply  22 May 2019, Konstantin Olchanski, Info, switch of MIDAS to C++ 
> switch MIDAS to C++

switch to C++ will proceed as follows:

- create a new branch off develop (feature/switch_to_cxx)
- remove all extern "C", ifdef c++, etc
- switch Makefile from gcc to g++
- test
- merge into develop
- before merge, tag the last "C" midas
- cut a new release branch (tentatively feature/midas-2019-06)

the last recommended "pre-C++" midas will remain the midas-2019-03 release (where we can retroactively apply bug fixes, as I just did a few minutes ago).

K.O.
       Reply  05 Jun 2019, Konstantin Olchanski, Info, MIDAS switched to C++ 
The last bits of code to switch MIDAS to C++ have been committed, see tag midas-2019-05-cxx.

Since the cmake conversion is still in progress, for now, I recommend using the old "make" build for trying this update.

From the switch to C++, the biggest change is the requirement that frontend programs be build and linked
using the C++ compiler. Since mfe.o and the rest of MIDAS are built with C++, building frontends
with C is no longer possible.

To help with this, I will post a short guide for converting C frontends to C++.

K.O.
          Reply  17 May 2022, Razvan Stefan Gornea, Info, MIDAS switched to C++ 
Hi, I have three naive questions about this:
 - have you posted somewhere this guide about converting C frontends to C++?
 - it was mentioned previously that there will be a 'tag the last "C" midas', which version is it?
 - it means that even a simple example like odb_test.c cannot be compile anymore? Even when using g++?

Something like

g++ -I $HOME/daq/packages/midas/include/ -L $HOME/daq/packages/midas/lib/ odb_test.c -l midas

is expected to fail or is just me glitching? Is it because of thread library differences?

Thanks!


> The last bits of code to switch MIDAS to C++ have been committed, see tag midas-2019-05-cxx.
> 
> Since the cmake conversion is still in progress, for now, I recommend using the old "make" build for trying this update.
> 
> From the switch to C++, the biggest change is the requirement that frontend programs be build and linked
> using the C++ compiler. Since mfe.o and the rest of MIDAS are built with C++, building frontends
> with C is no longer possible.
> 
> To help with this, I will post a short guide for converting C frontends to C++.
> 
> K.O.
             Reply  17 May 2022, Konstantin Olchanski, Info, MIDAS switched to C++ 
> Hi, I have three naive questions about this:

all good questions, ask more of them.

>  - have you posted somewhere this guide about converting C frontends to C++?

yes, in this elog here I posted a guide for converting C mfe.c frontends to C++ and
a guide for converting mfe.c frontend to C++ TMFE frontend. please use the "find" function,
if you cannot find them, let me know, I will look for it for you.

>  - it was mentioned previously that there will be a 'tag the last "C" midas', which version is it?

correct. please run "git tag", tags before "midas-2019-05-cxx"is "C", after is "C++".

>  - it means that even a simple example like odb_test.c cannot be compile anymore? Even when using g++?
> g++ -I $HOME/daq/packages/midas/include/ -L $HOME/daq/packages/midas/lib/ odb_test.c -l midas
> is expected to fail or is just me glitching? Is it because of thread library differences?

yes, it is expected to fail, you have spaces after "-I", "-L" and "-l", incorrect g++ command syntax. after
correcting this, it may or may not work depending on what you have inside odb_test.c. I would be happy
to help you debug this, but please start a separate thread instead of necroposting into the C++ announcements.

K.O.
                Reply  17 May 2022, Ben Smith, Info, MIDAS switched to C++ 
>  - have you posted somewhere this guide about converting C frontends to C++?

There's documentation in the wiki at:
https://daq00.triumf.ca/MidasWiki/index.php/Changelog#2019-06

It includes a step-by-step guide of how to upgrade, what changes need to be made to frontends, and common issues that people had.
Entry  08 May 2022, Stefan Ritt, Info, RO_STOPPED with triggered events 
We had issues in one of our experiment that people used RO_STOPPED in the 
equipment list together with triggered events (EQ_USER). If events are sent when 
a run is stopped, this leads to many unexpected results, so I added a check in 
the mfe.cxx code which prevents RO_STOPPED (or RO_ALWAYS which includes 
RO_STOPPED) together with EQ_TRIGGERED, EQ_INTERRUPT, EQ_MULTITHREAD and EQ_USER 
type of events.

I got now complaints that some old front-end are not running any more since they 
do use RO_ALWAYS together with triggered events. Can the author of these frontend 
please tell me the rationale why this is needed, then I can maybe add a better 
fix for that.

Stefan
    Reply  08 May 2022, Konstantin Olchanski, Info, RO_STOPPED with triggered events 
> some old front-end are not running any more since they do use RO_ALWAYS together with 
triggered events.

I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
some of these frontends will fail to start.
https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d

tmfe c++ frontends do not have this restriction but by default only read data when run 
is active (per-equipment fEqConfReadOnlyWhenRunning default is true).

K.O.
       Reply  16 May 2022, Konstantin Olchanski, Info, RO_STOPPED with triggered events 
> > some old front-end are not running any more since they do use RO_ALWAYS together with 
> triggered events.
> 
> I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
> some of these frontends will fail to start.
> https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d
> 
> tmfe c++ frontends do not have this restriction but by default only read data when run 
> is active (per-equipment fEqConfReadOnlyWhenRunning default is true).

As of commit 
https://bitbucket.org/tmidas/midas/commits/28d9c96bd6d4f65346ebcd6a04492ea764c90823 mfe.c 
frontends will no longer fail to start. an error will still be issued "Equipment \"%s\" 
contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be 
removed."

BTW 1:

Some of our old frontends use EQ_MULTITHREAD to implement multithreaded periodic equipments. 
They do not generate any events when there is no run (some of them do not generate any 
events at all). Now they will start printing this error message, for no reason. (no we will 
not be rewriting them justy to get rid of this message. life is too short).

BTW 2:

the c++ tmfe frontend does not have any protections against these "undersired side-effects".

What are these undesired side effects and should we add protection against them?

K.O.
          Reply  17 May 2022, Stefan Ritt, Info, RO_STOPPED with triggered events 
> > > some old front-end are not running any more since they do use RO_ALWAYS together with 
> > triggered events.
> > 
> > I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
> > some of these frontends will fail to start.
> > https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d
> > 
> > tmfe c++ frontends do not have this restriction but by default only read data when run 
> > is active (per-equipment fEqConfReadOnlyWhenRunning default is true).
> 
> As of commit 
> https://bitbucket.org/tmidas/midas/commits/28d9c96bd6d4f65346ebcd6a04492ea764c90823 mfe.c 
> frontends will no longer fail to start. an error will still be issued "Equipment \"%s\" 
> contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be 
> removed."
> 
> BTW 1:
> 
> Some of our old frontends use EQ_MULTITHREAD to implement multithreaded periodic equipments. 
> They do not generate any events when there is no run (some of them do not generate any 
> events at all). Now they will start printing this error message, for no reason. (no we will 
> not be rewriting them justy to get rid of this message. life is too short).
> 
> BTW 2:
> 
> the c++ tmfe frontend does not have any protections against these "undersired side-effects".
> 
> What are these undesired side effects and should we add protection against them?
> 
> K.O.

The undesired side-effects are the following: The logger tries to collect all events at the end of 
the run by emptying the SYSTEM buffer. If events keep coming after the run is stopped, this loop in 
the logger might be an endless loop, crashing the whole experiment in the end. 

Another issue (and actually the reason for this change) is the funciton receive_trigger_event() in 
mfe.cxx which will get confused if events are still coming in after a run has been stopped and 
actually enters an infinite loop.

Combining EQ_MULTITHREAD with EQ_PERIODIC or EQ_SLOW is a wrong parameter combination as written in 
the documentation. If one wants to have multi-threaded slow control events, one has to use the 
DF_MULTITHREAD flag in the DEVICE_DRIVER structure.

Having triggered events being sent to the system after a run has been stopped I would consider 
simply wrong. Why should we ever use a run start/stop if events are always flowing? Adding 
protections in all places for this case is certainly much more work than just changing one flag for 
frontends which produce this error message now for a wrong parameter combination.
Entry  24 Apr 2022, Konstantin Olchanski, Bug Fix, mserver buffer overrun and crash 
There is a memory allocation bug in the mserver.

ALIGN8() was missing when receiving events from the event socket and data buffer 
was allocated 4 bytes too short. but only for some received events and only in 
very unlucky sequence of received events. result was a rare but obnoxious crash 
of fevme frontend in alpha-2 at CERN. (we do not see any crash from this in 
alpha-g or anywhere else, the best I can tell).

fixed in commit 4dc06ba47ff7caa5251fd8c48d8533f35799f3a6.

If you use the mserver, please update to this commit or apply following patch in 
midas.cxx:

-   int bufsize = sizeof(INT) + event_size;
+   int bufsize = sizeof(INT) + total_size;

K.O.
    Reply  16 May 2022, Konstantin Olchanski, Bug Fix, mserver buffer overrun and crash 
> There is a memory allocation bug in the mserver.

Fix for this problem introduced a new problem, an infinite loop in bm_flush_cache, 
bitbucket bugs https://bitbucket.org/tmidas/midas/issues/339/infinite-loop-in-
mserver-due-to-mfes and https://bitbucket.org/tmidas/midas/issues/331/stuck-
semaphore-of-system-buffer

This is now fixed and the buffer write cache logic and size was rejigged
according to calculations in https://daq00.triumf.ca/elog-midas/Midas/2401

Event buffer write cache (as set via ODB Equipment/Common and via 
bm_set_cache_size()) now take 2 possible values:
0 - write cache is disabled and
MIN_WRITE_CACHE_SIZE - (10 Mbytes) minimum permitted cache size
bigger cache size values are permitted, up to buffer_size/3, but probably not useful 
if my calculations are right.
smaller cache size values are generally not useful, if my calculations are right.

mfe.c and tmfe c++ frontends updated to request the new write cache size by default.

if events are getting stuck in the write cache for too long, instead of reducing the 
cache size, one should increase frequency of bm_flush_cache() calls (1/sec by 
default).

commit 373bcc3ab7f83c3c7bf6c051c237de043a982502

K.O.
Entry  13 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
introduction:

to remember, bm_send_event() writes an event to the write cache, bm_flush_cache() 
writes the contents of the write cache into the shared memory event buffer, buffer 
free space is consumed. in the usual case, mlogger is reading events from the shared 
memory event buffer, buffer free space is released. there is also a read cache, not 
part of this discussion.

the purpose of the write cache is to reduce contention for the shared memory 
semaphore. in the case of large number of small events, semaphore is locked per 
cache-flush, instead of per-event. correct tuning of write cache and event size can 
reduce lock rate from >100 kHz to around 100 Hz or lower.

analysis:

for correct operation of bm_send_event() under all conditions we need to consider 
all corner cases:

1) no write cache: (cache size set to 0)

- event_size > buffer_size -> reject the event (obviously)
- event_size > 0.5 * buffer_size -> only 1 event fits into the buffer, next write 
will stall until mlogger reads the previous event (sequential operation, bad)
- event_size < 0.3 * buffer_size -> at least 2 events fit into the buffer (good)

decision: limit event size to 0.5 to 0.3 * buffer_size (current limit is 0.5 * 
buffer_size, I think).

consequence: buffer size limit is 2 Gbytes (32-bit byte offsets, code is only 31-
bit-clean), max event size is between 1 Gbytes and 0.6 Gbytes.

2) writing to write cache:

- event_size > cache_size -> flush cache, write event to directly to buffer
- event_size > 0.5 * cache_size -> inefficient use of cache: write to cache, next 
event does not fit, flush to buffer, repeat. no gain in semaphore locking (bad), one 
additional memcpy() (event to cache and cache to buffer) (bad)
- event_size < 0.3 * cache_size -> multiple events fit into cache, but probably no 
gain in semaphore locking

decision: events that are bigger than 0.3 to 0.1 * cache_size should not go through 
the cache. (flush cache, write directly to buffer).

3) flush write cache to buffer:

- cache_size > buffer_size -> cannot flush in 1 operation, must have a loop and 
flush the cache in pieces
- cache_size between 0.5 and 1.0 * buffer_size -> can flush in 1 operation, but must 
wait for mlogger to fully empty the buffer (sequential operation, bad)
- cache size < 0.3 * buffer_size -> can flush in 1 operation, at least 2 "flushes" 
fit inside the buffer (good)

decision: limit write cache size to 0.3 * buffer_size. (current limit is 
0.25*buffer_size).

consequences:

- write cache size limit is 0.3..0.25 * 2GB = 0.6..0.5 Gbytes
- cached event size limit is 0.3..0.1 * 0.5 GBytes = 150..50 Mbytes
- minimum number of cached events: 3 to 10
- semaphore locks reduced: 3 to 10 locks become 1 lock (all events cached),
4 to 11 locks become 2 locks (big event causes cache flush).

4) complications:

- there is a periodic 1/second bm_flush_cache() that flushes the cache early and 
reduces it's efficiency (but needed to avoid having data stuck in cache for long 
time)
- if multiple frontends use large write cache (~ 0.3..0.5 * buffer_size), again, 
sequential operation can happen (bad)
- write cache is per-frontend, not per-equipment. if different equipments request 
different cache sizes, mfe.c and tmfe c++ frontends complain about this, but the 
user has to sort it out.

K.O.
    Reply  16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
> for correct operation of bm_send_event() under all conditions we need to ...

to continue computation from last message:

default SYSTEM buffer size: 32 MiBytes
default max event size: 4 MiBytes

hard max buffer size: 2 Gbytes (code is only 31-bit-clean)
hard max event size: 2 Gbytes (code is only 31-bit-clean)

max event size currently: 32 Mbytes (same as buffer size)
max event size per (1) in previous post: 32*0.5..0.3 = 16..9 MiBytes

number of default-max-size events buffered: 32/4 = 8.
number of per (1) max-size events buffered: 2 or 3
number of current max-size events buffered: 0 (bad, frontend is serialized with mlogger)

default write cache size: 100 kbytes

max write cache size currently: buffer size / 4 = 32/4 = 8 MiBytes
max write cache size per (3) in previous post: buffer_size / 3 = 10 Mbytes
hard max write cache size per (3): 2 Gbytes/3 = 600 Mbytes

max size of cached events:

current: 100 kbytes (size as cache size)
per (2) in previous post: 0.1..0.3 * cache size = 10..30 kbytes
per (2), 1 Mbyte cahe: 0.1..0.3 * cache size = 100..300 kbytes
hard max size: 0.1..0.3 * hard_max_cache_size = 0.1..0.3 * 600 = 60..180 Mbytes.

max data rate before event buffer semaphore locking rate exceeds 100 Hz:

1 kbyte events, no write cache: 100 kbytes/sec
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 100*1kbyte*100Hz -> 10 Mbytes/sec
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 Mbytes/sec (1gige ethernet)
N kbyte events, 1 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
100 kbyte events, 1 Mbyte cache, not cached per (2): 100kbyte*100Hz = 10 Mbytes/sec
300 kbyte events, 1 Mbyte cache, not cached per (2): 300kbyte*100Hz = 30 Mbytes/sec
N00 kbyte events: N0 Mbytes/sec (500->50, etc)
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 Mbytes/sec (10gige ethernet)
N kbyte events, 10 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
1000 kbyte events, 10 Mbyte cache, not cached per (2): 1000kbyte*100Hz = 100 Mbytes/sec
3000 kbyte events, 10 Mbyte cache, not cached per (2): 3000kbyte*100Hz = 300 Mbytes/sec
N000 kbyte events: N00 Mbytes/sec (4000->400, 5000->500, etc)
default max event size: 4 Mibytes*100Hz = 400 Mbytes/sec (exceeds 1gige ethernet)
hard max event size (divided by 10 to buffer 10 events): 200 Mbytes*100Hz -> 20 Gbytes/sec

max event rate before event buffer semaphore locking rate exceeds 100 Hz:

1 kbyte events, no write cache: 100 Hz (obviously)
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 10 kHz
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 kHz
N kbyte events, 1 Mbyte cache: 1000/N events cached, cache flush rate 100 Hz -> 100/N kHz
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 kHz
N kbyte events, 10 Mbyte cache: 10000/N events cached, cache flush rate 100 Hz -> 1000/N kHz
100 kbyte events, not cached per (2): 100 Hz (obviously)
300 kbyte events, not cached per (2): 100 Hz (obviously)
default max event size: 100 Hz (obviously)

K.O.
       Reply  16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
> > for correct operation of bm_send_event() under all conditions we need to ...
> to continue computation from last message:

if I got my numbers right, for present-day hardware (1gige/10gige data rates, 100 Hz max locking rate), we should 
increase the default buffer write cache size from 100 kbytes to 10 Mbytes.

this cache size will permit processing of the full mix of small/big events
at the full mix of event rates without exceeding the 100 Hz semaphore locking rate.

with the 10 Mbyte write cache, default event buffer size should be 30-40 Mbytes (current size is 33 Mbytes, so does 
not need to change).

this computation is for 1 writer (1 reader, mlogger). it is a typical case for our experiments.

multiple writers can run into contention for event buffer space.

consider 10 writers want to flush their 10 Mbyte write cache all at the same time:

if buffer size is the default 33 Mbytes, the first 3 writers will have successful write cache flush,
but the other 7 will stall, there is no space in the buffer, we have to wait for mlogger to free
some (mlogger writing X Mbytes/sec will take Y milliseconds to liberate 10 Mbytes of space for the 4th writer
to successfully flush, writers 5..10 are still stalled).

but in a system with 10 writers writing at 10 Mbytes/sec (1 Hz default cache flush rate) is 100 Mbytes/sec
will likely have SYSTEM buffer size at least 200-300 Mbytes (to buffer 1-2 seconds of data against
any delays in writing to disk/network storage).

so there should be no problem in practice.

K.O.
Entry  06 May 2022, Stefan Ritt, Info, Increased timeout for program shut down 
We had the problem in our lab that a frontend took about 6 seconds to gracefully 
shut down, mainly it needed to park some motors. I found that the shutdown command 
had a hard-coded timeout of 5 seconds, after which the frontend gets killed, and 
cannot finish the park operation. I change the code so that the client timeout 
stored in the ODB is taken instead of the hard-coded 5 seconds. This allows each 
client to fine-tune its timeout, to allow graceful shutdown, but also not let the 
user wait too long if the client gets stuck and needs a hard kill.

The default timeout for mfe.cxx based frontends has been changed to 10 seconds 
now, but in the frontend_init function this can be changed by the user code 
easily.

I hope this char does not trigger any bad side effects, but if it does, please 
report here.

Stefan
Entry  04 May 2022, Konstantin Olchanski, Bug Fix, mysql history update 
the code for writing midas history to mysql has been updated to work against 
MYSQL 8.0.23 (CERN ALPHA-2):

- as ever mysql reports inconsistent data types (I create column with type 
"integer", mysql reports it has type "int" and so forth), the special kludge to 
take care of this had to be tweaked.

- this caused some columns to be marked "inactive" and the code to "reactivate" 
them was missing (fixed)

- binary history event data size was computed incorrectly for events with 
"inactive" columns (fixed) and caused assert() failure and mlogger crash.

- mysql read of column definitions for history event "system" (as in 
/history/links/system) bombed because of incorrect quoting (worked before, why? 
why bombed now?). this caused duplicate columns to be created in mysql table 
"system" and mlogger bomb-out with complaint about "duplicated columns" 
(actually the error message was missing, so it was a silent bomb-out). quoting 
fixed, missing error message fixed, but cleanup of duplicate columns has to be 
done by hand. in case of alpha-2 the fix was to remove the unused 
/history/links/system).

if you are using mysql history please update or patch src/history_schema.cxx.

commit 9d17d2fef233cf457121ca7c2a283c4c76ed33bc

K.O.
Entry  30 Apr 2022, Konstantin Olchanski, Info, added web pages for "show odb clients" and "show open records" 
for a long time, midas web pages have been missing the equivalent of odbedit 
"scl" and "sor" to display current odb clients and current odb open records.

this is now added as buttons "show open records" and "show odb clients" in the 
odb editor page.

as in odbedit, "sor" shows open records under the current subtree, i.e. if you 
are looking at /equipment, you will not see open records for /experiment. to see 
all open records, go to "/".

commit b1ab7e67ecf785744fff092708d8389f222b14a4

K.O.
    Reply  04 May 2022, Stefan Ritt, Info, added web pages for "show odb clients" and "show open records" 
Concerning the "scl" page, we are currently having a discussion. At the moment, one can 
see midas clients in three different places:

1) the main status page at the bottom, only names and hosts are there

2) the programs page, where one can also start/stop program

3) now the new page "Show ODB clients" in the ODB editor page, which shows also the 
alive status, PID and timeout

I'm thinking that three locations are two too much, so we are considering to merge the 
tree pages into one. That would mean that 1) goes away, and the "Programs" page will 
show more information. We have some rare cases that programs are removed from 
/System/Clients in the ODB but still attached to the ODB. For those "zombies" we would 
add a "hard kill" function.

I would like to hear feedback from the midas community before we proceed with the 
plans. Anybody desperately in need of the programs shown on the status page?

Best,
Stefan
Entry  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
added JSON RPC for bm_receive_event() and added a web page for "mdump".

the event dump is a hex dump for now.

if somebody can contribute a javascript decoder for midas bank format, it would be greatly appreciated.

otherwise, I will eventually write my own decoder library patterned on midasio.h and midasio.cxx.

as of commit 5882d55d1f5bbbdb0d9238ada639e63ac27d8825
K.O.
    Reply  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
> added JSON RPC for bm_receive_event()

there is a number of problems with implementing bm_receive_event() as a RPC:

1) mhttpd has only event buffer 1 read pointer for all javascript connections, if two browser tabs are 
running mdump, they will "steal" events from each other.
2) javascript connections are state-less and we cannot specify per-connection event_id and trigger_mask 
filters to bm_receive_event(). our bm_request_event() has to be for all event_id and all trigger_mask.
3) for same reason, we cannot have some requests to be GET_ALL, some to be GET_RECENT and some to be 
GET_OLD (if GET_OLD is ever implemented).

Problem (1) is hard to fix. Only solution I can see is to have mhttpd have it's own event buffer that can 
somehow track which events have been sent to which javascript connection.

The same scheme allows implementing GET_ALL and per-connection event_id and trigger_mask filters.

The difficulty is in detecting javascript connections that are no longer active and it's event request and 
events we have buffered for it can be deleted. Unlike proper rpc clients, javascript browser tabs can be 
closed without warning and without opportunity to tell rpc server that they are closed, gone.

K.O.
    Reply  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
> added a web page for "mdump".

missing functions:
- get a list of existing event buffers (should read event buffer names from /Experiment/Buffer sizes)
- selector box to select event buffer
- button for "get next" and "get new" (should call bm_skip_event() before bm_receive_event())
- entry fields for event_id and trigger_mask event filter
- check box for "keep getting new data" and entry field for update frequency
- (eventually) entry field for bank name filter

K.O.
       Reply  02 May 2022, Stefan Ritt, Info, added web page for "mdump" 
Here are some of my thoughts:

- I volunteer to write the JavaScript midas bank decoder. Just a couple of pure javascript functions, no 
midasio.cxx library needed.

- If different javascript connections "steal" events from each other, I would not be concerned. Actually I 
would rather like that all connections see the SAME event. So mhttpd keeps one event, serves it to all 
links, so displays are consistent. If a browser wants to see the "next" event, it send the old serial 
number and days "please send next event AFTER serial number". If the serial number is larger than the 
event in the buffer, mhttpd fetches a new event and puts it into its buffer.

- Since javascript connections are connectionless, I would rather pass event_id and trigger_mask with each 
request. Then mhttpd can retrieve events until event_id and trigger_mask match, then serve that event. 
Since reading events from a midas buffer is fast (many 10'000s of events per second), the won't be much of 
a delay.

- GET_ALL does not make sense for browsers, you don't want to slow down any frontend. If someone wants to 
do histogramming in the browser, then GET_SOME (which is kind of GET_OLD) would make sense, but most of 
the cases we have some single event display, and there a GET_RECENT is most appropriate.
Entry  30 Apr 2022, Giovanni Mazzitelli, Forum, S3 Object Storage 
Dear all,
We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
optimised for readout from such kind of storage. There is any work around on 
evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
python library optimised to stream data from S3 (is not really clear to me if this 
is possible).
    Reply  30 Apr 2022, Konstantin Olchanski, Forum, S3 Object Storage 
> We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
> optimised for readout from such kind of storage. There is any work around on 
> evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
> python library optimised to stream data from S3 (is not really clear to me if this 
> is possible).

We have plans for adding S3 object storage support to lazylogger, but have not gotten 
around to it yet.

We do not plan to add this in mlogger. mlogger works well for writing data to locally-
attached storage (local ext4, XFS, ZFS) but always runs into problems with timeouts and 
delays when writing to anything network-attached (even writing to NFS).

I envision that each midas raw data file (mid.gz or mid.lz4 or mid.bz2) will
be stored as an S3 object and there will be some kind of directory object
to map object ids to run and subrun numbers.

Choice of best file size is open, normally we use subruns to limit file size to 1-2 
Gbytes. If cloud storage prefers some other object size, we can easily to up to 10 
Gbytes and down to "a few megabytes" (ODB dumps will have to be turned off for this).

Other than that, in your view, what else is needed to optimize midas files for storage 
in the Amazon S3 could?

P.S. For reading files from the cloud, code needs to be written and added to 
midasio/midasio.cxx, for example, see the code that is already there for reading ssh-
attached files and dcache/dccp-attached files. (CERN EOS files can be read directly 
from POSIX mount point /eos).

K.O.
       Reply  01 May 2022, Giovanni Mazzitelli, Forum, S3 Object Storage 
> > We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
> > optimised for readout from such kind of storage. There is any work around on 
> > evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
> > python library optimised to stream data from S3 (is not really clear to me if this 
> > is possible).
> 
> We have plans for adding S3 object storage support to lazylogger, but have not gotten 
> around to it yet.
> 
> We do not plan to add this in mlogger. mlogger works well for writing data to locally-
> attached storage (local ext4, XFS, ZFS) but always runs into problems with timeouts and 
> delays when writing to anything network-attached (even writing to NFS).
> 
> I envision that each midas raw data file (mid.gz or mid.lz4 or mid.bz2) will
> be stored as an S3 object and there will be some kind of directory object
> to map object ids to run and subrun numbers.
> 
> Choice of best file size is open, normally we use subruns to limit file size to 1-2 
> Gbytes. If cloud storage prefers some other object size, we can easily to up to 10 
> Gbytes and down to "a few megabytes" (ODB dumps will have to be turned off for this).
> 
> Other than that, in your view, what else is needed to optimize midas files for storage 
> in the Amazon S3 could?
> 
> P.S. For reading files from the cloud, code needs to be written and added to 
> midasio/midasio.cxx, for example, see the code that is already there for reading ssh-
> attached files and dcache/dccp-attached files. (CERN EOS files can be read directly 
> from POSIX mount point /eos).
> 
> K.O.

thanks, 
actually a I made a small work around with python boto3 library with file of any size (with 
the obviously limitation of opportunity and time to wait) eg:

key = 'TMP/run00060.mid.gz'

aws_session = creds.assumed_session("infncloud-iam")
s3 = aws_session.client('s3', endpoint_url="https://minio.cloud.infn.it/", 
                        config=boto3.session.Config(signature_version='s3v4'),verify=True)

s3_obj = s3.get_object(Bucket='cygno-data',Key=key)
buf = BytesIO(s3_obj["Body"]._raw_stream.data)

for event in MidasSream(gzip.GzipFile(fileobj=buf)):
    if event.header.is_midas_internal_event():
        print("Saw a special event")
        continue

    bank_names = ", ".join(b.name for b in event.banks.values())
    print("Event # %s of type ID %s contains banks %s" % (event.header.serial_number, 
event.header.event_id, bank_names))
    ....


where in MidasSream I just bypass the open, and the code work, but obviously in this way I 
need to have all the buffer in memory and it take time get all the buffer. I was interested to 
understand if some one have already develop the stream event by event (better in python but 
not mandatory). I'll look to the code you underline.
Thanks, G. 
 
Entry  16 Mar 2022, Stefan Ritt, Info, New midas sequencer version 
A new version of the midas sequencer has been developed and now available in the 
develop/seq_eval branch. Many thanks to Lewis Van Winkle and his TinyExpr library 
(https://codeplea.com/tinyexpr), which has now been integrated into the sequencer 
and allow arbitrary Math expressions. Here is a complete list of new features:


* Math is now possible in all expressions, such as "x = $i*3 + sin($y*pi)^2", or 
in "ODBSET /Path/value[$i*2+1], 10"


* "SET <var>,<value>" can be written as "<var>=<value>", but the old syntax is 
still possible.


* There are new functions ODBCREATE and ODBDLETE to create and delete ODB keys, 
including arrays


* Variable arrays are now possible, like "a[5] = 0" and "MESSAGE $a[5]"


If the branch works for us in the next days and I don't get complaints from 
others, I will merge the branch into develop next week.

Stefan
    Reply  22 Mar 2022, Stefan Ritt, Info, New midas sequencer version 
After several days of testing in various experiments, the new sequencer has
been merged into the develop branch. One more feature was added. The path to
the ODB can now contain variables which are substituted with their values.
Instead writing

ODBSET /Equipment/XYZ/Setting/1/Switch, 1
ODBSET /Equipment/XYZ/Setting/2/Switch, 1
ODBSET /Equipment/XYZ/Setting/3/Switch, 1

one can now write

LOOP i, 3
   ODBSET /Equipment/XYZ/Setting/$i/Switch, 1
ENDLOOP

Of course it is not possible for me to test any possible script. So if you 
have issues with the new sequencer, please don't hesitate to report them 
back to me.

Best,
Stefan
       Reply  15 Apr 2022, Stefan Ritt, Info, New midas sequencer version sequencer.pdf
I prepared some slides about the new features of the sequencer and post it here so 
people can have a quick look at get some inspiration.

Stefan
Entry  05 Apr 2013, Konstantin Olchanski, Info, ODB JSON support 
odbedit can now save ODB in JSON-formatted files. (JSON is a popular data encoding standard associated 
with Javascript). The intent is to eventually use the ODB JSON encoder in mhttpd to simplify passing of 
ODB data to custom web pages. In mhttpd I also intend to support the JSON-P variation of JSON (via the 
jQuery "callback=?" notation).

JSON encoding implementation follows specifications at:
http://json.org/
http://www.json-p.org/
http://api.jquery.com/jQuery.getJSON/  (seek to JSONP)

The result passes validation by:
http://jsonlint.com/

Added functions:
   INT EXPRT db_save_json(HNDLE hDB, HNDLE hKey, const char *file_name);
   INT EXPRT db_copy_json(HNDLE hDB, HNDLE hKey, char **buffer, int *buffer_size, int *buffer_end, int 
save_keys, int follow_links);

For example of using this code, see odbedit.c and odb.c::db_save_json().

Example json file:

Notes:
1) hex numbers are quoted "0x1234" - JSON does not permit "hex numbers", but Javascript will 
automatically convert strings containing hex numbers into proper integers.
2) "double" is encoded with full 15 digit precision, "float" with full 7 digit precision. If floating point values 
are actually integers, they are encoded as integers (10.0 -> "10" if (value == (int)value)).
3) in this example I deleted all the "name/key" entries except for "stringvalue" and "sbyte2". I use the 
"/key" notation for ODB KEY data because the "/" character cannot appear inside valid ODB entry names. 
Normally, depending on the setting of "save_keys" argument, KEY data is present or absent for all entries.

ladd03:midas$ odbedit
[local:testexpt:S]/>cd /test
[local:testexpt:S]/test>save test.js
[local:testexpt:S]/test>exit
ladd03:midas$ more test.js
# MIDAS ODB JSON
# FILE test.js
# PATH /test
{
  "test" : {
    "intarr" : [ 15, 0, 0, 3, 0, 0, 0, 0, 0, 9 ],
    "dblvalue" : 2.2199999999999999e+01,
    "fltvalue" : 1.1100000e+01,
    "dwordvalue" : "0x0000007d",
    "wordvalue" : "0x0141",
    "boolvalue" : true,
    "stringvalue" : [ "aaa123bbb", "", "", "", "", "", "", "", "", "" ],
    "stringvalue/key" : {
      "type" : 12,
      "num_values" : 10,
      "item_size" : 1024,
      "last_written" : 1288592982
    },
    "byte1" : 10,
    "byte2" : 241,
    "char1" : "1",
    "char2" : "-",
    "sbyte1" : 10,
    "sbyte2" : -15,
    "sbyte2/key" : {
      "type" : 2,
      "last_written" : 1365101364
    }
  }
}

svn rev 5356
K.O.
    Reply  10 May 2013, Konstantin Olchanski, Info, mhttpd JSON support 
> odbedit can now save ODB in JSON-formatted files.
> Added functions:
>    INT EXPRT db_save_json(HNDLE hDB, HNDLE hKey, const char *file_name);
>    INT EXPRT db_copy_json(HNDLE hDB, HNDLE hKey, char **buffer, int *buffer_size, int *buffer_end, int  save_keys, int follow_links);
> 

Added JSON encoding format to Javascript ODBCopy() ("jcopy"). Use format="json", Javascript example updated with an example example.

Also updated db_copy_json():
- always return NUL-terminated string
- "save_keys" values: 0 - do not save any KEY data, 1 - save all KEY data, 2 - save only KEY.last_written

odb.c, mhttpd.cxx, example.html
svn rev 5362
K.O.
       Reply  17 May 2013, Konstantin Olchanski, Info, mhttpd JSON-P support 
> 
> Added JSON encoding format to Javascript ODBCopy(path,format) ("jcopy"). Use format="json", Javascript example updated with an example example.
> 

More ODBCopy() expansion: format="json-p" returns data suitable for JSON-P ("script tag") messaging.

Also implemented multiple-paths for "jcopy" (similar to "jget"/ODBMGet()). An example ODBMCopy(paths,callback,format) is present in example.html (will move to mhttpd.js).

Added JSON encoding options:
- format="json-nokeys" will omit all KEY information except for "last_written"
- "json-nokeys-nolastwritten" will also omit "last_written"
- "json-nofollowlinks" will return ODB symlink KEYs instead of following them (ODBGet/ODBMGet always follows symlinks)
- "json-p" adds JSON-P encapsulation
All these JSON format options can be used at the same time, i.e. format="json-p-nofollowlinks"

To see how it all works, please look at examples/javascript1/example.html.

The new code seems to be functional enough, but it is still work in progress and there are a few problems:
- ODBMCopy() using the "xml" format returns gibberish (the MIDAS XML encoder has to be told to omit the <?xml> header)
- example.html does not actually parse any of the XML data, so we do not know if XML encoding is okey
- JSON encoding has an extra layer of objects (variables.Variables.foo instead of variables.foo)
- ODBRpc() with JSON/JSON-P encoding not done yet.

mhttpd.cxx, example.html
svn rev 5364
K.O.
          Reply  31 May 2013, Konstantin Olchanski, Info, mhttpd JSON-P support 
> To see how it all works, please look at examples/javascript1/example.html.
> 
> - JSON encoding has an extra layer of objects (variables.Variables.foo instead of variables.foo)
>

This is now fixed. See updated example.html. Current encoding looks like this:

{
  "System" : {
    "Clients" : {
      "24885" : {
        "Name/key" : { "type" : 12, "item_size" : 32, "last_written" : 1370024816 },
        "Name" : "ODBEdit",
        "Host/key" : { "type" : 12, "item_size" : 256, "last_written" : 1370024816 },
        "Host" : "ladd03.triumf.ca",
        "Hardware type/key" : { "type" : 7, "last_written" : 1370024816 },
        "Hardware type" : 44,
        "Server Port/key" : { "type" : 7, "last_written" : 1370024816 },
        "Server Port" : 52539
      }
    },
    "Tmp" : {
...

odb.c, example.html
svn rev 5368
K.O.
    Reply  27 Sep 2013, Konstantin Olchanski, Info, ODB JSON support 
> odbedit can now save ODB in JSON-formatted files.
> 
> JSON encoding implementation follows specifications at:
> http://json.org/
> 
> The result passes validation by:
> http://jsonlint.com/
> 

A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly. A quick review found this:

1) the authors of JSON smoked some bad mushrooms and specifically disallowed NaN and Inf values for floating point numbers: 
http://tools.ietf.org/html/rfc4627
2) most JSON encoders and decoders do reasonable and unreasonable things with NaN and Inf values. The worst ones encode them as zero. More bad 
mushrooms.

There is a quick survey at: http://lavag.org/topic/16217-cr-json-labview/?p=99058

<pre>
Some Javascript engines allow it since it is valid Javascript but not valid Json however there is no concensus.
cmj-JSON4Lua: raw tostring() output (invalid JSON).
dkjson: 'null' (like in the original JSON-implementation).
Fleece: NaN is 0.0000, freezes on +/-Inf.
jf-JSON: NaN is 'null', Inf is 1e+9999 (the encode_pretty function still outputs raw tostring()).
Lua-Yajl: NaN is -0, Inf is 1e+666.
mp-CJSON: raises invalid JSON error by default, but runtime configurable ('null' or Nan/Inf).
nm-luajsonlib: 'null' (like in the original JSON-implementation).
sb-Json: raw tostring() output (invalid JSON).
th-LuaJSON: JavaScript? constants: NaN is 'NaN', Inf is 'Infinity' (this is valid JavaScript?, but invalid JSON).
</pre>

For the MIDAS JSON encoder (and decoder) I have several choices:
a) encode NaN and Inf using the printf("%f") encoding (as strings, making it valid JSON)
b) encode NaN and Inf as strings using the Javascript special values: "NaN", "Infinity" and "-Infinity", see 
http://www.w3schools.com/jsref/jsref_positive_infinity.asp

I note that the Python JSON encoder does (b), see section 18.2.3.3 at http://docs.python.org/2/library/json.html

In either case, behaviour of the JSON decoder on the Javascript side needs to be tested. (Silent conversion to value of zero is not acceptable).

If anybody has an suggestion on this, please let me know.



P.S. If you do not know all about NaN, Inf, "-0" and other floating point funnies, please read:  https://www.ualberta.ca/~kbeach/phys420_580_2010/docs/ACM-Goldberg.pdf

P.P.S. If you ever used the type "float" or "double", used the "/" operator or the function "sqrt()" you also should read that reference.

K.O.
       Reply  09 Oct 2013, Konstantin Olchanski, Info, ODB JSON support 
> > odbedit can now save ODB in JSON-formatted files.
> A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly.

Tested the browser-builtin JSON.stringify() function in google-chrome, firefox, safari, opera:
everybody encodes numeric values NaN and Inf as JSON value [null].

To me, this clearly demonstrates a severe defect in the JSON standard and in it's Javascript implementation:
a) NaN, Inf and -Inf are valid, useful and commonly used numeric values defined by the IEEE754/854 standard (as opposed to the special value "-0", which is also defined by the standard, but is not nearly as useful)
b) they are all distinct numeric values, encoding them all into the same JSON value [null] is the same as encoding all even numbers into the JSON value [42].
c) on the decoding end, JSON value [null] is decoded into Javascript value [null], which works as 0 for numeric computation, so effectively NaN, Inf and -Inf are made equal to zero. A neat trick.

Note that (c) - NaN, Inf is same as 0 - eventually produces incorrect numerical results by breaking the IEEE754/854 standard specification that number+NaN->NaN, number+infinity->infinity, etc.

In MIDAS we have a requirement that results be numerically correct: if an ODB value is "infinity", the corresponding web page should not show "0".

In addition we have a requirement that JSON encoding should be lossess: i.e. ODB contents encoded by JSON should decode back into the same ODB contents.

To satisfy both requirements, I now encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).

Notes:
1) this is valid JSON
2) it survives decode/encode in the browser (ODBMCopy()/JSON.parse/modify some values/JSON.stringify/ODBMPaste() does not destroy these special values)
3) it is numerically correct for "NaN" values (Javascript [1+"NaN"] -> NaN)
4) it fails in an obvious way for Inf and -Inf values (Javascript [1+"Infinity"] is NaN instead of Infinity).

https://bitbucket.org/tmidas/midas/commits/82dd203cc95dacb6ec9c0a24bc97ffd45bb58427
K.O.
          Reply  17 Mar 2014, Konstantin Olchanski, Info, ODB JSON support 
> > > odbedit can now save ODB in JSON-formatted files.
> encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).

A new standard just came out - Oasis OData JSON format 4.0 - 
http://docs.oasis-open.org/odata/odata-json-format/v4.0/os/odata-json-format-v4.0-os.html

Section 7.1 reads:

> Values of types [...] Edm.Single, Edm.Double, and Edm.Decimal are represented as JSON numbers, except for NaN, INF, and –INF which are represented as strings.

This is consistent with what we do in MIDAS - encode special numbers as strings. For now I think we stay with Javascript-standard "Infinity", "-Infinity",
but if more standards start using "INF", "-INF", maybe we will switch. It is easy enough to support both encodings in the JSON parser and in the ODB decoder.

https://xkcd.com/927/
K.O.
             Reply  12 Apr 2022, Konstantin Olchanski, Info, ODB JSON support 
> > > > odbedit can now save ODB in JSON-formatted files.
> > encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).
> http://docs.oasis-open.org/odata/odata-json-format/v4.0/os/odata-json-format-v4.0-os.html
> > Values of types [...] Edm.Single, Edm.Double, and Edm.Decimal are represented as JSON numbers,
> except for NaN, INF, and –INF which are represented as strings "NaN", "INF" and "-INF".
> https://xkcd.com/927/

Per xkcd, there is a new json standard "json5". In addition to other things, numeric
values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
https://spec.json5.org/#numbers

Good discussion of this mess here:
https://stackoverflow.com/questions/1423081/json-left-out-infinity-and-nan-json-status-in-ecmascript

K.O.
                Reply  13 Apr 2022, Stefan Ritt, Info, ODB JSON support 
> Per xkcd, there is a new json standard "json5". In addition to other things, numeric
> values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
> https://spec.json5.org/#numbers

Just for curiosity: Is this implemented by the midas json library now?
                   Reply  13 Apr 2022, Konstantin Olchanski, Info, ODB JSON support 
> > Per xkcd, there is a new json standard "json5". In addition to other things, numeric
> > values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
> > https://spec.json5.org/#numbers
> 
> Just for curiosity: Is this implemented by the midas json library now?

MIDAS encodes NaN, Infinity and -Infinity as javascript compatible "NaN", "Infinity" and "-Infinity",
this encoding is popular with other projects and allows correct transmission of these values
from ODB to javascript. The test code for this is on the MIDAS "Example" page, scroll down
to "Test nan and inf encoding".

I think this type of encoding, using strings to encode special values, is more in the spirit of json,
compared to other approaches such as adding special literals just for a few special cases
leaving other special cases in the cold (ieee-754 specifies several different types of NaN,
you can encode them into different nan-strings, but not into the one nan-literal (need more nan-literals,
requires change to the standard and change to every json parser).

As editorial comment, it boggles my mind, what university or kindergarden these people went to
who made the biggest number, the smallest number and the imaginary number (sqrt(-1))
all equal to zero (all encoded as literal null).

K.O.
Entry  31 Mar 2022, Konstantin Olchanski, Bug Fix, "run stop" trouble in mlogger, fixed 
while debugging something else, I ran into a bit of trouble in mlogger.

I set the mlogger event limit to 100, and after reaching 100 events, mlogger
sayd "stopping run", but nothing happened, run kept going.

it turns out mlogger tried stopping the run too soon, the run-start
transition did not finish yet and the error message about trying
to stop a run while another transition is in progress was missing.

(fixed - if another transition is in progress, we try again later)

it also turns out that cm_transition() checks if another transition
is in progress way too late, all the way in the transition thread,
where it cannot return it is an error to mlogger.

(fixed - first thing done in cm_transition() is this check).

while debugging this, I tested the ODB flags "/Logger/Async transitions"
and /Logger/Multithread transitions". It turns out only two transition
types still work from inside mlogger - multithread transition
and detached transition (via the mtransition helper).

the issue is the dead lock between mlogger and frontend. while mlogger
is inside cm_transition(), it is not reading the SYSTEM buffer,
while at the same time frontends are writing into it. If SYSTEM
buffer happens to be pretty full, we dead lock - frontends are waiting
for free space in the SYSTEM buffer do not respond to RPCs, mlogger is not 
reading from the SYSTEM and it stuck trying to issue "run stop" RPC
to frontend. (this dead lock is not forever, eventually frontend
is killed by RPC timeout, mlogger survives and stops the run).

this is a well known problem and as solution, mlogger has been using the 
multithreaded transitions for years.

now I removed the OBD /Logger/Async transition and /Logger/Multithread 
transition flags, instead, there is now a flag /Logger/Detached transitions
set to FALSE by default. Setting it to TRUE will cause mlogger to fork 
"mtransition STOP" and "mtransition START" for stopping and starting runs,
this is useful in case there is trouble with multithreading in mlogger.

K.O.
Entry  30 Mar 2022, Konstantin Olchanski, Bug Fix, erroneous removal of odb clients, fixed 
commit https://bitbucket.org/tmidas/midas/commits/b1fe21445109774be3f059c2124727b414abf835
made on 2022-02-21 fixed a serious bug in ODB.

a multithread race condition against an incorrectly updated shared variable caused removal of 
random clients from ODB with error message:

My client index %d in ODB is invalid: out of range 0..%d. Maybe this client was removed by a 
timeout, see midas.log. Cannot continue, aborting...

the race is between db_open_database() in one program (executed when any midas program starts) and 
db_get_my_client_locked() in all running midas programs.

as long as no midas programs are started (db_open_database() is not executed), this bug does not 
happen.

if i.e. odbedit is executed very often, i.e. from a script, probability of hitting this bug becomes 
quite high.

fixed now.

K.O.
Entry  29 Mar 2022, Konstantin Olchanski, Bug Fix, mdump can read lz4 and bz2 files now 
I converted mdump file i/o from older mdsupport library to newer midasio library 
and it can now read .mid, .mid.gz, .mid.lz4 and .mid.bz2 files. Output should be 
identical to what it printed before, if you see any differences, please report 
them here or on bitbucket. K.O.
Entry  29 Mar 2022, Hunter Lowe, Forum, Triggering without LAM signal - mcstd_libgpmc_camac driver 
Hello,

I have a question for anyone experienced with simple CAMAC systems.
 My understanding is that for a single ADC system you can use a gate to generate a
 LAM signal for triggering on ADC.
 The driver that I have "mcstd_libgpmc_camac" has LAM "not implemented" though,
 so I'm not sure how I should trigger DAQ. The frontend code that I have seems to use a TDC
 as trigger for ADC via "EQ_POLLED" type equipment setting. Should I simply plug in TDC in my
 system and use this as trigger? Is it as simple as TDC generates signal via gate and ADC performs job? 

Sorry if question is super basic, just confused how to trigger without LAM signal.

Thank you :)

Hunter Lowe
UNBC Grad Physics
Entry  12 Dec 2021, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  dummy_fe.cpp
Dear all,

in 13 Feb 2020 to 21 Feb 2020 we had a talk about how I try to create MIDAS events directly on a FPGA and 
than use DMA to hand the event over to MIDAS. In the thread I also explained how I do it in my MIDAS frontend. 

For testing the DAQ I created a dummy frontend which was emulating my FPGA (see attached file). The interesting code is 
in the function read_stream_thread and there I just fill a array according to the 32b BANKS which are 64b aligned (more or less
the lines 306-369). And than I do:

    uint32_t * dma_buf_volatile;
    dma_buf_volatile = dma_buf_dummy;

    copy_n(&dma_buf_volatile[0], sizeof(dma_buf_dummy)/4, pdata);

    pdata+=sizeof(dma_buf_dummy);
    rb_increment_wp(rbh, sizeof(dma_buf_dummy)); // in byte length

to send the data to the buffer.

This summer (Mai - July) everything was working fine but today I did not get the data into MIDAS. 
I was hopping around a bit with the commits and everything was at least working until: 3921016ce6d3444e6c647cbc7840e73816564c78.

Thanks,
Marius
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Writting MIDAS Events via FPGAs  
> today I did not get the data into MIDAS. 

Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes?

I do not understand what you mean by "did not get the data into midas". You create events
and send them to a midas event buffer and you do not see them there? With mdump?
Do you see this both connected locally and connected remotely through the mserver?

BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
is quite convoluted and impossible to straighten out. I recommend that you use
the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
compared to the mfe.c frontend. There is examples in the midas repository and there are
tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

BTW, the commit you refer to only changed some html files, could not have affected
your data.

K.O.
       Reply  26 Jan 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> I do not understand what you mean by "did not get the data into midas". You create events
> and send them to a midas event buffer and you do not see them there? With mdump?
> Do you see this both connected locally and connected remotely through the mserver?

I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
 
> BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
> is quite convoluted and impossible to straighten out. I recommend that you use
> the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
> compared to the mfe.c frontend. There is examples in the midas repository and there are
> tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

I know the code I used is really old that's why I was so surprised that it suddenly did not work. But I am on the way to change it. Also Stefan gave me some comments on how to improve the code. But still changing them did not really change the behavior. 

> BTW, the commit you refer to only changed some html files, could not have affected
> your data.

I just hopped around and the commit I send was the first one which worked again. But it's of course not the one where the stuff broke. I did a bit of git-bisect and ended up with this commit as the first one where my frontend is not working anymore: 91582e4172d534bf9b10e661a423c399fd1a69f4

Cheers,
Marius
          Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Writting MIDAS Events via FPGAs  
> 
> > Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> > I do not understand what you mean by "did not get the data into midas". You create events
> > and send them to a midas event buffer and you do not see them there? With mdump?
> > Do you see this both connected locally and connected remotely through the mserver?
> 
> I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
>

If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.

It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
without a trace.

Before you do that, try something simpler:

Run "mdump -s -d", it will print some event buffer internals.

Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

If nothing changes at all, then we are not sending anything (fault is in your code or on mfe.c).

If you see "wp" counting up, then we definitely write your events into the buffer and mdump & mlogger should see them.

But there is some funny logic for event_id and trigger_mask and it is worth checking their
values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

K.O.
             Reply  26 Jan 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
> if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.
> 
> It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
> without a trace.

I checked bm_send_event(rbh, (EVENT_HEADER*)(&pdata[0]), 0, 20); which gives me back 1. I also check the status of rb_increment_wp which is also 1.

> Before you do that, try something simpler:
> Run "mdump -s -d", it will print some event buffer internals.
> Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

"rp" & "wp" are not counting up. 

> But there is some funny logic for event_id and trigger_mask and it is worth checking their
> values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

Changing both to 0x1 did not change the behavior. 

Cheers,
Marius
    Reply  28 Jan 2022, Stefan Ritt, Bug Report, Writting MIDAS Events via FPGAs  dummy_fe.cpp
I finally got the dummy program working. There were several issues:

- event_buffer_size was defined as 10000 * 32 MB = 320 GB, exceeding the RAM of the computer

- SERIAL number starting with 1. Actually in midas, event serial numbers always started with zero, but this was wrong in the documentation at 
https://midas.triumf.ca/MidasWiki/index.php/Event_Structure, so I also fixed the documentation

- the event header time stamp must be seconds since 1.1.1970, and thus the function ss_time() should be used to set it

- calling set_equipment_status() for each event slows down the event collection considerably, since this function access the ODB each time

- dma_buf_dummy is defined inside the event loop, so it gets allocated and de-allocated on the stack for each event. Of course this might vanish 
when the real FPGA buffer will be used.

- The line pdata+=sizeof(dma_buf_dummy); is wrong. pdata is pointer to uint32_t, but the sizeof() operation returns the size of the 
dma_buf_dummy in bytes. Therefore, pdata gets incremented by four times the size of dma_buf_dummy

- Instead the call to std::this_thread::sleep_for(std::chrono::milliseconds(2000)); one can call the standard midas call ss_sleep(2000); which 
is a bit shorter

- Finally, sending many events to the ring buffer triggered a bug in the midas ring buffer functions which were lingering there since 2007. I'm 
glad that this happened and now could be fixed. Not sure if other experiments where affected in the last decade by that. This could have 
manifested itself in lost events or crashing front-ends. Anyhow, now it's fixed. You need to update midas to get the fix.

I attached a working version of the dummy program for your reference. Banks a different but the principle should become clear.

Stefan
       Reply  16 Feb 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
I just came back to this and started to use the dummy frontend.
Unfortunately, I have a problem during run cycles: 

Starting the frontend and starting a run works fine -> seeing events with mdump and also on the web GUI. 
But when I stop the run and try to start the next run the frontend is sending no events anymore.
It get stuck at line 221 (if (status == DB_TIMEOUT)).
I tried to reduce the nEvents to 1 which helped in terms of DB_TIMEOUT but still I don't get any events after I did a stop / start cycle -> no events in mdump and no events counting up at the web GUI.
If I kill the frontend in the terminal (ctrl+c) and restart it, while the run is still running, it starts to send events again.

Cheers,
Marius
          Reply  03 Mar 2022, Stefan Ritt, Bug Report, Writting MIDAS Events via FPGAs  
> Starting the frontend and starting a run works fine -> seeing events with mdump and also on the web GUI. 
> But when I stop the run and try to start the next run the frontend is sending no events anymore.
> It get stuck at line 221 (if (status == DB_TIMEOUT)).
> I tried to reduce the nEvents to 1 which helped in terms of DB_TIMEOUT but still I don't get any events after I did a stop / start cycle -> no events in mdump and no events counting up at the web GUI.
> If I kill the frontend in the terminal (ctrl+c) and restart it, while the run is still running, it starts to send events again.

This problem has (likely) been fixed in the current version. Please pull develop and try again. Was a recursive call to the event collection routine which is only triggered if you send events faster than 
the logger can digest, so not many people see it.

Best,
Stefan
             Reply  07 Mar 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> This problem has (likely) been fixed in the current version. Please pull develop and try again. Was a recursive call to the event collection routine which is only triggered if you send events faster than 
> the logger can digest, so not many people see it.

I just pulled the current version (d945fa9) but the problem as explained in 2347 stays the same.

Best,
Marius
                Reply  25 Mar 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
I finally found the problem why the readout stops after a run transition. 

In my dummy frontend the serial number was not reset to zero at run start. 
This leads to a mismatch of the serial number in the function receive_trigger_event of mfe.cxx:1247.
Which is than resulting in the problem that the function founds never a new event in all ring buffers and nothing get read out of the buffer.

Nevertheless, it would be nice that the system would tell the user that there is a mismatch in the serial number (printing a warning / error etc.). 

Cheers,
Marius
Entry  23 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
the mhttpd bug should be fixed now (branch feature/buffer_mutex).

simplest way to reproduce:

wget http://localhost:8080/
quickly ctrl-C it
wget http://localhost:8080/
inside mhttpd (by hook or crook) observe that the second wget got the data meant for the first wget.

if you cannot ctrl-C the first wget quickly enough, put a sleep somewhere in the worker thread (in 
mongoose_write(), I think).

this is what happens.

1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
(corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)

2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
nc pointer is no longer stale, but points to 2nd wget's connection.

so we think we are clever and we check the socket file descriptors. but same thing
happens there, too. if 1st wget was file descriptor 7, it is closed, (1st wget worker now has
a stale file handle), then reopened for the 2nd wget, per POSIX, we get back the same
file descriptor 7. 1st wget worker now has the file handle for the 2nd wget tcp socket and
the famous test/crash for "sending data to wrong socket" is defeated.

now, worker thread for the 1st wget wants to send a reply, it has a valid nc pointer (points to 2nd wget's
mg_connection object) and a valid file descriptor (points to 2nd wget's tcp socket),
reply meant for the 1st wget is successfully sent to the 2nd wget, 2nd wget finishes, it's socket
is closed, mg_connection object is free'ed. Now the worker thread for the 2nd wget has stale
connection info, but this is okey, mongoose does not find a matching connection, 2nd wget
worked thread reply goes nowhere, thread finishes silently (no memory leaks here, I checked).

so, connection for 2nd wget completely impersonates the closed connection of 1st wget (I guess I could
check the full socket address info, remote ip address, remote port number, etc, but...)

in practice, this bug does not happen often because modern browsers tend to keep tcp sockets open
for very long time. (not sure about sundry web proxies, etc).

solution of course is very simple. match worker thread data to mongoose mg_connection objects
using our own connection sequential number, which are unique and very easy to keep track
of through the mongoose event handler. all this mess runs in the main thread,
so no locking trouble here, small blessing.

K.O.
    Reply  24 Mar 2022, Stefan Ritt, Bug Fix, mhttpd bug fixed 
> 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
> (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)
> 
> 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
> but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
> nc pointer is no longer stale, but points to 2nd wget's connection.

Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be 
mistakenly re-used by the next thread.

Stefan
       Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
> > 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
> > (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)
> > 
> > 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
> > but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
> > nc pointer is no longer stale, but points to 2nd wget's connection.
> 
> Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be 
> mistakenly re-used by the next thread.
> 

My description was unclear. I will try better now.

When http replies are generated by worker threads, matching of reply to mg_connection is done
by checking the address of the mg_connection object. (mongoose itself unhelpfully offers
to send the reply to every mg_connection, see the responder to mg_broadcast() messages).

This works for open/active connections, addresses of all mg_connections are unique.

But if connection is closed and a new connection is opened, the address is reused (by malloc()/free()
reusing memory blocks or by mongoose using a pool of mg_connection objects, does not matter).

So matching http reply to mg_connection using only address of mg_connection can match the wrong connection.

(contents of mg_connection object does not matter, only address is used by matching. so memzero() of
mg_connection object does not help).

I saw this during my testing - wrong data was sent to wrong browser often enough - but did
not understand that the above problem is happening.

Because I was unable to reliably reproduce the problem, I could not debug it. I tried to add
a check for the tcp socket file descriptor number, in case there is a straight bug or multithread race
or simple memory corruption. This replaced "we sent wrong data to wrong browser, poisoned browser
cache, confused the user" with a crash. This "fix" seemed effective at the time.

Maybe I should mention browser cache poisoning again. What happened is html pages and rpc replies
were returned as responses to load things like CSS files, these bad responses are cached by the browser
pretty much forever, so all subsequent midas pages will look wrong (bad css!) forever, until
user manually clears browser cache. reload of page did not help, restart of browser did not help (I think).

So a very bad bug.

Unfortunately, the check for file descriptor was not effective because file descriptors are also
reused. And I did see wrong data returned by mhttpd, but even more rarely. And everybody (myself
included) complained about mhttpd crashes.

Now, matching of responses to connections is done by connection sequential/serial number,
which is unique 32-bit counter. Mismatch of reply to connection should not happen again.

P.S. Latest version of the mongoose web server library does not help with this problem,
the example code for matching reply to connection in their multithread example looks bogus:
https://github.com/cesanta/mongoose/blob/master/examples/multi-threaded/main.c

K.O.
          Reply  24 Mar 2022, Stefan Ritt, Bug Fix, mhttpd bug fixed 
I see, now I understand.

As for the browser cache problem: This Chrome extension is your friend: 

https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn?hl=en

I use it all the time I change the CSS or a JS file. Having the "Developer Tools" open in Chrome helps as well 
(cache is then turned off). Firefox has similar extensions.

Stefan
             Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
> As for the browser cache problem: This Chrome extension is your friend ...

for google chrome, it is easy, open the javascript debugger (left-click "inspect"),
the reload button becomes a left-click menu, one left-click option is "clear cache and reload".
(there is no button for "clear cookies and reload", re recent elog cookie problem).

but this does not help me personally any. if midas web pages get confused, I will also get confused, too,
and I will spend hours debugging mhttpd before thinking "hmm... maybe I should clear the browser cache!"

not sure about firefox, safari, microsoft edge and opera. if I ever need it, I google it.

K.O.
Entry  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
Dear all

We just started our beam time at ILL and just found yesterday that for certain 
settings of our detector the data is not saved into the .mid files. Running "mdump 
-l 10" online we see the data coming in as they should. Nevertheless, if we run 
"mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
missing. Any ideas where the data could go lost?

Thanks in advance,
Ivo
    Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> Dear all
> 
> We just started our beam time at ILL and just found yesterday that for certain 
> settings of our detector the data is not saved into the .mid files. Running "mdump 
> -l 10" online we see the data coming in as they should. Nevertheless, if we run 
> "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
> missing. Any ideas where the data could go lost?
> 
> Thanks in advance,
> Ivo

Have you checked 

/Logger/Channels/0/Settings/Event ID = -1
/Logger/Channels/0/Settings/Trigger mask = -1

If these settings are not -1, they filter the data stream for certain events and trigger 
masks.

Stefan
       Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> > Dear all
> > 
> > We just started our beam time at ILL and just found yesterday that for certain 
> > settings of our detector the data is not saved into the .mid files. Running "mdump 
> > -l 10" online we see the data coming in as they should. Nevertheless, if we run 
> > "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
> > missing. Any ideas where the data could go lost?
> > 
> > Thanks in advance,
> > Ivo
> 
> Have you checked 
> 
> /Logger/Channels/0/Settings/Event ID = -1
> /Logger/Channels/0/Settings/Trigger mask = -1
> 
> If these settings are not -1, they filter the data stream for certain events and trigger 
> masks.
> 
> Stefan

Good morning Stefan

Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the 
same settings, sometimes data is in the .mid file and sometimes it is not.

Best,
Ivo 
          Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the 
> same settings, sometimes data is in the .mid file and sometimes it is not.

Then I'm running out of ideas. Things I would check:

- Are the file sizes about the same? 

- When you dump the .mid file, you do you see your bank names? 

This would tell you if the events are really missing or if mdump would just not find them.

But I guess without being able to debug the system at ILL I cannot be of any more help. You are the 
first one reporting such a problem, so it must have to do with your local setup.

Stefan
             Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Then I'm running out of ideas. Things I would check:
> 
> - Are the file sizes about the same? 
> 
> - When you dump the .mid file, you do you see your bank names? 
> 
> This would tell you if the events are really missing or if mdump would just not find them.
> 
> But I guess without being able to debug the system at ILL I cannot be of any more help. You are the 
> first one reporting such a problem, so it must have to do with your local setup.
> 
> Stefan

So I did a quick check. The file size is about the same (322K and 329K). When I dump the .mid I don't see 
the banks. It only prints two lines with "------ Event# 0 ------" and "------ Event# 1 ------" whereas for 
the file with data I get the two banks with all the data. Our online analyzer also fails to see the banks. 
Is there another way to check what is in the .mid file?

Best,
Ivo
                Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> So I did a quick check. The file size is about the same (322K and 329K). When I dump the .mid I don't see 
> the banks. It only prints two lines with "------ Event# 0 ------" and "------ Event# 1 ------" whereas for 
> the file with data I get the two banks with all the data. Our online analyzer also fails to see the banks. 
> Is there another way to check what is in the .mid file?

with "dump" I meant a true object dump like "hexdump -C run000001.mid". I produced a file with ADC0 and TDC0 
banks (that's the example from the distribution under exampels/experiments/frontend.cxx), and I get

....
00024220  01 00 00 00 41 44 43 30  04 00 08 00 eb 06 35 04  |....ADC0......5.|
00024230  31 09 4f 06 54 44 43 30  04 00 08 00 93 04 fb 07  |1.O.TDC0........|
00024240  5c 09 88 0b 01 00 00 00  01 00 00 00 2a 0b 31 5f  |\...........*.1_|
00024250  28 00 00 00 20 00 00 00  01 00 00 00 41 44 43 30  |(... .......ADC0|
00024260  04 00 08 00 c3 09 24 05  85 05 f3 06 54 44 43 30  |......$.....TDC0|
00024270  04 00 08 00 88 08 2d 03  3b 0d d6 02 01 00 00 00  |......-.;.......|
00024280  02 00 00 00 2a 0b 31 5f  28 00 00 00 20 00 00 00  |....*.1_(... ...|
00024290  01 00 00 00 41 44 43 30  04 00 08 00 a5 0a 69 09  |....ADC0......i.|

where you clearly see the ADC0 and TDC0 banks.

Stefan
                   Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> with "dump" I meant a true object dump like "hexdump -C run000001.mid". I produced a file with ADC0 and TDC0 
> banks (that's the example from the distribution under exampels/experiments/frontend.cxx), and I get
> 
> ....
> 00024220  01 00 00 00 41 44 43 30  04 00 08 00 eb 06 35 04  |....ADC0......5.|
> 00024230  31 09 4f 06 54 44 43 30  04 00 08 00 93 04 fb 07  |1.O.TDC0........|
> 00024240  5c 09 88 0b 01 00 00 00  01 00 00 00 2a 0b 31 5f  |\...........*.1_|
> 00024250  28 00 00 00 20 00 00 00  01 00 00 00 41 44 43 30  |(... .......ADC0|
> 00024260  04 00 08 00 c3 09 24 05  85 05 f3 06 54 44 43 30  |......$.....TDC0|
> 00024270  04 00 08 00 88 08 2d 03  3b 0d d6 02 01 00 00 00  |......-.;.......|
> 00024280  02 00 00 00 2a 0b 31 5f  28 00 00 00 20 00 00 00  |....*.1_(... ...|
> 00024290  01 00 00 00 41 44 43 30  04 00 08 00 a5 0a 69 09  |....ADC0......i.|
> 
> where you clearly see the ADC0 and TDC0 banks.
> 
> Stefan

So at least I learned something new. I tried it with the hexdump and the banks are not existent in the .mid file. I 
only have the ODB inside the file. The 7K difference in size is actually just about what I expect to be the data 
(1792 x 4 bytes)

Best, Ivo
                      Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.
                         Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.

Yes, I did. This 7 KB of the data bank is about the limit. If we go only 1 KB higher it seems that we save all data. In 
our specific case, this is the number of time bins (256 pixels with 7 time bins results in data loss, with 8 time bins it 
seems to be okay, data type is DWORD). 

Of course, a workaround for us is to save at least 8 time bins and throw 7 of them away later on. Nevertheless, since we 
are only in the commissioning phase now this is okay, I would just like to avoid data loss in the data taking phase of the 
experiment so knowing where the problem origins could help. 

I did another test with another FE running that produces a lot of data. The behavior is the same though. If the bank size 
is less than about 8 KB, the bank is not saved anymore. But probably this is anyway the expected behavior since it is a 
different FE that produces the data. 

So if it is coming from the buffer, is there something I could change to test or solve the problem?

Best, Ivo
                            Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that 
it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I 
can reproduce it.
                               Reply  11 Aug 2020, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that 
> it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I 
> can reproduce it.

It would be good to pin point there the data is lost. This is the sequence:

frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk

To see if correct data arrives to the SYSTEM buffer, run:
mdump -z SYSTEM

To see if mlogger is receiving events from the SYSTEM buffer, run:
mlogger -v ### mlogger should report all events, history and data

To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).

I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
mlogger is misconfigured (but you already checked that).

If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).

K.O.
                                  Reply  11 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> It would be good to pin point there the data is lost. This is the sequence:
> 
> frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> 
> To see if correct data arrives to the SYSTEM buffer, run:
> mdump -z SYSTEM
> 
> To see if mlogger is receiving events from the SYSTEM buffer, run:
> mlogger -v ### mlogger should report all events, history and data
> 
> To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> 
> I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> mlogger is misconfigured (but you already checked that).
> 
> If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> 
> K.O.

Good evening

I tried to reproduce the behavior in a very simple FE but it did not work out. The next thing for me would be to take the FE that is producing this behavior, replace all the device communication and data with dummies. If the problem is still there I would start to simplify as much as possible. 

Following the inputs of KO, I pin-pointed the data loss. The system buffer still gets the data but the mlogger does not write the data event. Then of course the data is also not anymore present in the data file. Therefore, I checked the logger settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view, that is misconfigured. Nevertheless, if it helps I can send my ODB settings. 

When doing the tests just before I found something else that probably can give a hint to the problem. The data is only lost if the time between two runs is long (a few seconds). As an example: If I run a sequence with a loop and after the FE stops the run the loop ends and the next run is started automatically, then only the first run has no data, which is the one after a longer time of no data taking. When I add a "WAIT Seconds 5" after the run before starting the next, not data is written to the disk for any run. I also found this once when adding a sleep(1) at the end of the FE readout function but back then did not think about it any further. 

Best, Ivo
                                     Reply  24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> > It would be good to pin point there the data is lost. This is the sequence:
> > 
> > frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> > 
> > To see if correct data arrives to the SYSTEM buffer, run:
> > mdump -z SYSTEM
> > 
> > To see if mlogger is receiving events from the SYSTEM buffer, run:
> > mlogger -v ### mlogger should report all events, history and data
> > 
> > To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> > 
> > I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> > if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> > mlogger is misconfigured (but you already checked that).
> > 
> > If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> > to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> > 
> > K.O.
> 
> Good evening
> 
> I tried to reproduce the behavior in a very simple FE but it did not work out.
> The next thing for me would be to take the FE that is producing this behavior,
> replace all the device communication and data with dummies. If the problem is still
> there I would start to simplify as much as possible. 
> 
> Following the inputs of KO, I pin-pointed the data loss. The system buffer still
> gets the data but the mlogger does not write the data event. Then of course the data
> is also not anymore present in the data file. Therefore, I checked the logger
> settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view,
> that is misconfigured. Nevertheless, if it helps I can send my ODB settings. 
> 
> When doing the tests just before I found something else that probably
> can give a hint to the problem. The data is only lost if the time between
> two runs is long (a few seconds). As an example: If I run a sequence with a loop
> and after the FE stops the run the loop ends and the next run is started automatically,
> then only the first run has no data, which is the one after a longer time of
> no data taking. When I add a "WAIT Seconds 5" after the run before starting
> the next, not data is written to the disk for any run. I also found this
> once when adding a sleep(1) at the end of the FE readout function
> but back then did not think about it any further. 
> 

Looks like this problem fell into the covid crack.

As far as I know, MIDAS does not lose any events between bm_send_event() and the shared memory
buffer. It does not lose any events in the mlogger (unless the "event request" is misconfigured).
(there is lots of opportunity to lose events in complicated frontends).

If you have some evidence otherwise, I would very much like to hear about
it and I want to fix all problems that cause it.

In your previous report I was under the impression that you lose random events here and there,
but your latest report is about mlogger not writing anything at all.

Which case is it?

If you can definitely say that all your events make it to the SYSTEM buffer
but mlogger sometimes does not see some of them and sometimes does not see all of them,
we should look very closely at bm_receive_event() and mlogger itself.

In the case where mlogger is not seeing any events at all (output file is empty), as this is
happening, I would like to see the output of mdump (to confirm events are written to SYSTEM
buffer with correct event_id and trigger_mask) and the output of (say)
"manalyzer_test.exe --dump run01161.mid.lz4" on your output file.

If the output is very long, you can email it to me directly instead of posting it here.

K.O.
                                        Reply  24 Mar 2022, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
One idea: we should have a look at mlogger::close_channels(). There the SYSTEM buffer is emptied through the cm_yield() call. Instrumenting this with some debugging code will enlighten us. 

Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling 

cm_register_transition(TR_SOP, 500);

while the mlogger does 

cm_register_transition(TR_STOP, tr_stop, 800);

and since 800 > 500 the logger will be called AFTER the frontend. If one use a framework different from mfe.cxx, this could however be different.

Stefan
                                           Reply  24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> One idea: we should have a look at mlogger::close_channels().
> There the SYSTEM buffer is emptied through the cm_yield() call.
> Instrumenting this with some debugging code will enlighten us.

right. this will "last few events are lost at the end of run".

but that code in the mlogger was not touched in years, if there is a problem there,
we would have seen it by now, most experiments check that the number
of events in the data file is same as number of triggers generated, both
numbers are shown on the midas status page.

> Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling 
> cm_register_transition(TR_STOP, 500);

default sequence, both mfe.c frontend and c++ tmfe frontend:

start of run:
- mlogger first (configure history, open data file)
- frontends last
- (if any frontend fails, TR_STARTABORT is sent to mlogger to close the output file and "undo" the run start)

end of run:
- frontends first (must not send any events after after processing the TR_STOP RPC call, inside the TR_STOP handler, bm_flush_cache() takes care of the write cache)
- mlogger last
- (if any frontend fails, failure is ignored, run stops regardless)

wrong order will be only if they manually change it, and whatever order
they set, you see it on the midas transition page (and mtransition -v and odbedit stop now -v, etc).

K.O.
Entry  22 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
multithreaded frontends have an unusual event buffer corruption if the write 
cache is enabled. For a long time now I had to disable the write cache on
all multithreaded frontends in alpha-g, I was hitting this bug quite often.
(somehow I do not see this problem reported on bitbucket!)

last week I reworked the multithread locking of event buffers, in hope
that this bug will turn up, but nope, all mutexes and locking look okey,
except for a number of unrelated problems (races against bm_close_buffer()
were the most troublesome to fix).

but finally found the trouble.

first, some background.

because multiprocess locking is expensive, frontends that generate
a large number of small events can use the write cache to reduce
this overhead. instead of locking the shared memory event buffer for
each event, events are accumulated in the write cache, and periodic
calls to bm_flush_buffer() flush them to shared memory. For best effect,
one should increase the size of the write cache until lock rate is around
10/second.

it turns out introduction of multithreading broke bm_flush_cache().

it does this:

- int ask_free = pbuf->wp; // how much data we have in the write cache now
- call bm_wait_for_free_space(ask_free); // ensure we have this much free shared 
memory space
- copy pbuf->wp worth if events to shared memory

looks okey at first sight. this is what happens to trigger the bug:

- int ask_free = pbuf->wp; // ok
- call bm_wait_for_free_space(ask_free); // ok, but if shared memory is full, it 
will go to sleep waiting for free space
- in the mean time, another thread calls bm_send_event(), this adds more data to 
the write cache, moves pbuf->wp
- bm_wait_for_free_space() eventually returns
- copy pbuf->wp worth of data to shared memo KABOOM! shared memory corruption!

we just overwrote some unlucky event in shared memory: we only have "ask_free"
free bytes available, but pbuf->wp moved and now has more data,
and it does not fit, and there is no check against it.

of course in the single threaded world this bug did not exist, there was no 
other thread to call bm_send_event() while bm_flush_cache() is sleeping.

the obvious fix is to ask for more free space if cached data does not fit.

this is now implemented on the branch feature/buffer_mutex. after a bit more 
tested I will merge it into develop.

so that's it?

not so fast. there was more going on. as described, the bug will only happen
when shared memory event buffer is full. (i.e. rarely or never). It turns
out the old version of thread locking code was defective and permitted
a race between bm_send_event() and bm_send_event() in another thread:

thread 1: while (1) { bm_send_event(very small event); }
thread 2:
-> bm_send_event(very big event)
-> no space in the cache for the very big event, call bm_flush_cache()
-> bm_flush_cache() asks bm_wait_for_free_space() to make space for cached data
-> this was done with write cache mutex released (mistake!)
-> at the same time bm_send_event(very small event) added 1 more small event to 
the cache
-> back in bm_flush_cache() write cache mutex is locked correctly, we copy 
cached data to shared memory and again KABOOM because we now have more data than 
we asked free space for.

So in the original implementation, corruption was possible even when share 
memory event buffer was pretty much empty.

The reworked locking code closed that loop hole - bm_flush_cache() is now
called with write cache locked, and bm_send_event() from another thread
cannot confuse things, unless shared memory buffer is full and we go to
sleep inside bm_wait_for_free_space(). And this is now fixed, too.

K.O.
    Reply  22 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
Thanks Konstantin for your detailed description. 

I wonder why we never saw this problem at PSI. Here is the reason: In multil-threaded environments, we never call bm_send_event() directly 
from all threads (since in the old days nothing was thread safe in midas). Instead, we use a collector thread which gets all events via the 
rb_xxx functions from the individual readout threads. This is well integrated into the mfe.cxx framework. Look at examples/mtfe/mfte.cxx. 
Each thread does (simplified):

while (true) {
  do {
     status = rb_get_wp(&pevent);
  } while (status == DB_TIMEOUT)

  bm_compose_event_threadsafe(pevent, ..., &serial_number);
  bk_init32(pevent+1);
  ... fill event ...
  bk_close(pevent)

  rb_increment_wp(sizeof(EVENT_HEADER) + pevent->data_size);
}

The framework now collects all these events in receive_trigger_event() which runs in the main thread:

  for (i=0 ; i<n_thread ; i++) {
     rb_get_rp(i, pevent);
     if (pevent->serial_number == prev_serial+1)
        break;
  }
  
  prev_serial = pevent->serial_number;
  rpc_send_event(pevent);
  rb_increment_rp(sizeof(EVENT_HEADER) + pevent->data_size);

This code ensures that all events are in the right sequence (before the serial numbers where mixed up) and that all events are sent only 
from a single thread, so the write buffer can be used effectively without complicated multi-thread locks.

This solution works nicely at PSI since many years, maybe you should put some thought to use it in your tmfe framework in Alpha-g as well 
instead of struggling with all your locks.

Stefan
       Reply  23 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
I confirm, there is no problem in single-threaded programs, and
there is no problem if all bm_send_event() and bm_flush_cache() are called
from the same thread.

> ... instead of struggling with all your locks.

it is better to have midas fully thread safe. ODB has been so for a long time,
event buffer partially (except for this bug), now fully.

without that the problem still exists, because in many frontends,
bm_flush_buffer() is called from the main thread, and will race
against the "bm_send_event() thread". Of course you can do
everything on the main thread, but this opens you to RPC timeouts
during run transitions (if you sleep in bm_wait_for_free_space()).

also the SYSMSG buffer is subject to the same bug. cm_msg() is of course
safe to call from anywhere, but cm_msg_flush_buffer() and cm_periodic_tasks()
can be called from any thread, and they issue bm_send_event(SYSMSG),
and there will be mysterious crashes and SYSMSG corruptions, probably
only during message storms, but still!

K.O.
          Reply  24 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
> > ... instead of struggling with all your locks.
> 
> it is better to have midas fully thread safe. ODB has been so for a long time,
> event buffer partially (except for this bug), now fully.
> 
> without that the problem still exists, because in many frontends,
> bm_flush_buffer() is called from the main thread, and will race
> against the "bm_send_event() thread". Of course you can do
> everything on the main thread, but this opens you to RPC timeouts
> during run transitions (if you sleep in bm_wait_for_free_space()).

Just for the record: in the mfe.cxx framework both bm_send_event() and 
bm_flush_buffer() are called from the main thread, as can be seen in the 
midas/examples/mtfe/mtfe.cxx example.

But I agree that having all buffer operations thread safe is a clear benefit.

Stefan
    Reply  23 Mar 2022, Ivo Schulthess, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
Thanks for the investigation. Back in 2020, we had some issues of losing data between the system buffer and the logger writing them to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment but we had a multithreaded FE running at the same time. Could this be related to the same problem?

Best, Ivo
       Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
> Thanks for the investigation. Back in 2020, we had some issues
> of losing data between the system buffer and the logger writing them
> to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment
> but we had a multithreaded FE running at the same time. Could this be related to the same problem?

I think we will have to follow up on your problem 1966 separately.

I think this bug cannot lose events. Writing events to the write cache has correct
locking, no loss here. writing write cache to shared memory has correct locking,
no loss there. the bug will cause the *next* event in the event buffer to be overwritten,
this will be detected by most programs as shared memory corruption and everybody
will quit. (mhttpd, mserver, odbedit will probably survive).

I guess there could be unlucky corruption that looks like nothing was corrupted,
but this will affect only a few events right at the shared memory read/write
pointer, it so happens that they are the oldest events in the buffer and likely
mlogger already wrote them to disk. mlogger read pointer will likely follow
the shared memory write pointer closely, well ahead of the shared memory
read pointer which always pointe to the older event and where this bug's corruption
will happen.

So no, I do not think this bug can cause event loss between frontend and mlogger.

K.O.
Entry  23 Mar 2022, Hunter Lowe, Forum, ODB has issue with example analyzer 
Trying to play with midas file but I get error:

[Analyzer,ERROR] [odb.cxx:845:db_validate_name,ERROR] Invalid name "/Analyzer/Tests/low_sum/Rate [Hz]" passed to db_create_key_wlocked: should not contain "["

I'm not sure what sets the name so I'm not sure how to fix this.

Thanks
Entry  15 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd ipv6 bind should be fixed now 
Something changed after my initial implementation of ipv6 in mhttpd
and listening to ipv6 http/https connections was broken.

It turns out, I do not need to listen to both ipv4 and ipv6 sockets,
it is sufficient to listen to just ipv6. ipv4 connections will also
magically work. see linux kernel "bindv6only" sysctl setting: https://sysctl-
explorer.net/net/ipv6/bindv6only/

The specific bug in mhttpd was to bind to ipv4 socket first, subsequent bind() to ipv6 socket 
fails with error "Address already in use", which is silent, not reported by the mongoose library. 
For reasons unknown, this does not happen to bind() to "localhost" aka ipv6 "::1".

Apparently other web servers (apache, nginx) are/were also affected by this problem. 
https://chrisjean.com/fix-nginx-emerg-bind-to-80-failed-98-address-already-in-use/

First fix was to bind to ipv6 first (success) and to ipv4 second (fails). Second fix
committed to git is to only listen to ipv6.

This works both on MacOS and on Linux. Linux reports the listener socket is "tcp6", MacOS reports 
the listener socket as "tcp46":

4ed0:javascript1 olchansk$ netstat -an | grep 808 | grep LISTEN
tcp46      0      0  *.8081                 *.*                    LISTEN     
tcp6       0      0  ::1.8080               *.*                    LISTEN     
tcp4       0      0  127.0.0.1.8080         *.*                    LISTEN     
4ed0:javascript1 olchansk$ 

K.O.
    Reply  23 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd ipv6 bind should be fixed now 
> Something changed after my initial implementation of ipv6 in mhttpd
> and listening to ipv6 http/https connections was broken.

Reporting that mhttpd ipv6 works at CERN. The hostnames for ipv6 connections
come back as alphacpc05.ipv6.cern.ch instead of alphacpc05.cern.ch
so both are added to the http "insecure port" whitelist.

K.O.
Entry  10 Mar 2022, Gennaro Tortone, Bug Report, Python ODB watch 
Hi,

I have an issue with ODB watch on MIDAS Python library;

I wrote a simple frontend that read/write FPGA registers through 
ODB keys (simplified version at link below):

https://gist.github.com/gtortone/cd035a9ac4ea7a78ea9cd931e80e2c75

Everything works fine but there is a boolean array
in Settings (Enable ADC sampling) that I need to "toggle" 
(19 bit to 0 and 19 bit to 1). This operation is handled by
detailed_settings_changed_func that write the value of
toggled bit to FPGA.

The issue is that if I quickly toggle the boolean array by
odbedit:

set "/Equipment/odbtest/Settings/Enable ADC sampling[0-18]" 0
set "/Equipment/odbtest/Settings/Enable ADC sampling[0-18]" 1

I see in the Python script the following list of callbacks:

detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[0] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[1] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[2] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[3] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[4] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[5] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[6] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[7] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[8] - new value 1    ***
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[9] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[10] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[11] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[12] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[13] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[14] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[15] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[16] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[17] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[18] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[0] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[1] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[2] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[3] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[4] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[5] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[6] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[7] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[8] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[9] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[10] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[11] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[12] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[13] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[14] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[15] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[16] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[17] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[18] - new value 1

It seems that the second write operation "overlaps" the first one...

The same behavior is not observed using a 'watch' in odbedit...

I can overcame this problem using the value of register as ODB key to avoid 
array of boolean... but I report this issue as "possible" bug/limitation on Python ODB watch;

Cheers,
Gennaro
    Reply  16 Mar 2022, Ben Smith, Bug Report, Python ODB watch 
> It seems that the second write operation "overlaps" the first one...

Hi Gennaro,

In principle the same issue can happen in C++ code, but is much less likely as the callbacks get executed  more quickly (partly due to C++/python in general, and partly because the python code does some extra work to make the interface more user-friendly). The C++ code at the end of this message adds a 100ms sleep to the callback and can result in output like this when you do quick edits of "Test[0-19]" in odbedit.

Element 1 is 0
Element 2 is 0
Element 3 is 0
Element 4 is 0
Element 5 is 0
Element 6 is 0
Element 7 is 1
Element 8 is 1
Element 9 is 1
etc...


I agree that this can be a really nasty source of bugs if you need to react to every change. I'll add a warning to the python docstrings, but I can't think of a way to make this more robust at the midas level - I think we'd need some sort of ODB "snapshot" system...









#include "midas.h"

void watch_fn(HNDLE hDB, HNDLE hKey, int index, void *info) {
  DWORD data = 0;
  INT buf_size = sizeof(data);
  db_get_data_index(hDB, hKey, &data, &buf_size, index, TID_DWORD);
  printf("Element %d is %u\n", index, data);
  ss_sleep(100);
}

int main() {
  HNDLE hDB, hClient, hTestKey;

  std::string host, expt;
  cm_get_environment(&host, &expt);
  cm_connect_experiment(host.c_str(), expt.c_str(), "test_odb", nullptr);
  cm_get_experiment_database(&hDB, &hClient);

  static const DWORD numValues = 20;
  DWORD data[numValues] = {};
  db_set_value(hDB, 0, "Test", data, sizeof(DWORD) * numValues, numValues, TID_DWORD);
  db_find_key(hDB, 0, "Test", &hTestKey);
  db_watch(hDB, hTestKey, watch_fn, nullptr);

  printf("Press any key to exit loop...\n");

  while (!ss_kbhit()) {
    cm_yield(1);
  }

  db_unwatch_all();
  db_delete_key(hDB, hTestKey, FALSE);
  cm_disconnect_experiment();

  return 0;
}
       Reply  21 Mar 2022, Stefan Ritt, Bug Report, Python ODB watch 
What you describe is a well-known problem with the ODB. At PSI we have similar issues. There are
two approaches to solve it:

1) Write values one-by-one to the ODB, but do not trigger a watch update. In the sequencer, this
can be achieved with the ODBSET command (see https://daq00.triumf.ca/MidasWiki/index.php/Sequencer 
and the last paragraph right of the ODBSET command). You use notify=0 for all set commands except
the last one where you use notify=1. On the C++ API, you can use db_set_data_index1() which has
this notify flag as the last parameter.

2) You add intelligence to your front-end. If you get a watchdog update, you do not apply this
directly to the hardware, but put it into a FIFO. Once you do not get any more update for a certain
period (like 1s is a good value), you empty the FIFO and apply all setting immediately.

Both methods have been used at PSI successfully, although 1) is much easier to implement, especially
if you use the midas sequencer.

Stefan
Entry  03 Mar 2022, Konstantin Olchanski, Info, manalyzer updated 
manalyzer was updated to latest version. mostly multi-threading improvements from 
Joseph and myself. K.O.
Entry  03 Mar 2022, Konstantin Olchanski, Info, zlib required, lz4 internal 
as of commit 8eb18e4ae9c57a8a802219b90d4dc218eb8fdefb, the gzip compression
library is required, not optional.

this fixes midas and manalyzer mis-build if the system gzip library
is accidentally not installed. (is there any situation where
gzip library is not installed on purpose?)

midas internal lz4 compression library was renamed to mlz4 to avoid collision
against system lz4 library (where present). lz4 files from midasio are now
used, lz4 files in midas/include and midas/src are removed.

I see that on recent versions of ubuntu we could switch to the system version 
of the lz4 library. however, on centos-7 systems it is usually not present
and it still is a supported and widely used platform, so we stay
with the midas-internal library for now.

K.O.
Entry  23 Feb 2022, Stefan Ritt, Info, Midas slow control event generation switched to 32-bit banks 
The midas slow control system class drivers automatically read their equipment and generate events containing midas banks. So far these have been 16-bit banks using bk_init(). But now more and more experiments use large amount of channels, so the 16-bit address space is exceeded. Until last week, there was even no check that this happens, leading to unpredictable crashes.

Therefore I switched the bank generation in the drivers generic.cxx, hv.cxx and multi.cxx to 32-bit banks via bk_init32(). This should be in principle transparent, since the midas bank functions automatically detect the bank type during reading. But I thought I let everybody know just in case.

Stefan
Entry  08 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue mslerror.PNG
Hi all,

I am having some issues getting the ODBINC command to work within the MIDAS 
sequencer. I am trying to increment one of the ODB values but it is returning a 
mismatch data-type size error (see attached image).

All the ODB variables are MIDAS data-type FLOAT and should all be 32-bit values, 
but for some reason MIDAS is thinking they are 4-bit values. I have tried creating 
new ODB keys of type INT32, UINT32 and DOUBLE but they all give the same error.

If anybody has any suggestions I would really appreciate some help:)

Thanks



 
    Reply  08 Feb 2022, Konstantin Olchanski, Bug Fix, ODBINC/Sequencer Issue 
Please post the output of odbedit "ls -l" for /eq/ar.../variables. (you posted the 
variable name as an image, and I cannot cut-and-paste the odb path!). BTW data size 4 is 
correct, 4 bytes for INT32/UINT32/FLOAT. For DOUBLE it should be 8. For you it prints 32 
and this is wrong, we need to see the output of "ls -l".
K.O.
       Reply  09 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> Please post the output of odbedit "ls -l" for /eq/ar.../variables. (you posted the 
> variable name as an image, and I cannot cut-and-paste the odb path!). BTW data size 4 is 
> correct, 4 bytes for INT32/UINT32/FLOAT. For DOUBLE it should be 8. For you it prints 32 
> and this is wrong, we need to see the output of "ls -l".
> K.O.

Hi,

Thanks for getting back to me regarding this. The output of "ls -l" is:

[local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
[local:mu3eMSci:S]Variables>ls -l
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
_T_                             FLOAT   1     4     1h   0   RWD  20.93
_F_                             FLOAT   1     4     1h   0   RWD  12.8
_P_                             FLOAT   1     4     1h   0   RWD  56
_S_                             FLOAT   1     4     1h   0   RWD  5
_H_                             FLOAT   1     4     60h  0   RWD  44.74
_B_                             FLOAT   1     4     60h  0   RWD  18.54
_A_                             FLOAT   1     4     1h   0   RWD  14.41
_RH_                            FLOAT   1     4     1h   0   RWD  41.81
_AT_                            FLOAT   1     4     1h   0   RWD  20.46
SP                              INT16   1     2     1h   0   RWD  10

Many Thanks
Jago
          Reply  09 Feb 2022, Konstantin Olchanski, Bug Fix, ODBINC/Sequencer Issue 
> 
> [local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
> [local:mu3eMSci:S]Variables>ls -l
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> _T_                             FLOAT   1     4     1h   0   RWD  20.93
> _F_                             FLOAT   1     4     1h   0   RWD  12.8
> _P_                             FLOAT   1     4     1h   0   RWD  56
> _S_                             FLOAT   1     4     1h   0   RWD  5
> _H_                             FLOAT   1     4     60h  0   RWD  44.74
> _B_                             FLOAT   1     4     60h  0   RWD  18.54
> _A_                             FLOAT   1     4     1h   0   RWD  14.41
> _RH_                            FLOAT   1     4     1h   0   RWD  41.81
> _AT_                            FLOAT   1     4     1h   0   RWD  20.46
> SP                              INT16   1     2     1h   0   RWD  10
> 

This looks okey, so we still have no explanation for your error. Please post your sequencer 
script?

K.O.
             Reply  10 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
I tried following script:

ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10

LOOP 10
  WAIT seconds, 3
  ODBINC /Equipment/ArduinoTestStation/Variables/_S_
ENDLOOP

and it worked as expected. So I conclude the problem must be in your script. Probably a typo in 
the ODB path pointing to a 32-byte string instead to a 4-byte float.

Stefan  
                Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> I tried following script:
> 
> ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10
> 
> LOOP 10
>   WAIT seconds, 3
>   ODBINC /Equipment/ArduinoTestStation/Variables/_S_
> ENDLOOP
> 
> and it worked as expected. So I conclude the problem must be in your script. Probably a typo in 
> the ODB path pointing to a 32-byte string instead to a 4-byte float.
> 
> Stefan  

Hi Stefan,

Cheers for the reply. I believe the syntax we are using is correct. I have tried copying the script 
you used above and it results in the same error as before. Perhaps something is going wrong 
elsewhere - I will take another look today.

Jago
                   Reply  14 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
Just post here a minimal script which produces the error, so that I can try myself.

... and make sure that you have the latest develop version of midas.

Stefan
                      Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> Just post here a minimal script which produces the error, so that I can try myself.
> 
> ... and make sure that you have the latest develop version of midas.
> 
> Stefan

Here is the simplest script which produces the error:

WAIT seconds, 3
ODBINC /Equipment/ArduinoTestStation/Variables/_S_

I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
year. Indeed we have not installed any later versions of MIDAS since then so we will 
double check we have the latest version.

Jago
                         Reply  14 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
> I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
> year. Indeed we have not installed any later versions of MIDAS since then so we will 
> double check we have the latest version.

As you see from my reply to Jacob, the bug has been fixed in midas since then, so just 
update.

Stefan
                            Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
> > year. Indeed we have not installed any later versions of MIDAS since then so we will 
> > double check we have the latest version.
> 
> As you see from my reply to Jacob, the bug has been fixed in midas since then, so just 
> update.
> 
> Stefan

We have tried updating using both:

git submodule update --init --recursive

and:

git pull --recurse-submodules

But the error still persists. Is there another way to update which we are missing?

Cheers
Jago
                               Reply  15 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
> But the error still persists. Is there another way to update which we are missing?

The bug was definitively fixed in this modification:

https://bitbucket.org/tmidas/midas/commits/5f33f9f7f21bcaa474455ab72b15abc424bbebf2

You probably forgot to compile/install correctly after your pull. Of you start "odbedit" and do 
a "ver" you see which git revision you are currently running. Make sure to get this output:

MIDAS version:      2.1
GIT revision:       Fri Feb 11 08:56:02 2022 +0100 - midas-2020-08-a-509-g585faa96 on branch 
develop
ODB version:        3


Stefan
                                  Reply  16 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > But the error still persists. Is there another way to update which we are missing?
> 
> The bug was definitively fixed in this modification:
> 
> https://bitbucket.org/tmidas/midas/commits/5f33f9f7f21bcaa474455ab72b15abc424bbebf2
> 
> You probably forgot to compile/install correctly after your pull. Of you start "odbedit" and do 
> a "ver" you see which git revision you are currently running. Make sure to get this output:
> 
> MIDAS version:      2.1
> GIT revision:       Fri Feb 11 08:56:02 2022 +0100 - midas-2020-08-a-509-g585faa96 on branch 
> develop
> ODB version:        3
> 
> 
> Stefan

We we're having some problems compiling but have got it sorted now - thanks for your help:)

Jago
             Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > 
> > [local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
> > [local:mu3eMSci:S]Variables>ls -l
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > _T_                             FLOAT   1     4     1h   0   RWD  20.93
> > _F_                             FLOAT   1     4     1h   0   RWD  12.8
> > _P_                             FLOAT   1     4     1h   0   RWD  56
> > _S_                             FLOAT   1     4     1h   0   RWD  5
> > _H_                             FLOAT   1     4     60h  0   RWD  44.74
> > _B_                             FLOAT   1     4     60h  0   RWD  18.54
> > _A_                             FLOAT   1     4     1h   0   RWD  14.41
> > _RH_                            FLOAT   1     4     1h   0   RWD  41.81
> > _AT_                            FLOAT   1     4     1h   0   RWD  20.46
> > SP                              INT16   1     2     1h   0   RWD  10
> > 
> 
> This looks okey, so we still have no explanation for your error. Please post your sequencer 
> script?
> 
> K.O.

Hey, thanks for getting back to me

We are fairly confident the syntax is correct. Having tried the test script posted by Stefan:

> ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10
> 
> LOOP 10
>   WAIT seconds, 3
>   ODBINC 

The same error is returned:/

We will take another look today.

Jago
Entry  02 Dec 2021, Alexey Kalinin, Bug Report, some frontend kicked by cm_periodic_tasks 
Hello,
We have a small experiment with MIDAS based DAQ.
Status page shows :
ES	ESFrontend@192.168.0.37	207	0.2	0.000
Trigger06	Sample Frontend06@192.168.0.37	1.297M	0.3	0.000
Trigger01	Sample Frontend01@192.168.0.37	1.297M	0.3	0.000
Trigger16	Sample Frontend16@192.168.0.37	1.297M	0.3	0.000
Trigger38	Sample Frontend38@192.168.0.37	1.297M	0.3	0.000
Trigger37	Sample Frontend37@192.168.0.37	1.297M	0.3	0.000
Trigger03	Sample Frontend03@192.168.0.38	1.297M	0.3	0.000
Trigger07	Sample Frontend07@192.168.0.38	1.297M	0.3	0.000
Trigger04	Sample Frontend04@192.168.0.38	59898	0.0	0.000
Trigger08	Sample Frontend08@192.168.0.38	59898	0.0	0.000
Trigger17	Sample Frontend17@192.168.0.38	59898	0.0	0.000


And SYSTEM buffers page shows:
ESFrontend	1968	198	47520	0	0x00000000	0		
193 ms
Sample Frontend06	1332547	1330826	379729872	0	0x00000000	
0		1.1 sec
Sample Frontend16	1332542	1330839	361988208	0	0x00000000	
0		94 ms
Sample Frontend37	1332530	1330841	337798408	0	0x00000000	
0		1.1 sec
Sample Frontend01	1332543	1330829	467136688	0	0x00000000	
0		34 ms
Sample Frontend38	1332528	1330830	291453608	0	0x00000000	
0		1.1 sec
Sample Frontend04	63254	61467	20882584	0	0x00000000	
0		208 ms
Sample Frontend08	63262	61476	27904056	0	0x00000000	
0		205 ms
Sample Frontend17	63271	61473	20433840	0	0x00000000	
0		213 ms
Sample Frontend03	1332549	1330818	386821728	0	0x00000000	
0		82 ms
Sample Frontend07	1332554	1330821	462210896	0	0x00000000	
0		37 ms
Logger	968742	0w+9500418r	0w+2718405736r	0	0x00000000	0	
GET_ALL Used 0 bytes 0.0%	303 ms
rootana	254561	0w+29856958r	0w+8718288352r	0	0x00000000	0		
762 ms


The problem is that eventually some of frontend closed with message 
:19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

in the meantime mserver loggging :
mserver started interactively
mserver will listen on TCP port 1175
double free or corruption (!prev)
double free or corruption (!prev)
free(): invalid next size (normal)
double free or corruption (!prev)


I can find some correlation between number of events/event size produced by 
frontend, cause its failed when its become big enough. 

frontend scheme is like this:

poll event time set to 0;

poll_event{
//if buffer not transferred return (continue cutting the main buffer)
//read main buffer from hardware
//buffer not transfered
}

read event{
// cut the main buffer to subevents (cut one event from main buffer) return;
//if (last subevent) {buffer transfered ;return}
}

What is strange to me that 2 frontends (1 per remote pc) causing this.

Also, I'm executing one FEcode with -i # flag , put setting eventid in 
frontend_init , and using SYSTEM buffer for all.

Is there something I'm missing?
Thanks. 
A.
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, some frontend kicked by cm_periodic_tasks 
> The problem is that eventually some of frontend closed with message 
> :19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
> 'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

This messages means what it says. A client was registered with the SYSMSG buffer and this 
client had pid 9789. At some point some other client (rootana, in this case) checked it and 
process pid 9789 was no longer running. (it then proceeded to remove the registration).

There is 2 possibilities:
- simplest: your frontend has crashed. best to debug this by running it inside gdb, wait for 
the crash.
- unlikely: reported pid is bogus, real pid of your frontend is different, the client 
registration in SYSMSG is corrupted. this would indicate massive corruption of midas shared 
memory buffers, not impossible if your frontend misbehaves and writes to random memory 
addresses. ODB has protection against this (normally turned off, easy to enable, set ODB 
"/experiment/protect odb" to yes), shared memory buffers do not have protection against this 
(should be added?).

Do this. When you start your frontend, write down it's pid, when you see the crash message, 
confirm pid number printed is the same. As additional test, run your frontend inside gdb, 
after it crashes, you can print the stack trace, etc.

> 
> in the meantime mserver loggging :
> mserver started interactively
> mserver will listen on TCP port 1175
> double free or corruption (!prev)
> double free or corruption (!prev)
> free(): invalid next size (normal)
> double free or corruption (!prev)
> 

Are these "double free" messages coming from the mserver or from your frontend? (i.e. you run 
them in different terminals, not all in the same terminal?).

If messages are coming from the mserver, this confirms possibility (1),
except that for frontends connected remotely, the pid is the pid of the mserver,
and what we see are crashes of mserver, not crashes of your frontend. These are much harder to 
debug.

You will need to enable core dumps (ODB /Experiment/Enable core dumps set to "y"),
confirm that core dumps work (i.e. "killall -SEGV mserver", observe core files are created
in the directory where you started the mserver), reproduce the crash, run "gdb mserver 
core.NNNN", run "bt" to print the stack trace, post the stack trace here (or email to me 
directly).

>
> I can find some correlation between number of events/event size produced by 
> frontend, cause its failed when its become big enough. 
> 

There is no limit on event size or event rate in midas, you should not see any crash
regardless of what you do. (there is a limit of event size, because an event has
to fit inside an event buffer and event buffer size is limited to 2 GB).

Obviously you hit a bug in mserver that makes it crash. Let's debug it.

One thing to try is set the write cache size to zero and see if your crash goes away. I see
some indication of something rotten in the event buffer code if write cache is enabled. This
is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
where odb settings have no effect depending on value of "equipment_common_overwrite").

>
> frontend scheme is like this:
> 

Best if you use the tmfe c++ frontend, event data handling is much simpler and we do not
have to debug the convoluted old code in mfe.c.

K.O.

>
> poll event time set to 0;
> 
> poll_event{
> //if buffer not transferred return (continue cutting the main buffer)
> //read main buffer from hardware
> //buffer not transfered
> }
> 
> read event{
> // cut the main buffer to subevents (cut one event from main buffer) return;
> //if (last subevent) {buffer transfered ;return}
> }
> 
> What is strange to me that 2 frontends (1 per remote pc) causing this.
> 
> Also, I'm executing one FEcode with -i # flag , put setting eventid in 
> frontend_init , and using SYSTEM buffer for all.
> 
> Is there something I'm missing?
> Thanks. 
> A.
       Reply  11 Feb 2022, Alexey Kalinin, Bug Report, some frontend kicked by cm_periodic_tasks 
Thanks for the answer.
As soon as I can(possible in a month) I'll try suggestion below:

> One thing to try is set the write cache size to zero and see if your crash goes away. I see
> some indication of something rotten in the event buffer code if write cache is enabled. This
> is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
> where odb settings have no effect depending on value of "equipment_common_overwrite").

I tried to change this ODB for one of the frontend via mhttpd/browser, and eventually it goes back 
to default value (1000 as I remember). but this frontend has the minimum rate 50DWORD/~10sec. and 
depending on cashe size it appears in mdump once per 31 events but all aff them . SO its different 
story, but m.b. it has the same solution to play with Write Cashe Size.    


double free message goes from mserver terminal. 
all of the frontends are remote.
I can't exclude crashes of frontend , but when I run ./frontend -i 1(2,3 etc) thet means that I run 
one code for all, and only several causes crash.also I found that crash in frontend happened while 
it do nothing with collected data (last event reached and new data is not ready), but it tries to 
watch for the ODB changes.I mean it crashes iside (while {odb_changes(value in watchdog)}),and I don't 
know what else happenned meanwhile with cahed buffer.

Future plans is to use event buider for frontends when data/signals will be perfectly reasonable 
i/e/ without broken events. for now i kinda worry about if one of frontends will skip one of the 
event inside its buffer.


Thanks for the way to dig into.
A.    

> > The problem is that eventually some of frontend closed with message 
> > :19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
> > 'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist
> 
> This messages means what it says. A client was registered with the SYSMSG buffer and this 
> client had pid 9789. At some point some other client (rootana, in this case) checked it and 
> process pid 9789 was no longer running. (it then proceeded to remove the registration).
> 
> There is 2 possibilities:
> - simplest: your frontend has crashed. best to debug this by running it inside gdb, wait for 
> the crash.
> - unlikely: reported pid is bogus, real pid of your frontend is different, the client 
> registration in SYSMSG is corrupted. this would indicate massive corruption of midas shared 
> memory buffers, not impossible if your frontend misbehaves and writes to random memory 
> addresses. ODB has protection against this (normally turned off, easy to enable, set ODB 
> "/experiment/protect odb" to yes), shared memory buffers do not have protection against this 
> (should be added?).
> 
> Do this. When you start your frontend, write down it's pid, when you see the crash message, 
> confirm pid number printed is the same. As additional test, run your frontend inside gdb, 
> after it crashes, you can print the stack trace, etc.
> 
> > 
> > in the meantime mserver loggging :
> > mserver started interactively
> > mserver will listen on TCP port 1175
> > double free or corruption (!prev)
> > double free or corruption (!prev)
> > free(): invalid next size (normal)
> > double free or corruption (!prev)
> > 
> 
> Are these "double free" messages coming from the mserver or from your frontend? (i.e. you run 
> them in different terminals, not all in the same terminal?).
> 
> If messages are coming from the mserver, this confirms possibility (1),
> except that for frontends connected remotely, the pid is the pid of the mserver,
> and what we see are crashes of mserver, not crashes of your frontend. These are much harder to 
> debug.
> 
> You will need to enable core dumps (ODB /Experiment/Enable core dumps set to "y"),
> confirm that core dumps work (i.e. "killall -SEGV mserver", observe core files are created
> in the directory where you started the mserver), reproduce the crash, run "gdb mserver 
> core.NNNN", run "bt" to print the stack trace, post the stack trace here (or email to me 
> directly).
> 
> >
> > I can find some correlation between number of events/event size produced by 
> > frontend, cause its failed when its become big enough. 
> > 
> 
> There is no limit on event size or event rate in midas, you should not see any crash
> regardless of what you do. (there is a limit of event size, because an event has
> to fit inside an event buffer and event buffer size is limited to 2 GB).
> 
> Obviously you hit a bug in mserver that makes it crash. Let's debug it.
> 
> One thing to try is set the write cache size to zero and see if your crash goes away. I see
> some indication of something rotten in the event buffer code if write cache is enabled. This
> is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
> where odb settings have no effect depending on value of "equipment_common_overwrite").
> 
> >
> > frontend scheme is like this:
> > 
> 
> Best if you use the tmfe c++ frontend, event data handling is much simpler and we do not
> have to debug the convoluted old code in mfe.c.
> 
> K.O.
> 
> >
> > poll event time set to 0;
> > 
> > poll_event{
> > //if buffer not transferred return (continue cutting the main buffer)
> > //read main buffer from hardware
> > //buffer not transfered
> > }
> > 
> > read event{
> > // cut the main buffer to subevents (cut one event from main buffer) return;
> > //if (last subevent) {buffer transfered ;return}
> > }
> > 
> > What is strange to me that 2 frontends (1 per remote pc) causing this.
> > 
> > Also, I'm executing one FEcode with -i # flag , put setting eventid in 
> > frontend_init , and using SYSTEM buffer for all.
> > 
> > Is there something I'm missing?
> > Thanks. 
> > A.
Entry  28 May 2021, Joseph McKenna, Bug Report, History plots deceiving users into thinking data is still logging flatline.pngflatline.png
I have been trying to fix this myself but my javascript isn't strong... The 'new' history plot render fills in missing data with the last ODB value (even when this value is very old... elog:2180/1 shows this... The data logging stopped, but the history plot can fool users into thinking data is logging (The export button generates CSVs with entires every 10 seconds also). Grepping through the history files behind the scenes, I found only one match for an example variable from this plot, so it looks like there are no entries after March 24th (although I may be mistaken, I've not studied the history files data structure in detail), ie this is a artifact from the mhistory.js rather than the mlogger... Have I missed something simple? Would it be possible to not draw the line if there are no datapoints in a significant time? Or maybe render a dashed line that doesn't export to CSV? Thanks in advance Edit, I see certificate errors this forum and I think its preventing my upload an image... inlining it into the text here:
    Reply  28 May 2021, Stefan Ritt, Bug Report, History plots deceiving users into thinking data is still logging 

This is a known problem and I'm working on. See the discussion at: 

https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

Stefan

       Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, History plots deceiving users into thinking data is still logging 
https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

this problem is a blocker for the next midas release.

the best I can tell, current development version of midas writes history data incorrectly,
but I do not have time to look at it at this moment.

I recommend that people use the latest released version, midas-2020-12. (this is what we have on alphag and 
should have in alpha2).

midas-2020-12 uses mlogger from midas-2020-08.

If I cannot find time to figure out what is going on in the mlogger,
the next release may have to be done the same way (with mlogger from midas-2020-08).

K.O.
          Reply  10 Feb 2022, Stefan Ritt, Bug Report, History plots deceiving users into thinking data is still logging 
The problem has been fixed on commit 825935dc on Oct. 2021 and runs fine since then at PSI. If TRIUMF people 
agree, we can close that issue and proceed.

Stefan
Entry  30 Sep 2021, Francesco Renga, Forum, OPC client within MIDAS 
Dear all,
     I need to integrate in my MIDAS project the communication with an OPC UA 
server. My plan is to develop an OPC UA client as a "device" in 
midas/drivers/device.

Two questions:

1) Is anybody aware of some similar effort for some other project, so that I can 
get some example?

2) What could be the more appropriate driver's class to be used? generic.cxx? 
multi.cxx?

Thank you for your help,
           Francesco
    Reply  10 Feb 2022, Francesco Renga, Forum, OPC client within MIDAS opc.cxxopc.h
Dear all,
        I finally succeeded to get a working driver for the communication with an OPC 
UA server. It is based on the open62541 library and I use it in combination with the 
generic.h driver class. This is still a crude implementation, but let me post it here, 
maybe it can be useful to somebody else.

BTW, if there is somebody more skilled than me with OPC UA and MIDAS drivers, who is 
willing to give suggestions for improving the implementation, it would be extremely 
appreciated.

Best Regards,
      Francesco



> Dear all,
>      I need to integrate in my MIDAS project the communication with an OPC UA 
> server. My plan is to develop an OPC UA client as a "device" in 
> midas/drivers/device.
> 
> Two questions:
> 
> 1) Is anybody aware of some similar effort for some other project, so that I can 
> get some example?
> 
> 2) What could be the more appropriate driver's class to be used? generic.cxx? 
> multi.cxx?
> 
> Thank you for your help,
>            Francesco
Entry  07 Feb 2022, Konstantin Olchanski, Forum, MidasWiki moved from ladd00 to daq00.triumf.ca and updated to MediaWiki 1.35 
MidasWiki moved from ladd00 (obsolete SL6) to daq00.triumf.ca (Ubuntu LTS 20.04) 
and updated from obsolete MediaWiki LTS 1.27.7 to MediaWiki LTS 1.35, supported 
until mid-2023, see https://www.mediawiki.org/wiki/Version_lifecycle

Old URL https://midas.triumf.ca and https://midas.triumf.ca/MidasWiki/...
redirect to new URL https://daq00.triumf.ca/MidasWiki/index.php/Main_Page

All old links and bookmarks should continue to work (via redirect).

To report problems with this MediaWiki instance and to request
any changes in configuration or installed extensions, please reply to this
message here.

K.O.
Entry  26 Jan 2022, Frederik Wauters, Forum, .gz files 
I adapted our analyzer to compile against the manalyzer included in the midas repo.

All our data files are .mid.gz, which now fail to process :(

frederik@frederik-ThinkPad-T550:~/new_daq/build/analyzer$ ./analyzer -e100 -s100 ../../run_backup_11783.mid.gz 
...
...n
Registered modules: 1
file[0]: ../../run_backup_11783.mid.gz
Setting up the analysis!
TMReadEvent: error: short read 0 instead of -1193512213

Which is in the TMEvent* TMReadEvent(TMReaderInterface* reader) class in the midasio.cxx file

Reading the unzipped files works. But we have always processed our .gz files directly, for the unzipping we would need ~2x disk space.

Am I doing something wrong? I see that there is some activity on lz4 in the midasio repo, is gunzip next?
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, .gz files 
> I adapted our analyzer to compile against the manalyzer included in the midas repo.
> TMReadEvent: error: short read 0 instead of -1193512213

I think this problem is fixed in the latest version of midasio and manalyzer, but this update
was not pulled into midas yet. (Canada is in the middle of a covid wave since December).

What happens is you do not have the gzip library installed on your computer and
your analyzer is built without support for gzip.

The fix is done the hard way, the gzip library is no longer optional, but required.

You do not say what linux you use, so I cannot give exact instructions, but for:
ubuntu: apt -y install libz-dev
centos7: installed by default
centos8: installed by default
debian11/raspbian: same as ubuntu

K.O.
       Reply  31 Jan 2022, Frederik Wauters, Forum, .gz files 
> > I adapted our analyzer to compile against the manalyzer included in the midas repo.
> > TMReadEvent: error: short read 0 instead of -1193512213
> 
> I think this problem is fixed in the latest version of midasio and manalyzer, but this update
> was not pulled into midas yet. (Canada is in the middle of a covid wave since December).
> 
> What happens is you do not have the gzip library installed on your computer and
> your analyzer is built without support for gzip.
> 
> The fix is done the hard way, the gzip library is no longer optional, but required.
> 
> You do not say what linux you use, so I cannot give exact instructions, but for:
> ubuntu: apt -y install libz-dev
> centos7: installed by default
> centos8: installed by default
> debian11/raspbian: same as ubuntu
> 
> K.O.

My libz under ubuntu

-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
-- MIDAS: Found ZLIB version 1.2.11

I got both the manalyzer example and mine going with
* the latest midas dev
* + the latest manalyzer (cf6c233)
* + almost latest midasio (568a617, otherwise I get an linking error 

./libmidas.a(midasio.cxx.o): In function `Lz4Error(int)':
midasio.cxx:(.text+0x359): undefined reference to `MLZ4F_getErrorName(unsigned long)'

So this works, I will assume that in the near future this all will come together in the standard midas release.

thanks  
Entry  29 Jan 2022, Isaac Labrie Boulay, Forum, MIDAS and GRIF-16 digitizer (Standalone Mode). 
Hi all,

I was sent a version of the frontend for the TIGRESS Detector lab setup so that 
I can test detectors using a GRIF-16 digitizer in standalone mode.

I followed the GRIF-16 wiki (https://grsi.wiki.triumf.ca/wiki/GRIF-16#One-
level_operation) to setup the GRIF-16 through the webpage. The digitized data is 
supposed to come into my UDP port 8800 but it is never retrieved in the 
frontend.

Here's the readout scheme:
// readout sequence ...
// poll_event() true (if still have data in buffer or testmsg() true)
// -> read_trigger_event() -> read_grifc_event() - re-buffers into midas events
//                         -> grifc_eventread()  - returns single grif fragment
//                         -> grifc_dataread()   - returns single net-pkt 


Here's poll_event():
INT poll_event(INT source, INT count, BOOL test)
{
   int i, have_data=0;

   for(i=0; i<count; i++){
      if( data_available ){ break; }
      have_data = ( testmsg(data_socket, 0) > 0 );
      if( have_data && !test ){ break; }
   }
   return( (have_data || data_available) && !test );
}

This being said, testmsg() always returns empty and "data_available" is only set 
to TRUE when there's leftover data after a GRIF-C reading (I'm obviously not 
using a GRIF-C).

I know that when GRIF-16 is in standalone mode, MIDAS does not change GRIF-16s 
settings based on the ODB, it has to be done through the GRIF-16 webpage. Is the 
user frontend code even responsible for the GRIF-16 data readout in standalone 
mode? If not, could it just be that my UDP offloader is incorrectly setup?

Here are its current settings:

SETTINGS/UDP
- Offloader: ON
- Dst IP: my IP
- Dst Port: 8800 (DATA_PORT)

SETTINGS/MIDAS
- Use MIDAS: OFF
- MIDAS Hostname: my hostname
- MIDAS IP: same as Dst IP from UDP settings
- Dst Port: 8080 (I'm assuming that this is the mhttpd port)

Again, the frontend runs but I get 0 events. What might I be missing?

Thanks for helping me out!

Isaac
Entry  22 Oct 2021, Francesco Renga, Forum, mhttpd error 
Dear all,
      I am trying to make the MIDAS web server for my DAQ project accessible from other machines. In the ODB, I activated the necessary flags:

[local:CYGNUS_RD:S]/WebServer>ls
Enable localhost port           y
localhost port                  8080
localhost port passwords        n
Enable insecure port            y
insecure port                   8081
insecure port passwords         y
insecure port host list         y
Enable https port               y
https port                      8443
https port passwords            y
https port host list            n
Host list
                                localhost
Enable IPv6                     y
Proxy                           
mime.types

Following the instructions on the Wiki I enabled the SSL support. When running mhttpd, I get these messages:

Mongoose web server will use HTTP Digest authentication with realm "CYGNUS_RD" and password file "/home/cygno/DAQ/online/htpasswd.txt"
Mongoose web server will use the hostlist, connections will be accepted only from: localhost
Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "[::1]:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "8081", passwords enabled, hostlist enabled
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8081"
Mongoose web server will use https certificate file "/home/cygno/DAQ/online/ssl_cert.pem"
Mongoose web server listening on https address "8443", passwords enabled, hostlist OFF
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8443"

and the server is not accessible from other machines. Any suggestion to solve or better investigate this problem?

Thank you very much,
         Francesco
    Reply  22 Oct 2021, Stefan Ritt, Forum, mhttpd error 
> Enable IPv6                     y

Probably the IPv6 problem, see here elog:2269

I asked to turn off IPv6 by default, or at least mention this in the documentation,
but unfortunately nothing happened.

Stefan
       Reply  25 Oct 2021, Francesco Renga, Forum, mhttpd error 
It worked, thank you very much!

Francesco

> > Enable IPv6                     y
> 
> Probably the IPv6 problem, see here elog:2269
> 
> I asked to turn off IPv6 by default, or at least mention this in the documentation,
> but unfortunately nothing happened.
> 
> Stefan
       Reply  26 Jan 2022, Konstantin Olchanski, Forum, mhttpd error 
> > Enable IPv6                     y
> 
> Probably the IPv6 problem, see here elog:2269
> 
> I asked to turn off IPv6 by default, or at least mention this in the documentation,
> but unfortunately nothing happened.

But IPv4 and IPv6 code is completely separate, if IPv6 bind fails, IPv4 should still 
work.

This is all very strange.

It does not help that the OP does not say in which way things do not work,
"the server is not accessible from other machines" is not an error message
reported by any browser, and we do not know what URL he is using
to access mhttpd - http: or https:

Also he is enabling the "insecure" port 8081, I am pretty sure the documentation
is pretty clear, either use the secure https port or the insecure port,
but not both at the same time.

In any case, I see current version of mongoose have removed support
for password files, so all this stuff will likely become reworked
and at the end mhttpd will only listen to localhost ports. To make it "accessible
to other machines", one will have to use the apache https proxy. (or mtpcproxy from 
midas).

K.O.
Entry  29 Oct 2021, Kushal Kapoor, Bug Report, Unknown Error 319 from client Screenshot_2021-10-26_114015.png
I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
localhost.

Frontend code compiled without any errors and MIDAS reads the frontend 
successfully, this only comes when I start the new run on MIDAS, here are a few 
more details from the terminal-

11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
mismatch for "/" (expected size: 1, size in ODB: 41920)

11:46:32 [Logger,INFO] Deleting previous file 
"/home/rcmp/online3/run00621_000.root"

11:46:32 [ODBEdit,ERROR] [midas.cxx:5073:cm_transition,ERROR] transition START 
aborted: client "fetiglab" returned status 319

11:46:32 [ODBEdit,ERROR] [midas.cxx:5246:cm_transition,ERROR] Could not start a 
run: cm_transition() status 319, message 'Unknown error 319 from client 
'fetiglab' on host "localhost"'

TR_STARTABORT transition: cleanup after failure to start a run

&#8204;

I’ve also enclosed a screenshot for the same, any suggestions would be highly 
appreciated. thanks
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Unknown Error 319 from client 
> I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
> after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
> localhost.

actually run never starts.

> 11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
> mismatch for "/" (expected size: 1, size in ODB: 41920)

this is the error that causes run start to fail. for reasons unknown
your frontend is trying to do a db_get_record() from "/" (ODB root top directory).

if this is an mfe.c frontend, I do not think I have ever seen it do something
like this.

so, a puzzle.

K.O.
Entry  01 Dec 2021, Lars Martin, Bug Report, Off-by-one in sequencer documentation 
The documentation for the sequencer loop says:

<quote>
LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
inside the loop via $name.
</quote>

In fact the loop variable runs from 1...n, as can be seen by running this exciting 
sequencer code:

1 COMMENT "Figuring out MSL"
2 
3 LOOP n,4
4   MESSAGE $n,1
5 ENDLOOP
    Reply  02 Dec 2021, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation 
> The documentation for the sequencer loop says:
> 
> <quote>
> LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
> can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
> inside the loop via $name.
> </quote>
> 
> In fact the loop variable runs from 1...n, as can be seen by running this exciting 
> sequencer code:
> 
> 1 COMMENT "Figuring out MSL"
> 2 
> 3 LOOP n,4
> 4   MESSAGE $n,1
> 5 ENDLOOP

Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Stefan
       Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Off-by-one in sequencer documentation 
> > 3 LOOP n,4
> > 4   MESSAGE $n,1
> > 5 ENDLOOP
> 
> Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

K.O.
          Reply  26 Jan 2022, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation 
> Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

   for (i=1 ; i<=10 ; i++);     ;-)
             Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Off-by-one in sequencer documentation 
> > Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.
> 
>    for (i=1 ; i<=10 ; i++);     ;-)

Similar code made big news just recently: (scroll down to the example main() program)

https://blog.qualys.com/vulnerabilities-threat-research/2022/01/25/pwnkit-local-privilege-escalation-
vulnerability-discovered-in-polkits-pkexec-cve-2021-4034

I forget if the FORTRAN rules were "loop once" or "never loop" or if it was different
between Fortran-4, fortran-77, DEC extensions and IBM extension, or if it was a compiler switch.

We should check that we do something reasonable with such loops to zero:

LOOP n,0
   MESSAGE $n,1
ENDLOOP

P.S. Yup. "man g77" option "-fonetrip".

K.O.
Entry  09 Nov 2021, Francesco Renga, Forum, Issue in data writing speed 
Dear all,
       I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
explain this behavior? Is there any way to improve it?

Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
to improve it, it would be really appreciated.

Thank you very much,
          Francesco



  const char* pSrc = (const char*)bufframe.buf;

  for(int y = 0; y < bufframe.height; y++ ){

    //Copy one row
    const unsigned short* pDst = (const unsigned short*)pSrc;

    //go through the row
    for(int x = 0; x < bufframe.width; x++ ){

      WORD tmpData = *pDst++; 

      *pdata++ = tmpData;

    }

    pSrc += bufframe.rowbytes;

  }
 
    Reply  10 Nov 2021, Stefan Ritt, Forum, Issue in data writing speed 
Midas uses various buffers (in the frontend, at the server side before the SYSTEM buffer, the SYSTEM buffer itself, on the 
logger before writing to disk. All these buffers are in RAM and have fast access, so you can fill them pretty quickly. When
they are full, the logger writes to disk, which is slower. So I believe at 2 Hz your disk can keep up with your writing 
speed, but at 4 Hz (2x8MBx4=32 MB/sec) your disk starts slowing down the writing process. Now 32MB/s is pretty slow for
a disk, so I presume you have turned compression on which takes quite some time.

To verify this, disable logging. The disable compression and keep logging. Then report back here again.

> Dear all,
>        I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
> explain this behavior? Is there any way to improve it?
> 
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
> to improve it, it would be really appreciated.
> 
> Thank you very much,
>           Francesco
> 
> 
> 
>   const char* pSrc = (const char*)bufframe.buf;
> 
>   for(int y = 0; y < bufframe.height; y++ ){
> 
>     //Copy one row
>     const unsigned short* pDst = (const unsigned short*)pSrc;
> 
>     //go through the row
>     for(int x = 0; x < bufframe.width; x++ ){
> 
>       WORD tmpData = *pDst++; 
> 
>       *pdata++ = tmpData;
> 
>     }
> 
>     pSrc += bufframe.rowbytes;
> 
>   }
>  
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, Issue in data writing speed 
Francesco, when you say "writing an event is slow", do you mean it in the frontend
or in the output data file?

Stefan is quite right about the data file, it can take seconds between generating
an event in the frontend and seeing it written to the data file. (if compression
buffers are too big, an event can sit there forever, until pushed out by next events
or by run stop).

But maybe you see this on the frontend side.

What you are looking at is "real time" performance of the frontend and of the linux kernel.

The mfe.c frontend has many problems with real time performance, it can stall and take a long
time between calls to read_event(), for many reasons.

There are ways around that, but it is simpler to switch to the tmfe c++ frontend
that was designed for good real time performance.

In the tmfe frontend, if you use the polled equipment and enable the poll thread,
your frontend will be limited only by the linux kernel real time performance (i.e.
on a single-core CPU, other programs will delay execution of your frontend
and you will see it as long delays (usec, millisec) between calls to your read_event().

Next limit to real time performance (common to mfe.c and tmfe frontends) is the writing
of event data to the midas shared event buffer. One has to lock the shared memory semaphore
and this has to wait until other users of the event buffer finish their reading
or writing and unlock it. Arbitrary amount of time (usec, millisec, sec) can pass.
(there is also problems with "fairness" of the linux semaphores, a different story, again).

Making things more interesting, midas event buffers implement a write cache (default size 100 kbytes),
events smaller than the cache are quickly accumulated (no need to lock the shared memory semaphore),
them flushed to shared memory when cache is full. This is done to reduce the number
of shared memory semaphore locks per event, in the case of very high rate of very small events.

Solution to all this is to use 2 threads: read the data from hardware in one thread and write the data to midas
in a different thread. Between the threads would be an event fifo (circular buffer in mfe.c,
std::deque<EVENT> in tmfe c++ frontends).

For remote connected frontends, things are a bit different. Event data is written directly
into the TCP socket and as long as socket buffers are big enough, there is no real-time delays,
unless SYSTEM buffer is very congested and mserver does not read the TCP socket quickly enough.
So depending on event size, data rate and tcp socket buffer size, the extra 2nd thread
may not be necessary and poll thread real time performance may be good enough.

I hope this clarifies the situation somewhat.

K.O.

> Dear all,
>        I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
> explain this behavior? Is there any way to improve it?
> 
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
> to improve it, it would be really appreciated.
> 
> Thank you very much,
>           Francesco
> 
> 
> 
>   const char* pSrc = (const char*)bufframe.buf;
> 
>   for(int y = 0; y < bufframe.height; y++ ){
> 
>     //Copy one row
>     const unsigned short* pDst = (const unsigned short*)pSrc;
> 
>     //go through the row
>     for(int x = 0; x < bufframe.width; x++ ){
> 
>       WORD tmpData = *pDst++; 
> 
>       *pdata++ = tmpData;
> 
>     }
> 
>     pSrc += bufframe.rowbytes;
> 
>   }
>  
       Reply  26 Jan 2022, Konstantin Olchanski, Forum, Issue in data writing speed 
> Francesco, when you say "writing an event is slow", do you mean it in the frontend
> or in the output data file?

Another explanation just occurred to me. We do not know your event size and we do not
know the size of your SYSTEM buffer. But if you have an unlucky combination,
this can happen:

Consider event size is 6 Mbytes, buffer size is 8 Mbytes, enough space for only 1 event.

First event is written quickly (buffer is empty).
Second event will be delayed, there is not enough free space in the buffer, we have
to wait for mlogger to finish reading the first event.

Same thing happens if event size is 3 Mbytes, the first 2 events will write quickly,
writing the 3rd event will be delayed until mlogger does it's thing.

The mlogger reads the SYSTEM buffer "fast" and "quickly", but it can be delayed
for a number of reasons, i.e. handling a history event, a delay writing to disk,
a delay writing to network connected storage, etc.

In general, it is best to size the SYSTEM buffer to hold about 1 second worth
of data (of average size, average rate). If your event size is 4 Mbytes, and you
record them at 10/sec, SYSTEM buffer should be at least 40 Mbytes big. (this is
set in ODB /Experiment/Buffer Sizes). (MIDAS event buffer size is limited to 2 GBytes).

K.O.
Entry  09 Nov 2021, Hunter Lowe, Forum, MityCAMAC Login  
Hello all,

I've recently acquired a MityCAMAC system that was built at TRIUMF and I'm 
having issues accessing it over ethernet.

The system: Ubuntu VM inside Windows 10 machine.

I've tried reconfiguring the network settings for the VM but nmap and arp/ip 
commands have yielded me no results in finding the crate controller. 

I was getting help from Pierre Amaudruz but I think he is now busy for some 
time. I have the mac address of the crate controller and its name. The 
controller seems to initialize fine inside of the CAMAC crate. The windows side 
of the workstation also tells me that an unknown network is in fact connected.

I suspect I either need to do something with an ssh key (which I thought we 
accomplished but maybe not), or perhaps the domain name in the controller needs 
to be changed. 

If anybody has experience working with MityARM I would appreciate any advice I 
could get.

Best,
Hunter Lowe
UNBC Graduate Physics
    Reply  11 Nov 2021, Thomas Lindner, Forum, MityCAMAC Login  
Hi Hunter 

This sounds like a Triumf specific problem; 
not a MIDAS problem.  Please email me directly 
and we can try to solve this problem.

Thomas Lindner
TRIUMF DAQ


 Hello all,
> 
> I've recently acquired a MityCAMAC system 
that was built at TRIUMF and I'm 
> having issues accessing it over ethernet.
> 
> The system: Ubuntu VM inside Windows 10 
machine.
> 
> I've tried reconfiguring the network 
settings for the VM but nmap and arp/ip 
> commands have yielded me no results in 
finding the crate controller. 
> 
> I was getting help from Pierre Amaudruz but 
I think he is now busy for some 
> time. I have the mac address of the crate 
controller and its name. The 
> controller seems to initialize fine inside 
of the CAMAC crate. The windows side 
> of the workstation also tells me that an 
unknown network is in fact connected.
> 
> I suspect I either need to do something with 
an ssh key (which I thought we 
> accomplished but maybe not), or perhaps the 
domain name in the controller needs 
> to be changed. 
> 
> If anybody has experience working with 
MityARM I would appreciate any advice I 
> could get.
> 
> Best,
> Hunter Lowe
> UNBC Graduate Physics
       Reply  26 Jan 2022, Konstantin Olchanski, Info, MityCAMAC Login  
For those curious about CAMAC controllers, this one was built around 2014 to 
replace the aging CAMAC A1/A2 controllers (parallel and serial) in the TRIUMF 
cyclotron controls system (around 50 CAMAC crates). It implements the main
and the auxiliary controller mode (single width and double width modules).

The design predates Altera Cyclone-5 SoC and has separate
ARM processor (TI 335x) and Cyclone-4 FPGA connected by GPMC bus.

ARM processor boots Linux kernel and CentOS-7 userland from an SD card,
FPGA boots from it's own EPCS flash.

User program running on the ARM processor (i.e. a MIDAS frontend)
initiates CAMAC operations, FPGA executes them. Quite simple.

K.O.
Entry  16 Dec 2021, Zaher Salman, Forum, Device driver for modbus 
Dear all, does anyone have an example of for a device driver using modbus or modbus tcp to communicate with a device and willing to share it? Thanks.
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, Device driver for modbus 
> Dear all, does anyone have an example of for a device driver using modbus or modbus tcp to communicate with a device and willing to share it? Thanks.

I have not seen any modbus devices recently, so all my code and examples are quite old.

Basic modbus/tcp communication driver is in the midas repo:

daq00:midas$ find . | grep -i modbus
./drivers/divers/ModbusTcp.cxx
./drivers/divers/ModbusTcp.h
daq00:midas$ 

This driver worked for communication to a modbus PLC (T2K/ND280/TPC experiment in Japan).

An example program to use this driver and test modbus communication is here:
https://bitbucket.org/expalpha/agdaq/src/master/src/modbus.cxx

Because at the end, we do not have any modbus devices in any recent experiment,
I do not have any example of using this driver in the midas frontend. Sorry.

K.O.
Entry  19 Nov 2021, Jacob Thorne, Forum, Sequencer error with ODB Inc 
Hi,

I am having problems with the midas sequencer, here is my code:

1  COMMENT "Example to move a Standa stage"
2  RUNDESCRIPTION "Example movement sequence - each run is one position of a single stage
3  
4  PARAM numRuns
5  PARAM sequenceNumber
6  PARAM RunNum
7  
8  PARAM positionT2
9  PARAM deltapositionT2
10 
11 ODBSet "/Runinfo/Run number", $RunNum
12 ODBSet "/Runinfo/Sequence number", $sequenceNumber
13 
14 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Type of Measurement", 2
15 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Time Bins", 10
16 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Sweeps", 1
17 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Dwell Time", 100000
18 
19 ODBSet "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $positionT2
20 
21 LOOP $numRuns
22  WAIT ODBvalue, "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Ready", ==, 1
23  TRANSITION START
24  WAIT ODBvalue, "/Equipment/Neutron Detector/Statistics/Events sent", >=, 1
25  WAIT ODBvalue, "/Runinfo/State", ==, 1
26  WAIT ODBvalue, "/Runinfo/Transition in progress", ==, 0
27  TRANSITION STOP
28  ODBInc "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $deltapositionT2
29 
30 ENDLOOP
31 
32 ODBSet "/Runinfo/Sequence number", 0

The issue comes with line 28, the ODBInc does not work, regardless of what number I put I get the following error:

[Sequencer,ERROR] [odb.cxx:7046:db_set_data_index1,ERROR] "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position" invalid element data size 32, expected 4

I don't see why this should happen, the format is correct and the number that I input is an int.

Sorry if this is a basic question.

Jacob
    Reply  02 Dec 2021, Stefan Ritt, Forum, Sequencer error with ODB Inc 
Thanks for reporting that bug. Indeed there was a problem in the sequencer code which I fixed now. Please try the updated develop branch.

Stefan
Entry  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator 
I have 16 array odb key

{"FIR Energy", {
            {"Energy Gap Value", std::array<uint32_t,16>(10) },

I can get the maximum of this array like 


uint32_t max_value = *std::max_element(values.begin(),values.end());

but when I need the maximum of a sub range

uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);

I get

/home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
  584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
      |                                                ~~~~~~~~~~~~~~^~
      |                                                            |  |
      |                                                            |  int
      |               

As the + operator is overloaded for midas::odb::iterator, I was expected this to work.

(and yes, I can find the max element by accessing the elements on by one)
    Reply  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator | work around 
ok, so retrieving as a std::array (as it was defined) does not work

    std::array<uint32_t,16> avalues = settings["FIR Energy"]["Energy Gap Value"];

but retrieving as an std::vector does, and then I have a standard c++ iterator which I can use in std stuff

    std::vector<uint32_t> values = settings["FIR Energy"]["Energy Gap Value"];

> I have 16 array odb key
> 
> {"FIR Energy", {
>             {"Energy Gap Value", std::array<uint32_t,16>(10) },
> 
> I can get the maximum of this array like 
> 
> 
> uint32_t max_value = *std::max_element(values.begin(),values.end());
> 
> but when I need the maximum of a sub range
> 
> uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);
> 
> I get
> 
> /home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
>   584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
>       |                                                ~~~~~~~~~~~~~~^~
>       |                                                            |  |
>       |                                                            |  int
>       |               
> 
> As the + operator is overloaded for midas::odb::iterator, I was expected this to work.
> 
> (and yes, I can find the max element by accessing the elements on by one)
Entry  25 Oct 2021, Francesco Renga, Forum, Logger crash 
Hello,
     I'm experiencing crashes of the mlogger program on the time scale of a couple 
of days. The only messages from MIDAS are:

05:34:47.336 2021/10/24 [mhttpd,INFO] Client 'Logger' (PID 14281) on database 
'ODB' removed by db_cleanup called by cm_periodic_tasks (idle 10.2s,TO 10s)
05:34:47.335 2021/10/24 [mhttpd,INFO] Client 'Logger' on buffer 'SYSMSG' removed 
by cm_periodic_tasks (idle 10.2s, timeout 10s)

Any suggestion to further investigate this issue?

Thank you very much,
         Francesco
    Reply  25 Oct 2021, Stefan Ritt, Forum, Logger crash 
The short term solution would be to increase the logger timeout in the ODB under

/Programs/Logger/Watchdog timeout

and set it to 6000 (one minute). But that is curing just the symptoms. It would be 
interesting to understand the cause of this error. Probably the logger takes more than 10 
seconds to start or stop the run. The reason could be that the history grow too big (what 
we have right now in MEG II), or some disk problems. But that needs detailed debugging on 
the logger side.

Stefan
Entry  14 Oct 2021, Amy Roberts, Suggestion, Adding (or improving discoverability) of TID for odbset 
Creating an ODB key requires users to know the Type ID that are defined in 
https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.

I can't find any information on the Midas Wiki about these values or how to find 
them.

Am I missing something obvious?  Is there a way to improve how to find these values?  
Or is this not the best way to interact with the ODB?
    Reply  15 Oct 2021, Stefan Ritt, Suggestion, Adding (or improving discoverability) of TID for odbset 
> Creating an ODB key requires users to know the Type ID that are defined in 
> https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.
> 
> I can't find any information on the Midas Wiki about these values or how to find 
> them.
> 
> Am I missing something obvious?  Is there a way to improve how to find these values?  
> Or is this not the best way to interact with the ODB?

Well, you found them in midas.h, so where is the problem?

If you want a more detailed description, just look in the midas documentation (RTFM):

https://midas.triumf.ca/MidasWiki/index.php/Midas_Data_Types

If you want a more modern interface to the ODB without these data types, look here:

https://midas.triumf.ca/MidasWiki/index.php/Odbxx

Best regards,
Stefan
Entry  11 Oct 2021, Konstantin Olchanski, Forum, midas forum updated, moved 
The midas forum software (elogd) was updated to latest version and moved from our old server 
(ladd00.triumf.ca) to our new server (daq00.triumf.ca).

The following URLs should work:

https://daq00.triumf.ca/elog-midas/Midas/ (new URL)
https://midas.triumf.ca/elog/Midas/ (old URL, redirects to daq00)
https://midas.triumf.ca/forum (link from midas wiki)

The configuration on the old server ladd00.triumf.ca is quite tangled between
several virtual hosts and several DNS CNAMEs. I think I got all the redirects
correct and all old URLs and links in old emails & etc still work.

If you see something wrong, please reply to this message here or email me directly.

K.O.
Entry  11 Oct 2021, Konstantin Olchanski, Forum, test 
test, no email. K.O.
    Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> test, no email. K.O.

test reply, no email. K.O.
       Reply  11 Oct 2021, Konstantin Olchanski, Forum, test image.png
> > test, no email. K.O.
> 
> test reply, no email. K.O.

test attachment, no email. K.O.
          Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> > > test, no email. K.O.
> > 
> > test reply, no email. K.O.
> 
> test attachment, no email. K.O.

test email. K.O.
Entry  11 Oct 2021, Stefan Ritt, Info, Modification in the history logging system 
A requested change in the history logging system has been made today. Previously, history values were
logged with a maximum frequency (usually once per second) but also with a minimum frequency, meaning
that values were logged for example every 60 seconds, even if they did not change. This causes a problem.
If a frontend is inactive or crashed which produces variables to be logged, one cannot distinguish between
a crashed or inactive frontend program or a history value which simply did not change much over time.
The history system was designed from the beginning in a way that values are only logged when they actually
change. This design pattern was broken since about spring 2021, see for example this issue:

https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

Today I modified the history code to fix this issue. History logging is now controlled by the value of 
common/Log history in the following way:

* Common/Log history = 0 means no history logging
* Common/Log history = 1 means log whenever the value changes in the ODB
* Common/Log history = N means log whenever the value changes in the ODB and 
  the previous write was more than N seconds ago

So most experiments should be happy with 0 or 1. Only experiments which have fluctuating values due to noisy 
sensors might benefit from a value larger than 1 to limit the history logging. Anyhow this is not the preferred 
way to limit history logging. This should be done by the front-end limiting the updates to the ODB. Most of the 
midas slow control drivers have a “threshold” value. Only if the input changes by more then the threshold are 
written to the ODB. This allows a per-channel “dead band” and not a per-event limit on history logging 
as ‘log history’ would do. In addition, the threshold reduces the write accesses to the ODB, although that is
only important for very large experiments.

Stefan
Entry  29 Sep 2021, Richard Longland, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
Thank you, Stefan.

I found these instructions under
1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)

I do see reference to updating the submodules under the TRIUMF install 
instructions 
(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
all_MIDAS) although perhaps it can be clarified.

Cheers,
Richard
    Reply  29 Sep 2021, Stefan Ritt, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
> Thank you, Stefan.
> 
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
> 
> I do see reference to updating the submodules under the TRIUMF install 
> instructions 
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
> 
> Cheers,
> Richard

Hi Richard,

I updated the documentation at

https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas

by putting the submodule update command everywhere.

Best,
Stefan
Entry  28 Sep 2021, Richard Longland, Bug Report, Install clash between MIDAS 2020-08 and mscb 
All,

I am performing a fresh install of MIDAS on an Ubuntu linux box. I follow the 
usual installation procedure:

1) git clone https://bitbucket.org/tmidas/midas --recursive
2) cd midas
3) git checkout release/midas-2020-08
4) mkdir build
5) cd build
6) cmake ..
7) make

Step 3 warns me that 
"warning: unable to rmdir 'manalyzer': Directory not empty" and 
"warning: unable to rmdir 'midasio': Directory not empty"

Step 7 fails.
Compilation fails with an mhttp error related to mscb:
mhttpd.cxx:8224:59: error: too few arguments to function 'int mscb_ping(int, 
short unsigned int, int, int)'
 8224 |             status = mscb_ping(fd, (unsigned short) ind, 1);


I was able to get around this by rolling mscb back to some old version (commit 
74468dd), but am extremely nervous about mix-and-matching the code this way.

Any advice would be greatly appreciated.

Cheers,
Richard 
    Reply  28 Sep 2021, Stefan Ritt, Bug Report, Install clash between MIDAS 2020-08 and mscb 
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make

When you do step 3), you get

~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M	mjson
M	mscb
M	mvodb
M	mxml

The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the 
*current* submodules, which won't match. So you have to roll back
also the submodules with:

3.5) git submodule update --recursive

This fetched those versions of the submodules which match the
midas version 2020-08. See here for details: 

https://git-scm.com/book/en/v2/Git-Tools-Submodules

From where did you get the command

git checkout release/xxxx ???

If you tell me the location of that documentation, I will take
care that it will be amended with the command

git submodule update --recursive

Best,
Stefan
Entry  19 Sep 2021, Stefan Ritt, Bug Fix, Chat working again Screenshot_2021-09-19_at_21.27.19_.png
Not sure how many people are using it, but the Chat facility in midas was broken 
for some time now and got fixed today again.

Just for your information: Chat can be used like WhatsApp & Co, and connects all 
people who access a midas experiment through their browser. It's good to 
communicate between shift crew members located at different places. One advantage 
is that the chat messages can get 'spoken' by the text-to-speech engine of your 
browser, so it can be used to "wake up" shifters. Can be configured through the 
"Config" page.

Stefan
Entry  06 Sep 2021, Andreas Suter, Forum, mhttpd crash 
midas version used: midas-2019-05-cxx-1461-g906be8b

I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:

[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!

with various socket numbers of course.

Can anybody hint me what is going wrong here?

The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

Help would be very much appreciated!

Andreas 
    Reply  06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash 
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

This is my code. I am the culprit. I had a bit of discussion about this with Stefan.

Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.

I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.

This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.

Back to your troubles.

1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.

2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.

3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).

4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)

In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).

K.O.
       Reply  07 Sep 2021, Andreas Suter, Forum, mhttpd crash 
Dear Konstantin,

thanks for the prompt response, this helps a lot!

> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.

I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.

> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
> 
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
> 

I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?

Andreas
          Reply  17 Sep 2021, Stefan Ritt, Forum, mhttpd crash mhttpdScreenshot_2021-09-17_at_21.11.15_.png
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI 
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive. 

To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into

/etc/monit.d/mhttpd

You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put 

set daemon 10

into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.

Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is 
multi-threaded, but I haven't tested this in detail.

Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).

I hope this helps some of you.

Stefan
Entry  24 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I am updating the history plots. Main changes:

- the old history display code should again be easily usable (use the "open in old history display" checkbox)
- the history plot editor has an "edit in ODB" button that takes as to the plot definition in ODB (sometimes it is 
easier to editing things in the ODB editor)
- error in history plot editor that created "formula" entry of incorrect size should be fixed
- "reorder" (and "delete entry") functions in the history plot editor should work again (plus added explanation text)
- "factor" and "offset" restored in the history plot editor
- added the long desired "voffset" to simplify plot scaling and positioning
- (factor, offset and voffset do not yet work in the new history plots, TBI ASAP)
- history plot editor and generate_hist_graph() now use the same code to read plot definitions from ODB. There should 
be no more confusion about content of history plot entries in ODB and what each entry is supposed to do.

These changes have been precipitated by our inability to plot high voltage voltage and current on the same plot,
see bug https://bitbucket.org/tmidas/midas/issues/308/history-plot-formula-cannot-be-used-to

Voltage is in the range 0..1000 (volts) and current is in the range 0..50 and 0..0.100, autoscaling on voltage
makes the currents invisible at the zero line. In the past, we used the "factor" setting to scale
the graphs so we can see both voltage and currents at the same time (currents scaled up by factor 25 and 600,
as example).

The new "formula" feature was supposed to replace (and improve upon) the "factor" and "offset". But if I use
the formula "x*25", suddenly the plot is telling us that current values are not 50 uA, but 1250 uA (50*25),
and this is just wrong. We do not want to scale the micro-amps, we want to better position the plot on the graph,
like the old "factor" and "offset" allowed us to do.

So the idea is to use this computation:

y_position_on_plot = offset + factor*(formula(history_value) - voffset)

- "formula" is to transform history values into physical values (i.e. pressure meter reports bars, but we want atm, or 
voltmeter is reading in discrete units of 0.125V, we want to see volts)
- "factor" and "offset" is to position the graphs on the plot for best visual presentation of data
- I also added is the much desired "voffset", you only know it is needed if you have a non-zero "offset" and you need 
to change the "factor", surprise, "offset" has ot be changed, too, and good luck recalculating it correctly in one 
try.

The way to use this stuff:
- adjust "voffset" to bring the graph to around y=0
- increase the "factor" to zoom-in on features and stuff
- adjust "offset" to move the graph up and down relative to all the other graphs on the plot
- now one can zoom in and out as needed by changing the "factor" and the plot will stay roughly in the right place 
without having to readjust the offsets.

K.O.
    Reply  24 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
I disagree with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
of taking a screen shot, putting this into a publication, and get complaints from the reviewer.

The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

Stefan
       Reply  25 Jun 2021, Marco Francesconi, Bug Fix, changes in history plots 
We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
...), so it is great that the shown voltage is the calculated one.

I would like to add a point to this discussion.
In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
variables.
Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?

So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
Maybe marking the channels in the description or using different line styles/thickness?

Best,
Marco
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I will have to post an example of a scaled plot. I figure everybody forgot how they look like.

K.O.


> We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
> ...), so it is great that the shown voltage is the calculated one.
> 
> I would like to add a point to this discussion.
> In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
> The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
> variables.
> Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
> Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
> 
> So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
> Maybe marking the channels in the description or using different line styles/thickness?
> 
> Best,
> Marco
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I disagree ...

I am happy with disagreement and differences of opinions. Zest of life, driver of progress and improvements, etc.

I am even more happy with solutions to problems. The current problem is that the offset and factor feature
of history plots has been removed without much discussion.

I stress, we have been using this feature to run experiments for the last 20 years.

I do not understand objections to it being restored. If you do not want to use it, do not use it.

K.O.

> with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
> scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
> value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
> of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
> 
> The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
> 
> Stefan
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> > The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> > new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> > the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

In the past, we have done some useful plots with maybe 10 variables plotted
at the same time with different scaling and positioning on the graph.

Having 2 vertical axis is maybe useful for the specific case of plotting high voltages,
but not in the general case.

Actually, just 2 vertical axis will not work to plot high voltages in ALPHA-g, because
we have anode currents on the scale 0..0.1 uA and cathode currents on the scale 50..60 uA.

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
any history panel, on gets tons of errors and debug output from mhttpd:

MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
show_fill: 1
var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 10
var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 20



This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
break running experiments.

Stefan
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
> any history panel, on gets tons of errors and debug output from mhttpd: ...

This is the reason most projects have separate development and production branches.

I recommend everybody to use the released tagged versions of midas for production.

> I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.

Is there something that does not work anymore? Did I break something? The debug messages I am still
tuning.

K.O.


> 
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
> timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
> show_fill: 1
> var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 10
> var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 20
> 
> 
> 
> This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.
> 
> Stefan
    Reply  30 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I am updating the history plots.
> So the idea is to use this computation:
> y_position_on_plot = offset + factor*(formula(history_value) - voffset)

Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.

- we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
are before formula is applied or after the formula is applied?

- I suggested a universal solution using a double formula: use formula1 for one case;
  use formula2 for the other case;
  use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
     numeric_value = formula1(history_value)
     plotted_value = formula2(numeric_value)

- we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor

- Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).

- if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)

- to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
complication).

- I think this covers all the use cases I have seen in the past, so we will move in this direction.

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
Moving in the direction of this proposal. History plot editor is updated according to it. Remaining missing piece is the "show 
raw value" buttons and code behind them.

Changes:

- "show factor and offset" moved to the top of the page, "off" by default
- factor and offset (if not zero) are automatically migrated to the formula field (if it is empty), one needs to save the panel 
for this to take effect.

K.O.


> > I am updating the history plots.
> > So the idea is to use this computation:
> > y_position_on_plot = offset + factor*(formula(history_value) - voffset)
> 
> Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.
> 
> - we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
> are before formula is applied or after the formula is applied?
> 
> - I suggested a universal solution using a double formula: use formula1 for one case;
>   use formula2 for the other case;
>   use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
>      numeric_value = formula1(history_value)
>      plotted_value = formula2(numeric_value)
> 
> - we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor
> 
> - Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
> value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
> value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).
> 
> - if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
> Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
> temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)
> 
> - to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
> automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
> value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
> complication).
> 
> - I think this covers all the use cases I have seen in the past, so we will move in this direction.
> 
> K.O.
          Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> Moving in the direction of this proposal. Remaining missing piece is the "show 
> raw value" buttons and code behind them.

added "show raw value" button, updated on-page instructions.

I think this is the final layout of the history panel editor, conversion
to html+javascript will be done "as is". If you have suggestions to improve
the layout (add/remove/move things around, etc), please shoult out (on the elog
here or by direct email to me).

I am thinking in the direction of changing the control flow of the history editor:

- midas "history" manu button click redirects to
- current history panel selection (with checkbox to open old history plots), click on "new plot" button redirects to
- new page for creating new plots. this will present a list of all history variables, click on variable name creates a new history 
panel containing just this one variable and redirects to it.

In other words, to see the history for any history variable:
- click on "history" menu button
- click on "new"
- click on desired history variable
- see this history plot

From here, click on the "wheel" button to open the existing history panel editor and add any additional variables, change settings, 
etc.

In the history panel editor, I am thinking in the direction of replacing the existing drop-down selection of history variables (now 
very workable for large experiments) with an overlay dialog to show all history variables, with checkboxes to select them, basically 
the same history variable select page as described above. Not sure yet how this will work visually.

K.O.
             Reply  24 Aug 2021, Stefan Ritt, Bug Fix, changes in history plots 
One addition I would be in favour of is to remove the "Order" and replace it with drag&drop handles, because this is what people are more 
used to today. Only the old guys like us remember the /etc/init.d/xx_yy scheme where one uses an integer number in the file name to 
determine an order. 

See for example: https://jsbin.com/hijetos/edit?js,output

But instead of relying on a foreign library, I would rather implement that myself, since I need the same thing later for the to-be-
implemented ODB editor (next year? next lockdown?)

Stefan
Entry  19 Aug 2021, Konstantin Olchanski, Bug Report, select() FD_SETSIZE overrun 
I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
mysteriously misbehaving during run start and stop.

The problem turns out to be with the select() system call.

The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
the FD_SET() array. Ouch.

I see that all uses of select() in midas have no protection against this.

(we should probably move away from select() to newer poll() or whatever it is)

Why does mlogger open so many file descriptors? The usual, scaling problems in the 
history. The old midas history does not reuse file descriptors, so opens the same 
3 history files (.hst, .idx, etc) for each history event. The new FILE history 
opens just one file per history event. But if the number of events is bigger than 
1024, we run into same trouble.

(BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
on some other machines, see "limit" or "ulimit -a").

K.O.
    Reply  20 Aug 2021, Stefan Ritt, Bug Report, select() FD_SETSIZE overrun 
> I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
> mysteriously misbehaving during run start and stop.
> 
> The problem turns out to be with the select() system call.
> 
> The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
> FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
> the FD_SET() array. Ouch.
> 
> I see that all uses of select() in midas have no protection against this.
> 
> (we should probably move away from select() to newer poll() or whatever it is)
> 
> Why does mlogger open so many file descriptors? The usual, scaling problems in the 
> history. The old midas history does not reuse file descriptors, so opens the same 
> 3 history files (.hst, .idx, etc) for each history event. The new FILE history 
> opens just one file per history event. But if the number of events is bigger than 
> 1024, we run into same trouble.
> 
> (BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
> on some other machines, see "limit" or "ulimit -a").
> 
> K.O.

I cannot imagine that you have more than 1024 different events in ALPHA. That wouldn't 
fit on your status page. 

I have some other suspicion: The logger opens a history file on access, then closes it 
again after writing to it. In the old days we had a case where we had a return from the 
write function BEFORE the file has been closed. This is kind of a memory leak, but with 
file descriptors. After some time of course you run out of file descriptors and crash. 
Now that bug has been fixed many years ago, but it sounds to me like there is another 
"fd leak" somewhere. You should add some debugging in the history code to print the 
file descriptors when you open a file and when you leave that routine. The leak could 
however also be somewhere else, like writing to the message file, ODB dump, ...

The right thing of course would be to rewrite everything with std::ofstream which 
closes automatically the file when the object gets out of scope.

Stefan
Entry  12 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
Hi,

Using midas version 12-2020,  I am trying to run mhttpd from within a docker container using docker-compose.
Starting from an empty ODB, I simply run `mhttpd` and this is the output I have:
midas_hatfe_1  | <Warning> Starting mhttpd...
midas_hatfe_1  | [mhttpd,INFO] ODB subtree /Runinfo corrected successfully
midas_hatfe_1  | MVOdb::SetMidasStatus: Error: MIDAS db_find_key() at ODB path "/WebServer/Host list" returned status 312
midas_hatfe_1  | Mongoose web server will not use password protection
midas_hatfe_1  | Mongoose web server will not use the hostlist, connections from anywhere will be accepted
midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

According to the documentation, the WebServer tree should be created automatically when starting the mhttpd; but it seems not as it doesn't find the entry "/WebServer/Host list".
If I create it by end (using "create STRING /WebServer/Host list"), I still get the error message that mhttpd didn't bind properly to the local port 8080.
I am not sure what it wrong, as mhttpd is working perfectly well in this exact container for midas 03-2020.

Any idea what difference makes it not possible anymore to run into these container?

Thanks very much for your help.
Cheers
Mathieu
    Reply  12 May 2021, Ben Smith, Bug Report, mhttpd WebServer ODBTree initialization 
> midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
> midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
       Reply  12 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.

We had this issue already several times. This info should be put into the documentation at a prominent location.

Stefan
          Reply  13 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
> > It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
> 
> We had this issue already several times. This info should be put into the documentation at a prominent location.
> 
> Stefan

Thanks a lot, this solved my issue!
             Reply  14 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> Thanks a lot, this solved my issue!

... or we should turn IPv6 off by default, since not many people use this right now.
                Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, mhttpd WebServer ODBTree initialization 
> > Thanks a lot, this solved my issue!
> 
> ... or we should turn IPv6 off by default, since not many people use this right now.

IPv6 certainly works and is used at CERN.

But I am not sure why people see this message. I do not see it on any machines at 
TRIUMF, even those with IPv6 turned off.

K.O.
                   Reply  05 Aug 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
Well, we all see it here at PSI, so this is enough reason to turn this off by default. Shall 
I do it?
Entry  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Hi,

if I check out midas and try to configure it with 

cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

I do get the error messages:

  Target "midas" INTERFACE_INCLUDE_DIRECTORIES property contains path:

    "<path>/tmidas/midas/include"

  which is prefixed in the source directory.

Is the cmake setup not relocatable? This is new and was working until recently:

MIDAS version:      2.1
GIT revision:       Thu May 27 12:56:06 2021 +0000 - midas-2020-08-a-295-gfd314ca8-dirty on branch HEAD
ODB version:        3
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

good timing, I am working on cmake for manalyzer and rootana and I have not tested
the install prefix business.

now I know to test it for all 3 packages.

I will also change find_package(Midas) slightly, (see my other message here),
I hope you can confirm that I do not break it for you.

K.O.
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> Is the cmake setup not relocatable? This is new and was working until recently:

Indeed. Not relocatable. This is because we do not install the header files.

When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:

prefix/lib
prefix/bin
$MIDASSYS/include <-- this is the source tree and so not "relocatable"!

Before, this was kludged and cmake did not complain about it.

Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.

I am not sure how to fix this: we have a conflict between:

- our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
- the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
- and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)

I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
$MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.

K.O.
       Reply  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > Is the cmake setup not relocatable? This is new and was working until recently:
> 
> Indeed. Not relocatable. This is because we do not install the header files.
> 
> When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:
> 
> prefix/lib
> prefix/bin
> $MIDASSYS/include <-- this is the source tree and so not "relocatable"!
> 
> Before, this was kludged and cmake did not complain about it.
> 
> Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.
> 
> I am not sure how to fix this: we have a conflict between:
> 
> - our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
> - the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
> - and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)
> 
> I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
> $MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.
> 
> K.O.

We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX. 
Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
$MIDASSYS/include, even though this is not quite the standard way, but no problem here. Anyway, since CMAKE_INSTALL_PREFIX 
is a standard option from cmake, I think things should not "break" if you want to use it.

A.S.
          Reply  08 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > Is the cmake setup not relocatable? This is new and was working until recently:
> > Not relocatable. This is because we do not install the header files.
> 
> We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 

hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
mean by this? you install midas in a secret location to prevent somebody from linking to it?

> If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX.

cmake and other frameworks tend to be like procrustean beds (https://en.wikipedia.org/wiki/Procrustes),
pre-cmake packages never quite fit perfectly, and either the legs or the heads get cut off. post-cmake packages
are constructed to fit the bed, whether it makes sense or not.

given how this situation is known since antiquity, I doubt we will solve it today here.

(I exercise my freedom of speech rights to state that I object being put into
such situations. And I would like to have it clear that I hate cmake (ask me why)).

>
> Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
> $MIDASSYS/include, even though this is not quite the standard way, but no problem here.
>

I think a solution would be to add install rules for include files. There will be a bit of trouble,
normal include path is $MIDASSYS/include,$MIDASSYS/mxml,$MIDASSYS/mjson,etc, after installing
it will be $CMAKE_INSTALL_PREFIX/include (all header files from different git submodules all
dumped into one directory). I do not know what problems will show up from that.

I think if midas is used as a subproject of a bigger project, this is pretty much required
(and I have seen big experiments, like STAR and ND280, do this type of stuff with CMT,
another horror and the historical precursor of cmake)

The problem is that we do not have any super-project like this here, so I cannot ever
be sure that I have done everything correctly. cmake itself can be helpful, like
in the current situation where it told us about a problem. but I will never trust
cmake completely, I see cmake do crazy and unreasonable things way too often.

One solution would be for you or somebody else to contribute such a cmake super-project,
that would build midas as a subproject, install it with a CMAKE_INSTALL_PREFIX and
try to link some trivial frontend or analyzer to check that everything is installed
correctly. It would become an example for "how to use midas as a subproject").
Ideally, it should be usable in a bitbucket automatic build (assuming bitbucket
has correct versions of cmake, which it does not half the time).

P.S. I already spent half-a-week tinkering with cmake rules, only to discover
that I broke a kludge that allows you to do something strange (if I have it right,
the CMAKE_PREFIX_INSTALL code is your contribution). This does not encourage
 me to tinker with cmake even more. who knows against what other
kludge I bump into. (oh, yes, I know, I already bumped into the nonsense
find_package(Midas) implementation).

K.O.
             Reply  09 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > Not relocatable. This is because we do not install the header files.
> > 
> > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> 
> hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> mean by this? you install midas in a secret location to prevent somebody from linking to it?
> 

This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
I have submitted the pull request which should resolve this without interfere with your usage.
Hope this will resolve the issue.
                Reply  10 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > Not relocatable. This is because we do not install the header files.
> > > 
> > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > 
> > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > 
> 
> This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> I have submitted the pull request which should resolve this without interfere with your usage.
> Hope this will resolve the issue.

Excellent. I think it is good to have midas "install" in a sane manner.

But I still struggle to understand what you do. Presumably you can "install" midas
in the "midas account", which is not writable by the experiment and user accounts.
Then it does not matter if you "install" it in it's build directory (like we do)
or in some other location (like you do now).

This does not work of course if you only have one account, so do you build midas
as root? or install it as root?

I do ask because in the current computing world, doing things as root requires
a certain amount of trust, which may not be there anymore, see the recent "supply side" attacks
against python packages, solar winds hack, linux kernel malicious patches from umn, etc.

Personally, I do not want to answer questions "is midas safe to run as root?",
"can I trust the midas install scripts to run as root?" and certainly I do not want to hear
about "I installed midas and 100 other packages as root and got hacked 7 days later".

(and running midas as root was never safe. neither mhttpd nor mserver will pass
a security audit).

Anyhow, looks like I will look at cmake again next week. Right now I have a major
breakthrough in the ALPHA-g experiment, my big 96-port Juniper switch suddenly
has working ethernet flow control and I can record data at 600 Mbytes/sec without
any UDP packet loss. Above that, my event builder explodes. I want to fix it and get
it up to 1000 Mbytes/sec, the limit of my 10gige network link. (In this system I do not
have the disk subsystem to record data at this rate, but I have build 8-disk ZFS arrays
that would sink it, no problem). And the day has come when I ran out of CPU cores.
The UDP packet receivers are multithreaded, the event builder is multithreaded and I am using
all 4 of the available cores (intel cpu). As soon as I can get a rackmounted AMD Ryzen
or Threadripper machine, we will likely upgrade. (need at least one more CPU core to run
the online analyzer!). Exciting.

K.O.
                   Reply  10 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > > Not relocatable. This is because we do not install the header files.
> > > > 
> > > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > > 
> > > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > > 
> > 
> > This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> > I have submitted the pull request which should resolve this without interfere with your usage.
> > Hope this will resolve the issue.
> 
> Excellent. I think it is good to have midas "install" in a sane manner.
> 
> But I still struggle to understand what you do. Presumably you can "install" midas
> in the "midas account", which is not writable by the experiment and user accounts.
> Then it does not matter if you "install" it in it's build directory (like we do)
> or in some other location (like you do now).
> 
> This does not work of course if you only have one account, so do you build midas
> as root? or install it as root?
> 

We work the following way: there is a production Midas under let's say /usr/local/midas (make install as sudo/root). This is for the running experiment. Since we are doing muSR, we 
have experiments on a daily base, rather than month and years as it is the case for a particle physics experiment. Now, still we would like to test updates, new features of Midas on 
the same machine. For this we us the repo directly. If we are happy with the new feature, and fixes, we again do a 'make install' and hence freeze for the production a specific 
snapshot. Of course we could use various local copies of the Midas repo, but over the last years this approach was very convenient and productive. Hope this explains a bit better 
why we want to work with a CMAKE_INSTALL_PREFIX.

AS
                      Reply  11 Jul 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
big thanks to Andreas S. for getting most of this figured out. I now understand
much better how cmake installs things and how it generates config files, both
find_package(midas) style and install(export) style.

with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
how to use it and how to test it, it should not break again.

for posterity, my commends to Andreas's pull request:

thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
“install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
result. K.O.

K.O.
                         Reply  02 Aug 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Dear Konstantin,

I have tried your adopted version. You did already quite a job which is more consistent than what I was suggesting.
Yet, I still have a problem (git sha2 2d3872dfd31) when starting on a clean system (i.e. no midas present yet): 
Without CMAKE_INSTALL_PREFIX set, everything is fine. 
However, when setting CMAKE_INSTALL_PREFIX, I get the following error message on the build level (cmake --build ./ -- VERBOSE=1) from the manalyzer:

[ 32%] Building CXX object manalyzer/CMakeFiles/manalyzer.dir/manalyzer.cxx.o
cd /home/l_musr_tst/Tmp/midas/build/manalyzer && /usr/bin/c++  -DHAVE_FTPLIB -DHAVE_MIDAS -DHAVE_ROOT_HTTP -DHAVE_THTTP_SERVER -DHAVE_TMFE -DHAVE_ZLIB -D_LARGEFILE64_SOURCE -I/home/l_musr_tst/Tmp/midas/manalyzer -I/usr/local/root/include  -O2 -g -Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function -std=c++11 -pipe -fsigned-char -pthread -DHAVE_ROOT -std=gnu++11 -o CMakeFiles/manalyzer.dir/manalyzer.cxx.o -c /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx
In file included from /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx:14:0:
/home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.h:13:21: fatal error: midasio.h: No such file or directory
 #include "midasio.h"
                     ^
compilation terminated.

Obviously, still some include paths are missing. I tried quickly to see if an easy fix is possible, but I failed.

Question: is it possible to use manalyzer without midas? I am asking since the MIDAS_FOUND flag is confusing me.

> big thanks to Andreas S. for getting most of this figured out. I now understand
> much better how cmake installs things and how it generates config files, both
> find_package(midas) style and install(export) style.
> 
> with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
> how to use it and how to test it, it should not break again.
> 
> for posterity, my commends to Andreas's pull request:
> 
> thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
> “install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
> result. K.O.
> 
> K.O.
Entry  31 Jul 2021, Peter Kunz, Bug Report, ss_shm_name: unsupported shared memory type, bye! 
I ran into a problem trying to compile the latest MIDAS version on a Fedora 
system.

mhttpd and odbedit return:
ss_shm_name: unsupported shared memory type, bye!

check_shm_type: preferred POSIXv4_SHM got SYSV_SHM

The check returns SYSV_SHM which doesn't seem to be supported in ss_shm_name.

Is there an easy solution for this?

Thanks.
Entry  09 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
cmake check and mate in 1 move. please help.

the midas cmake file has a typo in the ROOT_CXX_FLAGS, I fixed it and now I am dead in the 
water, need help from cmake experts and pushers.

On Ubuntu:
ROOT_CXX_FLAGS has -std=c++14
midas cmake defines -std=gnu++11 (never mind that I asked for c++11, not "c++11 with GNU 
extensions")

the two compiler flags collide and the build explodes, the best I can tell c++11 prevails 
and ROOT header files blow up because they expect c++14.

if I remove the midas cmake request for c++11, -std=gnu++11 is gone, there is no conflict 
with ROOT C++14 request and the build works just fine.

but now it explodes on CentOS-7 because by default, c++11 is not enabled. (include <mutex> 
blows up).

what a mess.

K.O.
    Reply  13 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> cmake check and mate in 1 move. please help.
> -std=c++11 and -std=c++14 collision...

I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> > cmake check and mate in 1 move. please help.
> > -std=c++11 and -std=c++14 collision...
> 
> I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
> discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b
>

I figured it out, solution is to use:

target_compile_features(midas PUBLIC cxx_std_11)

this is how it works:

- centos-7 (g++ has c++11 off by default): -std=gnu++11 is added automatically (not -std=c++11, but 
probably correct, as some c++11 functions were available as gnu extensions)
- ubuntu-20.04 LTS without ROOT: nothing added (I guess correct, g++ has c++11 is enabled by default)
- ubuntu-20.04 LTS with -std=c++14 from ROOT: nothing added, c++14 as requested by ROOT is in affect.
- macos without ROOT: -std=gnu++11 is added automatically
- macos with -std=c++11 from ROOT: ditto, so both -std=c++11 and -std=gnu++11 are present in this order, 
wrong-ish, but works.

and good luck figuring this out just from cmake documentation:
https://cmake.org/cmake/help/latest/command/target_compile_features.html

K.O.
Entry  10 Aug 2020, Mathieu Guigue, Info, MidasConfig.cmake usage 
As the Midas software is installed using CMake, it can be easily integrated into 
other CMake projects using the MidasConfig.cmake file produced during the Midas 
installation.

This file points to the location of the include and libraries of Midas using three 
variables:
- MIDAS_INCLUDE_DIRS
- MIDAS_LIBRARY_DIRS
- MIDAS_LIBRARIES

Then the CMakeLists file of the new project can use the CMake find_package 
functionalities like:
```
find_package (Midas REQUIRED)
if (MIDAS_FOUND)
    MESSAGE(STATUS "Found midas: libraries ${MIDAS_LIBRARIES}")
    pbuilder_add_ext_libraries (${MIDAS_LIBRARIES})
else (MIDAS_FOUND)
    message(FATAL "Unable to find midas")
endif (MIDAS_FOUND)
include_directories (${MIDAS_INCLUDE_DIR})
```
pbuilder_add_ext_libraries is a CMake macro allowing to automatically add the 
libraries into the project: this macro can be found here: 
https://github.com/project8/scarab/blob/master/cmake/PackageBuilder.cmake
If such macro doesn't exist, the linkage to each executable/library can be done 
similarly to https://midas.triumf.ca/elog/Midas/1964 using: 

```
target_link_libraries(crfe ${MIDAS_LIBARIES} ${LIBS})
```

The current version of the MidasConfig.cmake is minimal and could for example 
include a version number: this would allow to define a e.g. minimal version of 
Midas needed by the new project.
    Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
How does "find_package (Midas REQUIRED)" find the location of MIDAS?

The best I can tell from the current code, the package config files are installed
inside $MIDASSYS somewhere and I see "find_package MIDAS" never find them (indeed,
find_package() does not know about $MIDASSYS, so it has to use telepathy or something).

Does anybody actually use "find_package(midas)", does it actually work for anybody?

Also it appears that "the cmake way" of importing packages is to use
the install(EXPORT) method.

In this scheme, the user package does this:

include(${MIDASSYS}/lib/midas-targets.cmake)
target_link_libraries(myprogram PUBLIC midas)

this causes all the midas include directories (including mxml, etc)
and dependancy libraries (-lutil, -lpthread, etc) to be automatically
added to "myprogram" compilation and linking.

of course MIDAS has to generate a sensible targets export file,
working on it now.

K.O.
       Reply  28 May 2021, Marius Koeppel, Info, MidasConfig.cmake usage 
> Does anybody actually use "find_package(midas)", does it actually work for anybody?

What we do is to include midas as a submodule and than we call find_package:

    add_subdirectory(midas)
    list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
    find_package(Midas REQUIRED)

For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 

Without the find_package the build does not work for us.
          Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > Does anybody actually use "find_package(midas)", does it actually work for anybody?
> 
> What we do is to include midas as a submodule and than we call find_package:
> 
>     add_subdirectory(midas)
>     list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
>     find_package(Midas REQUIRED)
> 
> For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 
> 
> Without the find_package the build does not work for us.

Ok, I see. I now think that for us, this "find_package" business an unnecessary complication:

since one has to know where midas is in order to add it to CMAKE_PREFIX_PATH,
one might as well import the midas targets directly by include(.../midas/lib/midas-targets.cmake).

From what I see now, the cmake file is much simplifed by converting
it from "find_package(midas)" style MIDAS_INCLUDES & co to more cmake-ish
target_link_libraries(myexe midas) - all the compiler switches, include paths,
dependant libraires and gunk are handled by cmake automatically.

I am not touching the "find_package(midas)" business, so it should continue to work, then.

K.O.
             Reply  31 May 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough. 

Even if it's not there, the find_package() is the "traditional" way CMake discovers external packages and users are used to that (like ROOT does the 
same). In comparison, your "midas-targets.cmake" way of doing things, although this works certainly fine, is not the "standard" way, but a midas-
specific solution, other people have to learn extra.

Stefan
                Reply  02 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
> and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough.

Hi, Stefan, can you say more about this? If MidasConfig.cmake is part of the cmake distribution,
(did I understand you right here?) and is installed into a system-wide directory,
how can it know to use midas from /home/agmini/packages/midas or from /home/olchansk/git/midas?

Certainly we do not do system wide install of midas (into /usr/local/bin or whatever) because
typically different experiments running on the same computer use different versions of midas.

For ROOT, it looks as if for find_package(ROOT) to work, one has to add $ROOTSYS to the Cmake package
search path. This is what we do in our cmake build.

As for find_package() vs install(EXPORT), we may have the same situation as with my "make cmake",
where my one line solution is no good for people who prefer to type 3 lines of commands.

Specifically, the install(EXPORT) method defines the "midas" target which brings with it
all it's dependent include paths, libraries and compile flags. So to link midas you need
two lines:

include(.../midas/lib/midas-targets.cmake)
target_link_libraries(myexe midas)
target_link_libraries(myfrontend mfe)

whereas find_package() defines a bunch of variables (the best I can tell) and one has
to add them to the include paths and library paths and compile flags "by hand".

I do not know how find_package() handles the separate libmidas, libmfe and librmana. (and
the separate libmanalyzer and libmanalyzer_main).

K.O.
                   Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> find_package(Midas)

I am testing find_package(Midas). There is a number of problems:

1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".

This seem to be an incomplete list of all libraries build by midas (rmana is missing).

This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):

- we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
- midas-c-compat is for building python interfaces, not for linking midas programs
- mfe contains a main() function, it will collide with the user main() function

So I think this should be changed to just "midas" and midas linking dependancy
libraries (-lutil, -lrt, -lpthread) should also be added to this list.

Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)

2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories

Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.

Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.

I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
this second method seems to be much simpler, everything is exported automatically into one file,
and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).

So how much time should I spend in fixing find_package(Midas) to make it generally usable?

- include path is incomplete
- library list is nonsense
- compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
- dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)

K.O.
                      Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > find_package(Midas)
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 

I think I give up on find_package(Midas). It seems like a lot of work to straighten
all this out, when install(EXPORT) does it all automatically and is easier to use
for building user frontends and analyzers.

K.O.
                      Reply  20 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
especially because ${MIDASSYS} is not set in cmake. 

I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)

Cheers
Lukas
 
> > find_package(Midas)
> 
> I am testing find_package(Midas). There is a number of problems:
> 
> 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> 
> This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> 
> This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> 
> - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> - midas-c-compat is for building python interfaces, not for linking midas programs
> - mfe contains a main() function, it will collide with the user main() function
> 
> So I think this should be changed to just "midas" and midas linking dependancy
> libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> 
> Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> 
> 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> 
> Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> 
> Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> 
> I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> this second method seems to be much simpler, everything is exported automatically into one file,
> and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 
> K.O.
                         Reply  20 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> especially because ${MIDASSYS} is not set in cmake.

So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
automatically would be nice.

Problem still remains that I cannot do these changes because I have no test harness
for any of this. Would be great if you could contribute this and post the documentation
blurb that we can paste into the midas wiki documentation.

And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
already does all of this automatically. What am I missing?

K.O.

> 
> I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> 
> Cheers
> Lukas
>  
> > > find_package(Midas)
> > 
> > I am testing find_package(Midas). There is a number of problems:
> > 
> > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > 
> > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > 
> > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > 
> > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > - midas-c-compat is for building python interfaces, not for linking midas programs
> > - mfe contains a main() function, it will collide with the user main() function
> > 
> > So I think this should be changed to just "midas" and midas linking dependancy
> > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > 
> > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > 
> > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > 
> > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > 
> > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > 
> > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > this second method seems to be much simpler, everything is exported automatically into one file,
> > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > 
> > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > 
> > - include path is incomplete
> > - library list is nonsense
> > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > 
> > K.O.
                            Reply  22 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

A more moderate option would be to remove mfe from ${MIDAS_LIBRARIES}, but as far as I understand mfe is not the only problem, so nuking might be the 
better option after all. In addition, setting ${MIDASSYS} in MidasConfig.cmake would probably improve compatibility.

>Sometimes you need "-lutil" and "-lrt" and "-lpthread", sometimes not. 
>Some way to pass this information automatically would be nice.

I do not properly understand when you need this and when not, but can't this be communicated with the PUBLIC keyword of target_link_libraries()? If I 
understand if we can use PUBLIC for -lutil, -lrt and -lpthread, I can write something, test it here and create a pull request.

> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?

Does this not require midas to be built every time you import it? I know, it's a bit the "billions of flies can't be wrong" argument, but I've never seen 
any package that uses import(EXPORT) over find_package().

> > I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> > problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> > not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> > especially because ${MIDASSYS} is not set in cmake.
> 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> 
> Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
> need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
> automatically would be nice.
> 
> Problem still remains that I cannot do these changes because I have no test harness
> for any of this. Would be great if you could contribute this and post the documentation
> blurb that we can paste into the midas wiki documentation.
> 
> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?
> 
> K.O.
> 
> > 
> > I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> > ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> > 
> > Cheers
> > Lukas
> >  
> > > > find_package(Midas)
> > > 
> > > I am testing find_package(Midas). There is a number of problems:
> > > 
> > > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > > 
> > > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > > 
> > > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > > 
> > > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > > - midas-c-compat is for building python interfaces, not for linking midas programs
> > > - mfe contains a main() function, it will collide with the user main() function
> > > 
> > > So I think this should be changed to just "midas" and midas linking dependancy
> > > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > > 
> > > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > > 
> > > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > > 
> > > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > > 
> > > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > > 
> > > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > > this second method seems to be much simpler, everything is exported automatically into one file,
> > > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > > 
> > > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > > 
> > > - include path is incomplete
> > > - library list is nonsense
> > > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > > 
> > > K.O.
                               Reply  24 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> A more moderate option ...

For the record, I did not disappear. I have a very short time window
to complete commissioning the alpha-g daq (now that the network
and the event builder are cooperating). To add to the fun, our high voltage
power supply turned into a pumpkin, so plotting voltages and currents 
on the same history plot at the same time (like we used to be able to do)
went up in priority. 

K.O.
                                  Reply  11 Jul 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> > A more moderate option ...
> 
> For the record, I did not disappear. I have a very short time window
> to complete commissioning the alpha-g daq (now that the network
> and the event builder are cooperating). To add to the fun, our high voltage
> power supply turned into a pumpkin, so plotting voltages and currents 
> on the same history plot at the same time (like we used to be able to do)
> went up in priority. 
> 

in the latest update, find_package(midas) should work correctly, the include path is right, 
the library list is right.

please test.

I find that the cmake install(export) method is simpler on the user side (just one line of 
code) and is easier to support on the midas side (config file is auto-generated).

I request that proponents of the find_package(midas) method contribute the documentation and 
example on how to use it. (see my other message).

K.O.
    Reply  13 Jul 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
Thanks for the contribution of MidasConfig.cmake. May I kindly ask for one extension:

Many of our frontends require inclusion of some midas-supplied drivers and libraries 
residing under

$MIDASSYS/drivers/class/
$MIDASSYS/drivers/device
$MIDASSYS/mscb/src/
$MIDASSYS/src/mfe.cxx

I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
that I can do things like:

add_executable(my_fe
  myfe.cxx
  $(MIDAS_SOURCES}/src/mfe.cxx
  ${MIDAS_SOURCES}/drivers/class/hv.cxx
  ...)

Does this make sense or is there a more elegant way for that?

Stefan
       Reply  13 Jul 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> $MIDASSYS/drivers/class/
> $MIDASSYS/drivers/device
> $MIDASSYS/mscb/src/
> $MIDASSYS/src/mfe.cxx
> 
> I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
> that I can do things like:
> 
> add_executable(my_fe
>   myfe.cxx
>   $(MIDAS_SOURCES}/src/mfe.cxx
>   ${MIDAS_SOURCES}/drivers/class/hv.cxx
>   ...)

1) remove $(MIDAS_SOURCES}/src/mfe.cxx from "add_executable", add "mfe" to 
target_link_libraries() as in examples/experiment/frontend:

add_executable(frontend frontend.cxx)
target_link_libraries(frontend mfe midas)

2) ${MIDAS_SOURCES}/drivers/class/hv.cxx surely is ${MIDASSYS}/drivers/...

If MIDAS is built with non-default CMAKE_INSTALL_PREFIX, "drivers" and co are not 
available, as we do not "install" them. Where MIDASSYS should point in this case is
anybody's guess. To run MIDAS, $MIDASSYS/resources is needed, but we do not install
them either, so they are not available under CMAKE_INSTALL_PREFIX and setting
MIDASSYS to same place as CMAKE_INSTALL_PREFIX would not work.

I still think this whole business of installing into non-default CMAKE_INSTALL_PREFIX
location has not been thought through well enough. Too much thinking about how cmake works
and not enough thinking about how MIDAS works and how MIDAS is used. Good example
of "my tool is a hammer, everything else must have the shape of a nail".

K.O.
Entry  11 Jul 2021, Konstantin Olchanski, Info, midas cmake update 
I reworked the midas cmake files:
- install via CMAKE_INSTALL_PREFIX should work correctly now:
- installed are bin, lib and include - everything needed to build against the midas library
- if built without CMAKE_INSTALL_PREFIX, a special mode "MIDAS_NO_INSTALL_INCLUDE_FILES" is activated, and the include path 
contains all the subdirectories need for compilation
- -I$MIDASSYS/include and -L$MIDASSYS/lib -lmidas work in both cases
- to "use" midas, I recommend: include($ENV{MIDASSYS}/lib/midas-targets.cmake)
- config files generated for find_package(midas) now have correct information (a manually constructed subset of information 
automatically exported by cmake's install(export))
- people who want to use "find_package(midas)" will have to contribute documentation on how to use it (explain the magic used to 
find the "right midas" in /usr/local/midas or in /midas or in ~/packages/midas or in ~/pacjages/new-midas) and contribute an 
example superproject that shows how to use it and that can be run from the bitpucket automatic build. (features that are not part 
of the automatic build we cannot insure against breakage).

On my side, here is an example of using include($ENV{MIDASSYS}/lib/midas-targets.cmake). I posted this before, it is used in 
midas/examples/experiment and I will ask ben to include it into the midas wiki documentation.

Below is the complete cmake file for building the alpha-g event bnuilder and main control frontend. When presented like this, I 
have to agree that cmake does provide positive value to the user. (the jury is still out whether it balances out against the 
negative value in the extra work to "just support find_package(midas) already!").

#
# CMakeLists.txt for alpha-g frontends
#

cmake_minimum_required(VERSION 3.12)
project(agdaq_frontends)

include($ENV{MIDASSYS}/lib/midas-targets.cmake)

add_compile_options("-O2")
add_compile_options("-g")
#add_compile_options("-std=c++11")
add_compile_options(-Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function)
add_compile_options("-DTMFE_REV0")
add_compile_options("-DOS_LINUX")

add_executable(feevb feevb.cxx TsSync.cxx)
target_link_libraries(feevb midas)

add_executable(fectrl fectrl.cxx GrifComm.cxx EsperComm.cxx JsonTo.cxx KOtcp.cxx $ENV{MIDASSYS}/src/tmfe_rev0.cxx)
target_link_libraries(fectrl midas)

#end
Entry  09 Jul 2021, Konstantin Olchanski, Info, cannot push to bitbucket 
the day has arrived when I cannot git push to bitbucket. cloud computing rules!

I have never seen this error before and I do not think we have any hooks installed,
so it must be some bitbucket stuff. their status page says some kind of maintenance
is happening, but the promised error message is "repository is read only" or something 
similar.

I hope this clears out automatically. I am updating all the cmake crud and I have no idea 
which changes I already pushed and which I did not, so no idea if anything will work for 
people who pull from midas until this problem is cleared out.

daq00:mvodb$ git push
X11 forwarding request failed on channel 0
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 12 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 247 bytes | 247.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: null value in column "attempts" violates not-null constraint
remote: DETAIL:  Failing row contains (13586899, 2021-07-10 01:13:28.812076+00, 1970-01-01 
00:00:00+00, 1970-01-01 00:00:00+00, 65975727, null).
To bitbucket.org:tmidas/mvodb.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@bitbucket.org:tmidas/mvodb.git'
daq00:mvodb$

K.O.
Entry  08 Jul 2021, Francesco Renga, Forum, Problem with python file reader 
Dear experts,
       while trying to readout a MIDAS file from a python script. I get the error below at the very first event. Any hint?

Thank you very much,
            Francesco

  File "/home/cygno/DAQ/offline/file_reader.py", line 9, in <module>
    for event in mfile:
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 159, in __next__
    ev = self.read_next_event()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 264, in read_next_event
    return self.read_this_event_body()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 307, in read_this_event_body
    self.event.unpack_body(body_data, 0, self.use_numpy)
  File "/home/cygno/DAQ/python/midas/event.py", line 648, in unpack_body
    bank.fill_header_from_bytes(bank_header_data, self.is_bank_32(), self.is_bank_data_64bit_aligned())
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in fill_header_from_bytes
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in <genexpr>
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 0: ordinal not in range(128)
    Reply  09 Jul 2021, Ben Smith, Forum, Problem with python file reader 
Hi Francesco,

Can you send me an example file to look at please? Either attached to the elog or sent directly to bsmith@triumf.ca

Thanks,
Ben
Entry  29 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
sets the variable to true or 1, the checkbox stays checked until you click again 
and it's being set back to 0.

For UINT32 variables, you can turn the variable "on", but the checkbox visually 
becomes unchecked immediately. Clicking again does not set the variable to 
0/false and the tick visually appears for a fraction of a second, but vanishes 
again.
    Reply  30 Jun 2021, Stefan Ritt, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
> For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
> sets the variable to true or 1, the checkbox stays checked until you click again 
> and it's being set back to 0.
> 
> For UINT32 variables, you can turn the variable "on", but the checkbox visually 
> becomes unchecked immediately. Clicking again does not set the variable to 
> 0/false and the tick visually appears for a fraction of a second, but vanishes 
> again.

Thanks for reporting that bug. Fixed in

https://bitbucket.org/tmidas/midas/commits/4ef26bdc5a32716efe8e8f0e9ce328bafad6a7bf

Stefan
       Reply  30 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
Thanks for the quick fix.
Entry  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
Hi all,
for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
Let me know if you think this is a good approach.

Marco F
    Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 
> Marco F

How can people judge your modification if they cannot see it? Why don't you make a pull request, so it can be properly reviewed.

Stefan
    Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 

Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
should be named "ODBLoadJSON" to be clear about this.

(JSON is preferred over .odb and .xml for many reasons (ask me))

K.O.
       Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > Hi all,
> > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > Let me know if you think this is a good approach.
> > 
> 
> Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> should be named "ODBLoadJSON" to be clear about this.
> 
> (JSON is preferred over .odb and .xml for many reasons (ask me))

What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.

Stefan
          Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> > > Hi all,
> > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > Let me know if you think this is a good approach.
> > > 
> > 
> > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > should be named "ODBLoadJSON" to be clear about this.
> > 
> > (JSON is preferred over .odb and .xml for many reasons (ask me))
> 
> What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> 

Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
sure we look at the file extension, I think the existing code already would do this and support all 3 formats.

But if he wants to load ODB data from a text literal or from a string,
we might as well stick to json. I guess we could support the other formats, but I do not see anybody
using anything other than json for new code like this.

ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

K.O.
             Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > > > Hi all,
> > > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > > Let me know if you think this is a good approach.
> > > > 
> > > 
> > > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > > should be named "ODBLoadJSON" to be clear about this.
> > > 
> > > (JSON is preferred over .odb and .xml for many reasons (ask me))
> > 
> > What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> > 
> 
> Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
> sure we look at the file extension, I think the existing code already would do this and support all 3 formats.
> 
> But if he wants to load ODB data from a text literal or from a string,
> we might as well stick to json. I guess we could support the other formats, but I do not see anybody
> using anything other than json for new code like this.
> 
> ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

I agree that if one would paste a string to the ODB, then JSON would be best.

But at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

Stefan
                Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

same here, lots of historical .odb and .xml files.

I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
was support for unlimited string length (and removal of associated buffer overflows). Right now,
I am not sure if both are UTF-8 clean and if they properly escape all control characters,
something to fix as we go or as we bump into problems.

K.O.
                   Reply  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
My idea was to collect some feedback instead of blindly submitting code for a pull request.

Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.

I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.

I'll try to sort this out and make a commit soon.
Best,

Marco



> > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> 
> same here, lots of historical .odb and .xml files.
> 
> I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> was support for unlimited string length (and removal of associated buffer overflows). Right now,
> I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> something to fix as we go or as we bump into problems.
> 
> K.O.
                      Reply  29 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
I just submitted a pull request for this feature, I did quite a lot of testing and it looks good to me.
Let me know if something is not clear.

I'll take care of adding the relevant informations to the wiki once it is merged.
Best,

Marco


> My idea was to collect some feedback instead of blindly submitting code for a pull request.
> 
> Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
> It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
> I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.
> 
> I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
> Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.
> 
> I'll try to sort this out and make a commit soon.
> Best,
> 
> Marco
> 
> 
> 
> > > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> > 
> > same here, lots of historical .odb and .xml files.
> > 
> > I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> > was support for unlimited string length (and removal of associated buffer overflows). Right now,
> > I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> > something to fix as we go or as we bump into problems.
> > 
> > K.O.
                         Reply  30 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
I quickly checked the pull request and could not find any obvious problem, so I merged it.
Entry  18 Jun 2021, Konstantin Olchanski, Bug Report, my html modbvalue thing is not working? 
I have a web page and I try to use modbvalue, but nothing happens. The best I can tell, I follow the documentation 
(https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbvalue).

<td id=setv0><div class="modbvalue" data-odb-path="/Equipment/CAEN_hvps01/Settings/VSET[0]" data-odb-editable="1">(ch0)</div></td>

I suppose I could add debug logging to the javascript framework for modbvalue to find out why it is not seeing
or how it is not liking my web page.

But how would a non-expert user (or an expert user in a hurry) would debug this?

Should the modbvalue framework log more error messages to the javascrpt console ("I am ignoring your modbvalue entry because...")?

Should it have a debug mode where it reports to the javascript console all the tags it scanned, all the tags it found, etc
to give me some clue why it does not find my modbvalue tag?

Right now I am not even sure if this framework is activated, perhaps I did something wrong in how I load the page
and the modbvalue framework is not loaded. The documentation gives some magic incantations but does not explain
where and how this framework is loaded and activated. (But I do not see any differences between my page and
the example in the documentation. Except that I do not load control.js, I do not need all the thermometer bars, etc.
If I do load it, still my modbvalue does not work).

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Report, my html modbvalue thing is not working? 
Can you post your complete page here so that I can have a look?

Stefan
Entry  21 Jun 2021, Lars Martin, Bug Report, ELog documentation inconsistency 
The documentation fro the Elog ODB tree here:
https://midas.triumf.ca/MidasWiki/index.php//Elog_ODB_tree#Url

says:

The Built-in elog will ignore this key.

If using an Built-in Elog, this key must NOT be present.

I assume this is an artifact from amending the documentation, but it's unclear if 
the key has to be removed or not. I.e. if the key exists and is empty, will the 
built-in elog work? In what way will it break?
Entry  17 Jun 2021, Joseph McKenna, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) unnamed.png
mlogger (history_image) now supports rtsp cameras, in ALPHA we have 
acquisitioned several new network connected cameras. Unfortunately they dont 
have a way of just capturing a single frame using libcurl


========================================
Motivation to link to OpenCV libraries
========================================

After looking at the ffmpeg libraries, it seemed non trivial to use them to 
listen to a rtsp stream and write a series of jpgs.

OpenCV became an obvious choice (it is itself linked to ffmpeg and 
gstreamer), its a popular, multiplatform, open source library that's easy to 
use. It is available in the default package managers in centos 7 and ubuntu 
(an is installed by default on lxplus).

========================================
How it works:
========================================

The framework laid out in history_image.cxx is great. A separate thread is 
dedicated for each camera. This is continued with the rtsp support, using 
the same periodicity:

if (ss_time() >= o["Last fetch"] + o["Period"]) {
An rtsp camera is detected by its URL, if the URL starts with ‘rtsp://’ its 
obvious its using the rtsp protocol and the cv::VideoCapture object is 
created (line 147).

If the connection fails, it will continue to retry, but only send an error 
message on the first 10 attempts (line 150). This counter is reset on 
successful connection
If MIDAS has been built without OpenCV, mlogger will send an error message 
that OpenCV is required if a rtsp URL is given (line 166)
The VideoCapture ‘stays live' and will grab frames from the camera based on 
the sleep, saving to file based on the Period set in the ODB.

If the VideoCapture object is unable to grab a frame, it will release() the 
camera, send an error message to MIDAS, then destroy itself, and create a 
new version (this destroy and create fully resets the connection to a 
camera, required if its on flaky wifi)
If the VideoCapture gets an empty frame, it also follows the same reset 
steps.
If the VideoCaption fills a cv::Frame object successfully, the image is 
saved to disk in the same way as the curl tools.

========================================
Concerns for the future:
========================================

VideoCapture is decoding the video stream in the background, allowing us to 
grab frames at will. This is nice as we can be pretty agnostic to the video 
format in the stream (I tested with h264 from a TP-LINK TAPO C100, but the 
CPU usage is not negligible.

I noticed that this used ~2% of the CPU time on an intel i7-4770 CPU, given 
enough cameras this is considerable. In ALPHA, I have been testing with 10 
cameras:
elog:2220/1 

My suggestion / request would be to move the camera management out of 
mlogger and into a new program (mcamera?), so that users can choose to off 
load the CPU load to another system (I understand the OpenCV will use GPU 
decoders if available also, which can also lighten the CPU load).
    Reply  18 Jun 2021, Konstantin Olchanski, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) 
> mlogger (history_image) now supports rtsp cameras

my goodness, we will drive the video surveillance industry out of business.

> My suggestion / request would be to move the camera management out of 
> mlogger and into a new program (mcamera?), so that users can choose to off 
> load the CPU load to another system (I understand the OpenCV will use GPU 
> decoders if available also, which can also lighten the CPU load).

every 2 years I itch to separate mlogger into two parts - data logger
and history logger.

but then I remember that the "I" in MIDAS stands for "integrated",
and "M" stands for "maximum" and I say, "nah..."

(I guess we are not maximum integrated enough to have mhttpd, mserver
and mlogger to be one monolithic executable).

There is also a line of thinking that mlogger should remain single-threaded
for maximum reliability and ease of debugging. So if we keep adding multithreaded
stuff to it, perhaps it should be split-apart after all. (anything that makes
the size of mlogger.cxx smaller is a good thing, imo).

K.O.
Entry  15 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
I am sure everybody else has 10gige and 40gige networks and are sending terabytes of data before breakfast.

Myself, I only have one computer with a 10gige network link and sufficient number of daq boards to fill
it with data. Here is my success story of getting all this data through MIDAS.

This is the anti-matter experiment ALPHA-g now under final assembly at CERN. The main particle detector is a long but 
thin cylindrical TPC. It surrounds the magnetic bottle (particle trap) where we make and study anti-hydrogen. There are 
64 daq boards to read the TPC cathode pads and 8 daq boards to read the anode wires and to form the trigger. Each daq 
board can produce data at 80-90 Mbytes/sec (1gige links). Data is sent as UDP packets (no jumbo frames). Altera FPGA 
firmware was done here at TRIUMF by Bryerton Shaw, Chris Pearson, Yair Lynn and myself.

Network interconnect is a 96-port Juniper switch with a 10gige uplink to the main daq computer (quad core Intel(R) 
Xeon(R) CPU E3-1245 v6 @ 3.70GHz, 64 GBytes of DDR4 memory).

MIDAS data path is: UDP packet receiver frontend -> event builder -> mlogger -> disk -> lazylogger -> CERN EOS cloud 
storage.

First chore was to get all the UDP packets into the main computer. "U" in UDP stands for "unreliable", and at first, UDP 
packets have been disappearing pretty much anywhere they could. To fix this, in order:

- reading from the udp socket must be done in a dedicated thread (in the midas context, pauses to write statistics or 
check alarms result in lost udp packets)
- udp socket buffer has to be very big
- maximum queue sizes must be enabled in the 10gige NIC
- ethernet flow control must be enabled on the 10gige link
- ethernet flow control must be enabled in the switch (to my surprise many switches do not have working end-to-end 
ethernet flow control and lose UDP packets, ask me about this. our big juniper switch balked at first, but I got it 
working eventually).
- ethernet flow control must be enabled on the 1gige links to each daq module
- ethernet flow control must be enabled in the FPGA firmware (it's a checkbox in qsys)
- FPGA firmware internally must have working back pressure and flow control (avalon and axi buses)
- ideally, this back-pressure should feed back to the trigger. ALPHA-g does not have this (it does not need it).

Next chore was to multithread the UDP receiver frontend and to multithread the event builder. Stock single-threaded 
programs quickly max out with 100% CPU use and reach nowhere near 10gige data speeds.

Naive multithreading, with two threads, reader (read UDP packet, lock a mutex, put it into a deque, unlock, repeat) and 
sender (lock a mutex, get a packet from deque, unlock, bm_send_event(), repeat) spends all it's time locking and 
unlocking the mutex and goes nowhere fast (with 1500 byte packets, about 600 kHz of lock/unlock at 10gige speed).

So one has to do everything in batches: reader thread: accumulate 1000 udp packets in an std::vector, lock the mutex, 
dump this batch into a deque, unlock, repeat; sender thread: lock mutex, get 1000 packets from the deque, unlock, stuff 
the 1000 packets into 1 midas event, bm_send_event(), repeat.

It takes me 5 of these multithreaded udp reader frontends to keep up with a 10gige link without dropping any UDP packets. 
My first implementation chewed up 500% CPU, that's all of it, there is only 4 CPU cores available, leaving nothing
for the event builder (and mlogger, and ...)

I had to:
a) switch from plain socket read() to socket recvmmsg() - 100000 udp packets per syscall vs 1 packet per syscall, and
b) switch from plain bm_send_event() to bm_send_event_sg() - using a scatter-gather list to avoid a memcpy() of each udp 
packet into one big midas event.

Next is the event builder.

The event builder needs to read data from the 5 midas event buffers (one buffer per udp reader frontend, each midas event 
contains 1000 udp packets as indovidual data banks), examine trigger timestamps inside each udp packet, collect udp 
packets with matching timestamps into a physics event, bm_send_event() it to the SYSTEM buffer. rinse and repeat.

Initial single threaded implementation maxed out at about 100-200 Mbytes/sec with 100% busy CPU.

After trying several threading schemes, the final implementation has these threads:
- 5 threads to read the 5 event buffers, these threads also examine the udp packets, extract timestamps, etc
- 1 thread to sort udp packets by timestamp and to collect them into physics events
- 1 thread to bm_send_event() physics events to the SYSTEM buffer
- main thread and rpc handler thread (tmfe frontend)

(Again, to reduce lock contention, all data is passed between threads in large batches)

This got me up to about 800 Mbytes/sec. To get more, I had to switch the event builder from old plain bm_send_event() to 
the scatter-gather bm_send_event_sg(), *and* I had to reduce CPU use by other programs, see steps (a) and (b) above.

So, at the end, success, full 10gige data rate from daq boards to the MIDAS SYSTEM buffer.

(But wait, what about the mlogger? In this experiment, we do not have a disk storage array to sink this
much data. But it is an already-solved problem. On the data storage machines I built for GRIFFIN - 8 SATA NAS HDDs using 
raidz2 ZFS - the stock MIDAS mlogger can easily sink 1000 Mbytes/sec from SYSTEM buffer to disk).

Lessons learned:

- do not use UDP. dealing with packet loss will cost you a fortune in headache medicines and hair restorations.
- use jumbo frames. difference in per-packet overhead between 1500 byte and 9000 byte packets is almost a factor of 10.
- everything has to be done in bulk to reduce per-packet overheads. recvmmsg(), batched queue push/pop, etc
- avoid memory allocations (I has a per-packet std::string, replaced it with char[5])
- avoid memcpy(), use writev(), bm_send_event_sg() & co

K.O.

P.S. Let's counting the number of data copies in this system:

x udp reader frontend:
- ethernet NIC DMA into linux network buffers
- recvmmsg() memcpy() from linux network buffer to my memory
- bm_send_event_sg() memcpy() from my memory to the MIDAS shared memory event buffer

x event builder:
- bm_receive_event() memcpy() from MIDAS shared memory event buffer to my event buffer
- my memcpy() from my event buffer to my per-udp-packet buffers
- bm_send_event_sg() memcpy() from my per-udp-packet buffers to the MIDAS shared memory event buffer (SYSTEM)

x mlogger:
- bm_receive_event() memcpy() from MIDAS SYSTEM buffer
- memcpy() in the LZ4 data compressor
- write() syscall memcpy() to linux system disk buffer
- SATA interface DMA from linux system disk buffer to disk.

Would a monolithic massively multithreaded daq application be more efficient?
("udp receiver + event builder + logger"). Yes, about 4 memcpy() out of about 10 will go away.

Would I be able to write such a monolithic daq application?

I think not. Already, at 10gige data rates, for all practical purposes, it is impossible
to debug most problems, especially subtle trouble in multithreading (race conditions)
and in memory allocations. At best, I can sprinkle assert()s and look at core dumps.

So the good old divide-and-conquer approach is still required, MIDAS still rules.

K.O.
    Reply  15 Jun 2021, Stefan Ritt, Info, 1000 Mbytes/sec through midas achieved! frontend.cxx
In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
) 10 GBit line. For the rack switch we use a 

MikroTik CRS354-48G-4S+2Q+RM 54 port

and for the aggregation switch

MikroTik CRS326-24S-2Q+RM 26 Port

both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
(which is usually NOT written in the data sheets). 

To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
details.

Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got

Event size: 7 MB

No logging: 900 events/s = 6.7 GBytes/s

Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s

Logging without compression: 170 events/s = 1.3 GBytes/s

So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
optimized version of bm_send_event()).

Best,
Stefan
       Reply  16 Jun 2021, Marco Francesconi, Info, 1000 Mbytes/sec through midas achieved! 
As reported by Stefan, in MEG II we have very similar ethernet throughputs.
In total, we have 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
The data arrives in push mode without any external intervention, the only throttling being an optional prescaling on the trigger rate.
We discovered the hard way that 1 Gbps throughput on Zynq is not trivial at all: the embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!) and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

Anyhow, even with the reduced speed, the maximum throughput at network input is around 8.5 Gbps which passes through the Mikrotik switches mentioned by Stefan.
We had very bad experiences in the past with similar price-point switches, observing huge packet drops when the instantaneous switching capacity cannot cope with the traffic, but so far we are happy with the Mikrotik ones.

On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU and a 10 Gbit connection to the network using an Intel X710 Network card.
In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

The current frontend is based on the mfe.c scheme for historical reasons (the very first version dates back to 2015).
We opted for a monolithic multithread solution so we can reuse the underlying DAQ code for other experiments which may not have the complete Midas backend.
Just to mention them: one is the FOOT experiment (which afaik uses an adapted version of Altas DAQ) and the other is the LOLX experiment (for which we are going to ship to Canada soon a small 32 channel system using Midas).
A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression can be applied to reduce the final data size (that part is still to be implemented).
This requirement results in additional resource usage to parse the UDP content into floats and calibrate them.
Currently, we have 7 packet collector threads to digest the full packet flow (using recvmmsg), followed by an event building stage that uses 4 threads and 3 other threads for WFM calibration.
We have progressive packet numbers on each packet generated by the hardware and a set of flags marking the start and end of the event; combining the packet number difference between the start and end of the event and the total received packets for that event it is really easy to understand if packet drops are happening.

All the thread infrastructure was tested and we could digest the complete throughput, we still have to finalise the full 10 Gbit connection to Midas because the final system has been installed only recently (April).
We are using EQ_USER flag to push events into mfe.c buffers with up to 4 threads, but I was observing that above ~1.5 Gbps the rb_get_wp() returns almost always DB_TIMEOUT and I'm forced to drop the event.
This conflicts with the measurements reported by Stefan (we were discussing this yesterday), so we are still investigating the possible cause.

It is difficult to report three years of development in a single Elog, I hope I put all the relevant point here.
It looks to me that we opted for very complementary approaches for high throughput ethernet with Midas, and I think there are still a lot of details that could be worth reporting.
In case someone organises some kind of "virtual workshop" on this, I'm willing to participate.
Best,

Marco


> In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
> I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
> switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
> 10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
> ) 10 GBit line. For the rack switch we use a 
> 
> MikroTik CRS354-48G-4S+2Q+RM 54 port
> 
> and for the aggregation switch
> 
> MikroTik CRS326-24S-2Q+RM 26 Port
> 
> both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
> (which is usually NOT written in the data sheets). 
> 
> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.
> 
> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got
> 
> Event size: 7 MB
> 
> No logging: 900 events/s = 6.7 GBytes/s
> 
> Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s
> 
> Logging without compression: 170 events/s = 1.3 GBytes/s
> 
> So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
> management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
> optimized version of bm_send_event()).
> 
> Best,
> Stefan
          Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> ... MEG II ... 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
>
> Zynq ... embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!)
> and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

that's an ouch. we use the altera ethernet mac, and jumbo frames are supported, but the firmware data path
was originally written assuming 1500-byte packets and it is too much work to rewrite it for jumbo frames.

we send the data directly from the FPGA fabric to the ethernet, there is an avalon/axi bus multiplexer
to split the ethernet packets to the NIOS slow control CPU. not sure if such scheme is possible
for SoC FPGAs with embedded ARM CPUs.

and yes, a 1 GHz ARM CPU will not do 10gige. You see it yourself, measure your memcpy() speed. Where
typical PC will have dual-channel 128-bit wide memory (and the famous for it's low latency
Intel memory controller), ARM SoC will have at best 64-bit wide memory (some boards are only 32-bit wide!),
with DDR3 (not DDR4) severely under-clocked (i.e. DDR3-900, etc). This is why the new Apple ARM chips
are so interesting - can Apple ARM memory controller beat the Intel x86 memory controller?

> On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU

that's the right gear for the job. quad-channel memory with nominal "Max Memory Bandwidth 68.3 GB/s",
10 CPU cores. My benchmark of memcpy() for the much older duad-channel memory i7-4820 with DDR3-1600 DIMMs
is 20 Gbytes/sec. waiting for ARM CPU with similar specs.

> and a 10 Gbit connection to the network using an Intel X710 Network card.
> In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

yup, same here. use Intel ethernet exclusively, even for 1gige links.

> A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression

I implemented hardware zero suppression in the FPGA code. I think 1 GHz ARM CPU does not have the oomph for this.

> rb_get_wp() returns almost always DB_TIMEOUT

replace rb_xxx() with std::deque<std::vector<char>> (protected by a mutex, of course). lots of stuff in the mfe.c frontend
is obsolete in the same way. check out the newer tmfe frontends (tmfe.md, tmfe.h and tmfe examples).

> It is difficult to report three years of development in a single Elog

but quite successful at it. big thanks for your write-up. I think our info is quite useful for the next people.

K.O.
       Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> In MEG II we also kind of achieved this rate.
>
> Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade switch.

Right. We built this DAQ system about 3 years ago and the cheep Chineese switches arrived
on the market about 1 year after we purchased the big 96 port juniper switch. Bad timing/good timing.

Actually I have a very nice 24-port 1gige switch ($2000 about 3 years ago), I could have
used 4 of them in parallel, but they were discontinued and replaced with a $5000 switch
(+$3000 for a 10gige uplink. I think I got the last very last one cheap switch).

But not all Chineese switches are equal. We have an Ubiquity 10gige switch, and it does
not have working end-to-end ethernet flow control. (yikes!).

BTW, for this project we could not use just any cheap switch, we must have 64 fiber SFP ports
for connecting on-TPC electronics. This narrows the market significantly and it does
not match the industry standard port counts 8-16-24-48-96.

> MikroTik CRS354-48G-4S+2Q+RM 54 port
> MikroTik CRS326-24S-2Q+RM 26 Port

We have a hard time buying this stuff in Vancouver BC, Canada. Most of our regular suppliers
are US based and there is a technology trade war still going on between the US and China.
I guess we could buy direct on alibaba, but for the risk of scammers, scalpers and iffy shipping.

> both cost in the order of 500 US$

tell one how much we overpay for US based stuff. not surprising, with how Cisco & co can afford
to buy sports arenas, etc.

> We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other,
> but apparently the switch have enough buffers.

You probably see ethernet flow control in action. Look at the counters for ethernet pause frames
in your daq boards and in your main computer.

> (which is usually NOT written in the data sheets).

True, when I looked into this, I found a paper by somebody in Berkley for special
technique to measure the size of such buffers.

(The big Juniper switch has only 8 Mbytes of buffer. The current wisdom for backbone networks
is to have as little buffering as possible).

> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.

We do not do this. (very bad!). When each trigger arrives, all 64+8 DAQ boards send a train of UDP packets
at maximum line speed (64+8 at 1 gige) all funneled into one 10 gige ((64+8)/10 oversubscription).

Before we got ethernet flow control to work properly, we had to throttle all the 1gige links by about 60%
to get any complete events at all. This would not have been acceptable for physics data taking.

> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk.

Dummy frontend is not very representative, because limitation is the memory bandwidth
and CPU load, and a real ethernet receiver has quite a bit of both (interrupt processing,
DMA into memory, implicit memcpy() inside the socket read()).

For example, typical memcpy() speeds are between 22 and 10 Gbytes/sec for current
generation CPUs and DRAM. This translates for a total budget of 22 and 10 memcpy()
at 10gige speeds. Subtract from this 1 memcpy() to DMA data from ethernet into memory
and 1 memcpy() to DMA data from memory to storage. Subtract from this 2 implicit
memcpy() for read() in the frontend and write() in mlogger. (the Linux sendfile() syscall
was invented to cut them out). Subtract from this 1 memcpy() for instruction and incidental
data fetch (no interesting program fits into cache). Subtract from this memory bandwidth
for running the rest of linux (systemd, ssh, cron jobs, NFS, etc). Hardly anything
left when all is said and done. (Found it, the alphagdaq memcpy() runs at 14 Gbytes/sec,
so total budget of 14 memcpy() at 10gige speeds).

And the event builder eats up 2 CPU cores to process the UDP packets at 10gige rate,
there is quite a bit of CPU-expensive data unpacking, inspection and processing
going on that cannot be cut out. (alphagdaq has 4 cores, "8 threads").

K.O.

P.S. Waiting for rack-mounted machines with AMD "X" series processors... K.O.
Entry  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
There have been times in ALPHA that an alarm is triggered and the shift crew 
are unclear who to contact if they aren't trained to fix the specific 
failure mode.

I wish to add the property 'Users responsible' to the ODB for Alarms and 
Programs.

I have drafted what this might look like in a new pull request:
https://bitbucket.org/tmidas/midas/pull-requests/22/add-users-responsible-
field-for-specific

It requires changing of several data structures, I think I have found all 
instances of the definitions so the ODB should 'repair' any of the old 
structures adding in users responsible.

If 'Users responsible' is set, MIDAS messages append them after the message 
in brackets '()'. If used in conjunction with the MIDAS messenger 
(mmessenger), the users responsible can be 'tagged' directly.

I.e, for slack, simply set the 'users responsible' to <@UserID|Nickname>, 
for mattermost '@username', for discord '<@userid>'. Note that discord 
doesn't allow you to tag by username, but numeric userid



I have expanded char array in 'al_trigger_class' to handle the potentially 
longer MIDAS messages. Perhaps since I'm touching these lines I should 
change these temporary containers to std::string (line 383 and 386 of 
alarm.cxx)?

I have tested this quite a bit for my system, I am not sure how I can test 
mjsonrpc.
    Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
I think this is a good idea and I support it. We have a similar problem in MEG and 
we solved that with external (bash) scripts called in case of alarms. One feature 
there we have is that for some alarms, several people want to get notified. So 
people can "subscribe" to certain alarms. The subscription are now handled inside 
Slack which I like better, but maybe it would be good to have more than one "user 
responsible". Like if one person is sleeping/traveling, it's good to have a 
substitute. Can you make an array out of that? Or a comma-separated list?

Best,
Stefan
       Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I think this is a good idea and I support it. We have a similar problem in MEG and 
> we solved that with external (bash) scripts called in case of alarms. One feature 
> there we have is that for some alarms, several people want to get notified. So 
> people can "subscribe" to certain alarms. The subscription are now handled inside 
> Slack which I like better, but maybe it would be good to have more than one "user 
> responsible". Like if one person is sleeping/traveling, it's good to have a 
> substitute. Can you make an array out of that? Or a comma-separated list?
> 
> Best,
> Stefan

Presently there are 256 characters in the 'users responsible' field, so you can just 
list many users (no space, space or comma whatever). Discord, slack and mattermost 
don't care, they just parse the user tags.

I can still make this an array and pass a std::vector<std::string> into 
al_trigger_class function?
          Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I can still make this an array and pass a std::vector<std::string> into 
> al_trigger_class function?

Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
re-visit.

Stefan
             Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > I can still make this an array and pass a std::vector<std::string> into 
> > al_trigger_class function?
> 
> Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> re-visit.
> 
> Stefan

Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
characters be too much? 
                Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > > I can still make this an array and pass a std::vector<std::string> into 
> > > al_trigger_class function?
> > 
> > Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> > re-visit.
> > 
> > Stefan
> 
> Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
> and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
> characters be too much? 

On that level memory is cheap.

Stefan
                   Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 

I've updated the branch / pull request to use an array of 10 entries (80 chars each). 32 felt a 
little overkill when I saw it on screen, but absolutely happy to set it to any number you 
recommend.

The array gets flattened out when an alarm is triggered, currently the formatting produces

AlarmClass : AlarmMessage (Flattened List Of Users Responsible Array With Space Separators)

If experiments want to use Discord / Slack / Mattermost tags and or add phone numbers, that 
should fit in 80 characters
                      Reply  31 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
This list of responsible being attached to alarm message strings will be great for the 
mmessenger, however, perhaps its going to generate very long messages for the speaker programs 
(web interface and mlxspeaker ):

AlarmClass : AlarmMessage (ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 ResponsibleUser4 
... ResponsibleUser4)

especially if people put in user tags or emergency contact details...

Should we add a key word or character for the programs that create audio to parse that silence 
the list of responsible users? I'd be tempted to use a single character but there is a risk 
users might have that in a custom alarm message. Maybe something usual like the 'bel' 
character? '|'? 

Perhaps use the string 'Responsible:' or 'Users:' to trim out the Users Responsible list from 
the message string?

AlarmClass : AlarmMessage Responsible:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)

AlarmClass : AlarmMessage Users:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)
                         Reply  02 Jun 2021, Konstantin Olchanski, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> This list of responsible being attached to alarm message strings ...

This is a great idea. But I think we do not need to artificially limit ourselves
to string and array lengths.

The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
#define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
the scenes, but in a non-type and -size safe way).

I think the web page code will work correctly, it does not care about string lengths.

K.O.
                            Reply  09 Jun 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > This list of responsible being attached to alarm message strings ...
> 
> This is a great idea. But I think we do not need to artificially limit ourselves
> to string and array lengths.
> 
> The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
> #define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
> the scenes, but in a non-type and -size safe way).
> 
> I think the web page code will work correctly, it does not care about string lengths.
> 
> K.O.

Auto growing lists is an excellent plan. I am making decent progress and should have something to 
report soon
Entry  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
notes from converting ALPHA-g chronobox frontend fechrono to tmfe c++ framework.

the chronobox device is a timestamp/low resolution tdc/scaler/generic TTL and ECL io
mainboard with an altera DE10_NANO plugin board. it has a cyclone-5 FPGA SOC running Raspbian linux.
FPGA communication is done by avalon-bus memory mapped registers, main data readout
is PIO from an FPGA 32-bit wide FIFO (no DMA yet).

- login to main computer (daq16)
- cd packages
- git clone https://bitbucket.org/tmidas/midas midas-develop
- cd midas-develop
- make mini ### creates linux-x86_64/{bin,lib}
- ssh agdaq@cb02 ### private network
- cd ~/packages/midas-develop
- make mini ### creates linux-linux-armv7l/{bin/lib}
- cd ~/online/chronobox_software
- cat fechrono.cxx ~/packages/midas-develop/progs/tmfe_example_everything.cxx > fechrono_tmfe.cxx
- edit fechrono_tmfe.cxx:

- rename "FeEverything" to "FeChrono"
- copy contents of frontend_init() to HandleFrontendInit()
- copy contents of frontend_exit() to HandleFrontendExit()
- replace get_frontend_index() with fFe->fFeIndex
- replace "return SUCCESS" with return TMFeOk()
- replace "return !SUCCESS" with return TMFeErrorMessage("boo!!!")
- this frontend has 3 indexed equipments, copy EqEverything 3 times, rename EqEverything to EqCbHist, EqCbms, EqCbFlow
- copy contents of begin_of_run() to EqCbHist::HandleBeginRun()
- copy contents of end_of_run() to EqCbHist::HandleEndRun()
- pause_run(), resume_run() are empty, delete all HandlePauseRun() and all HandleResumeRun()
- frontend_loop() is empty, delete
- poll_event() and interrupt_configure() are empty, delete
- delete all HandleStartAbortRun(), delete all calls to RegisterTransitionStartAbort();
- examine equipment[]:
- "cbhist%02d" - periodic, copy contents of read_cbhist() to EqCbHist::HandlePeriodic()
- "cbms%02d" - polled, copy contents of read_cbms_fifo() to EqCbms::HandlePollRead()
- "cbflow%02d" - periodic, copy contents of read_flow() to EqCbFlow::HandlePeriodic()
- delete unused HandlePoll(), HandlePollRead() and HandlePeriodic()
- replace bk_init32() with "size_t event_size = 100*1024; char* event = (char*)malloc(event_size); ComposeEvent(event, 
event_size); BkInit(event, event_size);"
- replace bk_create(pevent) with BkOpen(event)
- replace bk_close(pevent, ...) with BkClose(event, ...)
- replace "return bk_size(pevent)" with "EqSendEvent(event); free(event);"
- remove unused example SendData()
- if there linker complains about references to "hDB", add "HNDLE hDB" is global scope, add "hDB = fMfe->fDB"
- replace set_equipment_status() with EqSetStatus()
- move equipment configuration from the equipment[] array to the equipment constructors
- remove unused HandleRpc()
- remove unused HandleBeginRun() and unused HandleEndRun()
- remove all example code from HandleInit(), breakup frontend_init() code into per-equipment HandleInit() functions
- EqCbms::HandlePoll() replace all example code with "return true"
- if desired, replace ODB functions from utils.cxx with MVOdb RI(), RD(), etc
- if desired, replace cm_msg() with Msg() and delete "const char* frontend_name"
- update FeChrono() constructor:
      FeSetName("fechrono%02d");
      FeAddEquipment(new EqCbHist("cbhist%02d", __FILE__));
      FeAddEquipment(new EqCbms("cbms%02d", __FILE__));
      FeAddEquipment(new EqCbFlow("cbflow%02d", __FILE__));
- build:
g++ -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include -
I/home/agdaq/packages/midas-develop/mvodb -c fechrono_tmfe.cxx
g++ -o fechrono_tmfe.exe -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include 
-I/home/agdaq/packages/midas-develop/mvodb fechrono_tmfe.o utils.o cb.o /home/agdaq/packages/midas-develop/linux-
armv7l/lib/libmidas.a -lm -lz -lutil -lnsl -lpthread -lrt
- run:
- bombs on bm_set_cache_size(), reduce default cache size, old mserver cannot deal with the new default size, set 
fEqConfWriteCacheSize = 100*1024;
- run:
- prints too many messages, comment out print "HandlePollRead!"
- run:
- good now!

success, was not too bad.

also:
- replace gHaveRun with fMfe->fStateRunning
- replace gRunNumber with fMfe->fRunNumber

see tmfe.md section "variables provided by the framework"

K.O.
    Reply  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
Result is here:
https://bitbucket.org/expalpha/chronobox_software/src/master/fechrono_tmfe.cxx

Original code is in fechrono.cxx. Not super pretty, but representative of most mfe-based frontends
we see around here. A good example of why the old mfe.c structure no longer works so well for us.

After conversion to tmfe, we do not win a beauty contest yet, but the path for further
clean up and refactoring into better c++ is quite clear. (And it is very obvious where
the missing "event object" wants to be here)

K.O.
       Reply  15 Jun 2021, Konstantin Olchanski, Info, blog - convert tmfe_rev0 event builder to develop-branch tmfe c++ framework 
Now we are converting the alpha-g event builder from rev0 tmfe (midas-2020-xx) to the new tmfe c++
framework in midas-develop. Earlier, I followed the steps outlined in this blog
to convert this event builder from mfe.c framework to rev0 tmfe.

- get latest midas-develop
- examine progs/tmfe_example_everything.cxx
- open feevb.cxx
- comment-out existing main() function
- from tmfe_example_everything.cxx, copy class FeEverything and main() to the bottom of feevb.cxx
- comment-out old main()
- make sure we include the correct #include "tmfe.h"
- rename example frontend class FeEverything to FeEvb
- rename feevb's "rpc handler" and "periodic handler" class EvbEq to EqEvb
- update class declaration and constructor of EqEvb from EqEverything in example_everything: EqEvb extends TMFeEquipment, 
EqEvb constructor calls constructor of base class (c++ bogosity), keep the bits of the example that initialize the 
equipment "common"
- in EqEvb, remove data members fMfe and fEq: fMfe is now inherited from the base class, fEq is now "this"
- in FeEvb constructor, wire-in the EqEvb constructor: FeSetName("feevb") and FeAddEquipment(new EqEvb("EVB",__FILE__))
- migrate function names:
- fEq->SendEvent() with EqSendEvent()
- fEq->SetStatus() with EqSetStatus()
- fEq->ZeroStatistics() with EqZeroStatistics() -- can be removed, taken care of in the framework
- fEq->WriteStatistics() with EqWriteStatistics() -- can be removed, taken care of in the framework
- (my feevb.o now compiles, but will not work, yet, keep going:)
- EqEvb - update prototypes of all HandleFoo() methods per example_everything.cxx or per tmfe.h: otherwise the framework 
will not call them. c++ compiler will not warn about this!
- migrate old main():
- restore initialization of "common" and other things done in the old main():
- TMFeCommon was merged into TMFeEquipment, move common->Foo = ... to the EqEvb constructor, consult tmfe.h and tmfe.md 
for current variable names.
- consider adding "fEqConfReadConfigFromOdb = false;" (see tmfe.md)
- if EqEvb has a method Init() called from old main(), change it's name to HandleInit() with correct arguments.
- split EqEvb constructor: leave initialization of "common" in the constructor, move all functions, etc into HandleInit()
- move fMfe->SetTransitionSequenceFoo() calls to HandleFrontendInit()
- move fMfe->DeregisterTransition{Pause,Resume}() to HandleFrontendInit()
- old main should be empty now
- remove linking tmfe_rev0.o from feevb Makefile, now it builds!
- try to run it!
- it works!
- done.

K.O.
Entry  02 Jun 2021, Konstantin Olchanski, Info, bitbucket build truncated 
I truncated the bitbucket build to only build on ubuntu LTS 20.04.

Somehow all the other build targets - centos-7, centos-8, ubuntu-18 - have
an obsolete version of cmake. I do not know where the bitbucket os images
get these obsolete versions of cmake - my centos-7 and centos-8 have much
more recent versions of cmake.

If somebody has time to figure it out, please go at it, I would like very
much to have centos-7 and centos-8 builds restored (with ROOT), also
to have a ubuntu LTS 20.04 build with ROOT. (For me, debugging bitbucket
builds is extremely time consuming).

Right now many midas cmake files require cmake 3.12 (released in late 2018).

I do not know why that particular version of cmake (I took the number
from the tutorials I used).

I do not know what is the actual version of cmake that MIDAS (and ROOTANA) 
require/depend on.

I wish there were a tool that would look at a cmake file, examine all the 
features it uses and report the lowest version of cmake that supports them.

K.O.
Entry  12 May 2021, Pierre Gorel, Bug Report, History formula not correctly managed 
OS: OSX 10.14.6 Mojave
MIDAS: Downloaded from repo on April 2021.

I have a slow control frontend doing the command/readout of a MPOD HV/LV. Since I am reading out the current that are in nA (after updating snmp), I wanted to multiply the number by 1e9.

I noticed the new "Formula" field (introduced in 2019 it seems) instead of the "Factor/Offset" I was used to. None of my entries seems to be accepted (after hitting save, when coming back thee field is empty).

Looking in ODB in "/History/Display/MPOD/HV (Current)/", the field "Formula" is a string of size 32 (even if I have multiple plots in that display). I noticed that the fields "Factor" and "Offset" are still existing and they are arrays with the correct size. However, changing the values does not seem to do anything.

Deleting "Formula" by hand and creating a new field as an array of string (of correct length) seems to do the trick: the formula is displayed in the History display config, and correctly used.
    Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, History formula not correctly managed 
> OS: OSX 10.14.6 Mojave
> MIDAS: Downloaded from repo on April 2021.
> 
> I have a slow control frontend doing the command/readout of a MPOD HV/LV. Since I am reading out the current that are in nA (after updating snmp), I wanted to multiply the number by 1e9.
> 
> I noticed the new "Formula" field (introduced in 2019 it seems) instead of the "Factor/Offset" I was used to. None of my entries seems to be accepted (after hitting save, when coming back thee field is empty).
> 
> Looking in ODB in "/History/Display/MPOD/HV (Current)/", the field "Formula" is a string of size 32 (even if I have multiple plots in that display). I noticed that the fields "Factor" and "Offset" are still existing and they are arrays with the correct size. However, changing the values does not seem to do anything.
> 
> Deleting "Formula" by hand and creating a new field as an array of string (of correct length) seems to do the trick: the formula is displayed in the History display config, and correctly used.

I see this, too. Problem is that the history plot code must be compatible with both
the old scheme (factor/offset) and the new scheme (formula). But something goes wrong somewhere.

https://bitbucket.org/tmidas/midas/issues/307/history-plot-config-incorrect-in-odb

Why?

- new code cannot to "3 year" plots, old code has no problem with it
- old experiments (alpha1, etc) have only the old-style history plot definitions,
and both old and new plotting code should be able to show them (there is nobody
to convert this old stuff to the "new way", but we still desire to be able to look at it!)

K.O.
Entry  24 May 2021, Mathieu Guigue, Bug Report, Bug "is of type" 
Hi,

I am running a simple FE executable that is supposed to define a PRAW DWORD bank.
The issue is that, right after the start of the run, the logger crashes without messages.
Then the FE reports this error, which is rather confusing.
```
12:59:29.140 2021/05/24 [feTestDatastruct,ERROR] [odb.cxx:6986:db_set_data1,ERROR] "/Equipment/Trigger/Variables/PRAW" is of type UINT32, not UINT32
```
    Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, Bug "is of type" 
> Hi,
> 
> I am running a simple FE executable that is supposed to define a PRAW DWORD bank.
> The issue is that, right after the start of the run, the logger crashes without messages.
> Then the FE reports this error, which is rather confusing.
> ```
> 12:59:29.140 2021/05/24 [feTestDatastruct,ERROR] [odb.cxx:6986:db_set_data1,ERROR] "/Equipment/Trigger/Variables/PRAW" is of type UINT32, not UINT32
> ```

I think this is fixed in latest midas. There was a typo in this message, the same tid was printed twice,
with result you report "mismatch UINT32 and UINT32", instead of "mismatch of UINT32 vs what is actually there".

This fixes the message, after that you have to manually fix the mismatch in the data type in ODB (delete old one, I guess).

K.O.
Entry  26 May 2021, Marco Chiappini, Info, label ordering in history plot 
Dear all,
is there any way to order the labels in the history plot legend? In the old 
system there was the “order” column in the config panel, but I can not find it 
in the new system. Thanks in advance for the support.

Best regards,
Marco Chiappini
    Reply  02 Jun 2021, Konstantin Olchanski, Info, label ordering in history plot 
> is there any way to order the labels in the history plot legend? In the old 
> system there was the “order” column in the config panel, but I can not find it 
> in the new system. Thanks in advance for the support.

correct, for reasons unknown, the function to reorder and to delete individual 
entries was removed from the history panel editor.

K.O.
       Reply  02 Jun 2021, Konstantin Olchanski, Info, label ordering in history plot 
> > is there any way to order the labels in the history plot legend? In the old 
> > system there was the “order” column in the config panel, but I can not find it 
> > in the new system. Thanks in advance for the support.
> 
> correct, for reasons unknown, the function to reorder and to delete individual 
> entries was removed from the history panel editor.
> 
> K.O.

https://bitbucket.org/tmidas/midas/issues/284/history-panel-editor-reordering-of

K.O.
Entry  27 May 2021, Lukas Gerritzen, Bug Report, Wrong location for mysql.h on our Linux systems 
Hi,
with the recent fix of the CMakeLists.txt, it seems like another bug surfaced. 
In midas/progs/mlogger.cxx:48/49, the mysql header files are included without a 
prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our 
systems, the lines should be
  48 #include <mysql/mysql.h>
  49 #include <mysql/mysqld_error.h>
This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on 
Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.

If this problem occurs for other Linux/MySQL versions as well, it should be 
fixed in mlogger.cxx and midas/src/history_schema.cxx.
If this problem only occurs on some distributions or MySQL versions, it needs 
some more differentiation than #ifdef OS_UNIX. 

Also, this somehow seems familiar, wasn't there such a problem in the past?
    Reply  27 May 2021, Nick Hastings, Bug Report, Wrong location for mysql.h on our Linux systems 
Hi,

> with the recent fix of the CMakeLists.txt, it seems like another bug 
surfaced. 
> In midas/progs/mlogger.cxx:48/49, the mysql header files are included without 
a 
> prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our 
> systems, the lines should be
>   48 #include <mysql/mysql.h>
>   49 #include <mysql/mysqld_error.h>
> This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on 
> Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.
> 
> If this problem occurs for other Linux/MySQL versions as well, it should be 
> fixed in mlogger.cxx and midas/src/history_schema.cxx.
> If this problem only occurs on some distributions or MySQL versions, it needs 
> some more differentiation than #ifdef OS_UNIX.

What does "mariadb_config --cflags" or "mysql_config --cflags" return on 
these systems? For mariadb 10.3.27 on Debian 10 it returns both paths:

% mariadb_config --cflags
-I/usr/include/mariadb -I/usr/include/mariadb/mysql

Note also that mysql.h and mysqld_error.h reside in /usr/include/mariadb *not* 
/usr/include/mariadb/mysql so using "#include <mysql/mysql.h>" would not work.

On CentOS 7 with mariadb  5.5.68:

%  mysql_config --include
-I/usr/include/mysql
% ls -l /usr/include/mysql/mysql*.h
-rw-r--r--. 1 root root 38516 May  6  2020 /usr/include/mysql/mysql.h
-r--r--r--. 1 root root 76949 Oct  2  2020 /usr/include/mysql/mysqld_ername.h
-r--r--r--. 1 root root 28805 Oct  2  2020 /usr/include/mysql/mysqld_error.h
-rw-r--r--. 1 root root 24717 May  6  2020 /usr/include/mysql/mysql_com.h
-rw-r--r--. 1 root root  1167 May  6  2020 /usr/include/mysql/mysql_embed.h
-rw-r--r--. 1 root root  2143 May  6  2020 /usr/include/mysql/mysql_time.h
-r--r--r--. 1 root root   938 Oct  2  2020 /usr/include/mysql/mysql_version.h

So this seems to be the correct setup for both Debian and RHEL. If this is to 
be worked around in Midas I would think it would be better to do it at the 
cmake level than by putting another #ifdef in the code.

Cheers,

Nick.
       Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, Wrong location for mysql.h on our Linux systems 
> % mariadb_config --cflags
> -I/usr/include/mariadb -I/usr/include/mariadb/mysql

I get similar, both .../include and .../include/mysql are in my include path,
so both #include "mysql/mysql.h" and #include "mysql.h" work.

I added a message to cmake to report the MySQL CFLAGS and libraries, so next time
this is a problem, we can see what happened from the cmake output:

4ed0:midas olchansk$ make cmake | grep MySQL
...
-- MIDAS: Found MySQL version 10.4.16
-- MIDAS: MySQL CFLAGS: -I/opt/local/include/mariadb-10.4/mysql;-I/opt/local/include/mariadb-
10.4/mysql/mysql and libs: -L/opt/local/lib/mariadb-10.4/mysql/ -lmariadb

K.O.
Entry  27 May 2021, Joseph McKenna, Info, MIDAS Messenger - A program to send MIDAS messages to Discord, Slack and or Mattermost 

I have created a simple program that parses the message buffer in MIDAS and 
sends notifications by webhook to Discord, Slack and or Mattermost.

Active pull request can be found here:

https://bitbucket.org/tmidas/midas/pull-requests/21


Its written in python and CMake will install it in bin (if the Python3 binary 
is found by cmake). The only dependency outside of the MIDAS python library is 
'requests', full documentation are in the mmessenger.md 
    Reply  28 May 2021, Joseph McKenna, Info, MIDAS Messenger - A program to forward MIDAS messages to Discord, Slack and or Mattermost merged 
A simple program to forward MIDAS messages to Discord, Slack and or Mattermost

(Python 3 required)

Pull request accepted! Documentation can be found on the wiki

https://midas.triumf.ca/MidasWiki/index.php/Mmessenger
       Reply  02 Jun 2021, Konstantin Olchanski, Info, MIDAS Messenger - A program to forward MIDAS messages to Discord, Slack and or Mattermost merged 
> A simple program to forward MIDAS messages to Discord, Slack and or Mattermost
> 
> (Python 3 required)
> 
> Pull request accepted! Documentation can be found on the wiki
> 
> https://midas.triumf.ca/MidasWiki/index.php/Mmessenger

This sounds like a very useful and welcome addition to MIDAS.

But from documentation provided, I have clue how to activate it.

Perhaps it would help if you could write up the basic steps on how to go about it, i.e.
- go to discord
- push these buttons
- cut and paste this thingy from the web page to ODB

K.O.
Entry  19 May 2021, Francesco Renga, Suggestion, MYSQL logger 
Dear all,
      I'm trying to use the logging on a mysql DB. Following the instructions on 
the Wiki, I recompiled MIDAS after installing mysql, and cmake with NEED_MYSQL=1 
can find it:

-- MIDAS: Found MySQL version 8.0.23

Then, I compiled my frontend (cmake with no options + make) and run it, but in the 
ODB I cannot find the tree for mySQL. I have only:

Logger/Runlog/ASCII

while I would expect also:

Logger/Runlog/SQL

What could be missing? Maybe should I add something in the CMakeList file or run 
cmake with some option?

Thank you,
      Francesco
    Reply  21 May 2021, Francesco Renga, Suggestion, MYSQL logger 
I solved this, it was a failed "make clean" before recompiling. Now it works.

Sorry for the noise.

Francesco

> Dear all,
>       I'm trying to use the logging on a mysql DB. Following the instructions on 
> the Wiki, I recompiled MIDAS after installing mysql, and cmake with NEED_MYSQL=1 
> can find it:
> 
> -- MIDAS: Found MySQL version 8.0.23
> 
> Then, I compiled my frontend (cmake with no options + make) and run it, but in the 
> ODB I cannot find the tree for mySQL. I have only:
> 
> Logger/Runlog/ASCII
> 
> while I would expect also:
> 
> Logger/Runlog/SQL
> 
> What could be missing? Maybe should I add something in the CMakeList file or run 
> cmake with some option?
> 
> Thank you,
>       Francesco
Entry  19 May 2021, Konstantin Olchanski, Info, update of event buffer code 
a big update to the event buffer code was merged today.

two important bug fixes:

- a logic error in bm_receive_event() (actually bm_fill_read_cache_locked()) 
caused use of uninitialized variable to increment the read pointer and crash 
with error "read pointed points to an invalid event")
- missing bm_unlock() in bm_flush_cache() caused double-locking of event buffer 
caused a hang and a subsequent crash via the watchdog timeout.

several improvements:

- bm_receive_event_vec(std::vector<char>) with automatic memory allocation, one 
does not need to worry about providing a large event buffer to receive event 
data. For local connections MAX_EVENT_SIZE is no longer used, for remote 
connections, a buffer of MAX_EVENT_SIZE is allocated automatically, this is a 
limitation of the MIDAS RPC layer (it does not know how to allocate memory to 
receive arbitrary large data)

(MAX_EVENT_SIZE is now only used in bm_receive_event_rpc()).

- rpc_send_event_sg() - thread safe method to send events to the mserver. it 
takes an array of scatter-gather buffers, so a midas event does not have to be 
in one continious buffer.

- bm_send_event_sg() - same for local connections.

- on top of bm_send_event_sg() we now have bm_send_event_vec(std::vector<char>) 
and bm_send_event_vec(std::vector<vector<char>>). now we can move forward with 
implementing a new "event object" (the TMEvent event object from midasio.h 
already works with these new methods).

- remote connected bm_send_event() & co now always send events to the mserver 
using the event socket. (before, bm_send_event() used RPC_BM_SEND_EVENT and 
suffered from the RPC layer encoding/decoding overhead. mfe.c used 
rpc_send_event() for remote connections)

- bm_send_event(), bm_receive_event() & co now take a timeout value (in 
milliseconds) instead of an async_flag. The old async_flag values BM_WAIT and 
BM_NO_WAIT continue working as expected (wait forever and do not wait at all, 
respectively).

- following improvements are only for remote connections:

- in the case of event buffer congestion (event readers are slow, event buffers 
are close to 100% full), the bm_flush_cache() RPC will no longer timeout due to 
mserver being stuck waiting for free buffer space. (RPC is called with a 1000 
msec timeout, infinite loop waiting for flush is done on the frontend side, the 
RPC timeout will never fire)

- in the case of event buffer congestion, ODB RPC will no longer timeout. 
(previously mserver was stuck waiting for free buffer space and did not process 
any RPCs).

- at the end of run, last few events could be stuck in the event socket. now, 
frontends can flush it using bm_flush_cache(0,BM_WAIT) (use zero for the buffer 
handle). correct run transition should stop the trigger, stop generating new 
events, call bm_flush_cache(0,BM_WAIT), call bm_flush_cache("SYSTEM",BM_WAIT) 
and return success. (TMFE frontend already does this). Note that 
bm_flush_cache(BM_WAIT) can be stuck for very long time waiting for the event 
buffers to empty-out, so run transition RPC timeout is still possible.

K.O.
Entry  07 May 2021, Zaher Salman, Bug Report, modbselect trigget hotlink 
It seems that a modbselect triggers a "change" in an ODB which has a hot link. This happens onload (or whenever the custom page is reloaded) and otherwise it behaves as expected, i.e. no change unless the modbselect is actually changed. Is this the intended behaviour? can this be modified?
    Reply  10 May 2021, Stefan Ritt, Bug Report, modbselect trigget hotlink 
Thanks for reporting that bug, I fixed it in the last commit.

Stefan
Entry  06 May 2021, Ben Smith, Info, New feature in odbxx that works like db_check_record() 
For those unfamiliar, odbxx is the interface that looks like a C++ map, but automatically syncs with the ODB - https://midas.triumf.ca/MidasWiki/index.php/Odbxx.

I've added a new feature that is similar to the existing odb::connect() function, but works like the old db_check_record(). The new odb::connect_and_fix_structure() function:
- keeps the existing value of any keys that are in both the ODB and your code
- creates any keys that are in your code but not yet in the ODB
- deletes any keys that are in the ODB but not your code
- updates the order of keys in the ODB to match your code

This will hopefully make it easier to automate ODB structure changes when you add/remove keys from a frontend.

The new feature is currently in the develop branch, and should be included in the next release.
Entry  16 Feb 2021, Ruslan Podviianiuk, Forum, m is not defined error m_is_not_defined.png
Hello,

I see this mhttpd error starting MSL-script: 
Uncaught (in promise) ReferenceError: m is not defined
at mhttpd_message (VM2848 mhttpd.js:2304)
at VM2848 mhttpd.js:2122

As I can see it does not affect work of MSL script but shows ReferenceError in 
Midas sequencer (see picture).

Could please point me how to fix this error?

Thanks.
Ruslan
    Reply  25 Feb 2021, Konstantin Olchanski, Forum, m is not defined error 
> I see this mhttpd error starting MSL-script: 
> Uncaught (in promise) ReferenceError: m is not defined
> at mhttpd_message (VM2848 mhttpd.js:2304)
> at VM2848 mhttpd.js:2122

your line numbers do not line up with my copy of mhttpd.js. what version of midas 
do you run?

please give me the output of odbedit "ver" command (GIT revision, looks like this: 
IT revision:       Wed Feb 3 11:47:02 2021 -0800 - midas-2020-08-a-84-g78d18b1c on 
branch feature/midas-2020-12).

same info is in the midas "help" page (GIT revision).

to decipher the git revision string:

midas-2020-08-a-84-g78d18b1c means:
is commit 78d18b1c
which is 84 commits after git tag midas-2020-08-a

"on branch feature/midas-2020-12" confirms that I have the midas-2020-12 pre-
release version without having to do all the decoding above.

if you also have "-dirty" it means you changed something in the source code 
 and warranty is voided. (just joking! we can debug even modified midas source 
code)

K.O.
       Reply  05 May 2021, Zaher Salman, Forum, m is not defined error 
We had the same issue here, which comes from mhttpd.js line 2395 on the current git version. This seems to happen mostly when there is an alarm triggered or when there is an error message.

Anyway, the easiest solution for us was to define m at the beginning of mhttpd_message function 

let m;

and replace line 2395 with

if (m !== undefined) {


> > I see this mhttpd error starting MSL-script: 
> > Uncaught (in promise) ReferenceError: m is not defined
> > at mhttpd_message (VM2848 mhttpd.js:2304)
> > at VM2848 mhttpd.js:2122
> 
> your line numbers do not line up with my copy of mhttpd.js. what version of midas 
> do you run?
> 
> please give me the output of odbedit "ver" command (GIT revision, looks like this: 
> IT revision:       Wed Feb 3 11:47:02 2021 -0800 - midas-2020-08-a-84-g78d18b1c on 
> branch feature/midas-2020-12).
> 
> same info is in the midas "help" page (GIT revision).
> 
> to decipher the git revision string:
> 
> midas-2020-08-a-84-g78d18b1c means:
> is commit 78d18b1c
> which is 84 commits after git tag midas-2020-08-a
> 
> "on branch feature/midas-2020-12" confirms that I have the midas-2020-12 pre-
> release version without having to do all the decoding above.
> 
> if you also have "-dirty" it means you changed something in the source code 
>  and warranty is voided. (just joking! we can debug even modified midas source 
> code)
> 
> K.O.
          Reply  06 May 2021, Stefan Ritt, Forum, m is not defined error 
Thanks for reporting and pointing to the right location.

I fixed and committed it.

Best,
Stefan
Entry  09 Apr 2021, Lars Martin, Suggestion, Time zone selection for web page 
The new history as well as the clock in the web page header show the local time 
of the user's computer running the browser.
Would it be possible to make it either always use the time zone of the Midas 
server, or make it selectable from the config page?
It's not ideal trying to relate error messages from the midas.log to history 
plots if the time stamps don't match.
    Reply  14 Apr 2021, Stefan Ritt, Suggestion, Time zone selection for web page Screenshot_2021-04-14_at_16.54.12_.png
> The new history as well as the clock in the web page header show the local time 
> of the user's computer running the browser.
> Would it be possible to make it either always use the time zone of the Midas 
> server, or make it selectable from the config page?
> It's not ideal trying to relate error messages from the midas.log to history 
> plots if the time stamps don't match.

I implemented a new row in the config page to select the time zone. 

"Local": Time zone where the browser runs
"Server": Time zone where the midas server runs (you have to update mhttpd for that)
"UTC+X": Any other time zone

The setting affects both the status header and the history display.

I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
support for that is not that great in JavaScript, so I decided to go with simple 
UTC+X. Hope that's ok.

Please give it a try and let me know if it's working for you.

Best,
Stefan
       Reply  29 Apr 2021, Pierre-Andre Amaudruz, Suggestion, Time zone selection for web page 
> > The new history as well as the clock in the web page header show the local time 
> > of the user's computer running the browser.
> > Would it be possible to make it either always use the time zone of the Midas 
> > server, or make it selectable from the config page?
> > It's not ideal trying to relate error messages from the midas.log to history 
> > plots if the time stamps don't match.
> 
> I implemented a new row in the config page to select the time zone. 
> 
> "Local": Time zone where the browser runs
> "Server": Time zone where the midas server runs (you have to update mhttpd for that)
> "UTC+X": Any other time zone
> 
> The setting affects both the status header and the history display.
> 
> I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
> support for that is not that great in JavaScript, so I decided to go with simple 
> UTC+X. Hope that's ok.
> 
> Please give it a try and let me know if it's working for you.
> 
> Best,
> Stefan

Hi Stefan,

This is great, the UTC+x is perfect, thank you.
PAA
Entry  10 Mar 2021, Zaher Salman, Suggestion, embed modbvalue in SVG 
Is it possible to embed modbvalue in an SVG for use within a custom page?

thanks.
    Reply  10 Mar 2021, Stefan Ritt, Suggestion, embed modbvalue in SVG 
You can't really embed it, but you can overlay it. You tag the SVG with a 
"relative" position and then move the modbvalue with an "absolute" position over 
it:

<svg style="position:relative" width="400" height="100">
  <rect width="300" height="100" style="fill:rgb(255,0,0);stroke-width:3;stroke:rgb(0,0,0)" />
  <div class="modbvalue" style="position:absolute;top:50px;left:50px" data-odb-path="/Runinfo/Run number"></div>
</svg>
       Reply  26 Apr 2021, Zaher Salman, Suggestion, embed modbvalue in SVG 
I found a way to embed modbvalue into a SVG:

<text x="100" y="100" font-size="30rem">
Run=<tspan class="modbvalue" data-odb-path="/Runinfo/Run number"></tspan>
</text>

This seems to behave better that the suggestion below.

> You can't really embed it, but you can overlay it. You tag the SVG with a 
> "relative" position and then move the modbvalue with an "absolute" position over 
> it:
> 
> <svg style="position:relative" width="400" height="100">
>   <rect width="300" height="100" style="fill:rgb(255,0,0);stroke-width:3;stroke:rgb(0,0,0)" />
>   <div class="modbvalue" style="position:absolute;top:50px;left:50px" data-odb-path="/Runinfo/Run number"></div>
> </svg>
Entry  25 Mar 2021, Lars Martin, Bug Report, Minor bug: Change all time axes together doesn't work with +- buttons 
Version: release/midas-2020-12

In the new history display, the checkbox "Change all time axes together" works 
well with the mouse-based zoom, but does not apply to the +- buttons.
    Reply  14 Apr 2021, Stefan Ritt, Bug Report, Minor bug: Change all time axes together doesn't work with +- buttons 
> Version: release/midas-2020-12
> 
> In the new history display, the checkbox "Change all time axes together" works 
> well with the mouse-based zoom, but does not apply to the +- buttons.

Fixed in current commit.

Stefan
Entry  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export Cooling-MoxaCalib-20212118-190450-20212119-102151.pngScreenshot_from_2021-03-23_12-29-21.png
Version: release/midas-2020-12

I'm exporting the history data shown in elog:2132/1 to CSV, but when I look at the 
CSV data, the step no longer occurs at the same time in both data sets (elog:2132/2)
    Reply  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
History is from two separate equipments/frontends, but both have "Log history" set to 1.
       Reply  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
Tried with export of two different time ranges, and the shift appears to remain the same, 
about 4040 rows.
          Reply  24 Mar 2021, Stefan Ritt, Bug Report, Time shift in history CSV export 
I confirm there is a problem. If variables are from the same equipment, they have the same 
time stamps, like

t1 v1(t1) v2(t1)
t2 v1(t2) v2(t2)
t3 v1(t3) v2(t3)

when they are from different equipments, they have however different time stamps

t1 v1(t1)
t2    v2(t2)
t3 v1(t3)
t4    v2(t4)

The bug in the current code is that all variables use the time stamps of the first variable, 
which is wrong in the case of different equipments, like

t1 v1(t1) v2(*t2*)
t3 v1(t3) v2(*t4*)

So I can change the code, but I'm not sure what would be the bast way. The easiest would be to 
export one array per variable, like

t1 v1(t1)
t2 v1(t2)
...
t3 v2(t3)
t4 v2(t4)
...

Putting that into a single array would leave gaps, like

t1 v1(t1) [gap]
t2 [gap]  v2(t2)
t3 v1(t3) [gap]
t4 [ga]]  v2(t4)

plus this is programmatically more complicated, since I have to merge two arrays. So which 
export format would you prefer?

Stefan
             Reply  24 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
I think from my perspective the separate files are fine. I personally don't really like the format 
with the gaps, so don't see an advantage in putting in the extra work.
I'm surprised the shift is this big, though, it was more than a whole hour in my case, is it the 
time difference between when the frontends were started?
                Reply  14 Apr 2021, Stefan Ritt, Bug Report, Time shift in history CSV export 
I finally found some time to fix this issue in the latest commit. Please update and check if it's 
working for you.

Stefan
Entry  03 Apr 2020, Stefan Ritt, Info, Change of TID_xxx data types 
We have to request of a 64-bit integer data type to be included in MIDAS banks.
Since 64-bit integers are on some systems "long" and on other systems "long long",
I decided to create the two new data types

TID_INT64
TID_UINT64

which follows more the standard C++ tradition:

https://en.cppreference.com/w/cpp/types/integer

To be consistent, I renamed the old types:

TID_BYTE       -> TID_UINT8
TID_SBYTE      -> TID_INT8
TID_WORD       -> TID_UINT16
TID_SHORT      -> TID_INT16
TID_DWORD      -> TID_UINT32
TID_INT        -> TID_INT32

I left the old definitions in midas.h, so old code will still compile fine and be binary
compatible. But if you write new code, the new types are recommended.

If you save the ODB in ASCII format, the new types are used as stings as well, like

[/Experiment]
ODB timeout = INT32 : 10000

but the old types are still understood when you load an old ODB file.

I hope I didn't break anything, please report if you have any issue.

Stefan
    Reply  30 Mar 2021, Konstantin Olchanski, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> We have to request of a 64-bit integer data type to be included in MIDAS banks.
> Since 64-bit integers are on some systems "long" and on other systems "long long",
> I decided to create the two new data types
> 
> TID_INT64
> TID_UINT64
> 

These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

As of commits on 30 March 2021, mlogger will refuse to write them to the history and 
db_create_key() will refuse to create them in ODB.

Why these limitations:

a1) all reading of history is done using the "double" data type, IEEE-754 double precision 
floating point numbers have around 53 bits of precision and are too small to represent all 
possible values of 64-bit integers.
a2) SQL, SQLite and FILE history know nothing about reading and writing 64-bit integer data 
types (this should be easy to fix, as long as MySQL/MariaDB and PostgresQL support it)

b1) in ODB, odbedit and mhttd web pages do not display INT64/UINT64/QWORD data
b2) ODB save and restore from odb, xml and json formats most likely does not work for these 
data types

Fixing all this is possible, with a medium amount of work. As long as somebody needs it. 
Display of INT64/UINT64/QWORD on history plots will probably forever be truncated to 
"double" precision.

K.O.
       Reply  14 Apr 2021, Stefan Ritt, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

They were never meant to work with the history. They were primarily implemented to put large 64-
bit data words into midas banks. We did not yet have a request to put these values into the ODB. 
Once such a request comes, we can address this.

Stefan
    Reply  04 Apr 2021, Konstantin Olchanski, Info, Change of TID_xxx data types 
> 
> To be consistent, I renamed the old types:
> 
> TID_DWORD      -> TID_UINT32
> TID_INT        -> TID_INT32
> 

this created an incompatibility with old XML save files,
old versions of midas cannot load new XML save files,
variable types have changed i.e. from "INT" to "INT32".

it would have been better if XML save files kept using the old names.

now packages that read midas XML files also need updating.

specifically, in ROOTANA:
- the old TVirtualOdb/XmlOdb.cxx (no longer used, deleted),
- mvodb/mxmlodb.cxx

K.O.
Entry  12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. logicCtrl.cppstart_daq.PNG
Hi all,

I'm running into a curious problem when I try to run a program using my custom 
script button. I have been using a script button to start my DAQ, this button 
has always worked. It starts by exporting an absolute path to scripts and then 
runs scripts, my frontend, my analyzer, and mlogger relative to this path.

I recently added a line of code to run a new script "logic_controller". If I run 
the script_daq from my terminal (./start_daq), mhttpd accepts the client and the 
program works as intended. But, if I use the script button, the logic_controller 
program is immediately deleted by MIDAS. It can be seen appearing in the status 
page clients list and then immediately gets deleted. This is a client that runs 
on the local experiment host.

What might be the issue? What is the difference between running the script 
through the terminal as opposed to running it through the mhttpd button?

I have added a picture of my simple script and the logic_controller code.

Any help would be greatly appreciated.

Cheers.

Isaac
    Reply  12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button. 
> if I use the script button, the logic_controller program is immediately deleted by MIDAS.

This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.

Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"? 
       Reply  12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. debug_logic_controller.txt
> > if I use the script button, the logic_controller program is immediately deleted by MIDAS.
> 
> This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.

I have redirected stdout and stderr into a text file and I have attached it to this entry. From what the stdout says, it seems that the lambda
function gets called 4 times before the program disconnects from the experiment. Somehow the status must become SS_ABORT or RPC_SHUTDOWN.

> Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"? 

There are no interesting messages in the midas.log and "/Programs/logic_controller/Watchdog timeout" is 10000 when I run the command from the terminal window.
What happens when you run it on your test experiment?

I'll try some more debugging.

Thanks for helping me out! Cheers.

Isaac
          Reply  12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button. 
I think it would be useful to find the minimal example that exhibits this behaviour.

What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).



#include "midas.h"
#include "stdio.h"

int main() {
   cm_connect_experiment("", "", "logic_controller", NULL);

   do {
     int status = cm_yield(100);
     printf("cm_yield returned %d\n", status);
     if (status == SS_ABORT || status == RPC_SHUTDOWN)
       break;
   } while (!ss_kbhit());

   cm_disconnect_experiment();

   return 0;
}
             Reply  13 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. 
> I think it would be useful to find the minimal example that exhibits this behaviour.
> 
> What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).
> 

Hi Ben,

I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine. 
The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.

My temporary solution has been to let the loop run forever :)

Let me know what think. Thanks again!

Isaac

> 
> 
> #include "midas.h"
> #include "stdio.h"
> 
> int main() {
>    cm_connect_experiment("", "", "logic_controller", NULL);
> 
>    do {
>      int status = cm_yield(100);
>      printf("cm_yield returned %d\n", status);
>      if (status == SS_ABORT || status == RPC_SHUTDOWN)
>        break;
>    } while (!ss_kbhit());
> 
>    cm_disconnect_experiment();
> 
>    return 0;
> }
                Reply  13 Apr 2021, Stefan Ritt, Forum, Client gets immediately removed when using a script button. 
> I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine. 
> The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.
> 
> My temporary solution has been to let the loop run forever :)

Ahh, could be that ss_kbhit() misbehaves if there is no keyboard, meaning that it is started in the background as a script. 
We never had the issue before, since all "standard" midas programs like mlogger, mhttpd etc. also use ss_kbhit() and they 
can be started in the background via the "-D" flag, but maybe the stdin is then handled differentlhy. 

So just remove the ss_kbhit(), but keep the break, so that you can stop your program via the web page, like

#include "midas.h"
#include "stdio.h"

int main() {
  cm_connect_experiment("", "", "logic_controller", NULL);

  do {
    int status = cm_yield(100);
    printf("cm_yield returned %d\n", status);
    if (status == SS_ABORT || status == RPC_SHUTDOWN)
      break;
  } while (TRUE);

  cm_disconnect_experiment();

  return 0;
}
Entry  04 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)

This brings the number of midas data formats to 3:

bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;

TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)

TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)

ROOTANA should be able to handle bk_init32a() data now.

TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)

K.O.
    Reply  13 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
Until commit a4043ceacdf241a2a98aeca5edf40613a6c0f575 today, mdump mostly did not work with bank32a data.
K.O.


> In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
> misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
> https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)
> 
> This brings the number of midas data formats to 3:
> 
> bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
> bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
> bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;
> 
> TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)
> 
> TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)
> 
> ROOTANA should be able to handle bk_init32a() data now.
> 
> TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
> https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)
> 
> K.O.
Entry  22 Sep 2020, Frederik Wauters, Forum, INT INT32 in experim.h 
For my analyzer I generate the experim.h file from the odb.

Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)

newer midas versions generate INT32, ... types. I get a 

‘INT32’ does not name a type   

although I include midas.h 

how to fix this?
    Reply  22 Sep 2020, Konstantin Olchanski, Forum, INT INT32 in experim.h 
> For my analyzer I generate the experim.h file from the odb.
> 
> Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)
> 
> newer midas versions generate INT32, ... types. I get a 
> 
> ‘INT32’ does not name a type   
> 
> although I include midas.h 
> 
> how to fix this?

You could run experim.h through "sed" to replace the "wrong" data types with the correct data types.

You can also #define the "wrong" data types before doing #include experim.h.

I put your bug report into our bug tracker, but for myself I am very busy
with the alpha-g experiment and cannot promise to fix this quickly.

https://bitbucket.org/tmidas/midas/issues/289/int32-types-in-experimh

Here is an example to substitute things using "sed" (it can also do "in-place" editing, "man sed" and google sed examples)
sed "sZshm_unlink(.*)Zshm_unlink(SHM)Zg"

K.O.
       Reply  09 Mar 2021, Andreas Suter, Forum, INT INT32 in experim.h 
> > For my analyzer I generate the experim.h file from the odb.
This issue is still open. Shouldn't midas.h provide the 'new' data types as typedefs like  

typedef int INT32;

etc. Of course you would need to deal with all the supported targets and wrap it accordingly.

A.S.

> > 
> > Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)
> > 
> > newer midas versions generate INT32, ... types. I get a 
> > 
> > ‘INT32’ does not name a type   
> > 
> > although I include midas.h 
> > 
> > how to fix this?
> 
> You could run experim.h through "sed" to replace the "wrong" data types with the correct data types.
> 
> You can also #define the "wrong" data types before doing #include experim.h.
> 
> I put your bug report into our bug tracker, but for myself I am very busy
> with the alpha-g experiment and cannot promise to fix this quickly.
> 
> https://bitbucket.org/tmidas/midas/issues/289/int32-types-in-experimh
> 
> Here is an example to substitute things using "sed" (it can also do "in-place" editing, "man sed" and google sed examples)
> sed "sZshm_unlink(.*)Zshm_unlink(SHM)Zg"
> 
> K.O.
          Reply  10 Mar 2021, Stefan Ritt, Forum, INT INT32 in experim.h 
Ok, I added

/* define integer types with explicit widths */
#ifndef NO_INT_TYPES_DEFINE
typedef unsigned char      UINT8;
typedef char               INT8;
typedef unsigned short     UINT16;
typedef short              INT16;
typedef unsigned int       UINT32;
typedef int                INT32;
typedef unsigned long long UINT64;
typedef long long          INT64;
#endif

to cover all new types. If there is a collision with user defined types, compile your program with -DNO_INT_TYPES_DEFINE and you remove the 
above definition. I hope there are no other conflicts.

Stefan
             Reply  15 Mar 2021, Frederik Wauters, Forum, INT INT32 in experim.h 
works!

> Ok, I added
> 
> /* define integer types with explicit widths */
> #ifndef NO_INT_TYPES_DEFINE
> typedef unsigned char      UINT8;
> typedef char               INT8;
> typedef unsigned short     UINT16;
> typedef short              INT16;
> typedef unsigned int       UINT32;
> typedef int                INT32;
> typedef unsigned long long UINT64;
> typedef long long          INT64;
> #endif
> 
> to cover all new types. If there is a collision with user defined types, compile your program with -DNO_INT_TYPES_DEFINE and you remove the 
> above definition. I hope there are no other conflicts.
> 
> Stefan
                Reply  30 Mar 2021, Konstantin Olchanski, Forum, INT INT32 in experim.h 
> > 
> > /* define integer types with explicit widths */
> > #ifndef NO_INT_TYPES_DEFINE
> > typedef unsigned char      UINT8;
> > typedef char               INT8;
> > typedef unsigned short     UINT16;
> > typedef short              INT16;
> > typedef unsigned int       UINT32;
> > typedef int                INT32;
> > typedef unsigned long long UINT64;
> > typedef long long          INT64;
> > #endif
> > 

NIH at work. In C and C++ the standard fixed bit length data types are available
in #include <stdint.h> as uint8_t, uint16_t, uint32_t, uint64_t & co.

BTW, the definition of UINT32 as "unsigned int" is technically incorrect, on 16-bit machines
"int" is 16-bit wide and on some 64-bit machines "int" is 64-bit wide.

K.O.
Entry  05 Mar 2021, Svetlana Chesnevskaya, Bug Report, New MIDAS old frontend incompatibility error.log
Hello!

Could you help me solve the problem of compatibility between our frontend (created in 2017) and the fresh MIDAS? The old MIDAS (2017) worked well, then we did not use it.
While compiling the frontend, I get a lot of warnings and a few compilation errors.

Any help will be greatly appreciated.

Thanks in advance.
With the best regards,
Svetlana
Entry  01 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas 
Hi everyone,

I am currently trying to access a ROOT file produced by manalyzer. By calling JSROOT.openFile("MIDAS_DOMAIN/outputRUN.root"). I can download the rootfile via MIDAS_DOMAIN/outputRUN.root. Using JSROOT.openFile results in an 501 error, 
since the request feature is not provided. Using a simple API and uploading outputRUN.root there worked fine (when the run finised). 

Is there a way to use JSROOT.openFile with the current analyzed root file in Midas (so during the run)? I know that one can access histograms of the THttpServer via JSON but I need to get the full root tree.

Cheers,
Marius
    Reply  03 Mar 2021, Konstantin Olchanski, Forum, Using JSROOT.openFile with Midas 
> 
> I am currently trying to access a ROOT file produced by manalyzer. By calling JSROOT.openFile("MIDAS_DOMAIN/outputRUN.root"). I can download the rootfile via MIDAS_DOMAIN/outputRUN.root. Using JSROOT.openFile results in an 501 error, 
> since the request feature is not provided. Using a simple API and uploading outputRUN.root there worked fine (when the run finised). 
> 
> Is there a way to use JSROOT.openFile with the current analyzed root file in Midas (so during the run)? I know that one can access histograms of the THttpServer via JSON but I need to get the full root tree.
> 

Good questions. Right now in manalyzer I do not do anything more than starting the ROOT web server (so whatever
they support should work) and providing two "standard" location: one in the output file for histograms
and other permanent output and one in memory for transient objects, such as waveform plots, etc.

At some point I would like to provide a function to "get" TAFlowEvent objects so you can do things
like event displays in javascript. But I need a c++ to json serializer and standard c++ does not have it.
So I will have to use the clang serializer (also used by ROOT) and it will take me a few days
to figure it out.

Back to "openFile".

If you figure out the missing bits that need to be added to our code,
please post them here or submit them as a pull request or a bug report in bitbucket.

Also it would be good if you can provide a code example of "openFile" working elsewhere
but not with manalyzer, if I can run it, maybe I can figure out what's missing. But lacking
some example code, there is nothing for me to hack at.

K.O.
       Reply  04 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas test.htmlexample_root.root
Thank you for the answer :)

> At some point I would like to provide a function to "get" TAFlowEvent objects so you can do things
> like event displays in javascript. But I need a c++ to json serializer and standard c++ does not have it.
> So I will have to use the clang serializer (also used by ROOT) and it will take me a few days
> to figure it out.

That sounds exactly what I was searching for. Because I wanted to create an interface between rootana and 
an event display build in javascript. Since all I tried did not really worked with the current rootana
I have now a "solution". I use the MIDAS python client, read directly events from the MIDAS buffer and provided
the events in a JSON format with the python flask API. Since the rendering of the event display is the bottleneck 
and I only need a view events to display this solution worked really well for me. Maybe having such a JSON API of 
the event buffer in MIDAS directly would also work for most of the event display applications or other simple javescript 
applications (my opinion).

> If you figure out the missing bits that need to be added to our code,
> please post them here or submit them as a pull request or a bug report in bitbucket.

One of the problems I had was the CORS domain of the THttpServer. In manalyzer:1894 you do 
sprintf(str, "http:127.0.0.1:%d", httpPort); but there are additional options for the THttpServer (like "?cors=DOMAIN").
So maybe a flag while starting manalyzer passing such options would be nice. I will create a pull request passing them later.

> Also it would be good if you can provide a code example of "openFile" working elsewhere
> but not with manalyzer, if I can run it, maybe I can figure out what's missing. But lacking
> some example code, there is nothing for me to hack at.

The problem is that it was not even running on simple THttpServer using interactive root:

    serv = new THttpServer("http:8088?cors=*");
    TFile *_file0 = TFile::Open("example_root.root")
    serv->Register("File", _file0);

So I tried just saving the file in the $MIDAS_DIR and tried to use mserver with JSROOT.openFile. I attached the html file and 
a test root file.

Cheers,
Marius
          Reply  04 Mar 2021, Konstantin Olchanski, Forum, Using JSROOT.openFile with Midas 
well, if this is something in ROOT, perhaps you can pursue it with the ROOT crowd,
they are quite friendly.

on my side, if all you need is to pull event data banks, this is easy to add
in mhttpd.

the jsonrpc request will look something like this:

get_event {
"buffer":"system",
"get_type":"GET_LATEST", (or whatever bm_receive_event() can do)
"include_banks":["AAAA","BBBB"],
"exclude_banks":["CCCC","DDDD"]
}

and return something like this:

event {
"header":{"event_id":1,...},
"banks":{
"AAAA":[1,2,3,4],
"BBBB":NULL (you asked for it, so you always get it, but it is NULL if bank does not exist)
}
}

would this work for what you are doing?

(this is not good enough if data has to be pre-digested by c++ analysis in rootana)

K.O.
             Reply  04 Mar 2021, Stefan Ritt, Forum, Using JSROOT.openFile with Midas 
I also need midas events going back to the browser for single event display, so put +1 for me.

Please also consider to use JavaScript typed arrays instead of JSON. For large midas banks, type 
arrays are 5-10 times faster than JSON encoding/decoding.

Best,
Stefan
             Reply  04 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas 
> would this work for what you are doing?

Yes, having such a function would be perfect for the applications I have a the moment.

> (this is not good enough if data has to be pre-digested by c++ analysis in rootana)

Also agree, if one wants to have a more sophisticated applications it is definitely needed to preprocess the data.

Cheers,
Marius
Entry  25 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. error_message.PNGundefined_client.PNG
Hi all,

I'm currently experiencing an issue during run transitions. It comes in the form 
of an alert saying "TypeError: Cannot read property 'length' of undefined" 
whenever I'm in the "transition" window on mhttpd. I have attached an image of 
what the transition window looks like when this happens. 

By the looks of it and by peering at the lines in transition.html where the 
error occurs, it's pretty obvious that there is some strange undefined client 
that the web page tries to access.

I don't know how to find what this client is. Is there a way to see it in the 
ODB? 

The issues happens in show_client() of transition.html (called by callback()). 
Here's the trace:

Uncaught (in promise) TypeError: Cannot read property 'length' of undefined
    at show_client (?cmd=Transition:227)
    at callback (?cmd=Transition:420)
    at ?cmd=Transition:430

Any help would be very appreciated!

Thanks so much.

Isaac
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
Clearly something goes wrong with the STARTABORT transition. Actually from your 
sceenshot, it is not clear why the STARTABORT transition was initiated.

Usually it is called after some client fails the "start run" transition to inform 
other clients that the run did not start after all. (mlogger uses this to close the 
output file, etc).

But in the screenshot, we do not see any client fail the transition (only rootana1 
was called, and it returned "green").

So, a puzzle. One possibility is that the transition code gets so confused
that it does not record correct transition data to ODB, then the web page
gets even more confused.

One way to see what happens, is to run the odbedit command "start now -v".

Can you try that? And attach all it's output here?

K.O.
       Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. start_now_-v_(1).PNGstop.PNG
> Clearly something goes wrong with the STARTABORT transition. Actually from your 
> sceenshot, it is not clear why the STARTABORT transition was initiated.
> 
> Usually it is called after some client fails the "start run" transition to inform 
> other clients that the run did not start after all. (mlogger uses this to close the 
> output file, etc).
> 
> But in the screenshot, we do not see any client fail the transition (only rootana1 
> was called, and it returned "green").
> 
> So, a puzzle. One possibility is that the transition code gets so confused
> that it does not record correct transition data to ODB, then the web page
> gets even more confused.
> 
> One way to see what happens, is to run the odbedit command "start now -v".
> 
> Can you try that? And attach all its output here?
> 
> K.O.

Thanks for getting back to me right away. I've attached two screenshots. The first one 
is the output after running "start now -v" (everything seemed to work nicely there), the 
second output is after using odbedit to stop the run with "stop". Notice that the DAQ 
never stops because it gets stuck in between transitions (You can see the run status 
being "stopping run" with the cancel transition button).

Thanks.

Isaac
          Reply  26 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.
             Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. 
> So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
> to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
> the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
> forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.

You're right it does not stay stuck forever, it eventually gets unstuck. I forgot to mention this. 
I will try to play with these timeout parameters. It does not get stuck if I run my DAQ using the 
odbedit commands (start/stop). I don't know if this is relevant information that could help us 
identify the problem.

Thanks for all your help as always!

Isaac
                Reply  03 Mar 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
> It does not get stuck if I run my DAQ using the odbedit commands (start/stop).

Interesting. Run start/stop from odbedit works but from mhttpd gets stuck.

I think they do not run the transition quite the same way. mhttpd uses the multithreaded transition.

So we can debug this using the "mtransition" program. Try:

- mtransition -v -d 1 START/STOP -- this should be same as odbedit
- mtransition -m -v -d 1 START/STOP -- this should be same as mhttpd

The "-v" and "-d 1" flags should cause lots of output, for failed transitions,
cut-and-paste it all into this elog here, should give us plenty of meat to debug.

K.O.
Entry  02 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
since I am implementing a polled equipment, I was curious what is the smallest possible sleep time on current computers.

in current UNIX, there are 2 system calls available for sleeping: select() (with microsecond granularity) and nanosleep() (with nanosecond granularity).

So I wrote a little test program to check it out (progs/test_sleep).

First, Linux result using select(). Typical run on AMD 3700X CPU (4.1 GHz turbo boost) with Ubuntu LTS 20, linux kernel 5.8:

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1003368.855 usec actual, 100336.885 usec actual per loop, oversleep 336.885 usec, 0.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1008512.020 usec actual, 10085.120 usec actual per loop, oversleep 85.120 usec, 0.9%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1062137.842 usec actual, 1062.138 usec actual per loop, oversleep 62.138 usec, 6.2%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1528650.999 usec actual, 152.865 usec actual per loop, oversleep 52.865 usec, 52.9%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250898.123 usec actual, 62.510 usec actual per loop, oversleep 52.510 usec, 525.1%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 54056918.144 usec actual, 54.057 usec actual per loop, oversleep 53.057 usec, 5305.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   210875.988 usec actual, 0.211 usec actual per loop, oversleep 0.111 usec, 110.9%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   204804.897 usec actual, 0.205 usec actual per loop, oversleep 0.195 usec, 1948.0%
daq13:midas$ 

How to read this:

First line is 10 sleeps of 100 ms, for a total of 1 sec. this actually sleeps for a bit longer,
average over-sleep is 300 usec out of 100 ms is 0.3%.

Next few lines use progressively shorter sleep, 10 ms, 1 ms and 0.1 ms. over-sleep is consistently around 50-60 usec,
which I conclude to be this linux sleep granularity.

Last two lines try sleep for 0.1 usec and 0.01 usec, resulting in a zero-time sleep of select(),
so we just measure the average time cost of a linux syscall, around 200 ns in this machine.

Going to different machines:

Intel E-2236 (4.8 GHz tutboboost), Ubuntu LTS 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 400 ns.
Intel E-2226G (same, see arc.intel.com), CentOS-7, linux kernel 3.10: over-sleep is 60 usec, zero-sleep is 600 ns.
VME processor (2 GHz Intel T7400), Ubuntu 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 1700 ns.

This is pretty consistent, select() over-sleep is 60 usec on all hardware, zero-sleep tracks CPU GHz ratings.

Next, MacOS result, MacBookAir2020, MacOS 10.15.7, CPU 1.2 GHz i7-1060G7:

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1031108.856 usec actual, 103110.886 usec actual per loop, oversleep 3110.886 usec, 3.1%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1091104.984 usec actual, 10911.050 usec actual per loop, oversleep 911.050 usec, 9.1%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1270800.829 usec actual, 1270.801 usec actual per loop, oversleep 270.801 usec, 27.1%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1370345.116 usec actual, 137.035 usec actual per loop, oversleep 37.035 usec, 37.0%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  1706473.112 usec actual, 17.065 usec actual per loop, oversleep 7.065 usec, 70.6%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  5150341.034 usec actual, 5.150 usec actual per loop, oversleep 4.150 usec, 415.0%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   595654.011 usec actual, 0.596 usec actual per loop, oversleep 0.496 usec, 495.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   591560.125 usec actual, 0.592 usec actual per loop, oversleep 0.582 usec, 5815.6%
4ed0:midas olchansk$ 

things are quite different here, OS is Mach microkernel with an oldish FreeBSD UNIX single-server (from NextSTEP),
so the sleep granularity is different, better than linux. zero-sleep still measures the syscall time, 600 ns on this machine.

Next we measure the same using the nansleep() syscall.

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1004133.940 usec actual, 100413.394 usec actual per loop, oversleep 413.394 usec, 0.4%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1046117.067 usec actual, 10461.171 usec actual per loop, oversleep 461.171 usec, 4.6%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1096894.979 usec actual, 1096.895 usec actual per loop, oversleep 96.895 usec, 9.7%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1526744.843 usec actual, 152.674 usec actual per loop, oversleep 52.674 usec, 52.7%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250154.018 usec actual, 62.502 usec actual per loop, oversleep 52.502 usec, 525.0%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 53344123.125 usec actual, 53.344 usec actual per loop, oversleep 52.344 usec, 5234.4%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total, 52641665.936 usec actual, 52.642 usec actual per loop, oversleep 52.542 usec, 52541.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 52637501.001 usec actual, 52.638 usec actual per loop, oversleep 52.628 usec, 526275.0%
daq13:midas$ 

Here everything is simple. sleep longer than 1000 usec works the same as select(), sleep for shorter than 100 usec sleeps for 52 usec, regardless of what 
we ask for.

MacOS does no better, long sleeps are same as select(), sleeps is 1 usec or less sleep for too long. no improvement over select().

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1023327.827 usec actual, 102332.783 usec actual per loop, oversleep 2332.783 usec, 2.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1130330.086 usec actual, 11303.301 usec actual per loop, oversleep 1303.301 usec, 13.0%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1333846.807 usec actual, 1333.847 usec actual per loop, oversleep 333.847 usec, 33.4%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1402330.160 usec actual, 140.233 usec actual per loop, oversleep 40.233 usec, 40.2%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  2034706.831 usec actual, 20.347 usec actual per loop, oversleep 10.347 usec, 103.5%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  6646192.074 usec actual, 6.646 usec actual per loop, oversleep 5.646 usec, 564.6%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,  7556284.189 usec actual, 7.556 usec actual per loop, oversleep 7.456 usec, 7456.3%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 15720005.035 usec actual, 15.720 usec actual per loop, oversleep 15.710 usec, 157100.1%
4ed0:midas olchansk$ 

On Linux, strace tells us that the actual syscall behind nanosleep() is this:
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000}, 0x7fffc159e200) = 0

Let's try it directly... result is the same.
Let's try it with CLOCK_MONOTONIC... result is the same.

The man page of clock_nanosleep() specifies that this syscall always suspends the calling thread,
so what we see here is the Linux scheduler tick size.

Bottom line.

On current linux, shortest sleep is around 100 usec both select() and nanosleep().
On MacOS, shortest sleep is down to 5 usec using select(), but I cannot tell if CPU sleeps or busy-loops.

select() is still the best syscall for sleeping.

K.O.
    Reply  02 Mar 2021, Stefan Ritt, Info, shortest possible sleep 
Why do you need that? Periodic equipment typically runs ever ten seconds or so, meaning one can do this easily in a scheduler.

For polled equipment, you don't want to sleep at all. Because if you sleep, you might miss an event. That's why I put my poll in mfe.c into a for() loop. No 
sleep, maximum polling rate. I just double checked on my macbook air. 

- If poll is always false (no event available), the loop executes 50M times in 100ms (calibrated during startup of the frontend). That means one iteration 
takes 2ns (!). So if an event occurs, the readout is started with a 2ns overhead. No sleep can beat that. In a real world application, one has to add of course 
the VME access or so to poll for the event.

- If poll is always true, the framework generates about 700k events each second (returning jus a few bytes of event data).

So if one adds any sleep here, things can get only worse, so I don't see the point for that. Of course polling eats one kernel at 100%, but these days every 
CPU has more than one, even my 800 MHz Xilinx embedded ARM CPU (Zynq).

Best,
Stefan
       Reply  03 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
> Why do you need that?

UNIX/POSIX advertises functions for sleeping in microseconds and nanoseconds,
for sure it is interesting to know what they actually do and what happens
when you ask them to sleep for 1 microsecond or 1 nanosecond.

To sleep or not to sleep that is a question.

But if I do decide to sleep, and I call the sleep function, I want to know what actually happens.

Now I do and I share it with all.

On current Linux, shortest sleep is around 60 usec. select() with sleep
shorter than that will not sleep at all, nanosleep() will always sleep for
the shortest amount.

P.S. For fans of interrupts ("because they are fast"), sleep waiting for interrupt
probably has same latency/granularity as above (60 usec), so if I drive a DMA engine
and I except the DMA transfer to complete under 60 usec, I should use a busy loop
to poll the "DMA done" bit instead of going to sleep and wait for the DMA interrupt.

K.O.
Entry  25 Feb 2021, Lars Martin, Bug Report, tmfe_main.cxx missing include <signal.h> 
The most recent commit (b43aef648c2f8a7e710a327d0b322751ae44afea) throws this 
compiler error:
src/tmfe_main.cxx:39:11: error: 'SIGPIPE' was not declared in this scope
    signal(SIGPIPE, SIG_IGN);

It's fixed by adding #include <signal.h> to that file.
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, tmfe_main.cxx missing include <signal.h> 
> The most recent commit (b43aef648c2f8a7e710a327d0b322751ae44afea) throws this 
> compiler error:
> src/tmfe_main.cxx:39:11: error: 'SIGPIPE' was not declared in this scope
>     signal(SIGPIPE, SIG_IGN);
> 
> It's fixed by adding #include <signal.h> to that file.

"but it works just fine on my mac!"

anyhow, thank you for reporting this problem, it already fixed. the bitbucket auto-
build also caught it. I also boogered up "make remoteonly", also fixed now.

BTW, for production use I recommend midas from the "release" branches, unless one 
needs a bug fix or new feature from the development branch.

K.O.
       Reply  26 Feb 2021, Lars Martin, Bug Report, tmfe_main.cxx missing include <signal.h> 
> BTW, for production use I recommend midas from the "release" branches, unless one 
> needs a bug fix or new feature from the development branch.

Fair point. I would suggest adding that recommendation to the wiki instructions. I 
forget to add that step otherwise.
Entry  10 Feb 2021, Isaac Labrie Boulay, Forum, Javascript error during run transitions. 
Hi all,

I am encountering a Javascript error (TypeError: client.error is undefined) when 
I transition between run states. Does anybody have an idea of what my problem 
might be? I have pasted an example of what MIDAS logs during such sequences.

Thanks for all the help!

Isaac


09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
"~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"

09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started

09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started

09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started

09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
Client "Logger" transition 2 aborted while waiting for client "fevme": 
"/Runinfo/Transition in progress" was cleared

09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
transition STOP aborted: "/Runinfo/Transition in progress" was cleared

09:27:27.270 2021/02/10 [mhttpd,ERROR] 
[system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
header

09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
    Reply  10 Feb 2021, Konstantin Olchanski, Forum, Javascript error during run transitions. 
> I am encountering a Javascript error (TypeError: client.error is undefined) when 
> I transition between run states. Does anybody have an idea of what my problem 
> might be? I have pasted an example of what MIDAS logs during such sequences.


Not enough information. Can you do this:

a) for the javascript error, if you get it every time, open the javascript debugger 
and capture the stack trace? or at least the file name, function name and line number 
where the javascript exception is thrown?

b) for the run start failure, start the run from odbedit "start now -v" or from 
"mtransition -v -d 1 START" (or "stop" as the case may be). capture the output, email 
to me directly or put in this elog here.

K.O.


> 
> Thanks for all the help!
> 
> Isaac
> 
> 
> 09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
> "~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"
> 
> 09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started
> 
> 09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started
> 
> 09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started
> 
> 09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
> Client "Logger" transition 2 aborted while waiting for client "fevme": 
> "/Runinfo/Transition in progress" was cleared
> 
> 09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
> transition STOP aborted: "/Runinfo/Transition in progress" was cleared
> 
> 09:27:27.270 2021/02/10 [mhttpd,ERROR] 
> [system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
> header
> 
> 09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
> call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
       Reply  11 Feb 2021, Isaac Labrie Boulay, Forum, Javascript error during run transitions. start_now_-v.PNGCall_Stack_for_JavaScript_Error.PNG
> > I am encountering a Javascript error (TypeError: client.error is undefined) when 
> > I transition between run states. Does anybody have an idea of what my problem 
> > might be? I have pasted an example of what MIDAS logs during such sequences.
> 
> 
> Not enough information. Can you do this:
> 
> a) for the javascript error, if you get it every time, open the javascript debugger 
> and capture the stack trace? or at least the file name, function name and line number 
> where the javascript exception is thrown?

I've attached a screenshot of the call stack showing the file names and line numbers.

> b) for the run start failure, start the run from odbedit "start now -v" or from 
> "mtransition -v -d 1 START" (or "stop" as the case may be). capture the output, email 
> to me directly or put in this elog here.

I have also attached a screen capture of the output.

Thanks for your help as always.

Isaac

> K.O.
> 
> 
> > 
> > Thanks for all the help!
> > 
> > Isaac
> > 
> > 
> > 09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
> > "~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"
> > 
> > 09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started
> > 
> > 09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started
> > 
> > 09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started
> > 
> > 09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
> > Client "Logger" transition 2 aborted while waiting for client "fevme": 
> > "/Runinfo/Transition in progress" was cleared
> > 
> > 09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
> > transition STOP aborted: "/Runinfo/Transition in progress" was cleared
> > 
> > 09:27:27.270 2021/02/10 [mhttpd,ERROR] 
> > [system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
> > header
> > 
> > 09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
> > call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
          Reply  25 Feb 2021, Konstantin Olchanski, Forum, Javascript error during run transitions. 
> 
> I have also attached a screen capture of the output.
> 

so the error is gone?

K.O.
Entry  18 Feb 2021, Pintaudi Giorgio, Bug Report, Unexpected end-of-file 
Hello!
Sometimes when I mess around with the history plots I get the following error:

[mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
reading file "/home/wagasci-ana/Data/online/210219.hst"

I have tried the following without success:

- Remove the MIDAS history files
- Restart mhttpd and mlogger

I do not know what triggers the error but when it triggers the above message is 
printed hundres of times a second, completely spamming the message log.

It happened again today after I set the label of a frontend too long making 
mlogger crash. After fixing the label length, the above message appeared and it 
does not seem to go away.
    Reply  18 Feb 2021, Pintaudi Giorgio, Bug Report, Unexpected end-of-file Screenshot_from_2021-02-19_15-41-23.png
It appears that the issue is trigger by a nonexisting Event and Variable as shown 
in the attached picture. This issue can arise when restoring the ODB from a 
previous version or importing ODB values from other MIDAS instances.
It might be useful if the error message were more clear about the source of the 
problem.

> Hello!
> Sometimes when I mess around with the history plots I get the following error:
> 
> [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> reading file "/home/wagasci-ana/Data/online/210219.hst"
> 
> I have tried the following without success:
> 
> - Remove the MIDAS history files
> - Restart mhttpd and mlogger
> 
> I do not know what triggers the error but when it triggers the above message is 
> printed hundres of times a second, completely spamming the message log.
> 
> It happened again today after I set the label of a frontend too long making 
> mlogger crash. After fixing the label length, the above message appeared and it 
> does not seem to go away.
       Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Unexpected end-of-file 
> > [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> > reading file "/home/wagasci-ana/Data/online/210219.hst"

I am puzzled. We can try two things:

a) look inside the "bad" hst file, maybe we can see something. run "mhdump -L 
/home/wagasci-ana/Data/online/210219.hst". If there is anything wrong with the file, it 
will be probably at the end. You can also try to run it without "-L".

b) switch from "midas" history (.hst files) to "FILE" history (mh*.dat files), the 
"FILE" history code is newer and the file format is more robust, with luck it may 
survive whatever trouble is happening in your experiment. This is controlled in ODB 
/Logger/History/XXX/Active (set to "y/n").

c) the output of "mlogger -v" may give us some clue, it usually complains if something 
is not right with definitions of history data.

K.O.
Entry  25 Feb 2021, Lars Martin, Forum, TMFePollHandlerInterface timing 
Am I right in thinking that the TMFE HandlePoll function is calle once per 
PollMidas()? And what is the difference to HandleRead()?
    Reply  25 Feb 2021, Konstantin Olchanski, Forum, TMFePollHandlerInterface timing 
> Am I right in thinking that the TMFE HandlePoll function is calle once per 
> PollMidas()? And what is the difference to HandleRead()?

Actually, polled equipment is not implemented yet in TMFE, as you noted, the 
internal scheduler needs to be reworked.

Anyhow, I think with modern c++ and with threads, both "periodic" and "polled" 
equipments are not strictly necessary.

Periodic equipment is effectively this:

in a thread:
while (1) {
do stuff, read data, send events
sleep
}

Polled equipment is effectively this:

in a thread:
while (1) {
if (poll()) { read data, send events }
else { sleep for a little bit }
}

Example of such code is the "bulk" equipment in progs/fetest.cxx.

But to implement the same in a single threaded environment (eliminates
problems with data locking, race conditions, etc) and to provide additional
structure to the user code, the plan is to implement polled equipment in TMFE
frontends. (periodic equipment is already implemented).

K.O.
Entry  24 Feb 2021, Zaher Salman, Bug Report, history reload 
I have a history that is embedded in a custom page using

<div class="mjshistory" data-group="SampleCryo" data-panel="SampleTemp" data-scale="30m" style="'+size+' position: relative;left: 640px;top: -205px;"></div>

this works fine when I load the page but seems to cause a timeout when reloading (F5) the page. It used to work fine last year but since a midas update this year it does not work. 

When I manually stop the script when firefox reports that it is slowing down the browse I get the following in the console:

Script terminated by timeout at:
binarySearch@http://xxx.psi.ch:8081/mhistory.js:1051:11
MhistoryGraph.prototype.findMinMax@http://xxx.psi.ch:8081/mhistory.js:1583:28
MhistoryGraph.prototype.loadInitialData/<@http://xxx.psi.ch:8081/mhistory.js:780:15

any ideas what may be causing this?

thanks.

Another hint to the problem, the custom page is accessible via 
http://xxx.psi.ch:8081/?cmd=custom&page=SampleCryo&
once loaded the address changes to where A and B change values as time passes (I guess B-A=30m).
http://xxx.psi.ch:8081/?cmd=custom&page=SampleCryo&&A=1614173369&B=1614175169
    Reply  25 Feb 2021, Stefan Ritt, Bug Report, history reload 
I have to reproduce the problem. Can you please send me the full link by direct email. As you know, I'm also at PSI.

Stefan
Entry  12 Feb 2021, Konstantin Olchanski, Bug Report, mlogger history snafu 
there is a problem with mlogger between commits xxx (17 Nov 2020) and a762bb8 (12 feb 2021). because of 
confusion between seconds and milliseconds, FILE (mhf*.dat files) and SQL history are recording with 
incorrect timestamps.

- traditional MIDAS history (*.hst files) does not have this problem (because of a buglet)
- midas-2020-12 release does not have this problem (it has mlogger from midas-2020-08 release)

there are some additional changes in mlogger that we are sorting out, when ready, we will make a new 
release of midas.

K.O.
Entry  10 Feb 2021, Konstantin Olchanski, Release, midas-2020-12-a 
midas-2020-12-a is here, see https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12

notable change from previous midas releases:

Use of ODB "Common" by mfe.c frontends has changed. New preferred behaviour
is to have the values defined in the equipment structure in the source code
to always overwrite values in ODB /Equipment/Foo/Common, except for the value
of "Common/enabled" (equipment_common_overwrite set to TRUE).

All mfe.c frontends will need to be modified for this change:

- for old behaviour (use ODB "Common"), add: BOOL equipment_common_overwrite = false;
- for new behaviour (use equipment values in the source code), add: BOOL equipment_common_overwrite = true;

The TMFE C++ frontend does not implement this change yet, it uses all "Common" values from ODB
and there is no way to overwrite things like the MIDAS event buffer name from C++ code. This may
change with the next version.

notable updates since midas-2020-08:

- new ODB variable /Experiment/Enable sound can be used to globally prevent mhttpd from playing sounds.
- Lazylogger now supports writing data over SFTP.
- odbvalue elements on custom pages now support an onload() callback as well as onchange(). Most elements now also 
support a data-validate callback.
- custom pages can now tie a select drop-down box to an ODB value using modbselect.
- ability to choose whether the code or the current ODB values take precendence for the "Common" settings of an 
equipment when starting a frontend. See elog thread 2014 for more details, and the "Upgrade guide" below for 
instructions.
- minor improvements to mdump program - support for 64-bit data types and ability to load larger events if needed.
- minor improvements to History plots and Buffers webpage.
- bug fixes

plans for next development: major update of mlogger to simplify channel 
configuration in odb, improvements to mhttpd multithreading, new history plot 
configuration page, more c++ification.

To obtain this release, either checkout the top of branch release/midas-2020-12 
(recommended) or checkout the tag midas-2020-12-a.

K.O.
Entry  25 Jan 2021, Thomas Lindner, Suggestion, mhttpd browser caching 
I have a more subtle point about the new ODB key for using an external elog I mentioned in [1].  I was very confused after changing the ODB "External Elog" because mhttpd still wasn't using my external elog URL.  I started trying to debug mhttpd.cxx, but found a lot of bits of mhttpd didn't seem to be getting called.  I eventually realized that my browser had been caching the responses for some (though not all) of the MIDAS navigation buttons.  Clearing my browser cache fixed the problem and allowed me to use the MIDAS button to the external ELOG.  This caching happens on my macbook for both Firefox 84.0.2 and Safari 13.1.

Many of the requests to mhttpd end up going to send_fp(), where we explicitly set the cache time to 24 hours.

   // send HTTP cache control headers                                                                                               
   time_t now = time(NULL);
   now += (int) (3600 * 24);
   struct tm* gmt = gmtime(&now);
   const char* format = "%A, %d-%b-%y %H:%M:%S GMT";
   char str[256];
   strftime(str, sizeof(str), format, gmt);
   r->rsprintf("Expires: %s\r\n", str);

Some other MIDAS buttons don't seem to be cached by the browser; for instance the response for the 'OldHistory' button doesn't get cached.

Should we remove the cache instruction for at least some of the buttons?  At least for the elog button where we want the link direction to get switched by an ODB key the caching seems a bad idea.

[1] https://midas.triumf.ca/elog/Midas/2078
    Reply  25 Jan 2021, Stefan Ritt, Suggestion, mhttpd browser caching 
Let me first explain a bit why caching is there. Once we had the case that someone from 
TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. 

When we looked at it, we found that the custom page pulled about 100 items with individual
HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned
the custom page communication so that many ODB entries could be retrieved in one operation,
which improved the loading time from 100s to about 2s.

With the buttons we will have to make the same compromise. If we do not cache anything,
loading the midas status page over the Pacific takes many seconds. If we cache all, any
change on the midas side will not be reflected on the web page. So there is a compromise
to be made. I thought I designed it such that the side menu is cached locally, but when
the user presses "reload", then the full menu is fetched from the server. Of course one
has to remember this, so changing the ELOG URL or other things on the menu require a
reload (or wait a certain time for the cache to expire). So try again if that's working
for you. If not, I can visit it again and check if there is any bug.

If we go the route to disable the cache, better try this to T2K and see what you get before
we commit ourselves to that. Last time TRIUMF people were complaining a lot about long
load times.

Best,
Stefan
       Reply  25 Jan 2021, Thomas Lindner, Suggestion, mhttpd browser caching 
I tried reloading the pages.  If I reloaded the actual elog page 

https://server.triumf.ca/?cmd=Elog

then it bypassed the cache and got the correct updated page from mhttpd.

However, if when I reloaded the status page

https://server.triumf.ca/?cmd=Status

and then clicked the Elog button then I just got the cached (old) page.  Admittedly reloading the status page doesn't make so much sense (once I thought about it), but it is what I tried first (I'm good at modelling unexpected user behaviour); so there is some risk that the user will try reloading the wrong page and will be stuck not getting the external elog page (until 24 hours runs out).

Anyway, I will update the documentation to note that you need to reload the elog page after changing this variable.  That's probably an adequate solution.

I certainly don't suggest getting rid of caching entirely.  I was trying to think whether there was a set of pages where it would make sense to disable the cache (like the elog page).  But maybe that will just cause more problems.


> Let me first explain a bit why caching is there. Once we had the case that someone from 
> TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. 
> 
> When we looked at it, we found that the custom page pulled about 100 items with individual
> HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned
> the custom page communication so that many ODB entries could be retrieved in one operation,
> which improved the loading time from 100s to about 2s.
> 
> With the buttons we will have to make the same compromise. If we do not cache anything,
> loading the midas status page over the Pacific takes many seconds. If we cache all, any
> change on the midas side will not be reflected on the web page. So there is a compromise
> to be made. I thought I designed it such that the side menu is cached locally, but when
> the user presses "reload", then the full menu is fetched from the server. Of course one
> has to remember this, so changing the ELOG URL or other things on the menu require a
> reload (or wait a certain time for the cache to expire). So try again if that's working
> for you. If not, I can visit it again and check if there is any bug.
> 
> If we go the route to disable the cache, better try this to T2K and see what you get before
> we commit ourselves to that. Last time TRIUMF people were complaining a lot about long
> load times.
> 
> Best,
> Stefan
    Reply  08 Feb 2021, Konstantin Olchanski, Suggestion, mhttpd browser caching 
>    r->rsprintf("Expires: %s\r\n", str);

The best I can tell, none of this works in current browsers. with google-chrome,
I see it cache pretty much everything regardless of "expires", "no cache", etc
and anything else I tried.

Things like shift-<reload>, etc used to work to refresh the cache, but not any more.

So, I too, see confusing side-effects of caching, where I change something in ODB,
but "nothing happens". Then I scratch my head for 30 minutes until I remember
to open the javascript debugger where shift-<reload> (or is it ctrl-<reload>) actually works.

It seems that the only reliable way to bypass the browser cache is to add
a tag with a random number to the URL ("&ts=currenttime").

This is for HTTP GET requests. HTTP POST does not seem to be cached, so I do not worry
about this nonsense for json-rpc requests.

Perhaps we should do this random number trick for all user actions. User can
press buttons only so fast, we should be able to sustain the rate. Anything
loaded automatically or from a timer, we should allow caching.

BTW, things like midas.js are also cached, and it is common to see problems
after updating midas, where status.html is newly loaded, but midas.js is an old
stale version from cache.

Messy.

K.O.
       Reply  08 Feb 2021, Stefan Ritt, Suggestion, mhttpd browser caching 
> It seems that the only reliable way to bypass the browser cache is to add
> a tag with a random number to the URL ("&ts=currenttime").

Indeed that's the only reliable way to avoid caching across browsers. An alternative is

("&r=" + Math.random())

to add a random number.


> BTW, things like midas.js are also cached, and it is common to see problems
> after updating midas, where status.html is newly loaded, but midas.js is an old
> stale version from cache.

Reloading JavaScript file NOT from the cache is really tricky these days. I added a
special Google Chrome extension to clear my browser cache, which works reliably:

https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn

Stefan
Entry  13 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
Hi all,

I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
There must be a way to make this faster. From what I understand, changing the poll 
time (500ms by default) won't affect the frequency of polling just the 'lam' 
period.

Any suggestions?

Thanks for your help!

Isaac

Hi,

What is the actual readout time, event size?
Do you have multiple equipment and of what type if any?

PAA
    Reply  13 Jan 2021, Konstantin Olchanski, Forum, poll_event() is very slow. 
> 
> I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
> Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
> There must be a way to make this faster. From what I understand, changing the poll 
> time (500ms by default) won't affect the frequency of polling just the 'lam' 
> period.
> 
> Any suggestions?
> 

You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend,
where all this "lam" and "poll" business is removed.

At the moment, there are two example programs using the C++ TMFE frontend,
single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx).

K.O.
       Reply  15 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
> > 
> > I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
> > Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
> > There must be a way to make this faster. From what I understand, changing the poll 
> > time (500ms by default) won't affect the frequency of polling just the 'lam' 
> > period.
> > 
> > Any suggestions?
> > 
> 
> You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend,
> where all this "lam" and "poll" business is removed.
> 
> At the moment, there are two example programs using the C++ TMFE frontend,
> single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx).
> 
> K.O.

Ok. I did not know that there was a C++ OOD frontend example in MIDAS. I'll take a look at 
it. Is there any documentation on it works?

Thanks for the support!

Isaac
    Reply  13 Jan 2021, Stefan Ritt, Forum, poll_event() is very slow. 
Something must be wrong on your side. If you take the example frontend under

midas/examples/experiment/frontend.cxx

and let it run to produce dummy events, you get about 90 Hz. This is because we have a

  ss_sleep(10);

in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
you get an event rate of about 500'000 Hz. So the framework is really quick.

Probably your routine which looks for a 'lam' takes really long and should be fixed.

Stefan
       Reply  14 Jan 2021, Pintaudi Giorgio, Forum, poll_event() is very slow. 
> Something must be wrong on your side. If you take the example frontend under
> 
> midas/examples/experiment/frontend.cxx
> 
> and let it run to produce dummy events, you get about 90 Hz. This is because we have a
> 
>   ss_sleep(10);
> 
> in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
> you get an event rate of about 500'000 Hz. So the framework is really quick.
> 
> Probably your routine which looks for a 'lam' takes really long and should be fixed.
> 
> Stefan

Sorry if I am going off-topic but, because the ss_sleep function was mentioned here, I 
would like to take the chance and report an issue that I am having.

In all my slow control frontends, the CPU usage for each frontend is close to 100%. This 
means that each frontend is monopolizing a single core. When I did some profiling, I 
noticed that 99% of the time is spent inside the ss_sleep function. Now, I would expect 
that the ss_sleep function should not require any CPU usage at all or very little.

So my two questions are:
Is this a bug or a feature?
Would you able to check/reproduce this behavior or do you need additional info from my 
side?
       Reply  14 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
> Something must be wrong on your side. If you take the example frontend under
> 
> midas/examples/experiment/frontend.cxx
> 
> and let it run to produce dummy events, you get about 90 Hz. This is because we have a
> 
>   ss_sleep(10);
> 
> in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
> you get an event rate of about 500'000 Hz. So the framework is really quick.
> 
> Probably your routine which looks for a 'lam' takes really long and should be fixed.
> 
> Stefan

Hi Stefan,

I should mention that I was using midas/examples/Triumf/c++/fevme.cxx. I was trying to see 
the max speed so I had the 'lam' always = 1 with nothing else to add overhead in the 
poll_event(). I was getting <200 Hz. I am assuming that this is a bug. There is no 
ss_sleep() in that function.

Thanks for your quick response!

Isaac
          Reply  08 Feb 2021, Konstantin Olchanski, Forum, poll_event() is very slow. 
> I should mention that I was using midas/examples/Triumf/c++/fevme.cxx

this is correct, the fevme frontend is written to do 100% CPU-busy polling.

there is several reasons for this:
- on our VME processors, we have 2 core CPUs, 1st core can poll the VME bus, 2nd core can run 
mfe.c and the ethernet transmitter.
- interrupts are expensive to use (in latency and in cpu use) because kernel handler has to call 
use handler, return back etc
- sub-millisecond sleep used to be expensive and unreliable (on 1-2GHz "core 1" and "core 2" 
CPUs running SL6 and SL7 era linux). As I understand, current linux and current 3+GHz CPUs can 
do reliable microsecond sleep.

K.O.
Entry  21 Jan 2021, Thomas Lindner, Info, Using external ELOG with newer mhttpd 
A warning, in case others have the same problem I had.

In the past you could configure mhttpd so that the 'Elog' button would redirect to an external ELOG server; to do this you only needed to create and set the ODB variable '/Elog/URL' to the URL of your external ELOG server.

But with the newer MIDAS you need to set two ODB variables:

* "/Elog/URL" needs to be set to the URL of the external ELOG.
* "/Elog/External Elog" needs to be set to 'y'

I hadn't noticed this and was confused why my Elog button wasn't working after upgrading MIDAS.

MIDAS documentation was updated to reflect this change:

https://midas.triumf.ca/MidasWiki/index.php/Electronic_Logbook_(ELOG)
Entry  09 Dec 2020, Frederik Wauters, Forum, history and variables confusion 
I have a fe, with 2 "equipments" (2 different types of LV supplies).

Equipment/../Setting has a "Names" key, with the actual channel names (ch1, ch2, ...) of the devices.

Equipment/../Variables has channel states, voltage, etc. 

I also write a separate midas bank for each supply.

When I turn the "Log History" on, 2 things happen which cause troubles:
1. It writes the data of both bank to both the /Equipment/(Device1/Device2)/Variables .
2. I have e.g. 4 channels. In the banks I write current and voltages, so 8 numbers. When I turn on the logger I get an "Array size mismatch" between names and the midas bank size.

The only way around this is to disable the history logging in the equipment, and set "virtual" History events?
    Reply  09 Dec 2020, Stefan Ritt, Forum, history and variables confusion 
First, the writing of banks is completely independent of the history system. Banks go to the log file only, 
while the history is only linked to the "Variables" section in the ODB.

Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
"Power Supplies". In the frontend belonging to that equipment, you define a DEVICE_DRIVER list with
one entry for each power supply. If you interact with an mscb device, there are some helper functions
which simplify the definition of the equipment and which I can send you privately. So your device
driver looks a bit like the one attached.

If you cannot do that and absolutely want separate equipments, please post a complete ODB subtree of your
settings, and I can try to reproduce your problem.

Stefan

======================

DEVICE_DRIVER power_driver[] = {
   {"Power Supply 1", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {"Power Supply 2", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {"Power Supply 3", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {""}
};

...

INT frontend_init()
{
   mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 0, "Output 1", 0.1);
   mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 1, "Output 2", 0.1);
   ...
}

/*-- Function to define MSCB variables in a convenient way ---------*/

void mscb_define(const char *submaster, const char *equipment, const char *devname, 
                 DEVICE_DRIVER *driver, int address, unsigned char var_index, 
                 const char *name, double threshold)
{
   int i, dev_index, chn_index, chn_total;
   char str[256];
   float f_threshold;
   HNDLE hDB;

   cm_get_experiment_database(&hDB, NULL);

   if (submaster && submaster[0]) {
      sprintf(str, "/Equipment/%s/Settings/Devices/%s/Device", equipment, devname);
      db_set_value(hDB, 0, str, submaster, 32, 1, TID_STRING);
      sprintf(str, "/Equipment/%s/Settings/Devices/%s/Pwd", equipment, devname);
      db_set_value(hDB, 0, str, "meg", 32, 1, TID_STRING);
   }

   /* find device in device driver */
   for (dev_index=0 ; driver[dev_index].name[0] ; dev_index++)
      if (equal_ustring(driver[dev_index].name, devname))
         break;

   if (!driver[dev_index].name[0]) {
      cm_msg(MERROR, "mscb_define", "Device \"%s\" not present in device driver list", devname);
      return;
   }

   /* count total number of channels */
   for (i=chn_total=0 ; i<=dev_index ; i++)
      if (((driver[dev_index].flags & DF_INPUT) > 0 && (driver[i].flags & DF_INPUT)) ||
          ((driver[dev_index].flags & DF_OUTPUT) > 0 && (driver[i].flags & DF_OUTPUT)))
      chn_total += driver[i].channels;

   chn_index = driver[dev_index].channels;
   sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Address", equipment, devname);
   db_set_value_index(hDB, 0, str, &address, sizeof(int), chn_index, TID_INT, TRUE);
   sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Index", equipment, devname);
   db_set_value_index(hDB, 0, str, &var_index, sizeof(char), chn_index, TID_BYTE, TRUE);

   if (threshold != -1 && (driver[dev_index].flags & DF_INPUT) > 0) {
     sprintf(str, "/Equipment/%s/Settings/Update Threshold", equipment);
     f_threshold = (float) threshold;
     db_set_value_index(hDB, 0, str, &f_threshold, sizeof(float), chn_total, TID_FLOAT, TRUE);
   }

   if (name && name[0]) {
      sprintf(str, "/Equipment/%s/Settings/Names %s", equipment, devname);
      db_set_value_index(hDB, 0, str, name, 32, chn_total, TID_STRING, TRUE);
   }

   /* increment number of channels for this driver */
   driver[dev_index].channels++;
}
       Reply  10 Dec 2020, Frederik Wauters, Forum, history and variables confusion genesys.odb
I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 

Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  

Bank creation in readout function:

for(const auto& d: drivers)
{
...
...
  bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
...			
  std::vector<float> voltage = d->GetVoltage();
  std::vector<float> current = d->GetCurrent();
  for(channels)
  {
    *pdata++ = voltage.at(iChannel);
    *pdata++ = current.at(iChannel);
  }
  bk_close(pevent, pdata);
}
          Reply  11 Dec 2020, Frederik Wauters, Forum, history and variables confusion 
1. ok, so calling the same readout functions from different equipments is just a bad idea, my bad, no blame for Midas to write data from both bank to both odb trees ...

2. One needs the same amount of bank entries as the size of settings/names[] . Otherwise the "History Log" flag does not work. So just don`t us "names" but "channel names" or something. 

" Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
"Power Supplies".  "

It would be nice if this also works with c++ style drivers, i.e. a instance of a class. I don`t now how one would give an entry point to the "DEVICE_DRIVER" struct then.



> I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 
> 
> Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  
> 
> Bank creation in readout function:
> 
> for(const auto& d: drivers)
> {
> ...
> ...
>   bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
> ...			
>   std::vector<float> voltage = d->GetVoltage();
>   std::vector<float> current = d->GetCurrent();
>   for(channels)
>   {
>     *pdata++ = voltage.at(iChannel);
>     *pdata++ = current.at(iChannel);
>   }
>   bk_close(pevent, pdata);
> }
             Reply  15 Dec 2020, Konstantin Olchanski, Forum, history and variables confusion 
I think you are facing several problems:

a) mlogger does not clearly explain what history names will be used for which entries
in /eq/xxx/variables. "mlogger -v" almost does it, but we also need
"mlogger -v -n" to "show what you will do, but do not do it yet".

b) the mfe.c and the device class driver structure is very dated, tries to "do c++ in c". If it works for you,
certainly use it, but if it confuses you (as it confuses me), it probably only takes a few lines of c++
to replace the whole thing (minus the actual device drivers, which is the meat of it).

It think today, you face a choice:
- invest some time to understand the old device driver framework
- invest some time to do it "by hand" in c++, write your own device drivers (or use 3rd party drivers or snarf the "c" drivers from midas). Use the TMFE C++ frontend class if you go this route.

I would estimate that both choices are about the same amount of work.

K.O.


> 1. ok, so calling the same readout functions from different equipments is just a bad idea, my bad, no blame for Midas to write data from both bank to both odb trees ...
> 
> 2. One needs the same amount of bank entries as the size of settings/names[] . Otherwise the "History Log" flag does not work. So just don`t us "names" but "channel names" or something. 
> 
> " Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
> and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
> "Power Supplies".  "
> 
> It would be nice if this also works with c++ style drivers, i.e. a instance of a class. I don`t now how one would give an entry point to the "DEVICE_DRIVER" struct then.
> 
> 
> 
> > I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 
> > 
> > Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  
> > 
> > Bank creation in readout function:
> > 
> > for(const auto& d: drivers)
> > {
> > ...
> > ...
> >   bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
> > ...			
> >   std::vector<float> voltage = d->GetVoltage();
> >   std::vector<float> current = d->GetCurrent();
> >   for(channels)
> >   {
> >     *pdata++ = voltage.at(iChannel);
> >     *pdata++ = current.at(iChannel);
> >   }
> >   bk_close(pevent, pdata);
> > }
                Reply  08 Jan 2021, Stefan Ritt, Forum, history and variables confusion 
We kind of agreed to rewrite the slow control system in C++. Each device will have its own driver derived from a common base class implementing the general communication. The reason we need a "system" and not only a "hand-written" driver is because we want:

- glue many device drivers together for a single equipment
- have a dedicated readout thread for every device, in order not to block other devices
- have a common error reporting scheme working with several threads
- being able to disable/enable individual devices without changing the history system each time
- having a common naming scheme for all devices (like "enforce" /Equipment/<name>/Settings/Names xxx) which is needed by the history system
- ...

Will see when we have time for that.

Stefan
Entry  06 Jan 2021, Isaac Labrie Boulay, Info, Recovering a corrupted ODB using odbinit. 
Hi all,

I am currently trying to recover my corrupted ODB using odbinit and I am still 
getting issues after doing 'odbinit --cleanup' and trying to reload the saved 
ODB (last.json). Here is the output:


************************************************
(odbinit cleanup) Note* the ERROR in system.cxx
************************************************


[caendaq@cu332 ANIS]$ odbinit --cleanup
Checking environment... experiment name is "ANIS", remote hostname is ""
Checking command line... experiment "ANIS", cleanup 1, dry_run 0, create_exptab                                
0, create_env 0
Checking MIDASSYS....../home/caendaq/packages/midas
Checking exptab... experiments defined in exptab file "/home/caendaq/ANIS/exptab                               
":
0: "ANIS" <-- selected experiment
 

Checking exptab... selected experiment "ANIS", experiment directory "/home/caend                               
aq/ANIS/"
 

Checking experiment directory "/home/caendaq/ANIS/"
Found existing ODB save file: "/home/caendaq/ANIS/.ODB.SHM"
 

Checking shared memory...
Deleting old ODB shared memory...
[system.cxx:1052:ss_shm_delete,ERROR] shm_unlink(/1001_ANIS_ODB__home_caendaq_AN                               
IS_) errno 2 (No such file or directory)
Good: no ODB shared memory
Deleting old ODB semaphore...
Deleting old ODB semaphore... create status 1, delete status 1
Preserving old ODB save file /home/caendaq/ANIS/.ODB.SHM" to "/home/caendaq/ANIS                               
/.ODB.SHM.1609951022"
 

Checking ODB size...
Requested ODB size is 0 bytes (0.00B)
ODB size file is "/home/caendaq/ANIS//.ODB_SIZE.TXT"
Saved ODB size from "/home/caendaq/ANIS//.ODB_SIZE.TXT" is 1048576 bytes (1.05MB                               
)
We will initialize ODB for experiment "ANIS" on host "" with size 1048576 bytes                                
(1.05MB)

 
Creating ODB...
Creating ODB... db_open_database() status 302
Saving ODB...
Saving ODB... db_close_database() status 1
Connecting to experiment...
 

Connected to ODB for experiment "ANIS" on host "" with size 1048576 bytes (1.05M                               
B)
Checking experiment name... status 1, found "ANIS"
Disconnecting from experiment...
 

Done
 

****************************************
(Loading the last copy of my ODB)
*************************************



[caendaq@cu332 data]$ odbedit
[local:ANIS:S]/>load last.json
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
 


**********************************************
(Now trying to run my frontend and analyzer)
*********************************************
 


[caendaq@cu332 ANIS]$ ./start_daq.sh

mlogger: no process found
fevme: no process found
manalyzer.exe: no process found
manalyzer_example_cxx.exe: no process found
roody: no process found
[ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
 

11:38:30 [ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" 
is corrupted, mismatch of buffer name in shared memory ""
Becoming a daemon...
Becoming a daemon...
Please point your web browser to http://localhost:8081
To look at live histograms, run: roody -Hlocalhost
Or run: mozilla http://localhost:8081
[caendaq@cu332 ANIS]$ Frontend name          :     fevme
Event buffer size      :     1048576
User max event size    :     204800
User max frag. size    :     1048576
# of events per buffer :     5
 

Connect to experiment ANIS...

OK

[fevme,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
[fevme,ERROR] [mfe.cxx:596:register_equipment,ERROR] Cannot open event buffer 
"SYSTEM" size 33554432, bm_open_buffer() status 219

Has anyone ever encountered these issues?

Thanks for your time.

Isaac
Entry  05 Jan 2021, Isaac Labrie Boulay, Bug Report, Logger: Disk nearly full. 
Hi all,

I've ran into a problem where my experiment gets interrupted with a message from 
the logger saying that my disk is nearly full. This does not make sense to me 
because I have deleted almost all the data files from my data directory. I'm 
guessing that somewhere the ODB perceives that the directory is full when in 
reality its not.

Here is the exact message:

[ODBEdit,INFO] Run #252 stopped

09:22:19 [Logger,TALK] disk nearly full, stopping the run

09:22:19 [Logger,ERROR] [mlogger.cxx:4475:log_write,ERROR] Disk '/home/caendaq/A       
NIS/data/run00252.mid.lz4' is almost full: 81 MiBytes free out of 922497 MiBytes       
, stopping the run

Does any body have a solution for this? Thanks so much.

Isaac
    Reply  06 Jan 2021, Stefan Ritt, Bug Report, Logger: Disk nearly full. 
The logger simple requests the disk free space level from the operating system in the same 
way as the "df" command does. Can you do a "df" on your system? I have seen that some file 
systems free up space not immediately if you delete files, but some times later (like 24h).

Stefan
       Reply  06 Jan 2021, Isaac Labrie Boulay, Bug Report, Logger: Disk nearly full. 
> The logger simple requests the disk free space level from the operating system in the same 
> way as the "df" command does. Can you do a "df" on your system? I have seen that some file 
> systems free up space not immediately if you delete files, but some times later (like 24h).
> 
> Stefan

Thanks Stefan. Yes the files were still held open by some processes. It's solved now.

Cheers.

Isaac
Entry  17 Dec 2020, Amy Roberts, Suggestion, Improving variable functionality in Sequencer? 
We're using the sequencer to manage runs, and this typically looks something like:

1. save ODB keys to variables via ODBGET
2. set ODB keys to new values for a "pre-run" process
3. return ODB keys to values created in line 1
4. take data

The problem I'm running into is that the list of ODB keys to save is pretty 
unwieldy.  I'm wondering if there are sequencer features that exist or that I could 
request that might make this easier.

For example, having a way to list ODB keys, save ODB directories, and load ODB 
directories would be much more concise way for me to write my script.

Another option might be to have some version of the ODBSET wildcards for ODBGET.  
Although for this, setting the variable names might be tricky.  

In any case, even being able to ODBGET an array and set that to one variable name 
would be a big improvement.
    Reply  05 Jan 2021, Amy Roberts, Suggestion, Improving variable functionality in Sequencer? 
Hello, just wanted to re-ping on this question now that folks are starting to get back from 
the holidays.
       Reply  06 Jan 2021, Stefan Ritt, Suggestion, Improving variable functionality in Sequencer? laser.msl
I guess you use a wrong pattern here. There is no need to copy ODB values to local variables, 
then change them, then write them back. You can rather directly write values to the ODB. We run 
all our experiments in that way and we can do what we want. So most of our scripts have sections 
like

 ODBSUBDIR "/Equipment/Laser/Variables"
   ODBSET "Setting[*]", 0, 0
   ODBSET "Output[1]", 0, 0
   ODBSET "Output[2]", 1, 0
   ODBSET "Output[3]", 0, 0
   ODBSET "Output[4]", 1, 1
 ENDODBSUBDIR

Note that both the path and the indices can contain wild cards, making this pattern more 
flexible. Wildcards are however not (yet) supported for local variables, that's why we use 
directly the ODBSET directive.

I attach a larger example from the MEG experiment here for your reference.

Stefan
Entry  18 Dec 2020, Stefan Ritt, Suggestion, Code formatting .clang-formatcnaf_callback_llvm.cxxcnaf_callback_root.cxxcnaf_callback_gnu.cxxcnaf_callback_google.cxx
May I ask for your quick opinion on code formatting. MIDAS had a coding style 
which pretty much followed the ROOT coding style described at

https://root.cern/contribute/coding_conventions/

so we followed the "3 spaces indent" convention, braces according to Kernigham & 
Ritchie and a few other things. I see however that code written by different 
people still is formatted differently, like spaces before and after comparators 
etc. I wonder if it would make sense to keep a consistent code formatting through 
the whole midas repository.

Looking again at what the ROOT guys doe (see link above), they have a ClangFormat 
file, which I attached to this post. Putting this file into the root of midas 
ensures that all files are formatted in exactly the same way, which would increase 
readability largely.

The nice thing with ClangFormat is that can be integrated into my editor (Clion) as 
well as in emacs and vim:

https://clang.llvm.org/docs/ClangFormat.html

This would also make the emacs settings in our files obsolete:

/* emacs
 * Local Variables:
 * tab-width: 8
 * c-basic-offset: 3
 * indent-tabs-mode: nil
 * End:
 */

I don't like these because they are only for people using emacs. If everybody would 
put statements into the files with their favourite editor, all our source files 
would be cluttered quite a bit.

So the question is now how style to use? I attached different trials with a simple 
file from the distribution, so you can see the differences. They use the style from

- LLVM
- ROOT
- GNU
- Google

I consciously skipped the "Microsoft" style ;-)

Which one should we settle on? Any opinion? If I don't hear anything, I will pick a 
style at the end of this year 2020. I have a slight favour of the ROOT style, although 
I don't like that the "case" is not indented there under the opening brace of the 
switch statement which seems inconsistent to me. The only one doing that right is the 
Google format, but that one has an indentation of 2 chars instead our usual 3 chars. 
At the end of the day I think it's not so important on which style we agree, as long 
as we DO have a common style for all midas files.

Best,
Stefan
    Reply  04 Jan 2021, Stefan Ritt, Suggestion, Code formatting .clang-format
After pondering over the holidays, I decided to use the widely used LLVM code formatting, 
just adapted slightly for 3 spaces and "case" indentation in a "switch" statement. This 
formatting is now very close to our original one. Nevertheless, I did not reformat all 
existing code, since that would screw up the git repository, and you cannot see then anymore 
who wrote which line of code. But having the .clang-format file now in the midas root, all 
NEW files fill follow that standard. 

The CLion editor automatically picks up the .clang-format file if your enable ClangFomrat 
via Preferences -> Code Style -> General -> Enable ClangFormat.

EMACS can also use this file by adding following lines to your .emacs:

(load "<path-to-clang>/tools/clang-format/clang-format.el")
(global-set-key [C-M-tab] 'clang-format-region)

One problem left is if you check out midas on a new machine, you might not have there your 
personal .emacs file. If there is a way to ship a .emacs with midas, which gets 
automatically loaded, I would be happy to put this into the distribution.

Stefan
Entry  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
Hi all,

I'm currently trying to build events through doing block transfers. The worry was 
that organizing and packaging bank data into an array would produce too much dead 
time causing too many missed events. Trying out that method, I'm running into all 
sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
improperly aligned banks. Giving me a headache.

My question is, if I were to revert back to simple 32 bit read cycles and using 
the fevme.cxx template's method of organizing data before sending them to the 
buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
frequency that I'm using to test is 20kHz.

Thanks.

Isaac
    Reply  16 Dec 2020, Konstantin Olchanski, Forum, Issues building banks. 
> I'm currently trying to build events through doing block transfers.

I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
use. The restrictions on data alignment come from the VME master.
I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
I think there is no restriction for USB-VME bridges and similar.

Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
(no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
single-word reads instead of block transfer)?

> The worry was that organizing and packaging bank data into an array would produce too much dead time 
causing too many missed events.

Valid concerns.

> I'm running into all sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
improperly aligned banks.

You should not see any problems with unaligned transfers if you give the DMA engine
correct memory addresses as required by the hardware:

- always aligned to 32-bit (4 bytes, last two address bits set to 0)
- aligned to 64-bits for MBLT64 64-bit transfers, this would be the normal case for the V792 (8 bytes, 
last 3 address bits set to 0)
- aligned to 128-bits for 2eVME/2eSST transfers (16 bytes, last 4 bits of address are zero).

You also need to specify correct amount of data to read: number of bytes should be multiple of 4 for 32-
bit transfers, multiple to 8 for 64-bit transfers and multiple of 16 for 128-bit transfers (2eVME/2eSST).

Very often this requires reading "extra" data words. Most VME modules can generate extra pad words to 
align event length to DMA restrictions. Sometimes you need to
enable this in a control register (V792, V1190).

> Giving me a headache.

Me too. MIDAS recently introduced the QWORD 64-bit data type, banks of this type
should have correct alignment for 64-bit VME block transfers. But for 2eVME/2eSST
transfers, I still have to ensure alignment "by hand" (SIS3820, VF48, etc).

With QWORD banks, you need to use bk_init32a() instead of bk_init32().

> My question is, if I were to revert back to simple 32 bit read cycles

Yes, I always test with single-word reads first, with the 32-bit block transfer second and try the 64-bit 
block transfer last.

Sometimes there are unrelated problems (with the VME modules, VME bus, etc, or
with bugs in the frontend, etc) and this approach helps to identify the source
of trouble.

> and using 
> the fevme.cxx template's method of organizing data before sending them to the 
> buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
> would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
> frequency that I'm using to test is 20kHz.

Yes, with asynchronous read using 64-bit block transfer, 20 kHz should be achievable.

The old fevme frontend is based on the mfe.c framework and implementing
async readout requires special contortions. The structure of the new TMFE C++ frontend
class is supposed to make it easier, but I do not have an example TMFE based fevme yet.

P.S. Without using block transfer, your max rate is limited to:

16 channels, 1 word per channel, plus 1 header and 1 footer = 18 words (by luck, 64-bit aligned for 
correct BLT64 block read).

using VME single-word read at 1 us per transfer, 18 us per event = 55 kHz repetition rate.

(you do not say if you have any other VME modules you have to read)

K.O.
       Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
Thanks for the quick reply,

> > I'm currently trying to build events through doing block transfers.
> 
> I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> use. The restrictions on data alignment come from the VME master.
> I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> I think there is no restriction for USB-VME bridges and similar.
> 
> Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> single-word reads instead of block transfer)?

I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
the confusing question (Can you tell I'm an intern?).

> > The worry was that organizing and packaging bank data into an array would produce too much dead time 
> causing too many missed events.
> 
> Valid concerns.
> 
> > I'm running into all sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
> improperly aligned banks.
> 
> You should not see any problems with unaligned transfers if you give the DMA engine
> correct memory addresses as required by the hardware:
> 
> - always aligned to 32-bit (4 bytes, last two address bits set to 0)
> - aligned to 64-bits for MBLT64 64-bit transfers, this would be the normal case for the V792 (8 bytes, 
> last 3 address bits set to 0)
> - aligned to 128-bits for 2eVME/2eSST transfers (16 bytes, last 4 bits of address are zero).
> 
> You also need to specify correct amount of data to read: number of bytes should be multiple of 4 for 32-
> bit transfers, multiple to 8 for 64-bit transfers and multiple of 16 for 128-bit transfers (2eVME/2eSST).

I am transferring 32-bit words. Transferring 32-bit words should always read multiples of 4 bytes so that's 
good.

> Very often this requires reading "extra" data words. Most VME modules can generate extra pad words to 
> align event length to DMA restrictions. Sometimes you need to
> enable this in a control register (V792, V1190).
> 
> > Giving me a headache.
> 
> Me too. MIDAS recently introduced the QWORD 64-bit data type, banks of this type
> should have correct alignment for 64-bit VME block transfers. But for 2eVME/2eSST
> transfers, I still have to ensure alignment "by hand" (SIS3820, VF48, etc).
> 
> With QWORD banks, you need to use bk_init32a() instead of bk_init32().
> 
> > My question is, if I were to revert back to simple 32 bit read cycles
> 
> Yes, I always test with single-word reads first, with the 32-bit block transfer second and try the 64-bit 
> block transfer last.
> 
> Sometimes there are unrelated problems (with the VME modules, VME bus, etc, or
> with bugs in the frontend, etc) and this approach helps to identify the source
> of trouble.
> 
> > and using 
> > the fevme.cxx template's method of organizing data before sending them to the 
> > buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
> > would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
> > frequency that I'm using to test is 20kHz.
> 
> Yes, with asynchronous read using 64-bit block transfer, 20 kHz should be achievable.
> 
> The old fevme frontend is based on the mfe.c framework and implementing
> async readout requires special contortions. The structure of the new TMFE C++ frontend
> class is supposed to make it easier, but I do not have an example TMFE based fevme yet.
> 
> P.S. Without using block transfer, your max rate is limited to:
> 
> 16 channels, 1 word per channel, plus 1 header and 1 footer = 18 words (by luck, 64-bit aligned for 
> correct BLT64 block read).
> 
> using VME single-word read at 1 us per transfer, 18 us per event = 55 kHz repetition rate.
> 
> (you do not say if you have any other VME modules you have to read)
> 

Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
stick to 1 word transfers.

The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
significant deadtime concern? 

> K.O.

Thanks again for taking the time to help me out!

Cheers.

Isaac
          Reply  16 Dec 2020, Konstantin Olchanski, Forum, Issues building banks. 
> > > I'm currently trying to build events through doing block transfers.
> > 
> > I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> > use. The restrictions on data alignment come from the VME master.
> > I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> > I think there is no restriction for USB-VME bridges and similar.
> > 
> > Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> > (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> > single-word reads instead of block transfer)?
> 
> I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
> read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
> VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
> the confusing question (Can you tell I'm an intern?).
> 

Ok, I see. Using the normal mfe.c structure, you will not be able to read the VME modules
at maximum speed. This is because you must have two concurrent activities happening at the same time:

(1) tell the VME bridge to read data,
(2) package this data into midas banks and events and write it to the MIDAS event buffer.

If you do these tasks sequentially, obviously the VME bus will be idle during step (2),
and unless (2) takes 0 seconds (it does not) you will have a slow down.

So for maximum data rate, I prefer to have 3 threads:

thread 1: run the VME transfers, store data in circular buffer (today it would be std::deque<std::vector<char>>)
thread 2: encode the data into midas banks and midas events, store completed events in a circular buffer 
(std::deque<EVENT_HEADER*>).
thread 3: write data to midas event buffer (call bm_send_event(), etc)

This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

>
> Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
> stick to 1 word transfers.
>

I do not know the timing of CAEN V1718 single-word transfers. It may be significantly longer than 1 us:

V7865: DWORD read - CPU - PCI bus - tsi148 - VME
V1718: encode request as USB packet - CPU - PCI bus - USB hub - USB bus - USB asic - FPGA - VME (on the way back, 
"extract data from USB packet")

> 
> The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
> times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
> significant deadtime concern? 
> 

Hmm... I am not sure what fevme you refer to. I guess I can find version of fevme.cxx where data is read at
maximum VME speed if you want it.

K.O.
             Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
> > > > I'm currently trying to build events through doing block transfers.
> > > 
> > > I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> > > use. The restrictions on data alignment come from the VME master.
> > > I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> > > I think there is no restriction for USB-VME bridges and similar.
> > > 
> > > Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> > > (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> > > single-word reads instead of block transfer)?
> > 
> > I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
> > read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
> > VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
> > the confusing question (Can you tell I'm an intern?).
> > 
> 
> Ok, I see. Using the normal mfe.c structure, you will not be able to read the VME modules
> at maximum speed. This is because you must have two concurrent activities happening at the same time:
> 

I am using the mfe.cxx backend thread, I'm guessing that this is the file you are referring to.

> (1) tell the VME bridge to read data,
> (2) package this data into midas banks and events and write it to the MIDAS event buffer.
> 
> If you do these tasks sequentially, obviously the VME bus will be idle during step (2),
> and unless (2) takes 0 seconds (it does not) you will have a slow down.
> 

I see.


> So for maximum data rate, I prefer to have 3 threads:
> 
> thread 1: run the VME transfers, store data in circular buffer (today it would be std::deque<std::vector<char>>)
> thread 2: encode the data into midas banks and midas events, store completed events in a circular buffer 
> (std::deque<EVENT_HEADER*>).
> thread 3: write data to midas event buffer (call bm_send_event(), etc)
> 
> This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

Yes it seems like a bit of work

> >
> > Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
> > stick to 1 word transfers.
> >
> 
> I do not know the timing of CAEN V1718 single-word transfers. It may be significantly longer than 1 us:
> 
> V7865: DWORD read - CPU - PCI bus - tsi148 - VME
> V1718: encode request as USB packet - CPU - PCI bus - USB hub - USB bus - USB asic - FPGA - VME (on the way back, 
> "extract data from USB packet")

I found the following information in the CAEN V1718 manual:

"Transfer Rate = ~30MByte/s. Transfer rate supported in MBLT read cycles (block size = 32 kb), using a PC host with 
Windows XP or Linux and High Speed USB"

I'm guessing the sentence simply means that the rate increases with multiplexed block transfers. If the transfer rate 
is 30MBytes/s I should be able to write words at a transfer rate of 7500000 words per second.

> 
> > 
> > The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
> > times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
> > significant deadtime concern? 
> > 
> 
> Hmm... I am not sure what fevme you refer to. I guess I can find version of fevme.cxx where data is read at
> maximum VME speed if you want it.

This is the VME C++ frontend example in the directory /midas/examples/Triumf/c++/

If you can find a faster version of this code I would definitely like to check it out!

> 
> K.O.


Thanks again.

Isaac
             Reply  16 Dec 2020, Stefan Ritt, Forum, Issues building banks. 
> This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

Actually that's not true. Just look at 

midas/examples/mtfe/mtfe.c

this is an example for a frontend with equipment with the EQ_USER flag, which allows you easily to run a separate 
thread (or more) for event collection and processing. Of course all old-fashioned C style (code is from 2007) but it 
works.

Stefan
                Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
> > This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).
> 
> Actually that's not true. Just look at 
> 
> midas/examples/mtfe/mtfe.c
> 
> this is an example for a frontend with equipment with the EQ_USER flag, which allows you easily to run a separate 
> thread (or more) for event collection and processing. Of course all old-fashioned C style (code is from 2007) but it 
> works.
> 
> Stefan

Thank you sir I'll give it a look.

Cheers

Isaac
Entry  24 Nov 2020, Isaac Labrie Boulay, Forum, Invalid name "Analyzer/Tests" 
Hi everyone,

I've recently took the analyzer template from $MIDASSYS/examples/experiment and 
modified it to be able to use Roody on a very simple frontend setup. The 
analyzer works fine and I am able to view the online histograms but my console 
prints out this error:

[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
not contain "["                      
[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/low_sum/Rate [Hz]" passed to db_create_key_wlocked: should not 
contain "["
[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/high_sum/Rate [Hz]" passed to db_create_key_wlocked: should not 
contain "["

The error keeps getting printed even after stopping the run.

I do have these 3 keys under Analyzer/Tests/ in my ODB but I do not know where 
they come from. Any suggestions on what the root of the issue is?

Thanks for the help!

Isaac
    Reply  27 Nov 2020, Konstantin Olchanski, Forum, Invalid name "Analyzer/Tests" 
> I've recently took the analyzer template from $MIDASSYS/examples/experiment and 
> modified it to be able to use Roody on a very simple frontend setup.

Hmm... the old midas analyzer framework is very old and I do not recommend
to use it for new experiments.

A newer analyzer system is ROOTANA and an even newer is the "m" analyzer (manalyzer). These
analyzers progressively introduce improved c++-style programming environments amongst other
improvements. If starting from scratch, I recommend that you use manalyzer (currently from the rootana
git repository).

> The analyzer works fine and I am able to view the online histograms but my console 
> prints out this error:
> 
> [Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
> "/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
> not contain "["

The error says what it means. "[" is not a permitted character in odb names. It is used
by many odb functions to access array elements.

The midas analyzer example should be updated to change "[Hz]" to "(Hz)" or something similar.

K.O.
       Reply  27 Nov 2020, Konstantin Olchanski, Forum, Invalid name "Analyzer/Tests" 
https://bitbucket.org/tmidas/midas/issues/298/invalid-odb-names-in-example-midas
K.O.
          Reply  07 Dec 2020, Isaac Labrie Boulay, Forum, Invalid name "Analyzer/Tests" 
> https://bitbucket.org/tmidas/midas/issues/298/invalid-odb-names-in-example-midas
> K.O.

Hi K.O.

Ok I see, I will use the most up to date analyzer.

Thanks a ton for your help.

Isaac
Entry  24 Sep 2020, Gennaro Tortone, Forum, subrun  
Hi,

I was wondering if there is a "mechanism" to run an executable
file after each subrun is closed...

I need to convert .mid.lz4 subrun files to ROOT (TTree) files;

Thanks,
Gennaro
    Reply  01 Dec 2020, Stefan Ritt, Forum, subrun  
There is no "mechanism" foreseen to be executed after each subrun. But you could 
run a shell script after each run which loops over all subruns and converts them 
one after the other.

Stefan

> Hi,
> 
> I was wondering if there is a "mechanism" to run an executable
> file after each subrun is closed...
> 
> I need to convert .mid.lz4 subrun files to ROOT (TTree) files;
> 
> Thanks,
> Gennaro
       Reply  01 Dec 2020, Ben Smith, Forum, subrun  
We use the lazylogger for something similar to this. You can specify the path to a custom script, and it will be run for each midas file that gets written:
https://midas.triumf.ca/MidasWiki/index.php/Lazylogger#Using_a_script
This means that you don't have to wait until the end of the run to start processing.

If the ROOT conversion is going to be slow, but you have a batch system available, you could use the lazylogger script to submit a job to the batch system for each file.

> 
> > Hi,
> > 
> > I was wondering if there is a "mechanism" to run an executable
> > file after each subrun is closed...
> > 
> > I need to convert .mid.lz4 subrun files to ROOT (TTree) files;
> > 
> > Thanks,
> > Gennaro
Entry  30 Nov 2020, Konstantin Olchanski, Info, more wisdom from linux kernel people 
As you may know, I am a big fan of two software projects - the linux kernel and ROOT. The linux kernel is one of 
the few software projects "done right". ROOT is where normal people try to "get it right" with real-world level 
of success. I use both softwares daily and I try to apply their ways and methods to MIDAS as much as I can.

So just in time for our discussion of array indexes, a talk by gregkh shows
up on slashdot. The title is "how to keep your users happy". (Nobody
ever wants to be nasty to their users, but do read his talk).

https://git.sr.ht/~gregkh/presentation-application_summit/tree/main/keep_users_happy.pdf

The talk refers to some older stuff, still relevant, of course, in case you miss the links
in the pdf file, here they are:

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
https://ozlabs.org/~rusty/ols-2003-keynote/img0.html (click on "continue" to see next page)

K.O.
Entry  24 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I'm interested in using the matching feature for ODBSET explained on 
https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
array, like:

COMMENT "Ground the detectors"
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0

Currently I get an error when I try to run this script.  Is this expected?  Would it 
be possible to implement matching for array values?

Thanks!
    Reply  25 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
Hi,
I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
support "?".
Are you trying to set only the first 9 channels?
Could you try with "[*]" or "[0-9]" instead?

Marco

> I'm interested in using the matching feature for ODBSET explained on 
> https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> array, like:
> 
> COMMENT "Ground the detectors"
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> 
> Currently I get an error when I try to run this script.  Is this expected?  Would it 
> be possible to implement matching for array values?
> 
> Thanks!
       Reply  25 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
The following all fail with "Cannot find ODB key "<key>""

ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0


> Hi,
> I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> support "?".
> Are you trying to set only the first 9 channels?
> Could you try with "[*]" or "[0-9]" instead?
> 
> Marco
> 
> > I'm interested in using the matching feature for ODBSET explained on 
> > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > array, like:
> > 
> > COMMENT "Ground the detectors"
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > 
> > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > be possible to implement matching for array values?
> > 
> > Thanks!
          Reply  25 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I created some keys in my ODB to try to match yours.
The ODBSET commands you wrote are all working fine (of course with different results), except only for the "/Detectors/Det*/Settings/Charge/Bias (V)*" which I will have to 
look into.
In any case the error message I'm getting is "could not match ay key" and not the one you are reporting.

Now I'm a bit puzzled:
Are you sure your ODB contains those keys?
Are you testing the ODBSET inside a more complex sequencer or on its own?

Maybe I can try to reproduce it using your ODB setup.
Could you send an ODB dump of the "/Detectors" folder using the "save" command of odbedit ("cd /Detectors" and then "save detector.odb")?

Best,

Marco


> The following all fail with "Cannot find ODB key "<key>""
> 
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> 
> 
> > Hi,
> > I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> > support "?".
> > Are you trying to set only the first 9 channels?
> > Could you try with "[*]" or "[0-9]" instead?
> > 
> > Marco
> > 
> > > I'm interested in using the matching feature for ODBSET explained on 
> > > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > > array, like:
> > > 
> > > COMMENT "Ground the detectors"
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > > 
> > > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > > be possible to implement matching for array values?
> > > 
> > > Thanks!
             Reply  25 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I think the issue may be the version of MIDAS I'm using.  Mine is current as of February 4, 2020.  

But since then there have been changes to the sequencer code, specifically parts that handle indexing.

I'll try this out with an updated version of MIDAS and report back if there are still any issues after updating.

> I created some keys in my ODB to try to match yours.
> The ODBSET commands you wrote are all working fine (of course with different results), except only for the "/Detectors/Det*/Settings/Charge/Bias (V)*" which I will have to 
> look into.
> In any case the error message I'm getting is "could not match ay key" and not the one you are reporting.
> 
> Now I'm a bit puzzled:
> Are you sure your ODB contains those keys?
> Are you testing the ODBSET inside a more complex sequencer or on its own?
> 
> Maybe I can try to reproduce it using your ODB setup.
> Could you send an ODB dump of the "/Detectors" folder using the "save" command of odbedit ("cd /Detectors" and then "save detector.odb")?
> 
> Best,
> 
> Marco
> 
> 
> > The following all fail with "Cannot find ODB key "<key>""
> > 
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > 
> > 
> > > Hi,
> > > I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> > > support "?".
> > > Are you trying to set only the first 9 channels?
> > > Could you try with "[*]" or "[0-9]" instead?
> > > 
> > > Marco
> > > 
> > > > I'm interested in using the matching feature for ODBSET explained on 
> > > > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > > > array, like:
> > > > 
> > > > COMMENT "Ground the detectors"
> > > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > > > 
> > > > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > > > be possible to implement matching for array values?
> > > > 
> > > > Thanks!
          Reply  27 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> The following all fail with "Cannot find ODB key "<key>""
> 
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> 

It would be cool if ODB pattern matching in the sequencer
were consistent with the ODB pattern matching in the json-rpc
interface for web pages:

https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax

K.O.
             Reply  30 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I totally agree that we should have a consistent formatting for array index expansion.
I had a look to the mjsonrpc code and I found the function parse_array_index_list(...) which does this job.
I have a similar function (adapted form previous code) in odb.cxx called strarrayindex(...) that is designed for the same "consistency" purposes between odbedit and sequencer.

Let me put few points that I noticed:
- mjsonrpc has a very different way to write the full array (no indexes given) while currently sequencer requires "[*]" to do the same (otherwise it only changes the first value of the array)
- currently the sequencer and the underlying ODB calls use two indexes (that are the same if you want to write only one key) so we will need a serious rewriting to allow something like "ODBSET array[1,3,5]"
- if I correctly understood the code, mjsonrpc instead generates a list of indices and then calls an ODB write on each of them. That's not always a good thing, for example if you are writing an array of n parameters on a DAQ 
board you will call the hotlink on that key n times
- in addition to that the sequencer will also have to cope with variable-based indexes like "ODBSET array[$val]", but then how it should parse something like "[$a,1]" or "[$a*]"?

For the very first point I do not see a clean way to do this without breaking the compatibility of existing sequencers or having a difference between the two implementations.
For the others I guess we can find a way out, however that's a major modification so I will put it on my todo list when I can find some free time.
In any case I would propose to merge the two functions, so we have only to maintain a single implementation of the parsing.

I guess it's a good moment to brainstorm about that, let me know what you think

Marco


> > The following all fail with "Cannot find ODB key "<key>""
> > 
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > 
> 
> It would be cool if ODB pattern matching in the sequencer
> were consistent with the ODB pattern matching in the json-rpc
> interface for web pages:
> 
> https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax
> 
> K.O.
                Reply  30 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> I totally agree that we should have a consistent formatting for array index expansion.
> I had a look to the mjsonrpc code and I found the function parse_array_index_list(...) which does this job.

Yes, it is good to review this stuff. I think the json-rpc call should accept more array index patterns:

a[*] - whole array (even though it is unnatural use in javascript, we do not say "let a[*] = b[*]", we say "let a = b".
a[5-] - from 5th element to the end (in the case we do not know the length)
a[-10] - from first element to element 10 (this is same as a[0-10], but needed for consistency with previous case).

K.O.

> I have a similar function (adapted form previous code) in odb.cxx called strarrayindex(...) that is designed for the same "consistency" purposes between odbedit and sequencer.
> 
> Let me put few points that I noticed:
> - mjsonrpc has a very different way to write the full array (no indexes given) while currently sequencer requires "[*]" to do the same (otherwise it only changes the first value of the array)
> - currently the sequencer and the underlying ODB calls use two indexes (that are the same if you want to write only one key) so we will need a serious rewriting to allow something like "ODBSET array[1,3,5]"
> - if I correctly understood the code, mjsonrpc instead generates a list of indices and then calls an ODB write on each of them. That's not always a good thing, for example if you are writing an array of n parameters on a DAQ 
> board you will call the hotlink on that key n times
> - in addition to that the sequencer will also have to cope with variable-based indexes like "ODBSET array[$val]", but then how it should parse something like "[$a,1]" or "[$a*]"?
> 
> For the very first point I do not see a clean way to do this without breaking the compatibility of existing sequencers or having a difference between the two implementations.
> For the others I guess we can find a way out, however that's a major modification so I will put it on my todo list when I can find some free time.
> In any case I would propose to merge the two functions, so we have only to maintain a single implementation of the parsing.
> 
> I guess it's a good moment to brainstorm about that, let me know what you think
> 
> Marco
> 
> 
> > > The following all fail with "Cannot find ODB key "<key>""
> > > 
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > > 
> > 
> > It would be cool if ODB pattern matching in the sequencer
> > were consistent with the ODB pattern matching in the json-rpc
> > interface for web pages:
> > 
> > https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax
> > 
> > K.O.
             Reply  30 Nov 2020, Stefan Ritt, Suggestion, ODBSET wildcards with array keys in Sequencer files 
Hi Konstantin,

we are considering to make the range selection uniform among json, sequencer and 
odbedit "set" command. Having multiple ranges like [1,4-5] will be quite some work, so 
my question is did you just implement it on the json side because it was easy, or are 
there experiments who really need it? Wouldn't it be enough to have

[*]
[n]
[n-m]

This way we always have only one db_set_data() value behind that. Any set of indices 
we have to split into several db_set_data(), which especially for the front-end 
configuration can cause trouble by triggering a hot link on each access.

Stefan
                Reply  30 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> 
> we are considering to make the range selection uniform among json, sequencer and 
> odbedit "set" command. Having multiple ranges like [1,4-5] will be quite some work, so 
> my question is did you just implement it on the json side because it was easy, or are 
> there experiments who really need it? Wouldn't it be enough to have
> 
> [*]
> [n]
> [n-m]
> 

It has been a long time, but most likely I designed the interface to work this
way to permit maximum flexibility for writing into an array using just one
rpc operation.

The generalized form is:
[range,range,range...]

where range is:
array index or
index1-index2 or
index2-index1 (write in reverse order)

This is all documented here:
https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax

I think it is too late to change it.

>
> This way we always have only one db_set_data() value behind that. Any set of indices 
> we have to split into several db_set_data()
> 

Sounds good. I think it is easy to have a common implementation:

One would need following functions:
parse range selection from string into std::vector<int> of array indices (we already have it)
call db_set_data_range() (this is easy to add).

>
> trouble by triggering a hot link on each access.
>

There is no escaping this trouble. Half the experiments want notification
"per array", the other half, "per array element". We cannot choose one or the other for them,
we have to provide a way for the user to say how they want it.

P.S. With existing ODB calls, you cannot do [n-m] ranges. You can do whole array or you
can do one-element-at-a-time.

K.O.
Entry  17 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Today I addressed a topic which bugged me since long time. The ODB contains 
settings under /Equipment/<name>/Common which are a "mirror" of the equipment[] 
setting in a frontend (using the mfe.cxx framework). If the "Common" entry in 
the ODB is not present (fresh experiment), the equipment[] settings from the 
frontend are copied to the ODB. But if it exists, it takes precedence over the 
equipment[] entries, which is wrong in my opinion. Like if you change some 
settings in equipment[] (like the logging period of the history), then recompile 
and restart the frontend, the old values in the ODB are kept and your 
modification in the frontend code has no effect.

Starting on commit c3017c6c on Nov. 17th 2020 I reversed the precedence: Now, on 
each start of the frontend program, the values from equipment[] are written to 
the ODB. They are still "live". If one changes them when the frontend is 
running, that change takes effect immediately. But on the next restart of the 
frontend, the old values from equipment[] is put back there.

I fell too many times into this trap, and I hope the modification helps 
everybody. If there are however experiments which rely on the fact that the 
common settings in the ODB are NOT overwritten by the frontend, please let me 
know and I can put a flag "EQUIPMENT_FE_PRECEDENCE = FALSE" somewhere to restore 
the old behaviour.

Stefan
    Reply  20 Nov 2020, Pierre-Andre Amaudruz, Info, Equipment "common" settings in ODB 
Indeed this "mirror" of the ODB in settings option can cause frustration in 
particular when we think the ODB is empty but is not.
In the other hand, over time the settings are adjusted to a particular 
configuration or touched or not by the individual run preset parameters. Later, if 
a bug or code correction requires multiple restart of the fe, for every start of 
the application, you loose the latest configuration. This can be frustrating as 
well until you force a post-setting or report the specifics parameters in the fe 
code.
BTW I believe, we originally went for the ODB priority for that specific reason.
 
I would be in favour for having a general flag (FALSE) in /experiment which would 
define this global behaviour.  
PAA

> Today I addressed a topic which bugged me since long time. The ODB contains 
> settings under /Equipment/<name>/Common which are a "mirror" of the equipment[] 
> setting in a frontend (using the mfe.cxx framework). If the "Common" entry in 
> the ODB is not present (fresh experiment), the equipment[] settings from the 
> frontend are copied to the ODB. But if it exists, it takes precedence over the 
> equipment[] entries, which is wrong in my opinion. Like if you change some 
> settings in equipment[] (like the logging period of the history), then recompile 
> and restart the frontend, the old values in the ODB are kept and your 
> modification in the frontend code has no effect.
> 
> Starting on commit c3017c6c on Nov. 17th 2020 I reversed the precedence: Now, on 
> each start of the frontend program, the values from equipment[] are written to 
> the ODB. They are still "live". If one changes them when the frontend is 
> running, that change takes effect immediately. But on the next restart of the 
> frontend, the old values from equipment[] is put back there.
> 
> I fell too many times into this trap, and I hope the modification helps 
> everybody. If there are however experiments which rely on the fact that the 
> common settings in the ODB are NOT overwritten by the frontend, please let me 
> know and I can put a flag "EQUIPMENT_FE_PRECEDENCE = FALSE" somewhere to restore 
> the old behaviour.
> 
> Stefan
    Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> Today I addressed a topic which bugged me since long time.

Right. No easy subject. For me, too, this has been a problem in MIDAS for a long time.

> Now, on each start of the frontend program, the values from equipment[] are written to 
> the ODB. They are still "live". If one changes them when the frontend is 
> running, that change takes effect immediately. But on the next restart of the 
> frontend, the old values from equipment[] is put back there.

There is a downside from this behaviour.

If some values in equipment/common are "live" and the user is expected to change them,
the user will be unpleasantly surprised when their changes magically disappear (after reboot,
after frontend crash, after run restart if experiment requires restarting some frontends
before starting a new run).

This change will also break some experiments that rely in things like specifying
event buffer names through ODB. But experiments can adapt and specify buffer names
through command line switch instead of ODB.

This new way also it makes the "live" Common/Period unusable. Sure I can speed up or slow
down a frontend even during the run, but if my change does not "stick", what good is it?

Personally, I think there is no easy solution for all these troubles.

I would advocate the following approach:

- think of MIDAS as a "mature" system,
- treasure backward compatibility
- (if we must break backward compatibility to introduce a new "must have" improvement, so be it)
- document how things work. if it is clearly written down what different fields in "common" do, fewer people 
"get burned" by unexpected or illogical things. (and any non-trivial system has plenty of those).

Going back to ODB equipment/common, my experience with midas and odb tells me
that one should avoid mixing together ODB entries set by user and ODB entries set by code.

For example, separating them as equipment/settings and equipment/variables works well. Mixing
them as in equipment/common and sequencer/state causes trouble.

So perhaps we should split Equipment/common into two pieces, user settable fields like
"Period" and "event buffer name" would move to equipment/settings or whatever.

This will open the discussion of which items in equipment/common should be user settable,
and some people would want event buffer specified in the code to prevail, while other
people would want the name from odb to prevail, and both are valid but conflicting preferences.

Or we could bite the bullet and say, equipment/common is controlled by the frontend code,
the user should not change it. (and mark it read-only in ODB).

For all the pain this may cause, at least this will make it self-consistent.

Per this proposal, in addition to Stefan's change, the hotlink on equipment/common goes away,
"period" is no longer "live" and the whole subdirectory is made "read-only".

K.O.
       Reply  27 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, so what about the following proposal:

- I change back the mfe.cxx code to behave like before (ODB has precedence and does not get overwritten when the 
front-end restarts)

- I add a global flag

BOOL equipment_common_overwrite;

and pre-set it to FALSE;

- So if nothing is changed the flag stays false and ODB keeps precedence

- If a frontend wants to overwrite equipment/common on each start, the user sets

BOOL equipment_common_overwrite = TRUE;

near the equipment[] structure in the front-end code. 

- If the flag is true, the mfe.cxx init code copies the equipment[] structure to the ODB on each frontend start

I believe this way we can keep backward compatibility, and add the new way with minimal effort. The only downside 
is that all frontends on this plane have to add at least "BOOL equipment_common_overwrite = FALSE;" in their 
code.

I know global variables are evil, but this way the user can just add the line above to the equipment[] array, so 
one sees this when one edits the equipment[] array, giving motivation to change as needed. So the code would be



BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
 ....
}



An alternative way would be to add a function

  set_equipment_common_overwrite(TRUE);

into the frontend_init() code. That's somehow cleaner (still needs an internal global variable), but it has to go 
into frontend_init() so won't be at the same place as the EQUIPMENT list in the frontend.

Thoughts?

Best,
Stefan
          Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
Yes, I think this will work.

For old mfe.c frontends, global variable set to "do it the new way" should be okey,
new experiments will have it the new way. Old experiments, will be forced to add a one-line definition
of this global variable (otherwise mfe.o will not link), at that time they get to chose "new way" or "old way".

For the new TMFE c++ frontend, this will work naturally when they create the Equipment Common object,
in the object constructor, you can see how it explicitly honors or overwrites the ODB common entries.

The TMFE frontend does not do a live "period", so there should be no issue with that.

Should I open a bitbucket issue "update TMFE frontend to new Equipment/Common scheme", to make sure
I do not forget about it?

K.O.
             Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, I implemented it the following way:

- Added a boolean flag "equipment_common_overwrite", which must be contained in EACH frontend, preferably just 
before the EQUIPMENT structure, such as:

BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
...
};

- If that flag is TRUE, then the contents of the "equipment" structure is copied to the ODB on each start of the 
front-end

- If the flag is FALSE, then the ODB values are kept on the start of the front-end

The setting of the flag depends now on the philosophy of the experiment. Some experiments say that everything 
needed should be in the front-end code, so when it starts everything gets set correctly. They don't change the 
values in the ODB, but in the frontend code, which then goes into their repository. Other experiments just need 
some default values from the frontend code, and the fine-tune things by changing values in the ODB. These 
experiments should set this flag to FALSE.

*****

Please note that EVERY frontend now needs this flag, so all of you have to add it to all of your front-ends, 
otherwise the front-end will not compile! I could not figure out how to this could be done without this 
requirement, since you can define a global variable only once.

*****


Stefan
                Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
One more change: 

After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
so one knows what's on and what's off.

Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
all these changes.

Stefan
                   Reply  30 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> One more change: 
> 
> After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
> but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
> off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
> being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
> so one knows what's on and what's off.
> 
> Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
> all these changes.
> 

Good catch. I still think this is fundamentally impossible to "get right". But good, you
are now in the same boat with me. The documentation will read: "if flag is TRUE, these data fields
are read from ODB, if flag is FALSE, those other fields are read from ODB". I will have to check
how this will work out for the TMFE C++ frontend (I think both mfe.c and TMFE frontends should
work "the same").

I think we have at least one month to play with this, I do not think we can do the next release
of midas until January.

K.O.
Entry  06 Nov 2020, Alexandr Kozlinskiy, Suggestion, cmake build fixes 
hi,

there are several problems with current cmake build files in midas:
- not all systems have cuda libs in /usr/local/cuda
- not all cmake version like when redefining vars
  (i.e. redefining ROOT_CXX_FLAGS)
- c++ standard not matching the one used to build ROOT
- ROOTSYS is not needed to find ROOT (it is enough to have root in PATH)

I have posted pull request 'https://bitbucket.org/tmidas/midas/pull-requests/17'
which tries to fix some of the problems.
Tests and comments are welcome.
    Reply  27 Nov 2020, Konstantin Olchanski, Suggestion, cmake build fixes 
Hi, Alexandr, thank you for making improvements to MIDAS. I have some question
about your suggestions:

> there are several problems with current cmake build files in midas:
> - not all systems have cuda libs in /usr/local/cuda
> - not all cmake version like when redefining vars

we do not see these problems with the normal cmake on our current linux systems,
centos-7 and -8, Ubuntu LTS 18.04, 20.04.

so you have something different? can you be a bit more specific,
which version of cmake and which OS you have so see these troubles?

> - c++ standard not matching the one used to build ROOT
> - ROOTSYS is not needed to find ROOT (it is enough to have root in PATH)

Again ROOT tangles with the build of MIDAS.

MIDAS does not use ROOT. As a convenience to the users, we have a "ROOT output" driver
in mlogger and we build a special executable rmlogger with ROOT. Only this special
executable should be linked with ROOT and compiled with ROOT-specific flags.

The rest of the MIDAS build should not be affected by presence or absence of ROOT.

One would have to read old messages on this forum to understand this situation.

> 
> I have posted pull request 'https://bitbucket.org/tmidas/midas/pull-requests/17'
> which tries to fix some of the problems.
> Tests and comments are welcome.
>

I look at the diffs:

- CUDA detection is changed to "find_package(CUDA)". This code was added by Joseph and Ben, and there 
must be a reason why they did not use find_package(CUDA). They will have to sign-off on this change.

- ROOT related logic assumes that all of MIDAS will be built "the ROOT way". CFLAGS are changed, the C++ 
standard is changed, etc. this assumption is wrong. only rmlogger and rmana should be built "with ROOT".

If you want to follow through on this, I suggest that you split the pull request into two,
one pull request for the CUDA changes and one pull request for the ROOT changes. Also rework
your ROOT changes as I explained above (but also read all ROOT-related messages on this forum).

K.O.
Entry  05 Nov 2020, Isaac Labrie Boulay, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Hi everyone,

I have been building an experiment using the v1718 CAEN interface to talk to my modules and I am using the CAENVMElib Linux Library (2.50). I've managed to deal with data type issues by including additional libraries to my driver code but there is one type error that persists:


In file included from /usr/include/CAENVMElib.h:27:0,
             	from include/v1718.h:25,
             	from v1718.c:26:
/usr/include/CAENVMEtypes.h:323:9: error: unknown type name ‘VARIANT_BOOL’
     	CAEN_BOOL cvDS0;  	/* Data Strobe 0 signal                     	*/


The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:


#ifdef LINUX
#define CAEN_BYTE   	unsigned char
#define CAEN_BOOL   	int
#else
#define CAEN_BYTE   	byte
#define CAEN_BOOL   	VARIANT_BOOL
#endif


Has anyone ever ran into that problem when setting up an experiment using the CAEN standard?

Thanks for your help.

Isaac
    Reply  05 Nov 2020, Pierre-Andre Amaudruz, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Hi,

You're building under Linux like. You want to define the LINUX and skip the VARIANT_BOOL all together.
PAA

> Hi everyone,
> 
> I have been building an experiment using the v1718 CAEN interface to talk to my modules and I am using the CAENVMElib Linux Library (2.50). I've managed to deal with data type issues by including additional libraries to my driver code but there is one type error 
that persists:
> 
> 
> In file included from /usr/include/CAENVMElib.h:27:0,
>              	from include/v1718.h:25,
>              	from v1718.c:26:
> /usr/include/CAENVMEtypes.h:323:9: error: unknown type name ‘VARIANT_BOOL’
>      	CAEN_BOOL cvDS0;  	/* Data Strobe 0 signal                     	*/
> 
> 
> The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:
> 
> 
> #ifdef LINUX
> #define CAEN_BYTE   	unsigned char
> #define CAEN_BOOL   	int
> #else
> #define CAEN_BYTE   	byte
> #define CAEN_BOOL   	VARIANT_BOOL
> #endif
> 
> 
> Has anyone ever ran into that problem when setting up an experiment using the CAEN standard?
> 
> Thanks for your help.
> 
> Isaac
       Reply  06 Nov 2020, Isaac Labrie Boulay, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Yes, you are right. That fixed it and my frontend is compiling.

Thanks Pierre-Andre.

Isaac


> Hi,
> 
> You're building under Linux like. You want to define the LINUX and skip the VARIANT_BOOL all together.
&g