Back Midas Rome Roody Rootana
  Midas DAQ System, Page 5 of 45  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  15 Mar 2023, Casey, Forum, Having trouble with MIDAS setup 
Hi

I'm not sure if this is the right forum for this query (if it is not, I would truly appreciate it if someone could point me at the right forum). I'm having a little bit of trouble with the setup of a Midas system.

Now, this is a system that I inherited after the previous guy who was looking after it went to other employment. There was a point at which it was working. And then there was an unrelated issue in the electrical system which, as a side effect, meant that the building lost power for a time, and the entire system had to be rebooted.

No problem, I thought. I'll just reset and restart all of the software...

...and I can't seem to get it to work. I keep getting the error message "mvme_read_value: Could not perform read!: Bad address". So far as I can tell, this seems to be related to the idea that the base address being used to read from the boards is incorrect. The base addresses are hardcoded in the software (not autogenerated) and, aside from the power going down and up again, the hardware hasn't been touched since the system was working.

I imagine that there is something that needs to be set, twiddled, tweaked, or turned on in the driver. The output of 'lsmod | grep vme' is:

vmedriver             117742  0

so presumably the driver is at least *present*, even if I have no idea how to twiddle anything on it.

Could anyone perhaps suggest a way forward? Is there some way to gather the information that I need, perhaps, or some way to twiddle anything twiddle-able on the driver?

Casey
    Reply  16 Mar 2023, Konstantin Olchanski, Forum, Having trouble with MIDAS setup 
> I'm not sure if this is the right forum for this query

this might be the right place, depending.

> I'm having a little bit of trouble with the setup of a Midas system ...
> inherited after the previous guy ...

a rather major problem, but a typical situation.

> There was a point at which it was working.

this is very good. if it worked before, there is good chance it will work again.

> And then there was an unrelated issue in the electrical system which, as a side effect,
> meant that the building lost power for a time, and the entire system had to be rebooted.
> No problem, I thought. I'll just reset and restart all of the software...
> ...and I can't seem to get it to work.

this happens often enough. several things are likely to happen:
- unexpected software updates, i.e. new linux kernel was installed but inactive, waiting for a reboot
- hardware failures, i.e. we usually see blown up power supplies. check that all VME crate voltages are okey. (ask me how).
- firmware corruption, i.e. we have seen VME modules lose their firmware after power outage, had to be reloaded by jtag

> I keep getting the error message "mvme_read_value: Could not perform read!: Bad address".

this is a generic error, it does not mean that software suddenly is trying to read from wrong address.

> I imagine that there is something that needs to be set, twiddled, tweaked, or turned on in the driver. The output of 'lsmod | grep vme' is:
> vmedriver             117742  0

this is not the vme driver we use at TRIUMF, so I am not familiar with it's errors. we use the vme_tsi148 driver and the vme_universe driver. (ask me about them).

> so presumably the driver is at least *present*, even if I have no idea how to twiddle anything on it.

could be the wrong version of the driver or the wrong version of the linux kernel. worth checking log files
to see if kernel and driver version numbers are the same.

> Could anyone perhaps suggest a way forward?

Yes. You will have to tell me much more about your system. You can do this publicly here or privately by email to olchansk@triumf.ca

To start, I need to know your VME setup, what is the crate, what is the VME processor, what OS you run, what VME driver you use, what VME modules you have installed.

K.O.
Entry  22 Feb 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
Dear all,

we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
option to retry multiple times the I/O on the MySQL?

The error we are experiencing is the following (hiding the IP address):

[Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)

Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

Best,
Stefano.
    Reply  22 Feb 2023, Stefan Ritt, Info, connection to a MySQL server: retry procedure in the Logger 
> Dear all,
> 
> we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> option to retry multiple times the I/O on the MySQL?
> 
> The error we are experiencing is the following (hiding the IP address):
> 
> [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> 
> Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
the connection still might not work. If we retry many times, people will complain that the run start or stop 
takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
so that people get aware that there is a problem and fix the source, rather than curing the symptoms.

In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.

Best,
Stefan
       Reply  07 Mar 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
> > Dear all,
> > 
> > we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> > option to retry multiple times the I/O on the MySQL?
> > 
> > The error we are experiencing is the following (hiding the IP address):
> > 
> > [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> > connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> > 
> > Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
> 
> What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
> the connection still might not work. If we retry many times, people will complain that the run start or stop 
> takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
> information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
> so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
> 
> In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
> except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
> use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
> 
> Best,
> Stefan


Dear Stefan,

a possible solution could be to define the number of times to retry as a parameter that is 0 by default, as well as a wait time between two subsequent tries. This 
would leave the decision on how to handle a possible failed connection to the user. In our case, for example, we would prefer to not stop the acquisition in case 
of a failed connection to the external SQL. In addition, we have other software that, with a retry procedure, doesn’t fail: with 1 re-try and a sleep time of 0.5 s 
we already recover 100% of the faults.

Anyway, we implemented a local database, which is a mirror of the external one, and the problems disappeared.

Thanks,
Stefano.
Entry  05 Nov 2022, Zaher Salman, Suggestion, histories capture 'ruy' 
The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.

In mhistory.js we have

// Keyboard event handler (has to be on the window!)
window.addEventListener("keydown", this.keyDown.bind(this));

I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.
    Reply  08 Feb 2023, Stefan Ritt, Suggestion, histories capture 'ruy' 
> The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.
> 
> In mhistory.js we have
> 
> // Keyboard event handler (has to be on the window!)
> window.addEventListener("keydown", this.keyDown.bind(this));
> 
> I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.

I could reproduce the problem. I see two options there:

1) We replace 'r' with 'Ctrl-r' etc. 

2) We change the history JS code not to process keyboard inputs if we are currently editing a value.

I added 2) to give it a try. It works fine for me. The additional code is

MhistoryGraph.prototype.keyDown = function (e) {
   // don't consume events if we are editing a value
   if (e.target.tagName === "INPUT")
      return;
   ...
}


Feedback is welcome.

Stefan
       Reply  09 Feb 2023, Zaher Salman, Suggestion, histories capture 'ruy' 
I agree with you, option 2 is better and works well. 
The only problem is that if you are showing multiple histograms in the same window the keyDown even will affect all of the histories in the window. 
This may be the intended behaviour, but I think that if we can find a way to have the event affecting only the intended history (focused element for example) it would be better.

Zaher 

> > The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.
> > 
> > In mhistory.js we have
> > 
> > // Keyboard event handler (has to be on the window!)
> > window.addEventListener("keydown", this.keyDown.bind(this));
> > 
> > I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.
> 
> I could reproduce the problem. I see two options there:
> 
> 1) We replace 'r' with 'Ctrl-r' etc. 
> 
> 2) We change the history JS code not to process keyboard inputs if we are currently editing a value.
> 
> I added 2) to give it a try. It works fine for me. The additional code is
> 
> MhistoryGraph.prototype.keyDown = function (e) {
>    // don't consume events if we are editing a value
>    if (e.target.tagName === "INPUT")
>       return;
>    ...
> }
> 
> 
> Feedback is welcome.
> 
> Stefan
          Reply  09 Feb 2023, Stefan Ritt, Suggestion, histories capture 'ruy' 
> I agree with you, option 2 is better and works well. 
> The only problem is that if you are showing multiple histograms in the same window the keyDown even will affect all of the histories in the window. 
> This may be the intended behaviour, but I think that if we can find a way to have the event affecting only the intended history (focused element for example) it would be better.

The history panels are simple pictures from the perspective of the browser and have no concept of a focus. This would have to be tweaked somehow. If you want to reset a single history, 
just click on its reset button (circle arrow at the right top).

Stefan
Entry  31 Jan 2023, Lukas Gerritzen, Suggestion, "Soft interlock" possible? 
Is it possible to impose requirements on certain output variables in an interlock-like fashion? For example: "As long as the temperature exceeds a certain threshold, a light switched by a relay cannot be turned on."

A workaround would be to set an alarm on that variable and call a script which turns the light back off, but that might not be ideal in certain scenarios. For safety-critical situations, a PLC would be preferred, but I am missing an option between those two.
    Reply  31 Jan 2023, Stefan Ritt, Suggestion, "Soft interlock" possible? 
> Is it possible to impose requirements on certain output variables in an interlock-like fashion? For example: "As long as the temperature exceeds a certain threshold, a light switched by a relay cannot be turned on."
> 
> A workaround would be to set an alarm on that variable and call a script which turns the light back off, but that might not be ideal in certain scenarios. For safety-critical situations, a PLC would be preferred, but I am missing an option between those two.

No, interlocks of that kind are not implemented in midas. And that is for a good reason. Interlocks are critical, so one must be sure 100% that they are working. This cannot be done with complex software such as midas. You have to use dedicated hardware for that. 
Most people use a PLC controller which is made for that. Midas is then only used to read and display the status of the interlock controller.

Stefan
Entry  13 Jan 2023, Denis Calvet, Suggestion, Debug printf remaining in mhttpd.cxx 
Hi everyone,

It has been a long time since my last message. While porting Midas front-end 
programs developed for the T2K experiment in 2008 to a modern version of Midas, 
I noticed that some debug printf remain in mhttpd.cxx.

A number of debug messages are printed on stdout each time a graph is displayed 
in the OldHistory window (which is the style of history plots that will continue 
to be used in the upgraded T2K experiment for some technical reasons).

Here is an example of such debug messages:
Load from ODB History/Display/HA_EP_0/V33: hist plot: 8 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, 
show_run_markers: 1, show_values: 1, show_fill: 0, show_factor 0, enable_factor: 
1
var[0] event [HA_TPC_SC][E0M00FEM_V33] formula [], colour [#00AAFF] label 
[Mod_0] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
10
var[1] event [HA_TPC_SC][E0M01FEM_V33] formula [], colour [#FF9000] label 
[Mod_1] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
20
var[2] event [HA_TPC_SC][E0M02FEM_V33] formula [], colour [#FF00A0] label 
[Mod_2] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
30
var[3] event [HA_TPC_SC][E0M03FEM_V33] formula [], colour [#00C030] label 
[Mod_3] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
40
var[4] event [HA_TPC_SC][E0M04FEM_V33] formula [], colour [#A0C0D0] label 
[Mod_4] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
50
var[5] event [HA_TPC_SC][E0M05FEM_V33] formula [], colour [#D0A060] label 
[Mod_5] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
60
var[6] event [HA_TPC_SC][E0M06FEM_V33] formula [], colour [#C04010] label 
[Mod_6] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
70
var[7] event [HA_TPC_SC][E0M07FEM_V33] formula [], colour [#807060] label 
[Mod_7] show_raw_value 0 factor 1.000000 offset 0.000000 voffset 0.000000 order 
80
read_history: nvars 10, hs_read() status 1
read_history: 0: event [HA_TPC_SC], var [E0M00FEM_V33], index 0, odb index 0, 
status 1, num_entries 475
read_history: 1: event [HA_TPC_SC], var [E0M01FEM_V33], index 0, odb index 1, 
status 1, num_entries 475
read_history: 2: event [HA_TPC_SC], var [E0M02FEM_V33], index 0, odb index 2, 
status 1, num_entries 475
read_history: 3: event [HA_TPC_SC], var [E0M03FEM_V33], index 0, odb index 3, 
status 1, num_entries 475
read_history: 4: event [HA_TPC_SC], var [E0M04FEM_V33], index 0, odb index 4, 
status 1, num_entries 475
read_history: 5: event [HA_TPC_SC], var [E0M05FEM_V33], index 0, odb index 5, 
status 1, num_entries 475
read_history: 6: event [HA_TPC_SC], var [E0M06FEM_V33], index 0, odb index 6, 
status 1, num_entries 475
read_history: 7: event [HA_TPC_SC], var [E0M07FEM_V33], index 0, odb index 7, 
status 1, num_entries 475
read_history: 8: event [Run transitions], var [State], index 0, odb index -1, 
status 1, num_entries 0
read_history: 9: event [Run transitions], var [Run number], index 0, odb index 
-2, status 1, num_entries 0


Looking at the source code of mhttpd, these messages originate from:

[mhttpd.cxx line 10279] printf("Load from ODB %s: ", path.c_str());
[mhttpd.cxx line 10280] PrintHistPlot(*hp);
...
[mhttpd.cxx line  8336] int read_history(...
...
[mhttpd.cxx line  8343] int debug = 1;
...

Although seeing many debug messages in the mhttpd does not hurt, these can hide 
important error messages. I would rather suggest to turn off these debug 
messages by commenting out the relevant lines of code and setting the debug 
variable to 0 in read_history().

That is a minor detail and it is always a pleasure to use Midas.

Best regards,
Denis.
 
    Reply  13 Jan 2023, Stefan Ritt, Suggestion, Debug printf remaining in mhttpd.cxx 
These debug statements were added by K.O. on June 24, 2021. He should remove it.

https://bitbucket.org/tmidas/midas/commits/21f7ba89a745cfb0b9d38c66b4297e1aa843cffd

Best,
Stefan
Entry  19 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
serious (but rare) bug was fixed in the history reader. unlucky experiment would see 
errors about "Detected duplicate or non-monotonous data" in some history file, fixed by 
removing/renaming the offending file. (reported by MEG experiment)

it turns out there was nothing wrong with the data files (good), but there
was a nasty bug in the history reader. it did not ensure that we read history
files in chronological order. under some conditions order of files could be
reversed, older files would be read after newer files and trip the built-in
protection against returning non-monotonically increasing history data to the user.

fixed commit 
https://bitbucket.org/tmidas/midas/commits/9893f85ebe33e96cc63f501a0f89e1f8932c894d

for more details, see https://bitbucket.org/tmidas/midas/issues/350/file-history-non-
monotonic-time

K.O.
    Reply  23 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> serious (but rare) bug was fixed in the history reader.

previous fix was incomplete. please update to git commit
https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

K.O.
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> > serious (but rare) bug was fixed in the history reader.
> previous fix was incomplete. please update to git commit
> https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

a race condition between reading history file in mhttpd and writing history file in 
mlogger was accidentally introduced. mhttpd would file spurious errors about "timestamp 
is after last timestamp".

fixed, please update to git commit
https://bitbucket.org/tmidas/midas/commits/7a9f6e0c58ffddcacb9ee19934ce3e2033a805ef

fix race condition in history file reader - a race condition was added accidentally - 
first the reader remembers the history file size and the time of the last entry, then it 
goes to read the file and bombs if at the same time mlogger added more entries - their 
time is after the remembered time of last entry and error "timestamp is after last 
timestamp" is triggered.

K.O.
Entry  11 Nov 2022, Frederik Wauters, Bug Fix, O_CREAT in open in split.cxx 
midas currently does not compile on linux

/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
   50 |    __open_missing_mode ();

giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html

fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600
    Reply  12 Nov 2022, Stefan Ritt, Bug Fix, O_CREAT in open in split.cxx 
> midas currently does not compile on linux
> 
> /usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
>    50 |    __open_missing_mode ();
> 
> giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html
> 
> fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

Thanks. Fixed.

Stefan
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, O_CREAT in open in split.cxx 
> > midas currently does not compile on linux
> > fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

I got more warnings from split.cxx, looked at the code and see so many problems that it is easier
to delete it than it is to fix it.

Check for end of file is done incorrectly (check for read() return 0, -1 or short read),
memory overrun if given file name is longer than 80 bytes, no check for valid event length
read from the file, and so on and so on.

A better example for reading and writing midas files is in midasio/test_midasio.cxx. Proper c++ coding, and can read compressed files.

K.O.
Entry  22 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
I really like the concept of the odbxx interface.
I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.
    Reply  24 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
> I really like the concept of the odbxx interface.
> I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
> Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.

Having a "const midas::odb" probably does not work (at least I would not know how to implement that).

But I could make an internal flag analog to the auto refresh flags. So you would have

    o.set_write_protect(true);

to turn on write protection. Would that work for you?

Best,
Stefan
       Reply  26 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
> Having a "const midas::odb" probably does not work (at least I would not know how to implement that).
> 
> But I could make an internal flag analog to the auto refresh flags. So you would have
> 
>     o.set_write_protect(true);
> 
> to turn on write protection. Would that work for you?

Absolutely. Looking at the underlying code I was also at a loss how const would work.
I'm mostly just interested in having small clients that only read from the odb (for whatever reason) without risking corrupting it by messing something 
up in the code, especially since such small clients are almost by definition hacked together quickly on the fly.
          Reply  29 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
Ok, I implemented and committed that. Just call

o.set_write_protect(true)

on a key you don't want to modify. If you do so, an exception gets thrown.

Best,
Stefan
Entry  14 Oct 2022, Lars Martin, Suggestion, Allow onchange to refer to arbitrary js function 
Maybe this is already possible, I have a hard time understanding the mhttpd source code.

I would like to use a function defined in the <script> block of my custom page as an onchange callback.

Specific example:
I have an modbthermo that I would like to change to three different colours for "too cold", "just right", and "too hot" (measuring porridge, presumably). The examples only show the explicit (condition)?(val1):(val2) syntax, which doesn't allow more than two values, so I had hoped to replace

onchange="this.dataset.color=this.value > 40?'red':'blue';"

with something like

onchange="this.dataset.color=check_Temp(this.value);"

or

onchange="check_Temp(this.value, this.dataset.color);"

if that's easier somehow. The function itself would then return the colour string, or set the color argument to that string (I'm not sure if JS passes references or just values.)

Is this a possibility?
    Reply  14 Oct 2022, Ben Smith, Info, Allow onchange to refer to arbitrary js function 
> I would like to use a function defined in the <script> block of my custom page as an onchange callback.
>
> Is this a possibility?

Yes, this is already possible. An example was shown in the "modb" section of the custom page documentation, but not in the "Changing properties of controls dynamically" section. I've updated the wiki with an example.

https://daq00.triumf.ca/MidasWiki/index.php/Custom_Page#Changing_properties_of_controls_dynamically
       Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
I figured I wasn't the first to have this idea.
Works great, thanks!
          Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
Actually, now that I look again, there is a mistake in the instructions:
you say

onchange="this.dataset.color=check_therm(this)"

but check_therm doesn't return anything and instead changes the color itself. So you either want the function to return the string and use the above assignment, or use the function you provide and use

onchange="check_therm(this)"
Entry  10 Oct 2022, Zaher Salman, Suggestion, JSON-RPC function to read files mjsonrpc_user.cxx
Hello ,

The midas sequencer uses the function js_seq_list_files to get a list of files in the /Sequencer/State/Path with extension *.msl. It would be nice to generalize this function to be able to read files with other (or any) extension.

Based on the js_seq_list_files I added a function in js_any_list_files mjsonrpc_user.cxx (attached) which does the job. Maybe a better/safer implementation can be made in midas. Are there any plans to do this?

thanks.
Entry  21 Aug 2022, Joseph McKenna, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 


I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:

https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract
    Reply  22 Aug 2022, Stefan Ritt, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 
> I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:
> 
> https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract

Thanks for raising that point. I realized that the odbxx API was also missing that functionality, so I added it: 
https://bitbucket.org/tmidas/midas/commits/6991a92c19292eaf67721cb80f182c61db077f45

Best,
Stefan
Entry  15 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console

17:21:28.821 Script terminated by timeout at:
MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
mhistory.js:2828:7

Any ideas how to resolve this??
    Reply  15 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> 
> 17:21:28.821 Script terminated by timeout at:
> MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> mhistory.js:2828:7
> 
> Any ideas how to resolve this??

I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
problem anywhere else.

Stefan
       Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> > Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> > 
> > 17:21:28.821 Script terminated by timeout at:
> > MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> > MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> > mhistory.js:2828:7
> > 
> > Any ideas how to resolve this??
> 
> I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
> problem anywhere else.
> 
> Stefan

Hi Stefan and Konstantin,

The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas
Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.
The hangs seem to happen randomly so I have not been able to reproduce it yet. 
I have histories here  http://lem03.psi.ch:8081/?cmd=custom&page=Mudas&tab=3 (30 minutes each), but I have also histories popping up in modals though they do not cause any issues. 
I'll try to reproduce it in the coming few days and report again.

thanks,
Zaher
          Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
I found the bug. The problem is triggered by changing the firefox window. This calls a function that is supposed to change the size of the history plot and it works well when the history plots are visible but not if the history plots are hidden in a javascript tab (not another firefox tab).

Is there a clean way to resize the history plot if the parent div changes size?? The offending code is
mhist[i].mhg = new MhistoryGraph(mhist[i]);
mhist[i].mhg.initializePanel(i);
mhist[i].mhg.resize();
mhist[i].resize = function () {
   mhis.mhg.resize();
};
             Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like

<div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">

Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.

In you function mhistory_init_one() you forgot the call

   mhist.dataset.baseURL = baseURL; 

where baseURL has to come from the current address bar like

   let baseURL = window.location.href;
   if (baseURL.indexOf("?cmd") > 0)
      baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
   baseURL += "?cmd=history";

If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.

Best,
Stefan
                Reply  17 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like
> 
> <div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">
> 
> Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.
> 
> In you function mhistory_init_one() you forgot the call
> 
>    mhist.dataset.baseURL = baseURL; 
> 
> where baseURL has to come from the current address bar like
> 
>    let baseURL = window.location.href;
>    if (baseURL.indexOf("?cmd") > 0)
>       baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
>    baseURL += "?cmd=history";
> 
> If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.
> 

Thanks Stefan, but this was not the problem since I am setting the baseURL. You may have looked at the code during my debugging.

Some of my histories are placed in an IFrame object. I eventually realized that my code fails when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue by making sure that I am resizing plots only if they are in a visible IFrame.
  
                   Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Some of my histories are placed in an IFrame object. I eventually realized that my code fails 
> when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue 
> by making sure that I am resizing plots only if they are in a visible IFrame.

Just to be clear: You could resolve everything on your side, or do you need to change anything in mhistory.js?

Just a tip: IFrames are not good to put anything in. I recommend just to dynamically crate a <div> element, 
append it to the document body, make it floating and initially invisible. Then put all inside that div. Have
a look how control.js do it. This takes less resources than a complete IFrame and is much easier to handle.

Stefan
          Reply  16 Aug 2022, Konstantin Olchanski, Bug Report, firefox hangs due to mhistory 
> > > Firefox is hanging/becoming unresponsive due to javascript code.
> 
> The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas

so malfunction is not in the midas history page, but in a custom page. I could help you debug it,
but you would have to provide the complete source code (javascript and html).

> Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.

my firefox is 103.something. when you say google-chrome has "similar issues",
I read it as "google-chrome does not show this same bug, but shows some other
bug somewhere else". (if I misread you, you have to write better).

but this gives you a front to attack your bugs. basically all browsers should render your
custom page exactly the same (unless you use some obscure or experimental feature, which I
recommend against).

so you tweak your page to identify the source of different rendering results, and try to eliminate it,
hopefully by the time you get your page render exactly the same everywhere, all the real bugs
have gotten shaken out, too. (this is similar to debugging a c++ program by compiling
it on linux, mac, windows, vax, raspbery pi, etc and checking that you get the same result everywhere).

> The hangs seem to happen randomly so I have not been able to reproduce it yet.

I find that javascript debuggers are not setup to debug hangs. I think debugger runs partially
inside the same javascript engine you are debugging, so both hang and debugging is impossible.

(latest google-chrome has another improvement, all pages from the same computer run in the same
javascript engine, so if one midas page stops (on exception or because I debug it), all midas pages
stop and I have to run two different browsers if I want to debug (i.e.) a history page crash
and look at odb at the same time. fun).

K.O.
Entry  05 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
Several submodules of midas have been re-organized, so if you want to pull the 
newest version, you need a 

git pull --recurse-submodules
git submodule update --init --recursive

before you can build again. To do this automatically the next time, you can do

git config submodule.recurse true

which needs git 2.14 or later. I hope this works for everybody. If there is a 
better way to do that (I'm not a big expert on git) please reply here.

Stefan
    Reply  08 Aug 2022, Konstantin Olchanski, Info, Information for midas updates though git 
> git pull --recurse-submodules
> git submodule update --init --recursive
> git config submodule.recurse true

does not work for me, macos 12.4 git 2.32.1.

after I set "submodule.recurse true", I still have to type "git submodule update --
init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

P.S. the underlying issue is that the mxml submodule is now included twice 
(midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
the same. (No idea what happens if the two mxml's are different).
       Reply  08 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
> after I set "submodule.recurse true", I still have to type "git submodule update --
> init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

Indeed, doesn't work for me either. If some git guru has some more insight, please post 
here!

 
> P.S. the underlying issue is that the mxml submodule is now included twice 
> (midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
> the same. (No idea what happens if the two mxml's are different).

The version of each mxml is defined by last commit of the parent repository, which contains 
the hash of the submodule version. If we have to update mxml for some reason, we have to 
commit also mscb with the new version, and then midas with the same version of mxml. If one 
checks out midas then with 

git clone https://bitbucket.org/tmidas/midas --recursive

one gets the same versions for mxml.
Entry  08 Aug 2022, Konstantin Olchanski, Info, odb disallow key names that start or end with spaces 
while testing the new odb editor, we ran into a number of problems with key names 
that start or end with spaces. we cannot think of any valid use case for such key 
names (subdirectories and variables) and we think they could only have been 
created by mistake. ODB now disallows such names. K.O.
Entry  08 Aug 2022, Konstantin Olchanski, Info, midas on ubuntu LTS 22.04 
reporting that as of commit 78f707c0686d22f8329c7a1f1c46d7dccf35ceff, midas builds 
without errors or warnings on Ubuntu LTS 22.04, 20.04, CentOS-7 and MacOS 12.4. 
(except for some warnings from mscb and msc). K.O.
Entry  06 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
While the odbxx API has been successfully used since the last months, a potential 
problem with large ODBs surfaced. If you have lots of data in the ODB and load it 
into an object like

midas::odb o("/Equipment");

this might take quite long, since each ODB value is fetched separately, which is 
very quick on a local machine but can take long over a client-server connection. 
For large experiments this can take up to minutes (!).

To get rid of this problem, the underlying object model has been modified. When an 
object is instantiated like above, then the whole ODB tree is fetched in an XML 
buffer in a single transfer, which even for large ODBs usually takes much less 
than a second. Then the XML buffer is decomposed on the client side and converted 
into the proper midas::odb objects. In one case this gave an improvement from 35 
seconds to 0.5 seconds which is significant. To enable the new method, the object 
can be created with a flag like

midas::odb o("/Equipment", true);

which then switches to the new method. One has to take care not to fool oneself 
(like I did) by printing the object like

midas::odb o("/Equipment", true);
std::cout << o << std::endl;

because each read access to any sub-object of o causes a separate read request to 
the server which again can take long. Therefore, one has to switch off the auto 
refresh via

midas::odb o("/Equipment", true);
o.set_auto_refresh_read(false);
std::cout << o << std::endl;

Accessing any sub-object of o then does not cause a client-server request, which 
is not necessary if all objects just have been pulled from the server before. If 
one keeps the object however for a long time in memory, one has to be aware that 
it only contains "old" values from the time if instantiation. If one needs more 
current ODB values, the auto read refresh has to be turned on again.

Stefan
    Reply  08 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
After some thought, I changed the API again and removed the flag in the constructor,
so the system now automatically choses the best algorithm depending if the client
is connected to a local or a remote API. So in all cases you use again the old syntax:

midas::odb o("/Equipment");



Stefan
Entry  18 Jul 2022, Konstantin Olchanski, Release, midas-2022-05 
There is a release branch for midas-2022-05 and corresponding git tag midas-2022-
05-b. This branch is known to be stable and is working well for the ALPHA 
experiment at CERN. Latest update to this branch fixes two problems in the 
mserver (rpc timeout and a use-after-free internal error).

https://bitbucket.org/tmidas/midas/branch/release/midas-2022-05
K.O.
Entry  25 Jun 2022, Joseph McKenna, Bug Report, RPC timeout for manalyzer over network 

In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

gdb trace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff5d35859 in __GI_abort () at abort.c:79
#2  0x00005555555a2a22 in rpc_call (routine_id=11111) at /home/alpha/packages/midas/src/midas.cxx:13866
#3  0x000055555562699d in bm_receive_event_rpc (buffer_handle=buffer_handle@entry=2, buf=buf@entry=0x0, buf_size=buf_size@entry=0x0, ppevent=ppevent@entry=0x0, pvec=pvec@entry=0x7fffffffd700,
    timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10510
#4  0x0000555555631082 in bm_receive_event_vec (buffer_handle=2, pvec=pvec@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10794
#5  0x0000555555673dbb in TMEventBuffer::ReceiveEvent (this=this@entry=0x555557388b30, e=e@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/tmfe.cxx:312
#6  0x0000555555607b56 in ReceiveEvent (b=0x555557388b30, e=0x7fffffffd6c0, timeout_msec=100) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1411
#7  0x000055555560d8dc in ProcessMidasOnlineTmfe (args=..., progname=<optimized out>, hostname=<optimized out>, exptname=<optimized out>, bufname=<optimized out>, event_id=<optimized out>,
    trigger_mask=<optimized out>, sampling_type_string=<optimized out>, num_analyze=0, writer=<optimized out>, multithread=<optimized out>, profiler=<optimized out>,
    queue_interval_check=<optimized out>) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1534
#8  0x000055555560f93b in manalyzer_main (argc=<optimized out>, argv=<optimized out>) at /usr/include/c++/9/bits/basic_string.h:2304
#9  0x00007ffff5d37083 in __libc_start_main (main=0x5555555b1130 <main(int, char**)>, argc=8, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffdd98) at ../csu/libc-start.c:308
#10 0x00005555555b184e in _start () at /usr/include/c++/9/bits/stl_vector.h:94

Any suggestions? Many thanks
    Reply  18 Jul 2022, Konstantin Olchanski, Bug Report, RPC timeout for manalyzer over network 
> In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

there is a subtle bug in the mserver. under rare conditions, ss_suspend() will recurse in an unexpected way
and mserver will go to sleep waiting for data from a udp socket (that will never arrive, so sleep forever).
remote client will see it as an rpc timeout. in my tests (and in ALPHA-g at CERN, as reported by Joseph),
I see this rare condition to happen about every 5 minutes. in normal use, this is the first time we become
aware of this problem, the best I can tell this bug was in the mserver since day one.

commit https://bitbucket.org/tmidas/midas/commits/fbd06ad9d665b1341bd58b0e28d6625877f3cbd0
to develop and
to release/midas-2022-05

The stack trace that shows the mserver hang/crash (sleep() is the stand-in for the sleep-forever socket read).

(gdb) bt
#0  0x00007f922c53f9e0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1  0x00007f922c53f894 in sleep () from /lib64/libc.so.6
#2  0x0000000000451922 in ss_suspend (millisec=millisec@entry=100, msg=msg@entry=1) at /home/agmini/packages/midas/src/system.cxx:4433
#3  0x0000000000411d53 in bm_wait_for_more_events_locked (pbuf_guard=..., pc=pc@entry=0x7f920639b93c, timeout_msec=timeout_msec@entry=100, 
    unlock_read_cache=unlock_read_cache@entry=1) at /home/agmini/packages/midas/src/midas.cxx:9429
#4  0x00000000004238c3 in bm_fill_read_cache_locked (timeout_msec=100, pbuf_guard=...) at /home/agmini/packages/midas/src/midas.cxx:9003
#5  bm_read_buffer (pbuf=pbuf@entry=0xdf8b50, buffer_handle=buffer_handle@entry=2, bufptr=bufptr@entry=0x0, buf=buf@entry=0x7f9203d75020, 
    buf_size=buf_size@entry=0x7f920639aa20, vecptr=vecptr@entry=0x0, timeout_msec=timeout_msec@entry=100, convert_flags=0, 
    dispatch=dispatch@entry=0) at /home/agmini/packages/midas/src/midas.cxx:10279
#6  0x0000000000424161 in bm_receive_event (buffer_handle=2, destination=0x7f9203d75020, buf_size=0x7f920639aa20, timeout_msec=100)
    at /home/agmini/packages/midas/src/midas.cxx:10649
#7  0x0000000000406ae4 in rpc_server_dispatch (index=11111, prpc_param=0x7ffcad70b7a0) at /home/agmini/packages/midas/progs/mserver.cxx:575
#8  0x000000000041ce9c in rpc_execute (sock=10, buffer=buffer@entry=0xe11570 "g+", convert_flags=0)
    at /home/agmini/packages/midas/src/midas.cxx:15003
#9  0x000000000041d7a5 in rpc_server_receive_rpc (idx=idx@entry=0, sa=0xde6ba0) at /home/agmini/packages/midas/src/midas.cxx:15958
#10 0x0000000000451455 in ss_suspend (millisec=millisec@entry=1000, msg=msg@entry=0) at /home/agmini/packages/midas/src/system.cxx:4575
#11 0x000000000041deb2 in rpc_server_loop () at /home/agmini/packages/midas/src/midas.cxx:15907
#12 0x0000000000405266 in main (argc=9, argv=<optimized out>) at /home/agmini/packages/midas/progs/mserver.cxx:390
(gdb) 

K.O.
Entry  19 Jun 2022, Francesco Renga, Forum, Alarm on variable not updating 
Dear all,
I've an ODB equipment that sometimes loses the connection with the hardware, so that the variables are not updated anymore. The connection can be restored by restarting the frontend. It would be useful to have an alarm based on the time from the last update of some variable (i.e. the alarm is triggered if the variable is not updated for more than X seconds). Is there a method to implement such an alarm in MIDAS?

Thank you very much,
Francesco
    Reply  20 Jun 2022, Stefan Ritt, Forum, Alarm on variable not updating 
There are two functions to do that, one check the last write access, the other the last write access if the run is running. The alarm condition looks like:

access(/Equipment/.../Variables/Input[10]) > 60

which will cause an alarm if the Input[10] is not written for more than 60 seconds. The other function which checks the run status as well is like:

access_running(...odb key...) > 60

You can actually see an example on the MEG alarm page.

Rather than having an alarm for that I would however recommend that you program you frontend such that it realizes if it looses connections, then tries automatically to reconnect or trigger an alarm itself (so-called "internal" alarm). This is also how the MSCB system is working and is much more robust.

Stefan
Entry  20 Jun 2022, jianrun, Bug Report, Error in "midas/src/mana.cxx" 
Dear Midas developers,

When we are running the examples in $MIDASSYS/examples/experiment/, we meet some 
problems when analyzing the results:
1. When we analyze the data using the analyzer: ./analyzer -i run00001.mid -o 
run00001.rz  , we find some bugs: 
"
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
[Analyzer,ERROR] [mana.cxx:1832:bor,ERROR] HBOOK support is not compiled in
[Analyzer,INFO] Set run number 6 in ODB
Load ODB from run 6...OK
run00006.mid:2680  events, 0.00s
"
We think this occurs in the "midas/src/mana.cxx ". How can we solve this?

2. When we analyze the above data, an error also occurs: 
[Analyzer,ERROR] [odb.cxx:847:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
not contain "["

We simply fixed that just by replacing the "Rate [Hz]" with "Rate" in the 
test_write in midas/src/mana.cxx 
We are curious whether you can fix the problem permanently in the next version, 
or we are not running the code properly. Thanks!
ELOG V3.1.4-2e1708b5