Back Midas Rome Roody Rootana
  Midas DAQ System, Page 117 of 136  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subjectup
  1201   26 Sep 2016 Wes GohnInfomongoose v6.4 is ready for use
Since updating to the most recent midas commit, we get the following error if we try running mhttpd without su privileges: 

>mhttpd -e CR --http 8081
mhttpd is running in setuid-root mode.
mhttpd is listening on port 80
Mongoose version 4 cannot listen to port 80 in setuid mode. Please use mongoose version 6. Sorry, bye!
[mhttpd,ERROR] [midas.c:1960:,ERROR] cm_disconnect_experiment not called at end of program

It works if we run it as root, but that creates other problems. Is there a flag to turn off setuid-root mode? Or some other fix?

Thanks,
Wes
  1202   26 Sep 2016 Konstantin OlchanskiInfomongoose v6.4 is ready for use
> Since updating to the most recent midas commit, we get the following error if we try running mhttpd without su privileges: 
> 
> >mhttpd -e CR --http 8081
> mhttpd is running in setuid-root mode.
> mhttpd is listening on port 80
> Mongoose version 4 cannot listen to port 80 in setuid mode. Please use mongoose version 6. Sorry, bye!
> [mhttpd,ERROR] [midas.c:1960:,ERROR] cm_disconnect_experiment not called at end of program
> 
> It works if we run it as root, but that creates other problems. Is there a flag to turn off setuid-root mode? Or some other fix?
> 


From these messages, it looks like you really are using the setuid-root mode. And indeed it is not usable with the mongoose version 4 implementation in MIDAS.

I can suggest several fixes:

1) the setuid-root mode was only ever intended for use at PSI because of peculiar network configuration of the PSI corporate firewall. It is not intended for general 
use.
1a) I as an author of MIDAS recommend against using the setuid-root mode and against installing mhttpd as setuid-root because it is not secure. (normally you 
would run mhttpd behind an apache https proxy providing https encryption and password protection).
1b) if you follow the midas installation instructions at https://midas.triumf.a you will see that we do not login as root and run "make install" to install mhttpd as 
setuid-root.
1c) if you follow these instructions, or if you run mhttpd from the midas build directory ($MIDASSYS/linux/bin/mhttpd), the setuid-root mode will not activate and 
everything will work ok.

2) you can run in the "old server" mode, but this more does not implement the JSON-RPC methods, so the "programs" and "alarms" pages will not work.
3) you can build mhttpd with the mongoose version 6 implementation, it will work even with the setuid-root mode. To do this, edit the Makefile, comment-out 
"USE_MONGOOSE4=1" and uncomment "USE_MONGOOSE6=1", then make clean, make.

K.O.
  142   26 Jul 2003 Konstantin Olchanski more ODB checks in src/odb.c
Add more checks to db_validate_key() for pkey->total_size, item_size and
num_values. Automatically correct total_size to be item_size*num_values (we
saw this corruption and tested this fix).

K.O.

For your enjoyment, here is the diff:

RCS file: /usr/local/cvsroot/midas/src/odb.c,v
retrieving revision 1.64
diff -r1.64 odb.c
718a719,744
>   /* check key sizes */
>   if ((pkey->total_size < 0)||(pkey->total_size > pheader->key_size))
>     {
>     cm_msg(MERROR, "db_validate_key", "Warning: invalid key \"%s\"
total_size: %d", path, pkey->total_size);
>     return 0;
>     }
> 
>   if ((pkey->item_size < 0)||(pkey->item_size > pheader->key_size))
>     {
>     cm_msg(MERROR, "db_validate_key", "Warning: invalid key \"%s\"
item_size: %d", path, pkey->item_size);
>     return 0;
>     }
> 
>   if ((pkey->num_values < 0)||(pkey->num_values > pheader->key_size))
>     {
>     cm_msg(MERROR, "db_validate_key", "Warning: invalid key \"%s\"
num_values: %d", path, pkey->num_values);
>     return 0;
>     }
> 
>   /* check and correct key size */
>   if (pkey->total_size != pkey->item_size*pkey->num_values)
>     {
>     cm_msg(MINFO,  "db_validate_key", "Warning: corrected key \"%s\" size:
total_size=%d, should be %d*%d=%d", path, pkey->total_size, pkey->item_size,
pkey->num_values, pkey
->item_size*pkey->num_values);
>     pkey->total_size = pkey->item_size*pkey->num_values;
>     }
> 
  108   01 Nov 2003 Stefan Ritt more odb
> I added error checking to the places where we read "/runinfo/run number". In
> general, I do this:

> Affected files:
> src/lazylogger.c
> src/odbedit.c
> src/mlogger.c
> src/mfe.c
> src/odb.c
> src/mana.c
> src/midas.c
> src/mhttpd.c

Now YOU broke the system by editing all these files with something I consider 
temporary debugging code. A run number of zero is *VALILD*. If I want to make 
sure a new experiment starts with run number #1, I put a run number of 0 into 
the ODB. So on the first start the number is incremented by one which results 
in run number from one. So please remove those checks which prevents me of 
doing that. Again, your "run number zero" problem is soemhow specific to your 
environment, and I would not put all these tests into the distribution, 
because this can have side effects, like that one I described above.

- Stefan
  109   01 Nov 2003 Konstantin Olchanski more odb
> > I added error checking to the places where we read "/runinfo/run number". 
> Now YOU broke the system by editing all these files with something I consider 
> temporary debugging code. A run number of zero is *VALILD*.

I think I broke nothing. I do know that run number 0 is a valid odb value. Here
is an audit of all places where I abort on invalid run numbers:

mana.c: line 3676: assert(current_run_number > 0);
we take the run number from an event and write it into ODB. Events cannot have
run number negative or zero.

mana.c:analyze_run(): line 4632: assert(run_number > 0);
we are asked to analyze run "run_number". zero or negative is not valid.

midas.c:assert(run_number > old_run_number);
midas.c:assert(run_number > 1);
this code is not in CVS.

odbedit.c: line 2563: assert(old_run_number >= 0);
run number zero is valid

odbedit.c: line 2641: assert(new_run_number > 0);
starting a new run number zero is not valid

mfe.c: line 1786: if (run_number<=0) cm_msg(MERROR, "main", "aborting on attempt
to use invalid run number %d", run_number);
auto restart from run 0 to 1 is not valid

midas.c: line 3917: if (run_number<=0) cm_msg(MERROR, "cm_transition", "aborting
on attempt to use invalid run number %d",run_number);
transition to run zero or negative is not valid

midas.c: line 16101: if (run_number<0) cm_msg(MERROR, "el_submit", "aborting on
attempt to use invalid run number %d", run_number);
negative run numbers are not valid

mlogger.c: line 3301: if (run_number<=0) cm_msg(MERROR, "main", "aborting on
attempt to use invalid run number %d", run_number);
auto restart from run 0 to run 1 is not valid

K.O.
  110   14 Nov 2003 Stefan Ritt more odb
Ok, I apologize. It's all ok. Thanks for clearifying. Concerning the assert's, it 
would be nice to be able to disable them in release code. Under Windows, the 
assert() is actually a macro which expands to zero if NDEBUG is defined. I 
believe it's the same under linux, but I don't know about VxWorks. So we have 
three options:

1) Keep asserts always. This might possible slow down a DAQ system, but I'm not 
sure how much. Might be negligible.

2) Disable asserts by default (standard make). Only the "experts" can enable it 
in the make file (by removing NDEBUG), since only they know what to do with the 
assertation messages.

3) Let the user decide on the standard installation. Maybe have two libraries, 
one debug, one no-debug. The no-debug can even have the compiler optimization 
disabled, which makes debugging easier.

So what is your opinion (comments from others are welcome as well) of which way 
to go? 
  107   31 Oct 2003 Konstantin Olchanski more odb "run number" error checking
I added error checking to the places where we read "/runinfo/run number". In
general, I do this:

  status = db_get_value("/runinfo/run number",&run_number);
  assert(status==SUCCESS);
  assert(run_number >= 0); (and run_number>0, where appropriate)

Here is the rationale: if we cannot read the run number, something must be
very terribly wrong. I cannot think of any recovery action other than
abort() and make a core dump for our debugging enjoyment.

I considered and rejected adding a "retry" loop: if we allow db_get_value()
to intermittently fail, then it's every use has to be wrapped in a retry
loop, which then should be inside db_get_value(), making it pointless to
have external "retry" loops.

I am now pondering on proposing a "db_get_value_cannot_possibly_fail()"
function (it would abort(), exit() with an error or commit harakiri if it
can't get the value). They way most db_xxx() functions are used in midas,
maybe they should be made "void" and "unfailible", with "STATUS
db_xxx_yes_I_can_fail_and_return_an_error_code()" evil twins. I guess this
is why "they" invented C/C++ exceptions. Anyway, something to think about.

Affected files:
src/lazylogger.c
src/odbedit.c
src/mlogger.c
src/mfe.c
src/odb.c
src/mana.c
src/midas.c
src/mhttpd.c

K.O.
  2045   30 Nov 2020 Konstantin OlchanskiInfomore wisdom from linux kernel people
As you may know, I am a big fan of two software projects - the linux kernel and ROOT. The linux kernel is one of 
the few software projects "done right". ROOT is where normal people try to "get it right" with real-world level 
of success. I use both softwares daily and I try to apply their ways and methods to MIDAS as much as I can.

So just in time for our discussion of array indexes, a talk by gregkh shows
up on slashdot. The title is "how to keep your users happy". (Nobody
ever wants to be nasty to their users, but do read his talk).

https://git.sr.ht/~gregkh/presentation-application_summit/tree/main/keep_users_happy.pdf

The talk refers to some older stuff, still relevant, of course, in case you miss the links
in the pdf file, here they are:

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
https://ozlabs.org/~rusty/ols-2003-keynote/img0.html (click on "continue" to see next page)

K.O.
  367   09 Apr 2007 Konstantin OlchanskiInfomove history, elog and alarm functions into separate files
As approved by Stefan, I moved the history (hs_xxx), alarm (al_xxx) and elog (el_xxx) functions out of 
midas.c into separate files. Commited as revision 3665. This change should be transparent to all users. 
K.O.
  513   22 Oct 2008 Konstantin OlchanskiInfomscb timeouts and retries
A new set of functions was added to mscb.h to adjust mscb timeouts and retries to better match specific 
applications:

+   int EXPRT mscb_get_max_retry();
+   int EXPRT mscb_set_max_retry(int max_retry);
+   int EXPRT mscb_get_usb_timeout();
+   int EXPRT mscb_set_usb_timeout(int timeout);
+   int EXPRT mscb_get_eth_max_retry();
+   int EXPRT mscb_set_eth_max_retry(int eth_max_retry);

There are 3 settings:

1) mscb_max_retry: most (all?) mscb operations, like mscb_read(), retry failed mscb transactions up to 
10 times. The corresponding set and get functions allow tuning this retry limit.

2) mscb_usb_timeout: the driver for the USB-MSCB adapter uses a timeout of 6 seconds. 
mscb_set_usb_timeout() permits changing this value.

3) mscb_eth_max_retry: the driver for the Ethernet-MSCB adapter has to deal with UDP packet loss. If 
the adapter does not respond to a UDP command, the UDP command is sent again, with a bigger 
timeout (timeout = 100 * (retry+1), in ms), this is repeated up to 10 times. mscb_set_eth_max_retry() 
permits adjusting this number of retries.

This is how it works for the usb interface:

int mscb_read(...)
   for (retry=0; retry<mscb_max_retry; retry++)
       mscb_exch()
            musb_write(..., mscb_usb_timeout)
            musb_read(..., mscb_usb_timeout)     

This is how it works for the ethernet interface:

int mscb_read(...)
   for (retry=0; retry<mscb_max_retry; retry++)
       mscb_exch()
            for (retry=0; retry<mscb_eth_max_retry; retry++)
                 send_udp_command()
                 wait_for_udp_response(timeout = 100 * (retry+1))

This is how the new functions are intended to be used:
   ...
   int old = mscb_set_max_retry(2);
   ... do stuff ...
   mscb_set_max_retry(old); // restore default value

svn revision 4356.
K.O.
  519   28 Oct 2008 Stefan RittInfomscb timeouts and retries
> A new set of functions was added to mscb.h to adjust mscb timeouts and retries to better match specific 
> applications:
> 
> +   int EXPRT mscb_get_max_retry();
> +   int EXPRT mscb_set_max_retry(int max_retry);
> +   int EXPRT mscb_get_usb_timeout();
> +   int EXPRT mscb_set_usb_timeout(int timeout);
> +   int EXPRT mscb_get_eth_max_retry();
> +   int EXPRT mscb_set_eth_max_retry(int eth_max_retry);

In the spirit of this, a variable retry scheme has been implemented in the mscbdev.c device driver. At the 
MEG experiment, we have one mscb device which is pretty slow, while the others are fast. Therefore it is 
necessary to have a per-device max retry count which can be different for different submasters. I moved 
therefore the max_eth_retry variable into the mscb_fd structure and adjusted a few functions accordingly. I 
did not bother with the other timeouts and retries, since I don't need this for the moment, but it would be 
nice if they would be handled in the same way. Then I added code into mscbdev.c to read the retry variable 
form the ODB under /Equipment/<name>/Settings/Device/<Name>/Retries. The default is 10, but it can be 
changed and becomes valid after the program has been restarted. 
  150   03 Oct 2004 Konstantin OlchanskiInfomscb usb support for macosx
After a felicitous confuence of stellar bodies (Stefan, myself, some mscb hardware
and a mac laptop all in the same room for a few days), I wrote some MacOSX
code to support the MSCB-USB dongle using the native IoKit USB API. During testing,
I was able to communicate with an MSCB High voltage regulator module. I am now
commiting this code to CVS, warts and all (we can clean it up when somebody actually
uses it). Tested compilation on Linux (with libusb) and MacOSX (native
IoKit. MacOSX+libusb is possible but untested), Win32 should be unaffected by my changes,
but I could not test it.
K.O.
  391   29 Jun 2007 Konstantin OlchanskiBug Fixmscb, musbstd fixed on Linux, MacOS
I commited a few minor changes to musbstd and mscb code to make them work on
MacOSX (tested on 10.3.9) and Linux (tested on Fedora 6).

The basic functions work with the MSCB USB master, but I still need to
investigate some cases where the connection hangs and usb communications do not
work until the USB cable is unplugged and plugged back in. I see this problem
both on MacOS and Linux.

Important changes:
1) mscb_select_device() does not work on both Linux and MacOS and is disabled.
Please run "msc -d usb0".
2) on Linux, the Makefile should define -DOS_LINUX and -DHAVE_LIBUSB;
   on MacOS, the Makefile should define -DOS_LINUX and -DOS_DARWIN. (This is
because MacOS is treated as a funny type of Linux).
3) when doing USB communications, one has to use the correct endpoint numbers,
which seem to be system dependant and for now, I hard code them in mscb.c for
the tested systems.

There supposed to be no changes to the Windows code, but I cannot test on
Windows, so if somebody does and finds breakage, please let me know.

K.O.
  392   02 Jul 2007 Stefan RittBug Fixmscb, musbstd fixed on Linux, MacOS

KO wrote:
There supposed to be no changes to the Windows code, but I cannot test on Windows, so if somebody does and finds breakage, please let me know.


I can confirm that revision 3713 still works under Windows.
  394   06 Jul 2007 Konstantin OlchanskiBug Fixmscb, musbstd fixed on Linux, MacOS
> I commited a few minor changes to musbstd and mscb code...
>
> The basic functions work with the MSCB USB master, but I still need to
> investigate some cases where the connection hangs and usb communications do not
> work until the USB cable is unplugged and plugged back in. I see this problem
> both on MacOS and Linux.

I think I fixed the hangs we see on linux and macos - at the end all I had to do is
issue a usb reset to make mscb communicate again.

Also tested on Linux FC6 and SL4.5.

K.O.
  1173   30 Mar 2016 Belina von KrosigkForummserver ERR message saying data area 100% full, though it is free
Hi,

I have just installed Midas and set-up the ODB for a SuperCDMS test-facility (on
a SL6.7 machine). All works fine except that I receive the following error message:

[mserver,ERROR] [odb.c:944:db_validate_db,ERROR] Warning: database data area is
100% full

Which is puzzling for the following reason:

-> I have created the ODB with: odbedit -s 4194304
-> Checking the size of the .ODB.SHM it says: 4.2M
-> When I save the ODB as .xml and check the file's size it says: 1.1M
-> When I start odbedit and check the memory usage issuing 'mem', it says: 
...
Free Key area: 1982136 bytes out of 2097152 bytes
...
Free Data area: 2020072 bytes out of 2097152 bytes
Free: 1982136 (94.5%) keylist, 2020072 (96.3%) data

So it seems like nearly all memory is still free. As a test I created more
instances of one of our front-ends and checked 'mem' again. As expected the free
memory was decreasing. I did this ten times in fact, reaching

...
Free Key area: 1440976 bytes out of 2097152 bytes
...
Free Data area: 1861264 bytes out of 2097152 bytes
Free: 1440976 (68.7%) keylist, 1861264 (88.8%) data

So I could use another >20% of the database data area, which is according to the
error message 100% (resp. >95%) full. Am I misunderstanding the error message?
I'd appreciate any comments or ideas on that subject!

Thanks, Belina
  2718   26 Feb 2024 Maia Henriksson-WardForummserver ERR message saying data area 100% full, though it is free
> Hi,
> 
> I have just installed Midas and set-up the ODB for a SuperCDMS test-facility (on
> a SL6.7 machine). All works fine except that I receive the following error message:
> 
> [mserver,ERROR] [odb.c:944:db_validate_db,ERROR] Warning: database data area is
> 100% full
> 
> Which is puzzling for the following reason:
> 
> -> I have created the ODB with: odbedit -s 4194304
> -> Checking the size of the .ODB.SHM it says: 4.2M
> -> When I save the ODB as .xml and check the file's size it says: 1.1M
> -> When I start odbedit and check the memory usage issuing 'mem', it says: 
> ...
> Free Key area: 1982136 bytes out of 2097152 bytes
> ...
> Free Data area: 2020072 bytes out of 2097152 bytes
> Free: 1982136 (94.5%) keylist, 2020072 (96.3%) data
> 
> So it seems like nearly all memory is still free. As a test I created more
> instances of one of our front-ends and checked 'mem' again. As expected the free
> memory was decreasing. I did this ten times in fact, reaching
> 
> ...
> Free Key area: 1440976 bytes out of 2097152 bytes
> ...
> Free Data area: 1861264 bytes out of 2097152 bytes
> Free: 1440976 (68.7%) keylist, 1861264 (88.8%) data
> 
> So I could use another >20% of the database data area, which is according to the
> error message 100% (resp. >95%) full. Am I misunderstanding the error message?
> I'd appreciate any comments or ideas on that subject!
> 
> Thanks, Belina

This is an old post, but I encountered the same error message recently and was looking for a 
solution here. Here's how I solved it, for anyone else who finds this: 
The size of .ODB.SHM was bigger than the maximum ODB size (4.2M > 4194304 in Belina's case). For us, 
the very large odb size was in error and I suspect it happened because we forgot to shut down midas 
cleanly before shutting the computer down. Using odbedit to load a previously saved copy of the ODB 
did not help me to get .ODB.SHM back to a normal size. Following the instructions on the wiki for 
recovery from a corrupted odb, 
https://daq00.triumf.ca/MidasWiki/index.php/FAQ#How_to_recover_from_a_corrupted_ODB, (odbinit with --cleanup option) should 
work, but didn't for me. Unfortunately I didn't save the output to figure out why. My solution was to manually delete/move/hide 
the .ODB.SHM file, and an equally large file called .ODB.SHM.1701109528, then run odbedit again and reload that same saved copy of my ODB. 
Manually changing files used by mserver is risky - for anyone who has the same problem, I suggest trying odbinit --cleanup -s 
<yoursize> first.
  2543   21 Jun 2023 Gennaro TortoneBug Reportmserver and script execution
Hi,
I have the following setup:

- MIDAS release: release/midas-2022-05-c
- host with MIDAS frontend (mclient)
- host with MIDAS server (mhttpd / mserver)

On mclient I run a frontend with:

./feodt5751 -h mserver -e develop -i 0

On mserver I see frontend ready and ODB variables in place;

I noticed a strange behavior with "/Programs/Execute on start run" and 
"/Programs/Execute on stop run". In details the script to execute at start of run
is executed on "mserver" host but the script to execute at stop of run is executed on
"mclient" host (!)

Is this a bug or I'm missing some documentation links ?

Thanks in advance,
Gennaro
  2550   26 Jun 2023 Stefan RittBug Reportmserver and script execution
Indeed that could well be (and is certainly not intended like that). I checked the code
and found that "execute on start run" and "execute on stop run" are called inside
cm_transition(). That means they are executed on the computer which calls cm_transition().
If you use mhttpd and start a run through the web interface, then mhttpd runs on your
server and "execute on start run" gets executed on your server. If you stop the run
by your frontend running on the client machine (like if a certain number of events 
is reached), then "execute on stop run" gets executed on your client.

An easy way around would not to use "/Equipment/Trigger/Common/Event limit" which
gets check by your frontend and therefore on the client computer, but use 
"/Logger/Channels/0/Settings/Event limit" which gets checked by the logger and
therefore executed on the server computer.

Getting a consistent behaviour (like always executing scripts on the server) would
require a major rework of the run transition framework with probably many undesired
side-effects, so lots of debugging work.

Stefan
  2551   27 Jun 2023 Gennaro TortoneBug Reportmserver and script execution
Hi Stefan,

> Indeed that could well be (and is certainly not intended like that). I checked the code
> and found that "execute on start run" and "execute on stop run" are called inside
> cm_transition(). That means they are executed on the computer which calls cm_transition().
> If you use mhttpd and start a run through the web interface, then mhttpd runs on your
> server and "execute on start run" gets executed on your server. If you stop the run
> by your frontend running on the client machine (like if a certain number of events 
> is reached), then "execute on stop run" gets executed on your client.

ok, this is clear to me...

> An easy way around would not to use "/Equipment/Trigger/Common/Event limit" which
> gets check by your frontend and therefore on the client computer, but use 
> "/Logger/Channels/0/Settings/Event limit" which gets checked by the logger and
> therefore executed on the server computer.

we never used "/Equipment/Trigger/Common/Event limit" but we always used
"/Logger/Channels/0/Settings/Event limit"...

btw I did some tests and I understand that this issue is related to 'deferred transition'
on frontend. Indeed I disabled deferred transition on frontend side and now script
execution is carried out always on MIDAS server;

Cheers,
Gennaro
ELOG V3.1.4-2e1708b5