Back Midas Rome Roody Rootana
  Midas DAQ System, Page 133 of 138  Not logged in ELOG logo
    Reply  30 Nov 2018, Stefan Ritt, Info, status of self-signed https certificates 
> In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> httpd, etc).

I guess this is what moste people do anyhow these days. Do I understand correctly that this then rules out the usage of letsencrype certificates, since the 
host needs to be accessed from outside, which is not possible if running behind a password protected firewall.

Stefan
    Reply  03 Dec 2018, Konstantin Olchanski, Info, status of self-signed https certificates 
> > In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> > httpd, etc).
> 
> I guess this is what moste people do anyhow these days. Do I understand correctly that this then rules out the usage of letsencrype certificates, since the 
> host needs to be accessed from outside, which is not possible if running behind a password protected firewall.
> 
> Stefan

Careful, firewall != proxy, very different things.

A firewall prevents network communications, period. (Like fences and locked doors, there are good reasons to have them).

An https proxy is a way to have encrypted (protected) web communications with a machine behind a firewall.

Basically, we have 4 main cases, all with trouble.

1) mhttpd running on localhost, "just for testing", is in trouble. there is no simple way to get a "blessed" certificate, and self-signed certificates are now "almost forbidden". http is "okey 
for now", but the writing is on the wall. There is no special exception for "local-only" connections.

2a) mhttpd running on an internet-connected machine, with apache httpd, our best case. To get this working one has to configure both apache httpd and the "blessed certificate" 
certbot tool. With luck, both tools work smoothly on current OSes (they do NOT).

2b) same, but without apache httpd. One still has to run certbot, and the "glue" between mhttpd and certbot is currently missing: need a way to point mhttpd to the certbot certificate 
files and a way to reload mhttpd when the certificate is auto-renewed.

3) mhttpd running on a machine behind a corporate firewall. worst case. if firewall Gods make an opening for ports 80 and 443, it becomes case (2a/b), otherwise, one must use some 
kind of https proxy. (Plus there is no trivial way to setup an encrypted secure communication channel between mhttpd and this proxy, a double bad).

K.O.

P.S. I guess one can use nginx as the https proxy instead of apache httpd. I did not try yet. My impression is that everybody uses nginx, except for people who started with apache httpd 
and are too lazy to try nginx.

K.O.
    Reply  10 Jun 2019, Konstantin Olchanski, Info, status of self-signed https certificates 
> > > In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> > > httpd, etc).

There we go. google-chrome 74 refuses to connect to mhttpd configured with a self-signed certificate generated per instructions printed by mhttpd.

Here is the full error text (there is no button to "let me connect to it anyway"):

Your connection is not private
Attackers might be trying to steal your information from musr03.triumf.ca (for example, passwords, messages, or credit cards). Learn more
NET::ERR_CERT_AUTHORITY_INVALID
 
Help improve Safe Browsing by sending some system information and page content to Google. Privacy policy
musr03.triumf.ca normally uses encryption to protect your information. When Google Chrome tried to connect to musr03.triumf.ca this time, the website sent back unusual and incorrect credentials. This may happen when an 
attacker is trying to pretend to be musr03.triumf.ca, or a Wi-Fi sign-in screen has interrupted the connection. Your information is still secure because Google Chrome stopped the connection before any data was exchanged.

You cannot visit musr03.triumf.ca right now because the website uses HSTS. Network errors and attacks are usually temporary, so this page will probably work later.
    Reply  13 Jan 2020, Konstantin Olchanski, Info, status of self-signed https certificates 
Now firefox returns the same error. version 72.0.1.

> daqlabpc.triumf.ca has a security policy called HTTP Strict Transport Security (HSTS), which means that Firefox can only connect to it securely. You can’t add an exception to visit this site.
> Error code: MOZILLA_PKIX_ERROR_SELF_SIGNED_CERT

I think the problem is with HSTS. I enabled HSTS (in mhttpd and in apache httpd) because
SSLlabs encourage it and because my reading of it's description at
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security
makes it sound like a good idea without any downsides.

However, the actual HSTS RFC says something completely different:
https://tools.ietf.org/html/rfc6797

"The aim is to prevent click-through insecurity and address other potential threats".

To me this explains what I see. In contrast to the description at developer.mozilla.org,
firefox (and google chrome) disable "click-through" exceptions for "I do not like this https certificate",
and there is no way to connect to self-signed https.

Bottom line, either use certbot to get blessed https certificate or no https for you.

K.O.


> > > > In the mean time, we continue to recommend that mhttpd should be used behind a password protected https proxy (i.e. apache 
> > > > httpd, etc).
> 
> There we go. google-chrome 74 refuses to connect to mhttpd configured with a self-signed certificate generated per instructions printed by mhttpd.
> 
> Here is the full error text (there is no button to "let me connect to it anyway"):
> 
> Your connection is not private
> Attackers might be trying to steal your information from musr03.triumf.ca (for example, passwords, messages, or credit cards). Learn more
> NET::ERR_CERT_AUTHORITY_INVALID
>  
> Help improve Safe Browsing by sending some system information and page content to Google. Privacy policy
> musr03.triumf.ca normally uses encryption to protect your information. When Google Chrome tried to connect to musr03.triumf.ca this time, the website sent back unusual and incorrect credentials. This may happen when an 
> attacker is trying to pretend to be musr03.triumf.ca, or a Wi-Fi sign-in screen has interrupted the connection. Your information is still secure because Google Chrome stopped the connection before any data was exchanged.
> 
> You cannot visit musr03.triumf.ca right now because the website uses HSTS. Network errors and attacks are usually temporary, so this page will probably work later.
Entry  04 Jun 2020, Lukas Gerritzen, , stime() deprecated in glibc 2.31 
In glibc 2.31, the stime function was deprecated:

* The obsolete function stime is no longer available to newly linked
  binaries, and its declaration has been removed from <time.h>.
  Programs that set the system time should use clock_settime instead.

https://sourceware.org/legacy-ml/libc-announce/2020/msg00001.html

This creates a problem in src/system.cxx:3197:4
Entry  13 Apr 2017, Andreas Suter, Bug Report, stop form odbedit broken 
when I try to stop a run from odbedit I get a core dump.

[ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)

midas commit 53af92a5d0...

-----

I checked what happens if I try to stop a run via the mhttpd web-page: this
works! So what is different?

-----

I placed a issue (# 47) on bitbucket as well.

What is the preferred channel to report potential bugs (elog / bitbucket issues)? 
    Reply  13 Apr 2017, Andreas Suter, Bug Report, stop form odbedit broken 
> when I try to stop a run from odbedit I get a core dump.
> 
> [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
> 
> midas commit 53af92a5d0...
> 
> -----
> 
> I checked what happens if I try to stop a run via the mhttpd web-page: this
> works! So what is different?
> 
> -----
> 
> I placed a issue (# 47) on bitbucket as well.
> 
> What is the preferred channel to report potential bugs (elog / bitbucket issues)? 

I think I found the problem. Some ODB String values which are **automatically**
generated:

CSS File = STRING : [1024] mhttpd.css
Sqlite dir = STRING : [1024]
History dir = STRING : [1024]
Sound = STRING : [1000] alarm.mp3

are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)

It looks as if this screws up quite a bit of the system! When deleting .ODB.SHM and
afterwards try to reload the ODB via a dump I previously made with odbedit, the
following is happening:

1) I get the error message that some strings are too long (exceeding
MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
variables this is.

2) After this reload, essentially nothing is working anymore. Any client I tried to
start just crashed.

Since it seems that the string length of MAX_STRING_LENGTH is very crucial I would
suggest that db_create_record (or whatever routine is dealing with it) checks for
STRING variables and ensures that they cannot exceed MAX_STRING_LENGTH.

When I shortened in my dump the above variables to MAX_STRING_LENGTH, regenerated the
ODB, also the 'stop' Problem in odbedit is gone.
    Reply  15 Apr 2017, Konstantin Olchanski, Bug Report, stop form odbedit broken 
> > when I try to stop a run from odbedit I get a core dump.
> > 
> > [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> > Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
> > 


I am quite puzzled by this situation. We have seen the above error before, tried to track it down, failed. I was 
always thinking this is some kind of strange size mismatch between odb size and shared memory size and 
shared memory save file odb.shm size.

Now with your information, it looks like it is memory corruption.

I always thought there is no length limit to odb strings, except for the odb api problem where you have to 
know the maximum string length for db_get_value() & co otherwise long strings will be corrupted. Today
nobody uses fixed size buffers, either db_get_value() allocates the string of correct size (replacing buffer
overflow errors with memory leak errors), or return std::string.

I shall check on the use of MAX_STRING_SIZE at least in odb itself...

The default value 256 seems to be too small for today's use. (if you want to store json data, web page 
fragments, etc).


K.O.




> > midas commit 53af92a5d0...
> > 
> > -----
> > 
> > I checked what happens if I try to stop a run via the mhttpd web-page: this
> > works! So what is different?
> > 
> > -----
> > 
> > I placed a issue (# 47) on bitbucket as well.
> > 
> > What is the preferred channel to report potential bugs (elog / bitbucket issues)? 
> 
> I think I found the problem. Some ODB String values which are **automatically**
> generated:
> 
> CSS File = STRING : [1024] mhttpd.css
> Sqlite dir = STRING : [1024]
> History dir = STRING : [1024]
> Sound = STRING : [1000] alarm.mp3
> 
> are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)
> 
> It looks as if this screws up quite a bit of the system! When deleting .ODB.SHM and
> afterwards try to reload the ODB via a dump I previously made with odbedit, the
> following is happening:
> 
> 1) I get the error message that some strings are too long (exceeding
> MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
> variables this is.
> 
> 2) After this reload, essentially nothing is working anymore. Any client I tried to
> start just crashed.
> 
> Since it seems that the string length of MAX_STRING_LENGTH is very crucial I would
> suggest that db_create_record (or whatever routine is dealing with it) checks for
> STRING variables and ensures that they cannot exceed MAX_STRING_LENGTH.
> 
> When I shortened in my dump the above variables to MAX_STRING_LENGTH, regenerated the
> ODB, also the 'stop' Problem in odbedit is gone.
    Reply  15 Apr 2017, Konstantin Olchanski, Bug Report, stop form odbedit broken 
> when I try to stop a run from odbedit I get a core dump.
> [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
> 

I am puzzled. The crash is at the very end of everything (save odb shared memory to odb.shm),
does the run actually stop, or the crash is before the run is fully stopped? (I guess if you want
to run more odbedit commands after stopping the run, so you care about not crashing).

K.O.
    Reply  24 Apr 2017, Stefan Ritt, Bug Report, stop form odbedit broken 
> CSS File = STRING : [1024] mhttpd.css
> Sqlite dir = STRING : [1024]
> History dir = STRING : [1024]
> Sound = STRING : [1000] alarm.mp3

After a quick discussion with Konstantin, I changed these strings to a length of 256 chars 
(MAX_STRING_LENGTH). Actually all changes I had to made was on code introduced by KO, so I hope I 
did everything correctly. He should carefully check my changes (actually I would have preferred if he 
could change his code himself...).

I agree with KO that the preferred format for saving the ODB should be JSON, but there might be 
experiments with have some old ODB dumps in other formats, so we should not remove the possibility to 
read those formats back.

Stefan
Entry  23 Oct 2008, Konstantin Olchanski, Bug Report, strange output from "odbedit cleanup" 
When I run odbedit remotely (odbedit -h ladd09), the "cleanup" command unexpectedly produces the 
output of the "sor" command (sure enough, there is a call to db_get_open_records() there), but when I run 
it locally, I do not get this output (but db_get_open_records() is still called). Strange. K.O.
    Reply  28 Oct 2008, Stefan Ritt, Bug Report, strange output from "odbedit cleanup" 
> When I run odbedit remotely (odbedit -h ladd09), the "cleanup" command unexpectedly produces the 
> output of the "sor" command (sure enough, there is a call to db_get_open_records() there), but when I run 
> it locally, I do not get this output (but db_get_open_records() is still called). Strange. K.O.

The db_get_open_records() call was by mistake there, I removed it. What remains is that the notification 
message if a client is removed from the ODB goes through the system messages. When running locally, odbedit 
echoes it's own messages, but when running remotely, this is not the case. So the messages can be seen by 
everybody else (plus it ends up in the message file), but not by the remote odbedit where the cleanup is 
started. The quick fix for that is to say "old" in odbedit which shows the last few lines of the message 
file, so one can see any successful cleanup. 
Entry  03 Feb 2024, Pavel Murat, Bug Report, string --> int64 conversion in the python interface ? 
Dear MIDAS experts,

I gave a try to the MIDAS python interface and ran all tests available in midas/python/tests.
Two Int64 tests from test_odb.py had failed (see below), everthong else - succeeded

I'm using a ~ 2.5 weeks-old commit and python 3.9 on SL7 Linux platform.

commit c19b4e696400ee437d8790b7d3819051f66da62d (HEAD -> develop, origin/develop, origin/HEAD)
Author: Zaher Salman <zaher.salman@gmail.com>
Date:   Sun Jan 14 13:18:48 2024 +0100

The symptoms are consistent with a string --> int64 conversion not happening 
where it is needed. 

Perhaps the issue have already been fixed? 

-- many thanks, regards, Pasha
-------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 178, in testInt64
    self.set_and_readback_from_parent_dir("/pytest", "int64_2", [123, 40000000000000000], midas.TID_INT64, True)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 130, in set_and_readback_from_parent_dir
    self.validate_readback(value, retval[key_name], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 87, in validate_readback
    self.assert_equal(val, retval[i], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 60, in assert_equal
    self.assertEqual(val1, val2)
AssertionError: 123 != '123'

with the test on line 178 commented out, the test on the next line fails in a similar way:
        
Traceback (most recent call last):
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 179, in testInt64
    self.set_and_readback_from_parent_dir("/pytest", "int64_2", 37, midas.TID_INT64, True)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 130, in set_and_readback_from_parent_dir
    self.validate_readback(value, retval[key_name], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 102, in validate_readback
    self.assert_equal(value, retval, expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 60, in assert_equal
    self.assertEqual(val1, val2)
AssertionError: 37 != '37'
---------------------------------------------------------------------------
    Reply  05 Feb 2024, Ben Smith, Bug Fix, string --> int64 conversion in the python interface ? 
> The symptoms are consistent with a string --> int64 conversion not happening 
> where it is needed. 

Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!

Ben
    Reply  13 Feb 2024, Konstantin Olchanski, Bug Fix, string --> int64 conversion in the python interface ? 
> > The symptoms are consistent with a string --> int64 conversion not happening 
> > where it is needed. 
> 
> Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!
> 

Are we running these tests as part of the nightly build on bitbucket? They would be part of 
the "make test" target. Correct python dependancies may need to be added to the bitbucket OS 
image in bitbucket-pipelines.yml. (This is a PITA to get right).

K.O.
Entry  05 Jun 2018, Frederik Wauters, Forum, strings in sqlite 
I am setting up a sqlite db to serve as a run database.

The easiest option is to use the history sqlite feature, and add run information 
as virtual history events

however:

Invalid tag 0 'Comment' in event 21 'Run Parameters': cannot do history for 
TID_STRING data, sorry!

I'd like to save e.g. the edit on start information , with shift crew checks. 
Would it be easy to allow for text, or is this inherent to the history system 
handling binary data?
    Reply  20 Jul 2018, Konstantin Olchanski, Forum, strings in sqlite 
> Invalid tag 0 'Comment' in event 21 'Run Parameters': cannot do history for 
> TID_STRING data, sorry!

The original MIDAS history API does not have provisions for storing TID_STRING data,
it is a very unfortunate limitation that has been with us for a very long time.

If I ever get around to rewrite the MIDAS history API, I will definitely add support for TID_STRING data.

But not today.

K.O.

P.S. Support for arbitrary binary blobs is also possible, but this will make the midas history
a kind of "a daq inside the daq" thing, probably we do not want to go this direction.

K.O.
Entry  02 May 2005, Stefan Ritt, Info, strlcpy/strlcat moved into separate file 
I had to move strlcpy & strlcat into a separate file "strlcpy.c". A header file
"strlcpy.h" was added as well. This way one can omit the old HAVE_STRLCPY which
made life hard. The windows and linux makefiles were adjusted accordingly, but
for Max OS X there might be some fixes necessary which I could not test.
Entry  12 Nov 2014, Robert Pattie, Forum, struct mismatch logger_channels.pdf
Hi all,
  I've started receiving the following error that I can't track down.  Does
anyone have a suggestion for where to start looking for the cause of this?

[Analyzer,ERROR] [odb.c:9460:db_open_record,ERROR] struct size mismatch for "/"
(expected size: 576, size in ODB: 0)

This error prevents me from running two runs in a row.  I have to close the DAQ
and restart to take multiple runs.  Also it prevents me from running the analyzer
in offline mode. 

I also noticed that several for the ODB directories no longer have the same html
format when viewed through the browser.  I've attached a screen print of the
"/Logger/Channels" page.

Thanks,
Robert
Entry  15 Nov 2013, Konstantin Olchanski, Bug Report, stuck data buffers 
We have seen several times a problem with stuck data buffers. The symptoms are very confusing - 
frontends cannot start, instead hang forever in a state very hard to kill. Also "mdump -s -d -z 
BUF03" for the affected data buffers is stuck.

We have identified the source of this problem - the semaphore for the buffer is locked and nobody 
will ever unlock it - MIDAS relies on a feature of SYSV semaphores where they are automatically 
unlocked by the OS and cannot ever be stuck ever. (see man semop, SEM_UNDO function).

I think this SEM_UNDO function is broken in recent Linux kernels and sometimes the semaphore 
remains locked after the process that locked it has died. MIDAS is not programmed to deal with this 
situation and the stuck semaphore has to be cleared manually.

Here, "BUF3" is used as example, but we have seen "SYSTEM" and ODB with stuck semaphores, too.

Steps:
a) confirm that we are using SYSV semaphores: "ipcs" should show many semaphores
b) identify the stuck semaphore: "strace mdump -s -d -z BUF03".
c) here will be a large printout, but ultimately you will see repeated entries of 
"semtimedop(9633800, {{0, -1, SEM_UNDO}}, 1, {1, 0}^C <unfinished ...>"
d) erase the stuck semaphore "ipcrm -s 9633800", where the number comes from semtimedop() in 
the strace output.
e) try again: "mdump -s -d -z BUF03" should work now.

Ultimately, I think we should switch to POSIX semaphores - they are easier to manage (the strace 
and ipcrm dance becomes "rm /dev/shm/deap_BUF03.sem" - but they do not have the SEM_UNDO 
function, so detection of locked and stuck semaphores will have to be done by MIDAS. (Unless we 
can find some library of semaphore functions that already provides such advanced functionality).

K.O.
ELOG V3.1.4-2e1708b5