Back Midas Rome Roody Rootana
  Midas DAQ System, Page 116 of 136  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subjectup
  554   14 Jan 2009 Konstantin OlchanskiForummlogger problem
> The problem was already fixed some time ago, so please update your version from the SVN 
> revision (see https://midas.psi.ch/download.html for details).

I wanted to check out the latest websvn midas repository viewer installed at PSI, so I used the web "annotate/blame" tools 
to trace the fix to this bug down to revision 3660 committed in April 2007. (It turns out that "svn blame" is not very useful 
for tracing *removed* lines, so I ended up doing a manual binary search across different revisions of mlogger.c)

K.O.
  978   11 Mar 2014 Andreas SuterForummlogger problem
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.
  979   11 Mar 2014 Stefan RittForummlogger problem

Andreas Suter wrote:
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.


K.O. put that in: https://bitbucket.org/tmidas/midas/commits/9d7b7c83b275a2bd3c846c4f265ff7f5d53f3426

He should have a look at it.

Have you tried to rebuild your ODB from scratch? (Save in XML, then delete .ODB.SHM, then load again form XML)?

/Stefan
  980   11 Mar 2014 Andreas SuterForummlogger problem

Stefan Ritt wrote:

Andreas Suter wrote:
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.


K.O. put that in: https://bitbucket.org/tmidas/midas/commits/9d7b7c83b275a2bd3c846c4f265ff7f5d53f3426

He should have a look at it.

Have you tried to rebuild your ODB from scratch? (Save in XML, then delete .ODB.SHM, then load again form XML)?

/Stefan

Yes, I could recover the ODB by falling back to a previous dump. Still, I would like to know what is the exact meaning of the above assertion. It might help to understand what are the likely cause which results in the assertion.

/Andreas
  984   14 Mar 2014 Konstantin OlchanskiForummlogger problem
> I stumbled over a problem which I cannot pin point and would appreciate suggestions.
> 
> [nemu@lem00 2014]$ odbedit -e nemu
> odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey-
>notify_count' failed.
> Aborted

I think this is a real bug in MIDAS - I will have to take a look to figure out where this is coming from. At the 
least, if I cannot replace the assert with some corrective action, I may replace it with an error message.

I am glad you could recover by reloading odb.

K.O.
  744   15 Feb 2011 Konstantin OlchanskiBug Fixmlogger stop run on disk full!
The mlogger has a function for detecting when the output disk becomes full - when this condition is 
detected, the run should be stopped. But this did not work if disk is already full and the user tries to start 
a run - the "disk full?" check happened too early and the attempt to stop the run was not succeeding 
because the original start-run transition is still running. Now if "disk full" condition is detected, mlogger 
tries to stop the run every 10 seconds until the run is finally stopped (or dies because disk is full).

mlogger.c svn rev 4976
K.O.
  2582   15 Aug 2023 Konstantin OlchanskiInfomlogger update
A bit of update to the mlogger. In preparation for more cleanup when Stefan is 
here at TRIUMF.

1) fix overwrite of existing files if run number is reset (check for existing 
files was missing in the LZ4, BZ2 & co data path)
2) made output files read-only (midas, json and checksum files)
3) commented out the old code paths

Currently active per-channel ODB settings:

Active - enable or disable mlogger channel
Type - NOT USED
Filename - output filename template, %d are replaced by run number and subrun 
number, also pipe command for PIPE output
Format - NOT USED
Compression - NOT USED
ODB dump - enable/disable writing ODB dump to data file
ODB dump format - "json" is recommended for new experiments
Log messages - write log messages to output file, 0=off, -1=write all messages
Buffer - "SYSTEM" read events from this event buffer
EventID - "-1" for all events
Trigger Mask - "-1" for all events
Event Limit - stop run after so many events
Byte Limit - stop run after so many bytes
Subrun Byte limit - switch to next subrun file after writing so many bytes. 
actual file size is longer than subrun_byte_limit because of ODB dumps.
Tape Capacity - NOT USED
Subdir Format - if not empty, output file name is DIR/SUBDIR/FILENAME, "%" 
format things are expanded by strftime().
Current Filename - updated by mlogger, contains the currently written file name
Data checksum - checksum before compression, use CRC32C for maximum speed, 
SHA512 for maximum security.
File checksum - checksum after compression, CRC32C is good against accidental 
file corruption, SHA512 is cryptographically strong, good against purposeful 
tampering.
Compress - use "lz4" for maximum speed, bzip2 or pbzip2 for maximum compression. 
no compression and gzip are not recommended. (ZFS may apply lz4 compression to 
uncompressed data).
Output - "NULL" do not write anything, "FILE" write to disk, "FTP" write to FTP 
server, "ROOT" write via the mlogger ROOT writer (docs?), "PIPE" pipe data 
through an external command (i.e. for bzip2 compression).
Gzip compression - gzip compression flags (see gzip docs, 1=max speed, 9=max 
compression)
Bzip2 compression - if non-zero, bzip2 compression level (see "bzip2 -h", 1=max 
speed, 9=max compression)
Pbzip2 num cpu - number of CPUs used by parallel bzip2 compression, pbzip2 -p 
flag
Pbzip2 compression - if non-zero, pbzip2 compresison level (see "pbzip2 -h", 
default is 9=max compression)
Pbzip2 options - any additional pbzip2 options, i.e. -l, -m, -p, etc.

Currently active /Logger options:

Data Dir - where to write all output files, if empty, cm_get_path() is used.
Message file date format - not used in mlogger
Message dir - not used in mlogger
Write data - if set to "no", midas file, runlog, etc will not be written.
ODB Dump - at run stop, save odb to disk
ODB Dump File - file name for "ODB Dump" save file. "%d" is replaced by run 
number. "json" format is recommended for new experiments.
ODB Last Dump File - at run start, save ODB to disk. "json" format is 
recommended for new experiments.
Auto restart - run stopped by time limit or event limit is automatically 
restarted
Auto restart delay - wair for some many seconds before restarting the run
Tape message - NOT USED
Run duration - stop the run after so many seconds
Next subrun - change from "no" to "yes" to force mlogger to open a new subrun 
file (should this be per-channel?)
Subrun duration - open new subrun file after so many seconds (should this be 
per-channel?)
History dir - not used in mlogger
Detached transition - "no" use the normal multithreaded transtions 
(recommended), "yes" use mtransition helper to stop and restart runs. sometimes 
files because mtransition is not in the user $PATH or wrong version of 
mtransition is in the user $PATH.

K.O.
  729   29 Oct 2010 Konstantin OlchanskiInfomlogger.c 4858-4862 busted
Please note that mlogger does not work (crashes on run start) starting with svn
rev 4858, fixed in svn 4862. If you have to use this busted version of mlogger,
the crash is fixed by update of history_midas.c to svn rev 4862 or set ODB
/Logger/WriteFileHistory to 'n'. Sorry for the inconvenience. K.O.
  1865   25 Mar 2020 Andreas SuterForummlogger: misleading error messages for ROOT
Dear All,

At our experiment we write ROOT files. When starting/stopping runs we get the following error messages:

[Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8000

[Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8001

Looking into the source code I found that log_write (line 4248) sends these Midas System Events (BOR,EOR) to root_write without filtering them. root_write() checks in a first step if it gets such Midas System Events and if yes, moans.

Wouldn't it be better just to filter these events in log_write, before calling root_write, avoiding unnecessary error messages?

Is there something I miss?

Thanks,
  Andreas
  1866   25 Mar 2020 Konstantin OlchanskiForummlogger: misleading error messages for ROOT
> [Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8000

Hi, Andreas, please open a bug report for this problem on bitbucket, there is now at least 2 bugs against
the ROOT writer (some events are written in duplicate sometimes), and I hope to fix this next time i review
the mlogger (RSN!). Biggest problem is that I do not use the ROOT output myself, so I have no way
to know if ROOT files produced by mlogger are correct or make sense. (without setting up some kind
of test environment with a ROOT file reader.

Thank you for reporting this problem here, so more people know about it.

If somebody has a patch to fix this, please send it in!

K.O.
  1867   27 Mar 2020 Stefan RittForummlogger: misleading error messages for ROOT
Dear simplest solution seems to me to just remove the error message generation and silently ignore the BOE EOR events. 

Committed that change.

Stefan
  1868   27 Mar 2020 Andreas SuterForummlogger: misleading error messages for ROOT
Hi Stefan,

I think this only partially resolves the issue, in log_write:

#ifdef HAVE_ROOT
   } else if (log_chn->format == FORMAT_ROOT) {
      status = root_write(log_chn, pevent, pevent->data_size + sizeof(EVENT_HEADER));
#endif
   }

   actual_time = ss_millitime();
   if ((int) actual_time - (int) start_time > 3000)
      cm_msg(MINFO, "log_write", "Write operation on \'%s\' took %d ms", log_chn->path.c_str(), actual_time - start_time);

   if (status != SS_SUCCESS && !stop_requested) {
      cm_msg(MTALK, "log_write", "Error writing output file, stopping run");
      cm_msg(MERROR, "log_write", "Cannot write \'%s\', error %d, stopping run", log_chn->path.c_str(), status);
      stop_the_run(0);

      return status;
   }

In your solution root_write returns quietly but status == SS_INVALID_FORMAT (not SS_SUCCESS) and hence I get another misleading error message "Error writing output file, stopping run".

In order to prevent this you also would need to change the return value to SS_SUCCESS.
  1869   27 Mar 2020 Stefan RittForummlogger: misleading error messages for ROOT
Ok, changed.

Stefan
  1375   03 Jul 2018 Frederik WautersForummlogger? jamming
We run as follows:

* sis3316 digitizers in a vme crate
* 1-2 midas events /s
* data rate at 20 MB/s

At a rate of 30 MB/s the daq crashed because the I think the mlogger can`t follow:

  * it runs at 100% cpu
  * memory usage of mlogger process goes from 2% to 15%
  * All other processes < 50 % cpu and < 20% RAM

Both the vme frontend and the mlogger crash about 2.5 minutes into a run. Both
the logger and vme fe spit out:
bm_validate_client_pointers: Assertion `pclient->read_pointer >= 0 &&
pclient->read_pointer <= pheader->size' failed.
Aborted

I first thought that writing-to-disk could be a bottle neck. But when I write to
an SSD, same thing.

Is there another bottleneck which keeps the mlogger busy?
  2249   29 Jun 2021 Lukas GerritzenBug Reportmodbcheckbox behaves erroneous with UINT32 variables
For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
sets the variable to true or 1, the checkbox stays checked until you click again 
and it's being set back to 0.

For UINT32 variables, you can turn the variable "on", but the checkbox visually 
becomes unchecked immediately. Clicking again does not set the variable to 
0/false and the tick visually appears for a fraction of a second, but vanishes 
again.
  2250   30 Jun 2021 Stefan RittBug Reportmodbcheckbox behaves erroneous with UINT32 variables
> For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
> sets the variable to true or 1, the checkbox stays checked until you click again 
> and it's being set back to 0.
> 
> For UINT32 variables, you can turn the variable "on", but the checkbox visually 
> becomes unchecked immediately. Clicking again does not set the variable to 
> 0/false and the tick visually appears for a fraction of a second, but vanishes 
> again.

Thanks for reporting that bug. Fixed in

https://bitbucket.org/tmidas/midas/commits/4ef26bdc5a32716efe8e8f0e9ce328bafad6a7bf

Stefan
  2252   30 Jun 2021 Lukas GerritzenBug Reportmodbcheckbox behaves erroneous with UINT32 variables
Thanks for the quick fix.
  2161   07 May 2021 Zaher SalmanBug Reportmodbselect trigget hotlink
It seems that a modbselect triggers a "change" in an ODB which has a hot link. This happens onload (or whenever the custom page is reloaded) and otherwise it behaves as expected, i.e. no change unless the modbselect is actually changed. Is this the intended behaviour? can this be modified?
  2162   10 May 2021 Stefan RittBug Reportmodbselect trigget hotlink
Thanks for reporting that bug, I fixed it in the last commit.

Stefan
  1294   31 May 2017 Konstantin OlchanskiInfomodified db_watch() arguments
for reasons unknown, db_watch() did not have an "info" parameter passed through to the callback 
handler function, like it is done with db_open_record().

This omission makes it difficult to write db_watch handler functions that must watch multiple odb 
trees - db_watch only delivers the hkey of the modified item inside the tree, leaving us with no 
simple way to tell which tree it came from. An example of this is mfe.c watching the Common 
structure for multiple equipments. There are other
uses for the "info" parameter, for example it is needed to implement c++ wrapper classes.

this omission is now corrected at the cost of changing the definition db_watch().

all uses of db_watch() in the midas tree have been corrected, but all out-of-tree programs
will not compile. For quick conversion, add a NULL parameter to db_watch() calls and add a 
"void*info" parameter to your watch handler function.

sorry about this disturbance,
K.O.
ELOG V3.1.4-2e1708b5