Back Midas Rome Roody Rootana
  Midas DAQ System, Page 115 of 136  Not logged in ELOG logo
ID Date Author Topic Subjectup
  585   02 Jun 2009 Konstantin OlchanskiBug Reportmlogger duplicate event problem
> We have seen on several daq systems this problem: we start a run and observe that the number of 
> events written by mlogger to the output file is double the number of events actually collected. Upon 
> inspection of the output file, we see that every event is written twice. Restarting the run usually fixes 
> this problem.

mlogger.c fixed svn rev 4497. (from tr_start(), call tr_stop() if somehow it was not called already by end-run transition).

K.O.
  595   16 Jun 2009 Konstantin OlchanskiBug Reportmlogger duplicate event problem
> > We have seen on several daq systems this problem: we start a run and observe that the number of 
> > events written by mlogger to the output file is double the number of events actually collected. Upon 
> > inspection of the output file, we see that every event is written twice. Restarting the run usually fixes 
> > this problem.
> 
> mlogger.c fixed svn rev 4497. (from tr_start(), call tr_stop() if somehow it was not called already by end-run transition).

There is a new problem: after an unsuccessful run start, the next run start bombs with the error "output file runNNN.mid already exists". One way around this is to 
manually remove the useless data file, another is to bump up the run number. Better solution is to automatically erase the output file created by unsuccessful run 
starts.

K.O.
  1106   02 Sep 2015 Konstantin OlchanskiInfomlogger history changes
The git branch feature/logger_db_watch is getting ready for merging into main midas.

The main change in the logger is the switch from db_open_record() to db_watch() as the 
method of listening to ODB variables. The new db_watch() function makes it cheap (in the 
number of hotlinks used) to implement "per-variable" history as the new default. In the new 
code, the old "per-equipment" history is no longer available.

In other words:

old per-equipment history: 1 hotlink per equipment
old per-variable history: 1 hotlink per "ls" entry in /eq/xxx/variables (big experiments can 
easily exceed the maximum number of hotlinks!)
new (per-variable) history: (back to) 1 hotlink per equipment

Notable changes from old history:

- works as described in my recent notes - the new code will complain about all incorrect use 
of history - where the old code sometimes silently malfunctioned (i.e. symlinks in unexpected 
places) or bombed (i.e. infinite loop reloading the history).
- all references to "PerVariableHistory" in ODB are removed (this is the new default)
- the "structured bank" records (subdirectories under variables, as in /eq/xxx/var/struct/value) 
are now broken up into individual items. This change is forced by the difference between 
db_open_record() and db_watch() for structured banks written using db_set_record(). The old 
per-variable history kept these items together in one event.

This change is also inline with Stefan's suggestion that all compound items, including arrays, 
should be broken up into separate history events. Keeping with this suggestion, right now 
only arrays are not broken up - because of limitations in the history storage level. As history 
storage is improved, arrays will also be broken up into individual elements.

The new code is functionally complete and all are welcome to try it (but beware as it may eat 
your odb or your history storage - make a backup!).

git checkout feature/logger_db_watch

K.O.
  2091   12 Feb 2021 Konstantin OlchanskiBug Reportmlogger history snafu
there is a problem with mlogger between commits xxx (17 Nov 2020) and a762bb8 (12 feb 2021). because of 
confusion between seconds and milliseconds, FILE (mhf*.dat files) and SQL history are recording with 
incorrect timestamps.

- traditional MIDAS history (*.hst files) does not have this problem (because of a buglet)
- midas-2020-12 release does not have this problem (it has mlogger from midas-2020-08 release)

there are some additional changes in mlogger that we are sorting out, when ready, we will make a new 
release of midas.

K.O.
  1068   15 Jul 2015 Konstantin OlchanskiInfomlogger improvements
A set of improvements to mlogger is in:

a) event buffer (SYSTEM) size up to 2GB
b) test version of LZ4 high speed compression, support for bzip2 and pbzip2

Details:

a) previously contents of shared memory buffers (SYSTEM, SYSMSG, etc) were periodically saved to disk 
files SYSTEM.SHM, SYSMSG.SHM, etc. This was not workable for large event buffers - reading/writing 2GB 
of data takes quite some time. We have decided that saving buffer contents to disk is no longer necessary 
and ss_shm_close() no longer writes SYSTEM.SHM, SYSMSG.SHM, etc. From now on you will still see these 
files created, but size will be 0. The file ODB.SHM is not affected by this - ODB contents is saved to 
ODB.SHM via ss_shm_flush().

b) as a rework of mlogger file output drivers (using chainable c++ classes), test versions of new 
compression algorithms have been added. In the present test version, they are controlled by the value of 
"compression".

The plan is to ultimately have following outputs from the mlogger:

- ROOT output - save as before, but you have to use rmlogger executable
- FTP output - for high speed write over the network
- .mid output for uncompressed data
- .mid.gz - gzip1 compressed data - best compromise between compression ratio and speed - will be the 
new default
- .mid.bz2 via pbzip2 (parallel bzip2) - maximum compression ratio
- .mid.lz4 - lz4 compression for high speed data taking - maximum compression speed

The current test version implements the following selections of "compression":

80 - ROOT output through the new driver (use rmlogger executable)
98 - null output (no file written)
99 - uncompressed disk output
100 - lz4 comression
200 - piped bzip2 compression
201 - piped pbzip2 compression
300 - gzip compression
301 - gzip1 compression
309 - gzip9 compression

in addition the old selections are still available:
0 - uncompressed output
1 - gzip1 compression
9 - gzip9 compression

The final implementation will include a better way to configure the mlogger output channels.

K.O.
  1076   23 Jul 2015 Konstantin OlchanskiInfomlogger improvements
> A set of improvements to mlogger is in:
> The current test version implements the following selections of "compression":
> 
> 80 - ROOT output through the new driver (use rmlogger executable)
> ...

Additional output modes through the new output drivers:

81 - FTP output
82 - FTP output with LZ4 compression

The format of the "Channels/xxx/Settings/Filename" for FTP output is like this:
"/localhost, 5555, ftpuser, ftppwd, ., run%05dsub%05d.mid"

- the leading slash is required (for now)
- localhost is the FTP server hostname
- 5555 is the FTP server port number
- ftpuser and ftppwd are the FTP login. password is stored and transmitted in clear text for extra security
- "." is the output directory on the FTP server
- the rest is the file name in the usual format.

For testing this driver, I run the ftp server like this:

# vsftpd -olisten=YES -obackground=no -olisten_port=5555 -olisten_address=127.0.0.1 -oport_promiscuous=yes -oconnect_from_port_20=no -oftp_data_port=6666

K.O.
  1085   29 Jul 2015 Konstantin OlchanskiInfomlogger improvements - CRC32C, SHA-2
> A set of improvements to mlogger is in:

Preliminary support for CRC32-zlib, CRC32C, SHA-256 and SHA-512 is in. Checksums are computed correctly, but plumbing configuration is 
preliminary. Good enough for testing and benchmarking.

To enable checksums, set channel compression:
100 - no checksum (LZ4 compression)
11100 - CRC32-zlib checksum
22100 - CRC32C
33100 - SHA-256
44100 - SHA-512
checksums for both uncompressed and compressed data will be computed and reported into midas.log.

To compare:

CRC32-zlib is for compatibility with gzip and zlib tools
CRC32C is for maximum speed
SHA-256 and SHA-512 is for maximum data security

To remember:

- CRC32-zlib is the CRC32 computation from gzip/png/zlib library. I believe the technical name of the algorithm is "adler32".
- CRC32C is the most recently improved version of CRC32 family of checksums. Implementation is from Mark Adler (same Adler as adler32) uses 
hardware acceleration on recent Intel CPUs.
- SHA-256 and SHA-512 are checksums currently accepted as cryptographically secure. One of them is supposed to be faster on 64-bit 
machines. I implement both for benchmarking.

"Cryptographically secure" means "nobody has a practical way to construct two different files with the same checksum".

In simpler words, the file contents cannot change without breaking the checksum - by software bug, by hardware fault, by benign or malicious 
intent.

The CRC family of checksum functions were never cryptographically secure.
MD5 and SHA-1 used to be secure but are no longer considered to be so. MD5 was definitely broken as different files with the same checksum 
have been discovered or constructed.

K.O.
  487   11 Jun 2008 Andreas SuterSuggestionmlogger is flooding the message queue
The current versions of mlogger SVN 4215 is flooding our message system with
stuff like

> Tue Jun 10 16:42:01 2008 [Logger,INFO] Configured history with 22 events
> Tue Jun 10 16:42:14 2008 [Logger,INFO] Configured history with 22 events
> Tue Jun 10 16:42:26 2008 [Logger,INFO] Configured history with 22 events

This is fatal to us and blowing up the midas.log like hell. I would prefer if
one could flag these kind of messages (ODB /Logger/..), i.e. enable and disable
it. At the moment I have to comment it out in the source code since we cannot
work with it.

Cheers,
  Andreas 
  488   11 Jun 2008 Konstantin OlchanskiSuggestionmlogger is flooding the message queue
> The current versions of mlogger SVN 4215 is flooding our message system with
> stuff like
> 
> > Tue Jun 10 16:42:01 2008 [Logger,INFO] Configured history with 22 events
> > Tue Jun 10 16:42:14 2008 [Logger,INFO] Configured history with 22 events
> > Tue Jun 10 16:42:26 2008 [Logger,INFO] Configured history with 22 events
> 
> This is fatal to us and blowing up the midas.log like hell. I would prefer if
> one could flag these kind of messages (ODB /Logger/..), i.e. enable and disable
> it. At the moment I have to comment it out in the source code since we cannot
> work with it.

I just sent the attached message to Stefan - please read it.

Before we take any action, we need to understand why history is being
reconfigured every 10 seconds (according to your logfile snippet).

Are you starting a new run every 10 seconds?

If that is what you do and that is your intent, I guess it is atypical usage of
MIDAS and the message from the mlogger is offensive and should be removed/disabled.

If something else is going on, we need to understand it before we sweep trouble
under the carpet by disabling this message.

K.O.

Stefan - there is more bad news - the message is produced when the history
is being reconfigured. This only is supposed to happen when the mlogger
starts or at the begin of run.

So these messages are just a tip of an iceberg of some other trouble.

The logic of when history is reconfigured I did not change. So likely
the trouble existed before, but you did not know about it.

We can kill the message, but why is the history being reconfigured
at a rate that "floods the log file"? That cannot possibly be good.

K.O.
  490   16 Jun 2008 Konstantin OlchanskiSuggestionmlogger is flooding the message queue
> The current versions of mlogger SVN 4215 is flooding our message system with
> stuff like
> 
> > Tue Jun 10 16:42:01 2008 [Logger,INFO] Configured history with 22 events
> > Tue Jun 10 16:42:14 2008 [Logger,INFO] Configured history with 22 events
> > Tue Jun 10 16:42:26 2008 [Logger,INFO] Configured history with 22 events

Problem confirmed on the M11 DAQ system at TRIUMF. We definitely do nothing funny
there, so what is going on? Will investigate.

K.O.
  491   16 Jun 2008 Stefan RittSuggestionmlogger is flooding the message queue
> > The current versions of mlogger SVN 4215 is flooding our message system with
> > stuff like
> > 
> > > Tue Jun 10 16:42:01 2008 [Logger,INFO] Configured history with 22 events
> > > Tue Jun 10 16:42:14 2008 [Logger,INFO] Configured history with 22 events
> > > Tue Jun 10 16:42:26 2008 [Logger,INFO] Configured history with 22 events
> 
> Problem confirmed on the M11 DAQ system at TRIUMF. We definitely do nothing funny
> there, so what is going on? Will investigate.

The only place I see where this could happen is in mlogger.c, lines 3064ff:

   /* check if event size has changed */
   db_get_record_size(hDB, hKey, 0, &size);
   if (size != hist_log[i].buffer_size) {
      close_history();
      open_history();
      return;
   }

The record size corresponds to /Equipment/<name>/Variables. If this array changes in
size, it will trigger the re-definition of the history. So please have a look there
and check why the record size changes.
  772   27 Jun 2011 Konstantin OlchanskiInfomlogger lock for runNNN.mid.gz files
By popular request, Stefan R. implemented a locking scheme for mlogger output files.

To use this function, set the mlogger ODB /Logger/Channels/NNN/Settings/Filename
to ".run%05dsub%05d.mid.gz" (note the leading dot).

In this mode, active output files will have a filename with a leading dot
(.run00001sub00001.mid.gz) while the file is being written to. After the file is
closed, it is renamed and the leading dot is removed.

To use this function with the lazylogger, please set ODB
"/Lazy/Foo/Settings/Filename format" to "run*.mid.gz,run*.xml" (note the leading
text "run"). Set "stay behind" to 0.

svn rev 5080 (or so, checking by Stefan R.)
K.O.
  548   09 Jan 2009 Derek EscontriasForummlogger problem
Hi,

I am running Scientific Linux with kernel 2.6.9-34.EL and  I have
glibc-2.3.4-2.25. When I run mlogger, I receive the error:

*** glibc detected *** free(): invalid pointer: 0x0073e93e ***
Aborted

Any ideas?
  549   13 Jan 2009 Stefan RittForummlogger problem
> Hi,
> 
> I am running Scientific Linux with kernel 2.6.9-34.EL and  I have
> glibc-2.3.4-2.25. When I run mlogger, I receive the error:
> 
> *** glibc detected *** free(): invalid pointer: 0x0073e93e ***
> Aborted
> 
> Any ideas?

Not much. Try to clean up the ODB (delete the .ODB.SHM file, remove all shared 
memory via ipcrm) and run again. I run under kernel 2.6.18 and glibc 2.5 and this 
problem does not occur. If you cannot fix it, try to run mlogger inside gdb and 
make a stack trace to see who called the free().
  550   13 Jan 2009 Derek EscontriasForummlogger problem
> > Hi,
> > 
> > I am running Scientific Linux with kernel 2.6.9-34.EL and  I have
> > glibc-2.3.4-2.25. When I run mlogger, I receive the error:
> > 
> > *** glibc detected *** free(): invalid pointer: 0x0073e93e ***
> > Aborted
> > 
> > Any ideas?
> 
> Not much. Try to clean up the ODB (delete the .ODB.SHM file, remove all shared 
> memory via ipcrm) and run again. I run under kernel 2.6.18 and glibc 2.5 and this 
> problem does not occur. If you cannot fix it, try to run mlogger inside gdb and 
> make a stack trace to see who called the free().

Sorry for being vague. I cleaned up the ODB, but it doesn't seem to be the
problem. Here is a sample run of mlogger and gdb:


/**************************************************************
/**************************************************************
/**************************************************************
[root@tsunami AL_Test]# mlogger -v -d
*** glibc detected *** free(): invalid pointer: 0x007f793e ***
Aborted (core dumped)
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# 
[root@tsunami AL_Test]# gdb mlogger core.23213 
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library
"/lib/tls/libthread_db.so.1".

Core was generated by `mlogger -v -d'.
Program terminated with signal 6, Aborted.
Reading symbols from /home/dayabay/Software/Root/lib/libCore.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libCore.so
Reading symbols from /home/dayabay/Software/Root/lib/libCint.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libCint.so
Reading symbols from /home/dayabay/Software/Root/lib/libRIO.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libRIO.so
Reading symbols from /home/dayabay/Software/Root/lib/libNet.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libNet.so
Reading symbols from /home/dayabay/Software/Root/lib/libHist.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libHist.so
Reading symbols from /home/dayabay/Software/Root/lib/libGraf.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libGraf.so
Reading symbols from /home/dayabay/Software/Root/lib/libGraf3d.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libGraf3d.so
Reading symbols from /home/dayabay/Software/Root/lib/libGpad.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libGpad.so
Reading symbols from /home/dayabay/Software/Root/lib/libTree.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libTree.so
Reading symbols from /home/dayabay/Software/Root/lib/libRint.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libRint.so
Reading symbols from /home/dayabay/Software/Root/lib/libPostscript.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libPostscript.so
Reading symbols from /home/dayabay/Software/Root/lib/libMatrix.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libMatrix.so
Reading symbols from /home/dayabay/Software/Root/lib/libPhysics.so...done.
Loaded symbols for /home/dayabay/Software/Root/lib/libPhysics.so
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /usr/lib/libfreetype.so.6...done.
Loaded symbols for /usr/lib/libfreetype.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
#0  0x002e37a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) 
(gdb) 
(gdb) 
(gdb) where
#0  0x002e37a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x016d68b5 in raise () from /lib/tls/libc.so.6
#2  0x016d8329 in abort () from /lib/tls/libc.so.6
#3  0x0170a40a in __libc_message () from /lib/tls/libc.so.6
#4  0x01710a08 in _int_free () from /lib/tls/libc.so.6
#5  0x01710fda in free () from /lib/tls/libc.so.6
#6  0x08057108 in main (argc=3, argv=0xbff94f14) at src/mlogger.c:3473
(gdb) 
/**************************************************************
/**************************************************************
/**************************************************************


I am running Midas 2.0.0 and here is a section of my mlogger.c:


/**************************************************************
/**************************************************************
/**************************************************************
/********************************************************************\

  Name:         mlogger.c
  Created by:   Stefan Ritt

  Contents:     MIDAS logger program

  $Id: mlogger.c 3476 2006-12-20 09:00:26Z ritt $

\********************************************************************/

// stuff...

/*------------------------ main ------------------------------------*/

int main(int argc, char *argv[])
{
   INT status, msg, i, size, run_number, ch = 0, state;
   char host_name[HOST_NAME_LENGTH], exp_name[NAME_LENGTH], dir[256];
   BOOL debug, daemon, save_mode;
   DWORD last_time_kb = 0;
   DWORD last_time_stat = 0;
   HNDLE hktemp;

#ifdef HAVE_ROOT
   char **rargv;
   int rargc;

   /* copy first argument */
   rargc = 0;
   rargv = (char **) malloc(sizeof(char *) * 2);
   rargv[rargc] = (char *) malloc(strlen(argv[rargc]) + 1);
   strcpy(rargv[rargc], argv[rargc]);
   rargc++;

   /* append argument "-b" for batch mode without graphics */
   rargv[rargc] = (char *) malloc(3);
   rargv[rargc++] = "-b";

   TApplication theApp("mlogger", &rargc, rargv);

   /* free argument memory */
   free(rargv[0]);
   free(rargv[1]);   // Line: 3473
   free(rargv);

#endif

// etc...

/**************************************************************
/**************************************************************
/**************************************************************


I'll play with it some, but I wanted to post this info first.
  551   13 Jan 2009 Stefan RittForummlogger problem
> Sorry for being vague. I cleaned up the ODB, but it doesn't seem to be the
> problem. Here is a sample run of mlogger and gdb:

Thanks for the info, that explained the problem. It is related to the lines

rargv[rargc] = (char *)malloc(3);
rargv[rargc++] = "-b";

where one first allocates some memory (3 bytes), but then overwrites the pointer with 
another pointer to some static memory ("-b"). The following

free(rargv[1]);

then tries to free the static memory which fails.

The problem was already fixed some time ago, so please update your version from the SVN 
revision (see https://midas.psi.ch/download.html for details).
  554   14 Jan 2009 Konstantin OlchanskiForummlogger problem
> The problem was already fixed some time ago, so please update your version from the SVN 
> revision (see https://midas.psi.ch/download.html for details).

I wanted to check out the latest websvn midas repository viewer installed at PSI, so I used the web "annotate/blame" tools 
to trace the fix to this bug down to revision 3660 committed in April 2007. (It turns out that "svn blame" is not very useful 
for tracing *removed* lines, so I ended up doing a manual binary search across different revisions of mlogger.c)

K.O.
  978   11 Mar 2014 Andreas SuterForummlogger problem
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.
  979   11 Mar 2014 Stefan RittForummlogger problem

Andreas Suter wrote:
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.


K.O. put that in: https://bitbucket.org/tmidas/midas/commits/9d7b7c83b275a2bd3c846c4f265ff7f5d53f3426

He should have a look at it.

Have you tried to rebuild your ODB from scratch? (Save in XML, then delete .ODB.SHM, then load again form XML)?

/Stefan
  980   11 Mar 2014 Andreas SuterForummlogger problem

Stefan Ritt wrote:

Andreas Suter wrote:
I stumbled over a problem which I cannot pin point and would appreciate suggestions.

I set up an experiment, and all of a sudden I noticed the following behaviour.

I can start any number of frontends without any problems as long as mlogger is NOT running.
I can also start mlogger without any problems. However, as soon as I started the mlogger, I cannot start anything else any more (including odbedit). I get the following assertion:
16:07:06 [Logger,INFO] Program Logger on host lem00 started
[local:nemu:S]/>q
[nemu@lem00 2014]$ odbedit -e nemu
odbedit: src/odb.c:753: db_update_open_record: Assertion `xkey->notify_count == pkey->notify_count' failed.
Aborted
This is even happening if I stop all frontends, start only the mlogger and afterwards try to start odbedit.

I tried to see if this is a generic feature on a test experiment, but there I cannot reproduce it. It seems that there is either something wrong with the ODB, something wrong with hotlinks, ..., I don't know.

I would appreciated suggestions how pin point the issue.


K.O. put that in: https://bitbucket.org/tmidas/midas/commits/9d7b7c83b275a2bd3c846c4f265ff7f5d53f3426

He should have a look at it.

Have you tried to rebuild your ODB from scratch? (Save in XML, then delete .ODB.SHM, then load again form XML)?

/Stefan

Yes, I could recover the ODB by falling back to a previous dump. Still, I would like to know what is the exact meaning of the above assertion. It might help to understand what are the likely cause which results in the assertion.

/Andreas
ELOG V3.1.4-2e1708b5