Back Midas Rome Roody Rootana
  Midas DAQ System, Page 136 of 152  Not logged in ELOG logo
ID Date Author Topicdown Subject
  3051   07 Jun 2025 Mark GrimesBug ReportMemory leak in mhttpd binary RPC code

Hi,

We applied an intermediate fix for this locally and it seems to have fixed our issue.  The attached plot shows the percentage memory use on our machine with 128 Gb memory, as a rough proxy for mhttpd memory use.  After applying our fix mhttpd seems to be happy using ~7% of the memory after being up for 2.5 days.

Our fix to mjson was:

diff --git a/mjson.cxx b/mjson.cxx

index 17ee268..2443510 100644

--- a/mjson.cxx

+++ b/mjson.cxx

@@ -654,8 +654,7 @@ MJsonNode::~MJsonNode() // dtor

       delete subnodes[i];

    subnodes.clear();

 

-   if (arraybuffer_size > 0) {

-      assert(arraybuffer_ptr != NULL);

+   if (arraybuffer_ptr != NULL) {

       free(arraybuffer_ptr);

       arraybuffer_size = 0;

       arraybuffer_ptr = NULL;

We also applied the following in midas for good measure, although I don't think it contributed to the leak we were seeing:

diff --git a/src/mjsonrpc.cxx b/src/mjsonrpc.cxx

index 2201d228..38f0b99b 100644

--- a/src/mjsonrpc.cxx

+++ b/src/mjsonrpc.cxx

@@ -3454,6 +3454,7 @@ static MJsonNode* brpc(const MJsonNode* params)

    status = cm_connect_client(name.c_str(), &hconn);

 

    if (status != RPC_SUCCESS) {

+      free(buf);

       return mjsonrpc_make_result("status", MJsonNode::MakeInt(status));

    }

I hope this is useful to someone.  As previously mentioned we make heavy use of binary RPC, so maybe other experiments don't run into the same problem.

Thanks,

Mark.

  3054   10 Jun 2025 Nik BergerBug ReportHistory variables with leading spaces
By accident we had history variables with leading spaces. The history schema check then decides that this is a new variable (the leading space is not read from the history file) and starts a new file. We found this because the run start became slow due to the many, many history files created. It would be nice to just get an error if one has a malformed variable name like this.

How to reproduce: Try to put a variable with a leading space in the name into the history, repeatedly start runs.
Sugested fix: Produce an error if a history variable has a leading space.
  3055   10 Jun 2025 Konstantin OlchanskiBug ReportMemory leak in mhttpd binary RPC code
I confirm that MJSON_ARRAYBUFFER does not work correctly for zero-size buffers, 
buffer is leaked in the destructor and copied as NULL in MJsonNode::Copy().

I also confirm memory leak in mjsonrpc "brpc" error path (already fixed).

Affected by the MJSON_ARRAYBUFFER memory leak are "brpc" (where user code returns 
a zero-size data buffer) and "js_read_binary_file" (if reading from an empty 
file, return of "new char[0]" is never freed).

"receive_event" and "read_history" RPCs never use zero-size buffers and are not 
affected by this bug.

mjson commit c798c1f0a835f6cea3e505a87bbb4a12b701196c
midas commit 576f2216ba2575b8857070ce7397210555f864e5
rootana commit a0d9bb4d8459f1528f0882bced9f2ab778580295

Please post bug reports a plain-text so I can quote from them.

K.O.
  3058   15 Jun 2025 Mark GrimesBug ReportMemory leak in mhttpd binary RPC code
Many thanks for the fix.  We've applied and see better memory performance.  We still have to kill and restart 
mhttpd after a few days however.  I think the official fix is missing this part:

diff --git a/src/mjsonrpc.cxx b/src/mjsonrpc.cxx
index 2201d228..38f0b99b 100644
--- a/src/mjsonrpc.cxx
+++ b/src/mjsonrpc.cxx
@@ -3454,6 +3454,7 @@ static MJsonNode* brpc(const MJsonNode* params)
    status = cm_connect_client(name.c_str(), &hconn);
 
    if (status != RPC_SUCCESS) {
+      free(buf);
       return mjsonrpc_make_result("status", MJsonNode::MakeInt(status));
    }

When the other process returns a failure the memory block is also currently leaked.  I originally stated "...although I 
don't think it contributed to the leak we were seeing" but it seems this was false.

Thanks,

Mark.


> I confirm that MJSON_ARRAYBUFFER does not work correctly for zero-size buffers, 
> buffer is leaked in the destructor and copied as NULL in MJsonNode::Copy().
> 
> I also confirm memory leak in mjsonrpc "brpc" error path (already fixed).
> 
> Affected by the MJSON_ARRAYBUFFER memory leak are "brpc" (where user code returns 
> a zero-size data buffer) and "js_read_binary_file" (if reading from an empty 
> file, return of "new char[0]" is never freed).
> 
> "receive_event" and "read_history" RPCs never use zero-size buffers and are not 
> affected by this bug.
> 
> mjson commit c798c1f0a835f6cea3e505a87bbb4a12b701196c
> midas commit 576f2216ba2575b8857070ce7397210555f864e5
> rootana commit a0d9bb4d8459f1528f0882bced9f2ab778580295
> 
> Please post bug reports a plain-text so I can quote from them.
> 
> K.O.
  3059   19 Jun 2025 Frederik WautersBug Reportadd history variables
I have encounter this a few times
* Make a new history panel
* Use the web GUI to add history variables
* When I am at the "add history variables" panel, there is not scroll option. So 
depending on the size and zoom of my screen, some variables further down the list 
can not be selected

tried Chrome and Firefox
  3060   19 Jun 2025 Stefan RittBug ReportHistory variables with leading spaces
I added now code to the logger so it properly complains if there would be a leading space in a variable name.

Stefan

> By accident we had history variables with leading spaces. The history schema check then decides that this is a new variable (the leading space is not read from the history file) and starts a new file. We found this because the run start became slow due to the many, many history files created. It would be nice to just get an error if one has a malformed variable name like this.
> 
> How to reproduce: Try to put a variable with a leading space in the name into the history, repeatedly start runs.
> Sugested fix: Produce an error if a history variable has a leading space.
  3061   23 Jun 2025 Stefan RittBug ReportMemory leak in mhttpd binary RPC code
Since this memory leak is quite obvious, I pushed the fix to develop.

Stefan
  3062   04 Jul 2025 Mark GrimesBug ReportMemory leaks in mhttpd
Something changed in our system and we started seeing memory leaks in mhttpd again.  I guess someone 
updated some front end or custom page code that interacted with mhttpd differently.
I found a few memory leaks in some (presumably) rarely seen corner cases and we now see steady 
memory usage.  The branch is fix/memory_leaks 
(https://bitbucket.org/tmidas/midas/branch/fix/memory_leaks) and I opened pull request #55 
(https://bitbucket.org/tmidas/midas/pull-requests/55).  I couldn't find a BitBucket account for you 
Konstantin to add as a reviewer, so it currently has none.

Thanks,

Mark.
  3064   21 Jul 2025 Stefan RittBug ReportDefault write cache size for new equipments breaks compatibility with older equipments
> Perhaps have:
> 
> set_write_cache_size("SYSTEM", 0);
> set_write_cache_size("BUF1", bigsize);
> 
> with an internal std::map<std::string,size_t>; for write cache size for each named buffer

Ok, this is implemented now in mfed.cxx and called from examples/experiment/frontend.cxx

Stefan
  170   22 Oct 2004 Konstantin OlchanskiBug Fixmhttpd message colouring
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O.
  174   09 Nov 2004 Pierre-Andre AmaudruzBug FixNew transition scheme
Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the 
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!

Fix:
Local transition table updated in midas.c (1.226)

Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered.
  200   25 Feb 2005 Konstantin OlchanskiBug Fixfixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
  211   05 May 2005 Konstantin OlchanskiBug Fixfix: minor bit rot in the example experiment
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O.
  212   02 Aug 2005 Konstantin OlchanskiBug Fixfix odb corruption when running analzer for the first time
I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
midas cvs. In book_ttree(), db_find("/Analyzer/Bank switches") was returning an
error and setting hkey to zero. Then we called db_open_record() with hkey==0,
which cased ODB corruption later on. The normal db_validate_hkey() did not catch
this because it considers hkey==0 to be valid (when most likely it is not). K.O.
  216   18 Aug 2005 Konstantin OlchanskiBug Fixfix race condition between clients on run start/stop, pause/resume
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.

I now commited changes to mana.c and mlogger.c changing their transition sequencing:

start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)

stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)

P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.

K.O.
  219   01 Sep 2005 Stefan RittBug Fixfix race condition between clients on run start/stop, pause/resume
> It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
> get a begin-run transition before the logger and so start sending data before
> the logger opened the output file. Same for the analyzer and same for the end of
> run. Also the sequencing for pause/resume run and begin/end run was different
> when the two pairs ought to have identical sequencing.
> 
> I now commited changes to mana.c and mlogger.c changing their transition sequencing:
> 
> start and resume:
> 200 - logger (mlogger.c, no change)
> 300 - analyzer (mana.c, was 500)
> 500 - frontends (mfe.c, no change)
> 
> stop and pause:
> 500 - frontends (mfe.c, no change)
> 700 - analyzer (mana.c, was 500)
> 800 - mlogger (mlogger.c, was 500)
> 
> P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
> see an anomaly in the analyzer, where it receives data events after the
> end-of-run transition.
> 
> K.O.

Thanks for fixing that bug. It happend because during the implementatoin of the priority
sequencing we have up the pre/post tansition, which took care of the proper sequence
between the logger, frontend and analyzer. The way you modified the sequence is
absolutely correct. It is important to have >10 numbers "around" the frontends (like
450...550) in case one has an experiment with >10 frontends which need to make a
transition in a certain sequence (like the DANCE experiment in Los Alamos).
  229   17 Oct 2005 Exaos LeeBug Fix"make install" error under MacOS X
Under MacOS X, "make install" will cours an error like this:
...
install: darwin/bin/dio: No such file or directory
make: *** [install] Error 71

This can be fixed as the following diff:
404,405c404,405
< $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c
<       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c $(LIB) $(LIBS)
---
> $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c
>       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c $(LIB) $(LIBS)
438c438,439
<       @for i in mserver mhttpd odbedit mlogger ; \
---
> 
>       @for i in mserver mhttpd odbedit mlogger dio ; \
444,447d444
<       chmod +s $(SYSBIN_DIR)/mhttpd
< 
< ifeq ($(OSTYPE),linux)
<       install -v -m 755 $(BIN_DIR)/dio $(SYSBIN_DIR)
449c446
< endif
---
>       chmod +s $(SYSBIN_DIR)/mhttpd
  235   23 Nov 2005 Stefan RittBug FixEndian swapping in mana.c
It was reported that following code in mana.c :
  /* swap event header if in wrong format */
  if (pevent->serial_number > 0x1000000) {
     WORD_SWAP(&pevent->event_id);
     WORD_SWAP(&pevent->trigger_mask);
     DWORD_SWAP(&pevent->serial_number);
     DWORD_SWAP(&pevent->time_stamp);
     DWORD_SWAP(&pevent->data_size);
  }

does not work correctly for events having a true serial number above 16777216 (=0x10000000). After some considerations, I concluded that there is no good way to determine automatically the endian format of midas events, without adding another field in the header, which would break the compatibility with all recorded data up to date. I therefore changed the above code to
  /* swap event header if in wrong format */
#ifdef SWAP_EVENTS
  WORD_SWAP(&pevent->event_id);
  WORD_SWAP(&pevent->trigger_mask);
  DWORD_SWAP(&pevent->serial_number);
  DWORD_SWAP(&pevent->time_stamp);
  DWORD_SWAP(&pevent->data_size);
#endif

So if one wants to analyze events with the midas analyzer on a PC system for example where the events come from a VxWorks system with the opposite endian encoding, one has to set the flag -DSWAP_EVENTS when compiling the analyzer for that type of analysis.
  252   07 May 2006 Konstantin OlchanskiBug FixUpdate & add VME drivers
I commited fixes for a few minor compilation errors in the VME drivers
(vmicvme.c, etc)
I also added new drivers for the v513 latch and v560 scaler that I wrote for
CERN-ALPHA.

(Maybe I should mention that we also have drivers for the SIS 3820 multiscaler,
the v895 VME discriminator and a few more modules. Will commit them as they mature).

K.O.
  256   18 May 2006 Konstantin OlchanskiBug Fixremoved a few "//" comments to fix compilation on VxWorks
Our VxWorks C compiler (gcc-2.8-something) does not like the "//" comments. Luckily, on VxWorks, we 
only compile a small subset of midas, so there is no point in banning all "//" comments. But I did have to 
convert a couple of them to /* commens */ in odb.c to make it compile. Changes to odb.c commited. K.O.
ELOG V3.1.4-2e1708b5