Back Midas Rome Roody Rootana
  Midas DAQ System, Page 2 of 145  Not logged in ELOG logo
ID Datedown Author Topic Subject
  2905   26 Nov 2024 Maia Henriksson-WardBug ReportTMFE::Sleep() errors
> Hello,
> 
> I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
> The error is :
> 
> [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
> 
> This seems to happen in two different ways:
> 
> 1. Error being reported repeatedly
> 2. Occasional single errors being reported
> 
> When the first of these presents, we typically restart the FE to "solve" the problem.
> Case 2. is typically ignored.
> 
> The code in question is:
> 
> void TMFE::Sleep(double time)
> {
>    int status;
>    fd_set fdset;
>    struct timeval timeout;
>       
>    FD_ZERO(&fdset);
>       
>    timeout.tv_sec = time;
>    timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
> 
>    while (1) {
>       status = select(1, &fdset, NULL, NULL, &timeout);
> #ifdef EINTR
>       if (status < 0 && errno == EINTR) {
>          continue;
>       }
> #endif
>       break;
>    }
>       
>    if (status < 0) {
>       TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
>    }
> }
> 
> So it looks like either file descriptor of the timeval struct must have a problem.
> From some reading it seems that invalid timeval structs are often caused by one or both
> of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
> correctly set initially.
> 
> From the select() man page I see:
> 
> RETURN VALUE
>        On success, select() and pselect() return the number of file descriptors contained in
>        the three returned descriptor sets (that is, the total number of bits that are set in
>        readfds,  writefds,  exceptfds).  The return value may be zero if the timeout expired
>        before any file descriptors became ready.
> 
>        On error, -1 is returned, and errno is set to indicate the error; the file descriptor
>        sets are unmodified, and timeout becomes undefined.
> 
> The second paragraph quoted from the man page above would indicate to me that perhaps the
> timeout needs to be reset inside the if block. eg:
> 
>       if (status < 0 && errno == EINTR) {
>          timeout.tv_sec = time;
>          timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
>          continue;
>       }
> 
> Please note that I've only just briefly looked at this and was hoping someone more
> familiar with using select() as a way to sleep() might be better able to understand
> what is happening.
> 
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
> 
> Thanks,
> 
> Nick.

I had the same error a few months ago, though I wasn't using a tagged release. It happened because I was calling TMFE::Sleep() 
with a negative time. If your issues were caused by the same reason, TMFE::Sleep() can handle negative times since commit 
591f78f (https://bitbucket.org/tmidas/midas/commits/591f78f52893d5ffd64bf4e52a1daac537ebd672).

Early in my debugging, I did come to the same conclusions you did, and actually tried a similar solution the one you suggested. 
This was a few months ago and I didn't write down what happened, but I believe it didn't work because in my case the errno was 
something other than EINTR, and/or the timeval was still an invalid argument for sleep because the timeout was still negative. I 
never followed it up because I was able to fix my problem by fixing my frontend.
  2904   26 Nov 2024 Nick HastingsBug ReportTMFE::Sleep() errors
Hello,

I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
The error is :

[tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).

This seems to happen in two different ways:

1. Error being reported repeatedly
2. Occasional single errors being reported

When the first of these presents, we typically restart the FE to "solve" the problem.
Case 2. is typically ignored.

The code in question is:

void TMFE::Sleep(double time)
{
   int status;
   fd_set fdset;
   struct timeval timeout;
      
   FD_ZERO(&fdset);
      
   timeout.tv_sec = time;
   timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;

   while (1) {
      status = select(1, &fdset, NULL, NULL, &timeout);
#ifdef EINTR
      if (status < 0 && errno == EINTR) {
         continue;
      }
#endif
      break;
   }
      
   if (status < 0) {
      TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
   }
}

So it looks like either file descriptor of the timeval struct must have a problem.
From some reading it seems that invalid timeval structs are often caused by one or both
of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
correctly set initially.

From the select() man page I see:

RETURN VALUE
       On success, select() and pselect() return the number of file descriptors contained in
       the three returned descriptor sets (that is, the total number of bits that are set in
       readfds,  writefds,  exceptfds).  The return value may be zero if the timeout expired
       before any file descriptors became ready.

       On error, -1 is returned, and errno is set to indicate the error; the file descriptor
       sets are unmodified, and timeout becomes undefined.

The second paragraph quoted from the man page above would indicate to me that perhaps the
timeout needs to be reset inside the if block. eg:

      if (status < 0 && errno == EINTR) {
         timeout.tv_sec = time;
         timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
         continue;
      }

Please note that I've only just briefly looked at this and was hoping someone more
familiar with using select() as a way to sleep() might be better able to understand
what is happening.

I wonder also if now that midas requires stricter/newer c++ standards if there maybe
some more straightforward method to sleep that is sufficiently robust and portable.

Thanks,

Nick.
  2903   24 Nov 2024 Pavel MuratBug ReportODB lock timeout, Difficulty running MIDAS on Rocky 9.4
there is a really good software tool developed by the Fermilab DAQ group, called TRACE - 

https://github.com/art-daq/trace ,

It could be useful for debugging cases like this one. In short, TRACE instruments the code 
with the printouts which could be selectively turned on and off without recompiling the executable. 

TRACE output could go to /dev/stdout (slow output) and/or to a circular buffer implemented via a shared 
memory segment (fast output). Sending unlimited output to the shared memory segment is extremely useful.

TRACE also allows to trigger on certain conditions, again, w/o recompiling the executable. 
For debugging cases like the one in question, that could turn out even more useful, 
however I didn't try the triggering functionality myself. 

-- regards, Pasha
  2902   22 Nov 2024 Konstantin OlchanskiBug ReportODB lock timeout, Difficulty running MIDAS on Rocky 9.4
> try to replicate the issue ...

I see ODB lock timeout (and abort() of everything) in the dsvslice test station. We have 
about 15-20 MIDAS clients connected.

I am pretty sure we have not seen this problem until recently (and I have not seen it 
personally for a very long time). There were no changes to the MIDAS ODB locking code in a 
long time.

I suspect a recent change in the linux kernel. But I am likely to be wrong.

I have 1000 core dumps from this crash of dsvslice, and among them should be the 1 thread
that has ODB locked. Wish me luck finding it. Worst case is to discover that ODB is locked 
but nobody is holding a lock ("missing unlock bug"). This is hard to debug, I would have add 
tracking of "who was the last one to lock it, who forgot to unlock it".

K.O.
  2901   21 Nov 2024 Stefan RittInfoWhat do the status numbers mean and where can I find more information about them?
> [RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR] 
> cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
> 
> 
> 
> I just need more information on what the error message means. Which data type 
> refers to tid 24 and what does status 309 indicate?? 

A tid (type identification) of 24 does actually not exist. See midas.h:327, so this tells
me that your bank header got corrupted. Somewhere you write over your data.

Stefan
  2900   21 Nov 2024 Mann GandhiInfoWhat do the status numbers mean and where can I find more information about them?
Hello, 

This is the error message I got:

[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR] 
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
read_periodic_event: No data in ring buffer or error occurred

[RP Streaming Frontend,ERROR] [odb.cxx:3373:db_create_key,ERROR] invalid key type 
24 to create 'DATA' in '/Equipment/Periodic/Variables'

[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR] 
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309



I just need more information on what the error message means. Which data type 
refers to tid 24 and what does status 309 indicate?? 

There is definitely data in the ring buffer but I keep on getting this error.

Thank you!

M.G 
  2899   18 Nov 2024 Lukas GerritzenSuggestionComma-separated indices in alarm conditions
I have the following use case: I would like to check if two elements of an array exceed a certain threshold. 
However, they are not consecutive. Currently, I have to write two alarms, one checking Array[8] and one 
checking Array [10].

It would be nice if we could enter conditions such as "/Path/To/Array[8,10] > 0.5".

I looked into the code of al_evaluate_condition() and it seems very C-style. I know that you have been 
refactoring a lot of code to work with STL strings and their functions. If you find the time to refactor 
alarm.cxx, I ask that you consider adding comma-separated lists as a new feature.

Cheers
Lukas
  2898   18 Nov 2024 Stefan RittInfoNew sequencer command ODBLOOKUP
> "value not found" sets "index" to ?

It sets it actually to "not found". Since all variables are stings in the sequencer, you can then do a test like

ODBLOOKUP ..., index
if ($index == "not found")
  ...


> "odb key not found" sets "index" to ?

If the odb key is not found, the sequencer aborts.

> link to documentation?

The documentation is where it always has been: 

https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#Sequencer_Commands

/Stefan
  2897   15 Nov 2024 Konstantin OlchanskiInfoNew sequencer command ODBLOOKUP
> A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string 
> array in the ODB given by a path and returns its index as a number. If we have for example an array
> 
> /Examples/Names
>    [0] Hello
>    [1] Test
>    [2] Other
> 
> and do a
>  
> ODBLOOKUP "/Examples/Names", "Test", index
> 
> we get a index equal 1.
> 

"value not found" sets "index" to ?
"odb key not found" sets "index" to ?
link to documentation?

K.O.
  2896   15 Nov 2024 Konstantin OlchanskiSuggestionIssue with creating banks
> Hello, I am a coop student working at SNOLAB.
> void* data_acquisition_thread(void* param)
> {
> 	EVENT_HEADER *pevent;
>       if (complicated) {
> 			status = rb_get_wp(rbh, (void **) &pevent, 0);
>       }
>       bm_compose_event_threadsafe(pevent, 1, 0, 0, &equipment[0].serial_number);
> }

this code is buggy. it should read "EVENT_HEADER *pevent = NULL;" to avoid an uninitialized variable
and bm_compose_event() & co should be inside an "if (pevent != NULL)" block, unless you can absolutely
proove that rb_get_wp() is always called and pevent is never NULL. (even is somebody changes the code later).

if you build your code with "gcc -O2 -g -Wall -Wuninitialized" it would probably warn you about use of uninitilialized 
"pevent".

P.S. for building multithreaded frontends, you are much better off starting from the c++ tmfe frontend framework,
a good starting point is study tmfe_example_everything.cxx.

K.O.
  2895   14 Nov 2024 Mann GandhiSuggestionIssue with creating banks
> All I can see is that your bank header gets corrupted along the way. The funny character reported by 
> cm_write_event_to_odb indicates that your original name "RPD0" got overwritten somewhere, but I could not spot any 
> mistake in your code. 
> 
> I would play around: change max_event_size, produce dummy data of size N instead of the recv() and so on. Also monitor 
> the bank header to see when it gets overwritten. I guess you only write form one thread, so that should be safe, right?
> 
> Best,
> Stefan

Hello Stefan, 

Thank you for the advice. On inspection, I noticed that my event size (when I print bk_size(pevent)) is around 1.4 billion 
which seems absurd so I am not sure why this is the case as well. In addition, is mdump the way to monitor the bank header?
I just recently started using MIDAS so I am a little bit confused. I can attach a link to the github repository where I am 
currently working on this for further clarity since I am sure there is an issue in my code somewhere. 
(https://github.com/mgandhi-1/red-pitaya-frontend/blob/10-issue-with-bank-creation-neeed-to-figure-out-why-banks-are-not-
being-created-correctly/frontend.cxx)

I appreciate the help. Thank you once more.

Best, 
Mann
  2894   14 Nov 2024 Stefan RittSuggestionIssue with creating banks
All I can see is that your bank header gets corrupted along the way. The funny character reported by 
cm_write_event_to_odb indicates that your original name "RPD0" got overwritten somewhere, but I could not spot any 
mistake in your code. 

I would play around: change max_event_size, produce dummy data of size N instead of the recv() and so on. Also monitor 
the bank header to see when it gets overwritten. I guess you only write form one thread, so that should be safe, right?

Best,
Stefan
  2893   14 Nov 2024 Mann GandhiSuggestionIssue with creating banks
Hello, I am a coop student working at SNOLAB. I am currently setting up a frontend 
program to collect data for an experiment I am currently having with my bank being 
initialized correctly with the correct name. I will attach an image of the error and 
a code snippet for clarity. This is a multi-thread program using ring buffers. The 
first thread  is only responsible for data collection of ADC values from the Red 
Pitaya (FPGA) and the second thread does a simple derivative calculation. The 
frontend makes use of the TCP connection to stream data from the Red Pitaya. 

Here is the code snippet. This is the only place in the frontend code where I 
initialize and create a bank to store the ADC values from the Red Pitaya. 

void* data_acquisition_thread(void* param)
{
	printf("Data acquisition thread started\n");
	// Obtain ring buffer for inter-thread data exchange
	EVENT_HEADER *pevent;
	WORD *pdata;
	int status;

	//Set a timeout for the recv function to prevent indefinite blocking
	struct timeval timeout;
	timeout.tv_sec = 10; //seconds
	timeout.tv_usec = 0; // 0 microseconds
	setsockopt(stream_sockfd, SOL_SOCKET, SO_RCVTIMEO, (char *)&timeout, 
sizeof(timeout));



	while (is_readout_thread_enabled())
	{

		if (!readout_enabled())
		{
			usleep(10); // do not produce events when run is stopped
			continue;
		}
		// Acquire a write pointer in the ring buffer
		int status;
		do {
			status = rb_get_wp(rbh, (void **) &pevent, 0);
			if (status == DB_TIMEOUT)
			{
				usleep(5);
				if (!is_readout_thread_enabled()) break;
			}
		} while (status != DB_SUCCESS);

		if (status != DB_SUCCESS) continue;

		// Lock mutex before accessing shared resources
		pthread_mutex_lock(&lock);

		// Buffer for incoming data
		//int16_t temp_buffer[4096] = {0};

		bm_compose_event_threadsafe(pevent, 1, 0, 0, 
&equipment[0].serial_number);
        pdata = (WORD *)(pevent + 1);  // Set pdata to point to the data section of 
the event

		// Initialize the bank and read data directly into the bank
        bk_init32(pevent);
        bk_create(pevent, "RPD0", TID_WORD, (void **)&pdata);

		int bytes_read = recv(stream_sockfd, pdata, max_event_size * 
sizeof(WORD), 0);
		printf("Data received: %d bytes\n", bytes_read);


		if (bytes_read <= 0)
		{
			if (bytes_read == 0)
			{
				printf("Red Pitaya disconnected\n");
				pthread_mutex_unlock(&lock);
				break;

			} else if (errno == EWOULDBLOCK || errno ==EAGAIN)
			{
				printf("Receive timeout\n");
				pthread_mutex_unlock(&lock);
				continue;
			}

			else
			{
				printf("Error reading from the Red Pitaya: %s\n", 
strerror(errno));
				pthread_mutex_unlock(&lock);
				continue;
			}

		}
		
		 // Adjust data pointers after reading
        pdata += bytes_read / sizeof(WORD);
        bk_close(pevent, pdata);

        pevent->data_size = bk_size(pevent);

		// Unlock mutex after writing to the buffer
		pthread_mutex_unlock(&lock);

		// Send event to ring buffer
		rb_increment_wp(rbh, sizeof(EVENT_HEADER) + pevent->data_size);
	}
	pthread_mutex_unlock(&lock);

	return NULL;
}
Attachment 1: Screenshot_from_2024-11-14_12-35-06.png
Screenshot_from_2024-11-14_12-35-06.png
  2892   13 Nov 2024 Stefan RittInfoNew sequencer command ODBLOOKUP
A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string 
array in the ODB given by a path and returns its index as a number. If we have for example an array

/Examples/Names
   [0] Hello
   [1] Test
   [2] Other

and do a
 
ODBLOOKUP "/Examples/Names", "Test", index

we get a index equal 1.


/Stefan
  2891   07 Nov 2024 Stefan RittSuggestionStop run and sequencer button
I don't find this very useful. Some experiments do not only want to stop the run, but also do other cleanup things. To do that, I proposed and "atexit" function like C has it. Then the user can put a run stop there, plus any other cleanup. This will be much more flexible. Think about the "reset" script we have to manually run if we abort a sequencer. The atexit function will come next week, so you should consider to use it instead your additional button.

Stefan
  2890   07 Nov 2024 Lukas GerritzenSuggestionStop run and sequencer button
Due to popular demand among our students, I added a button to the sequencer that stops the run and the sequence. If you find it useful, please consider merging this upstream.
$ git diff sequencer.html
diff --git a/resources/sequencer.html b/resources/sequencer.html
index e7f8a79d..95c7e3d8 100644
--- a/resources/sequencer.html
+++ b/resources/sequencer.html
@@ -115,6 +115,7 @@
               <img src="icons/play.svg" title="Start" class="seqbtn Stopped" onclick="startSeq();">
               <img src="icons/debug.svg" title="Debug" class="seqbtn Stopped" onclick="debugSeq();">
               <img src="icons/square.svg" title="Stop" class="seqbtn Running Paused" onclick="stopSeq();">
+              <img src="icons/x-octagon.svg" title="Stop Run and Sequencer immediately" class="seqbtn Running Paused" onclick="stopRunAndSeq();">
               <img src="icons/pause.svg" title="Pause" class="seqbtn Running" onclick="modbset('/Sequencer/Command/Pause script',true);">
               <img src="icons/resume.svg" title="Resume" class="seqbtn Paused" onclick="modbset('/Sequencer/Command/Resume script',true);">
               <img src="icons/step-over.svg" title="Step Over" class="seqbtn Running Paused" onclick="modbset('/Sequencer/Command/Step over',true);">
[gac-megj@pc13513 resources]$ git diff sequencer.js
diff --git a/resources/sequencer.js b/resources/sequencer.js
index cc5398ef..b75c926c 100644
--- a/resources/sequencer.js
+++ b/resources/sequencer.js
@@ -1582,6 +1582,23 @@ function stopSeq() {
    });
 }

+function stopRunAndSeq() {
+   const message = `Are you sure you want to stop the run and sequence?`;
+   dlgConfirm(message,function(resp) {
+      if (resp) {
+         modbset('/Sequencer/Command/Stop immediately',true);
+
+         mjsonrpc_call("cm_transition", {"transition": "TR_STOP"}).then(function (rpc) {
+            if (rpc.result.status !== 1) {
+               throw new Error("Cannot stop run, cm_transition() status " + rpc.result.status + ", see MIDAS messages");
+            }
+         }).catch(function (error) {
+            mjsonrpc_error_alert(error);
+         });
+      }
+   });
+}
+
 // Show or hide parameters table
 function showParTable(varContainer) {
    let e = document.getElementById(varContainer);
  2889   06 Nov 2024 Amy RobertsBug ReportDifficulty running MIDAS on Rocky 9.4
After following Konstantin's debugging suggestions, I thought I would try to replicate 
the issue on my own computer.  My hope was that I could provide instructions for 
replicating the bug so that the MIDAS team could try debugging things more easily.

However, when I ran the current version of MIDAS in a Rocky 9.4 VM on my laptop (both 
VMWare and VirtualBox), mserver and odbedit ran just fine (!).

I'm currently trying to find out if there's a way to compare the VMs on my machine and 
the machine that's being problematic, I'll report back if I learn anything.
  2886   05 Nov 2024 Maia Henriksson-WardForumHow to properly write a client listens for events on a given buffer?
> If there's some template for writing a client to access event data, that would be 
> very useful (and you can probably just ignore the context I gave below in that 
> case).
> 
> 
> Some context:
> 
> Quite a while ago, I wrote the attached "data pipeline" client whose job was to 
> listen for events, copy their data, and pipe them to a python script. I believe I 
> just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the 
> attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
> data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the 
> below problems; so don't take the logic in the attached code as the exact code that 
> caused the issues below.
> 
> However, I'm unable to resolve a couple issues:
> 
> 1. If a timeout is set, everything will work until that timeout is reached. Then 
> regardless of what kind of logic I tried to implement (retry receiving event, 
> disconnect and reconnect client, etc.) the client would refuse to receive more data.
> 
> 2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while 
> loop. But because I can't set a timeout I have to ctrl-C twice; this would 
> occasionally corrupt the ODB which was not ideal. I was able to get around this with 
> some impractical solution involving ncurses I believe.
> 
> 
> Thanks,
> Jack

midas/examples/lowlevel/consume.cxx might be what you're looking for, but I think all 
you're missing is a call to cm_yield() in your loop, so your midas client doesn't get 
killed when the timeout is reached (and also so you can act on shutdown requests from 
midas)

Something like 
      int status = cm_yield(100);
      if (status == SS_ABORT || status == RPC_SHUTDOWN)
         break;

There might be a recommended way to handle the ctrl-c and disconnect from the ODB, but 
off the top of my head I don't remember it. 

Also check out Ben's new(ish) python library, midas/python/examples/event_receiver.py 
might be a much easier solution. And you can use the context manager, which will take 
care of safely disconnecting from midas after you ctrl-C.
  2885   05 Nov 2024 Jack CarltonForumHow to properly write a client listens for events on a given buffer?
If there's some template for writing a client to access event data, that would be 
very useful (and you can probably just ignore the context I gave below in that 
case).


Some context:

Quite a while ago, I wrote the attached "data pipeline" client whose job was to 
listen for events, copy their data, and pipe them to a python script. I believe I 
just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the 
attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the 
below problems; so don't take the logic in the attached code as the exact code that 
caused the issues below.

However, I'm unable to resolve a couple issues:

1. If a timeout is set, everything will work until that timeout is reached. Then 
regardless of what kind of logic I tried to implement (retry receiving event, 
disconnect and reconnect client, etc.) the client would refuse to receive more data.

2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while 
loop. But because I can't set a timeout I have to ctrl-C twice; this would 
occasionally corrupt the ODB which was not ideal. I was able to get around this with 
some impractical solution involving ncurses I believe.


Thanks,
Jack
Attachment 1: data_pipeline_(2).cxx
#include "midas.h"
#include "msystem.h"
#include "mrpc.h"
#include "mdsupport.h"
#include <iostream>
#include <unistd.h>
#include <stdio.h>  // Added for popen
#include <stdlib.h> // Added for malloc and free

INT hBufEvent;

void process_event(EVENT_HEADER *pheader) {
    printf("Received event #%d\n", pheader->serial_number);
    printf("Event ID: %d\n", pheader->event_id);
    printf("Data Size: %d bytes\n", pheader->data_size);
    printf("Timestamp: %d\n", pheader->time_stamp);
    printf("Trigger mask: %d\n", pheader->trigger_mask);

    // Print a marker to indicate the start of serialized data
    printf("EVENT_DATA_START\n");

    // Serialize and print the event data
    int* eventData = (int*)((char*)pheader + sizeof(EVENT_HEADER));
    int numIntegers = (pheader->data_size - sizeof(EVENT_HEADER)) / sizeof(int);
    for (int i = 0; i < 8; ++i) {
        printf("%d ", eventData[i]);
    }
    printf("\n");

    // Process the event here
}

int main() {
    HNDLE hDB, hKey;
    char host_name[HOST_NAME_LENGTH], expt_name[NAME_LENGTH], str[80];
    char buf_name[32] = EVENT_BUFFER_NAME, rep_file[128];
    unsigned int status, start_time, stop_time;
    INT ch, request_id, size, get_flag, action, single, i;

    // Define the maximum event size you expect to receive
    INT max_event_size = 4000;

    // Allocate memory for storing event data dynamically
    void* event_data = malloc(max_event_size);

    printf("1\n");
    /* Get if existing the pre-defined experiment */
    cm_get_environment(host_name, sizeof(host_name), expt_name, sizeof(expt_name));
    // Print host_name
    printf("host_name = %s\n", host_name);

    // Print expt_name
    printf("expt_name = %s\n", expt_name);

    printf("2\n");
    /* connect to the experiment */
    status = cm_connect_experiment(host_name, expt_name, "data_pipeline", 0);
    if (status != CM_SUCCESS) {
        return 1;
    }
    printf("3\n");

    status = bm_open_buffer(buf_name, DEFAULT_BUFFER_SIZE, &hBufEvent);
    if (status != BM_SUCCESS && status != BM_CREATED) {
        cm_msg(MERROR, "data_pipeline", "Cannot open buffer \"%s\", bm_open_buffer() status %d", buf_name, status);
        return 1;
    }
    printf("4\n");
    /* set the buffer cache size if requested */
    bm_set_cache_size(hBufEvent, 100000, 0);
    printf("5\n");

    /* place a request for a specific event id */
    status = bm_request_event(hBufEvent, EVENTID_ALL, TRIGGER_ALL, GET_ALL, &request_id, NULL); // Use NULL as the callback routine
    printf("6\n");
    printf("status = %d\n",status);
    // Open a pipe to a Python script for data transfer
    FILE* pipe = popen("python3 data_pipeline.py", "w");
    if (pipe == NULL) {
        perror("popen");
        return 1;
    }

    // Enter the event processing loop
    while (1) {
        // Use the address of max_event_size in bm_receive_event
        status = bm_receive_event(hBufEvent, event_data, &max_event_size, BM_WAIT); // Wait for new data indefinitely

        if (status == BM_SUCCESS) {
            //process_event((EVENT_HEADER*)((char*)event_data + sizeof(EVENT_HEADER)));
            // Send the event data to the Python script via the pipe
            fprintf(pipe, "EVENT_DATA_START\n");
            int* eventData = (int*)((char*)event_data + sizeof(EVENT_HEADER));
            int numIntegers = (max_event_size - sizeof(EVENT_HEADER)) / sizeof(int);
            for (int i = 4; i < 12; ++i) {
                fprintf(pipe, "%d ", eventData[i]);
            }
            fprintf(pipe, "\n");
            fflush(pipe); // Flush the buffer to ensure data is sent immediately
        } else {
            printf("Error receiving event: %d\n", status);
            break; // Exit the loop if an error occurs
        }
    }

    // Close the pipe
    pclose(pipe);

    // Free the dynamically allocated memory
    free(event_data);

    cm_disconnect_experiment();

    printf("7\n");
    return 1;
}
Attachment 2: MidasConnector.cpp
#include "MidasConnector.h"

MidasConnector::MidasConnector(const char* clientName) {
    // Initialize client name
    strncpy(client_name_, clientName, NAME_LENGTH);

    // Get host name and experiment name from environment
    cm_get_environment(host_name_, sizeof(host_name_), experiment_name_, sizeof(experiment_name_));

    // Initialize other private variables if needed
    event_id = EVENTID_ALL;  // Initialize with default value
    trigger_mask = TRIGGER_ALL;  // Initialize with default value
    sampling_type = GET_ALL;  // Initialize with default value (renamed from get_flags)
    buffer_size = DEFAULT_BUFFER_SIZE;  // Initialize with default value
    timeout_millis = BM_WAIT;
    strncpy(buffer_name, EVENT_BUFFER_NAME, sizeof(buffer_name));  // Initialize with default value
}

// Getters for the private variables
short MidasConnector::getEventId() const {
    return event_id;
}

short MidasConnector::getTriggerMask() const {
    return trigger_mask;
}

int MidasConnector::getSamplingType() const {
    return sampling_type;  
}

int MidasConnector::getBufferSize() const {
    return buffer_size;
}

const char* MidasConnector::getBufferName() const {
    return buffer_name;
}

int MidasConnector::getTimeout() const {
    return timeout_millis;
}

HNDLE MidasConnector::getEventBufferHandle() const {
    return hBufEvent;
}

// Setters for the private variables
void MidasConnector::setEventId(short eventId) {
    event_id = eventId;
}

void MidasConnector::setTriggerMask(short triggerMask) {
    trigger_mask = triggerMask;
}

void MidasConnector::setSamplingType(int samplingType) {
    sampling_type = samplingType;  
}

void MidasConnector::setBufferSize(int bufferSize) {
    buffer_size = bufferSize;
}

void MidasConnector::setBufferName(const char* bufferName) {
    strncpy(buffer_name, bufferName, sizeof(buffer_name));
}

void MidasConnector::setTimeout(int timeoutMillis) {
    timeout_millis = timeoutMillis;
}

void MidasConnector::setEventBufferHandle(HNDLE eventBufferHandle) {
    hBufEvent = eventBufferHandle;
}


bool MidasConnector::ConnectToExperiment() {
    // Connect to the experiment
    int status = cm_connect_experiment(host_name_, experiment_name_, client_name_, NULL);
    if (status != CM_SUCCESS) {
        // Handle connection error
        return false;
    }
    return true;
}

void MidasConnector::DisconnectFromExperiment() {
    // Disconnect from the experiment
    cm_disconnect_experiment();
}

bool MidasConnector::OpenEventBuffer() {
    int status = bm_open_buffer(buffer_name, buffer_size, &hBufEvent);
    if (status != BM_SUCCESS && status != BM_CREATED) {
        cm_msg(MERROR, client_name_, "Cannot open buffer \"%s\", bm_open_buffer() status %d", buffer_name, status);
        return false;
    }
    return true;
}

bool MidasConnector::SetCacheSize(int cacheSize) {
    bm_set_cache_size(hBufEvent, cacheSize, 0);
    return true;
}

bool MidasConnector::RequestEvent() {
    int request_id;
    int status = bm_request_event(hBufEvent, event_id, trigger_mask, sampling_type, &request_id, NULL);
    return status == BM_SUCCESS;
}

bool MidasConnector::ReceiveEvent(void* eventBuffer, int& maxEventSize) {
    int status = bm_receive_event(hBufEvent, eventBuffer, &maxEventSize, timeout_millis);
    return status == BM_SUCCESS;
}

Attachment 3: main.cpp
#include "event_processor/EventProcessor.h"
#include "data_transmitter/DataTransmitter.h"
#include "midas_connector/MidasConnector.h"
#include "json.hpp"
#include <fstream>

INT hBufEvent1;
INT hBufEvent2;

// Function to initialize MIDAS and open an event buffer
bool initializeMidas(MidasConnector& midasConnector, const nlohmann::json& config) {
    // Set the MidasConnector properties based on the config
    midasConnector.setEventId(config["eventId"].get<short>());
    midasConnector.setTriggerMask(config["triggerMask"].get<short>());
    midasConnector.setSamplingType(config["samplingType"].get<int>());
    midasConnector.setBufferSize(config["bufferSize"].get<int>());
    midasConnector.setBufferName(config["bufferName"].get<std::string>().c_str());
    midasConnector.setBufferSize(config["bufferSize"].get<int>());

    // Call the ConnectToExperiment method
    if (!midasConnector.ConnectToExperiment()) {
        return false;
    }

    // Call the OpenEventBuffer method
    if (!midasConnector.OpenEventBuffer()) {
        return false;
    }

    // Set the buffer cache size if requested
    midasConnector.SetCacheSize(config["cacheSize"].get<int>());

    // Place a request for a specific event id
    if (!midasConnector.RequestEvent()) {
        return false;
    }

    return true;
}

int main() {
    // Read configuration from a JSON file
    nlohmann::json config;
    std::ifstream configFile("config.json");
    configFile >> config;
    configFile.close();

    // Initialize MidasConnector and connect to the MIDAS experiment
    MidasConnector midasConnector(config["clientName"].get<std::string>().c_str());
    if (!initializeMidas(midasConnector, config)) {
        printf("Error: Failed to initialize MIDAS.\n");
        return 1;
    }

    // Read the maximum event size from the JSON configuration
    INT max_event_size = config["maxEventSize"].get<int>();

    // Allocate memory for storing event data dynamically
    void* event_data = malloc(max_event_size);

    // Initialize EventProcessor with detector mapping file and verbosity flag
    EventProcessor eventProcessor(config["detectorMappingFile"].get<std::string>(), config["verbose"].get<bool>());

    // Initialize DataTransmitter with the ZeroMQ address
    DataTransmitter dataPublisher(config["zmqAddress"].get<std::string>());

    // Connect to the ZeroMQ server
    if (!dataPublisher.bind()) {
        // Handle connection error
        printf("Error: Failed to bind to port %s.\n", config["zmqAddress"].get<std::string>().c_str());
        return 1;
    } else {
        printf("Connected to the ZeroMQ server.\n");
    }


    // Event processing loop
    while (true) {

        midasConnector.ReceiveEvent(event_data, max_event_size);

        //Prcoess data once we have it
        eventProcessor.processEvent(event_data, max_event_size);

        // Serialize the event data with EventProcessor and store it in serializedData
        std::string serializedData = eventProcessor.getSerializedData();

        // Send the serialized data to the ZeroMQ server with DataTransmitter
        if (!dataPublisher.publish(serializedData)) {
            // Handle send error
            printf("Error: Failed to send serialized data.\n");
        }
    }

    // Cleanup and finalize your application
    midasConnector.DisconnectFromExperiment(); // Disconnect from the MIDAS experiment

    return 0;
}
  2884   28 Oct 2024 Amy RobertsBug ReportDifficulty running MIDAS on Rocky 9.4
> Now for each timeout it will print detailed syscall and timing information, if time goes backwards, it should catch it.

It appears that time is moving forward:

[aroberts@sdfcdmsdaq build]$ odbedit
[ODBEdit,ERROR] [odb.cxx:2043:db_open_database,ERROR] Removed ODB client 'ODBEdit', index 0 because process pid 1617119 does 
not exists
[ODBEdit,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Removed exclusive access mode from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Corrected 1 ODB entries
[ODBEdit,INFO] Deleted entry '/System/Clients/1617119' for client 'ODBEdit' because it is not connected to ODB
[ODBEdit,INFO] Client 'ODBEdit' on buffer 'SYSMSG' removed by bm_open_buffer because process pid 1617119 does not exist
[local:amy_test:S]/>ss_semaphore_wait_for: semop/semtimedop(5) returned -1, errno 11 (Resource temporarily unavailable), 
start time 0xd4fd98f6, now 0xd4fdc0ef, dt 0x000027f9, timeout 0x00002710 ms, SEMAPHORE TIMEOUT!
[ODBEdit,ERROR] [odb.cxx:2489:db_lock_database,ERROR] cannot lock ODB semaphore, timeout 10000 ms, aborting...
Aborted (core dumped)
ELOG V3.1.4-2e1708b5