ID |
Date |
Author |
Topic |
Subject |
2905
|
26 Nov 2024 |
Maia Henriksson-Ward | Bug Report | TMFE::Sleep() errors | > Hello,
>
> I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
> The error is :
>
> [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
>
> This seems to happen in two different ways:
>
> 1. Error being reported repeatedly
> 2. Occasional single errors being reported
>
> When the first of these presents, we typically restart the FE to "solve" the problem.
> Case 2. is typically ignored.
>
> The code in question is:
>
> void TMFE::Sleep(double time)
> {
> int status;
> fd_set fdset;
> struct timeval timeout;
>
> FD_ZERO(&fdset);
>
> timeout.tv_sec = time;
> timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
>
> while (1) {
> status = select(1, &fdset, NULL, NULL, &timeout);
> #ifdef EINTR
> if (status < 0 && errno == EINTR) {
> continue;
> }
> #endif
> break;
> }
>
> if (status < 0) {
> TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
> }
> }
>
> So it looks like either file descriptor of the timeval struct must have a problem.
> From some reading it seems that invalid timeval structs are often caused by one or both
> of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
> correctly set initially.
>
> From the select() man page I see:
>
> RETURN VALUE
> On success, select() and pselect() return the number of file descriptors contained in
> the three returned descriptor sets (that is, the total number of bits that are set in
> readfds, writefds, exceptfds). The return value may be zero if the timeout expired
> before any file descriptors became ready.
>
> On error, -1 is returned, and errno is set to indicate the error; the file descriptor
> sets are unmodified, and timeout becomes undefined.
>
> The second paragraph quoted from the man page above would indicate to me that perhaps the
> timeout needs to be reset inside the if block. eg:
>
> if (status < 0 && errno == EINTR) {
> timeout.tv_sec = time;
> timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
> continue;
> }
>
> Please note that I've only just briefly looked at this and was hoping someone more
> familiar with using select() as a way to sleep() might be better able to understand
> what is happening.
>
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
>
> Thanks,
>
> Nick.
I had the same error a few months ago, though I wasn't using a tagged release. It happened because I was calling TMFE::Sleep()
with a negative time. If your issues were caused by the same reason, TMFE::Sleep() can handle negative times since commit
591f78f (https://bitbucket.org/tmidas/midas/commits/591f78f52893d5ffd64bf4e52a1daac537ebd672).
Early in my debugging, I did come to the same conclusions you did, and actually tried a similar solution the one you suggested.
This was a few months ago and I didn't write down what happened, but I believe it didn't work because in my case the errno was
something other than EINTR, and/or the timeval was still an invalid argument for sleep because the timeout was still negative. I
never followed it up because I was able to fix my problem by fixing my frontend. |
2904
|
26 Nov 2024 |
Nick Hastings | Bug Report | TMFE::Sleep() errors | Hello,
I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
The error is :
[tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
This seems to happen in two different ways:
1. Error being reported repeatedly
2. Occasional single errors being reported
When the first of these presents, we typically restart the FE to "solve" the problem.
Case 2. is typically ignored.
The code in question is:
void TMFE::Sleep(double time)
{
int status;
fd_set fdset;
struct timeval timeout;
FD_ZERO(&fdset);
timeout.tv_sec = time;
timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
while (1) {
status = select(1, &fdset, NULL, NULL, &timeout);
#ifdef EINTR
if (status < 0 && errno == EINTR) {
continue;
}
#endif
break;
}
if (status < 0) {
TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
}
}
So it looks like either file descriptor of the timeval struct must have a problem.
From some reading it seems that invalid timeval structs are often caused by one or both
of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
correctly set initially.
From the select() man page I see:
RETURN VALUE
On success, select() and pselect() return the number of file descriptors contained in
the three returned descriptor sets (that is, the total number of bits that are set in
readfds, writefds, exceptfds). The return value may be zero if the timeout expired
before any file descriptors became ready.
On error, -1 is returned, and errno is set to indicate the error; the file descriptor
sets are unmodified, and timeout becomes undefined.
The second paragraph quoted from the man page above would indicate to me that perhaps the
timeout needs to be reset inside the if block. eg:
if (status < 0 && errno == EINTR) {
timeout.tv_sec = time;
timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
continue;
}
Please note that I've only just briefly looked at this and was hoping someone more
familiar with using select() as a way to sleep() might be better able to understand
what is happening.
I wonder also if now that midas requires stricter/newer c++ standards if there maybe
some more straightforward method to sleep that is sufficiently robust and portable.
Thanks,
Nick. |
2903
|
24 Nov 2024 |
Pavel Murat | Bug Report | ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 | there is a really good software tool developed by the Fermilab DAQ group, called TRACE -
https://github.com/art-daq/trace ,
It could be useful for debugging cases like this one. In short, TRACE instruments the code
with the printouts which could be selectively turned on and off without recompiling the executable.
TRACE output could go to /dev/stdout (slow output) and/or to a circular buffer implemented via a shared
memory segment (fast output). Sending unlimited output to the shared memory segment is extremely useful.
TRACE also allows to trigger on certain conditions, again, w/o recompiling the executable.
For debugging cases like the one in question, that could turn out even more useful,
however I didn't try the triggering functionality myself.
-- regards, Pasha |
2902
|
22 Nov 2024 |
Konstantin Olchanski | Bug Report | ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 | > try to replicate the issue ...
I see ODB lock timeout (and abort() of everything) in the dsvslice test station. We have
about 15-20 MIDAS clients connected.
I am pretty sure we have not seen this problem until recently (and I have not seen it
personally for a very long time). There were no changes to the MIDAS ODB locking code in a
long time.
I suspect a recent change in the linux kernel. But I am likely to be wrong.
I have 1000 core dumps from this crash of dsvslice, and among them should be the 1 thread
that has ODB locked. Wish me luck finding it. Worst case is to discover that ODB is locked
but nobody is holding a lock ("missing unlock bug"). This is hard to debug, I would have add
tracking of "who was the last one to lock it, who forgot to unlock it".
K.O. |
2901
|
21 Nov 2024 |
Stefan Ritt | Info | What do the status numbers mean and where can I find more information about them? |
> [RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
> cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
>
>
>
> I just need more information on what the error message means. Which data type
> refers to tid 24 and what does status 309 indicate??
A tid (type identification) of 24 does actually not exist. See midas.h:327, so this tells
me that your bank header got corrupted. Somewhere you write over your data.
Stefan |
2900
|
21 Nov 2024 |
Mann Gandhi | Info | What do the status numbers mean and where can I find more information about them? | Hello,
This is the error message I got:
[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
read_periodic_event: No data in ring buffer or error occurred
[RP Streaming Frontend,ERROR] [odb.cxx:3373:db_create_key,ERROR] invalid key type
24 to create 'DATA' in '/Equipment/Periodic/Variables'
[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
I just need more information on what the error message means. Which data type
refers to tid 24 and what does status 309 indicate??
There is definitely data in the ring buffer but I keep on getting this error.
Thank you!
M.G |
2899
|
18 Nov 2024 |
Lukas Gerritzen | Suggestion | Comma-separated indices in alarm conditions | I have the following use case: I would like to check if two elements of an array exceed a certain threshold.
However, they are not consecutive. Currently, I have to write two alarms, one checking Array[8] and one
checking Array [10].
It would be nice if we could enter conditions such as "/Path/To/Array[8,10] > 0.5".
I looked into the code of al_evaluate_condition() and it seems very C-style. I know that you have been
refactoring a lot of code to work with STL strings and their functions. If you find the time to refactor
alarm.cxx, I ask that you consider adding comma-separated lists as a new feature.
Cheers
Lukas |
2898
|
18 Nov 2024 |
Stefan Ritt | Info | New sequencer command ODBLOOKUP | > "value not found" sets "index" to ?
It sets it actually to "not found". Since all variables are stings in the sequencer, you can then do a test like
ODBLOOKUP ..., index
if ($index == "not found")
...
> "odb key not found" sets "index" to ?
If the odb key is not found, the sequencer aborts.
> link to documentation?
The documentation is where it always has been:
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#Sequencer_Commands
/Stefan |
2897
|
15 Nov 2024 |
Konstantin Olchanski | Info | New sequencer command ODBLOOKUP | > A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string
> array in the ODB given by a path and returns its index as a number. If we have for example an array
>
> /Examples/Names
> [0] Hello
> [1] Test
> [2] Other
>
> and do a
>
> ODBLOOKUP "/Examples/Names", "Test", index
>
> we get a index equal 1.
>
"value not found" sets "index" to ?
"odb key not found" sets "index" to ?
link to documentation?
K.O. |
2896
|
15 Nov 2024 |
Konstantin Olchanski | Suggestion | Issue with creating banks | > Hello, I am a coop student working at SNOLAB.
> void* data_acquisition_thread(void* param)
> {
> EVENT_HEADER *pevent;
> if (complicated) {
> status = rb_get_wp(rbh, (void **) &pevent, 0);
> }
> bm_compose_event_threadsafe(pevent, 1, 0, 0, &equipment[0].serial_number);
> }
this code is buggy. it should read "EVENT_HEADER *pevent = NULL;" to avoid an uninitialized variable
and bm_compose_event() & co should be inside an "if (pevent != NULL)" block, unless you can absolutely
proove that rb_get_wp() is always called and pevent is never NULL. (even is somebody changes the code later).
if you build your code with "gcc -O2 -g -Wall -Wuninitialized" it would probably warn you about use of uninitilialized
"pevent".
P.S. for building multithreaded frontends, you are much better off starting from the c++ tmfe frontend framework,
a good starting point is study tmfe_example_everything.cxx.
K.O. |
2895
|
14 Nov 2024 |
Mann Gandhi | Suggestion | Issue with creating banks | > All I can see is that your bank header gets corrupted along the way. The funny character reported by
> cm_write_event_to_odb indicates that your original name "RPD0" got overwritten somewhere, but I could not spot any
> mistake in your code.
>
> I would play around: change max_event_size, produce dummy data of size N instead of the recv() and so on. Also monitor
> the bank header to see when it gets overwritten. I guess you only write form one thread, so that should be safe, right?
>
> Best,
> Stefan
Hello Stefan,
Thank you for the advice. On inspection, I noticed that my event size (when I print bk_size(pevent)) is around 1.4 billion
which seems absurd so I am not sure why this is the case as well. In addition, is mdump the way to monitor the bank header?
I just recently started using MIDAS so I am a little bit confused. I can attach a link to the github repository where I am
currently working on this for further clarity since I am sure there is an issue in my code somewhere.
(https://github.com/mgandhi-1/red-pitaya-frontend/blob/10-issue-with-bank-creation-neeed-to-figure-out-why-banks-are-not-
being-created-correctly/frontend.cxx)
I appreciate the help. Thank you once more.
Best,
Mann |
2894
|
14 Nov 2024 |
Stefan Ritt | Suggestion | Issue with creating banks | All I can see is that your bank header gets corrupted along the way. The funny character reported by
cm_write_event_to_odb indicates that your original name "RPD0" got overwritten somewhere, but I could not spot any
mistake in your code.
I would play around: change max_event_size, produce dummy data of size N instead of the recv() and so on. Also monitor
the bank header to see when it gets overwritten. I guess you only write form one thread, so that should be safe, right?
Best,
Stefan |
2893
|
14 Nov 2024 |
Mann Gandhi | Suggestion | Issue with creating banks | Hello, I am a coop student working at SNOLAB. I am currently setting up a frontend
program to collect data for an experiment I am currently having with my bank being
initialized correctly with the correct name. I will attach an image of the error and
a code snippet for clarity. This is a multi-thread program using ring buffers. The
first thread is only responsible for data collection of ADC values from the Red
Pitaya (FPGA) and the second thread does a simple derivative calculation. The
frontend makes use of the TCP connection to stream data from the Red Pitaya.
Here is the code snippet. This is the only place in the frontend code where I
initialize and create a bank to store the ADC values from the Red Pitaya.
void* data_acquisition_thread(void* param)
{
printf("Data acquisition thread started\n");
// Obtain ring buffer for inter-thread data exchange
EVENT_HEADER *pevent;
WORD *pdata;
int status;
//Set a timeout for the recv function to prevent indefinite blocking
struct timeval timeout;
timeout.tv_sec = 10; //seconds
timeout.tv_usec = 0; // 0 microseconds
setsockopt(stream_sockfd, SOL_SOCKET, SO_RCVTIMEO, (char *)&timeout,
sizeof(timeout));
while (is_readout_thread_enabled())
{
if (!readout_enabled())
{
usleep(10); // do not produce events when run is stopped
continue;
}
// Acquire a write pointer in the ring buffer
int status;
do {
status = rb_get_wp(rbh, (void **) &pevent, 0);
if (status == DB_TIMEOUT)
{
usleep(5);
if (!is_readout_thread_enabled()) break;
}
} while (status != DB_SUCCESS);
if (status != DB_SUCCESS) continue;
// Lock mutex before accessing shared resources
pthread_mutex_lock(&lock);
// Buffer for incoming data
//int16_t temp_buffer[4096] = {0};
bm_compose_event_threadsafe(pevent, 1, 0, 0,
&equipment[0].serial_number);
pdata = (WORD *)(pevent + 1); // Set pdata to point to the data section of
the event
// Initialize the bank and read data directly into the bank
bk_init32(pevent);
bk_create(pevent, "RPD0", TID_WORD, (void **)&pdata);
int bytes_read = recv(stream_sockfd, pdata, max_event_size *
sizeof(WORD), 0);
printf("Data received: %d bytes\n", bytes_read);
if (bytes_read <= 0)
{
if (bytes_read == 0)
{
printf("Red Pitaya disconnected\n");
pthread_mutex_unlock(&lock);
break;
} else if (errno == EWOULDBLOCK || errno ==EAGAIN)
{
printf("Receive timeout\n");
pthread_mutex_unlock(&lock);
continue;
}
else
{
printf("Error reading from the Red Pitaya: %s\n",
strerror(errno));
pthread_mutex_unlock(&lock);
continue;
}
}
// Adjust data pointers after reading
pdata += bytes_read / sizeof(WORD);
bk_close(pevent, pdata);
pevent->data_size = bk_size(pevent);
// Unlock mutex after writing to the buffer
pthread_mutex_unlock(&lock);
// Send event to ring buffer
rb_increment_wp(rbh, sizeof(EVENT_HEADER) + pevent->data_size);
}
pthread_mutex_unlock(&lock);
return NULL;
} |
Attachment 1: Screenshot_from_2024-11-14_12-35-06.png
|
|
2892
|
13 Nov 2024 |
Stefan Ritt | Info | New sequencer command ODBLOOKUP | A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string
array in the ODB given by a path and returns its index as a number. If we have for example an array
/Examples/Names
[0] Hello
[1] Test
[2] Other
and do a
ODBLOOKUP "/Examples/Names", "Test", index
we get a index equal 1.
/Stefan |
2891
|
07 Nov 2024 |
Stefan Ritt | Suggestion | Stop run and sequencer button | I don't find this very useful. Some experiments do not only want to stop the run, but also do other cleanup things. To do that, I proposed and "atexit" function like C has it. Then the user can put a run stop there, plus any other cleanup. This will be much more flexible. Think about the "reset" script we have to manually run if we abort a sequencer. The atexit function will come next week, so you should consider to use it instead your additional button.
Stefan |
2890
|
07 Nov 2024 |
Lukas Gerritzen | Suggestion | Stop run and sequencer button | Due to popular demand among our students, I added a button to the sequencer that stops the run and the sequence. If you find it useful, please consider merging this upstream.
$ git diff sequencer.html
diff --git a/resources/sequencer.html b/resources/sequencer.html
index e7f8a79d..95c7e3d8 100644
--- a/resources/sequencer.html
+++ b/resources/sequencer.html
@@ -115,6 +115,7 @@
<img src="icons/play.svg" title="Start" class="seqbtn Stopped" onclick="startSeq();">
<img src="icons/debug.svg" title="Debug" class="seqbtn Stopped" onclick="debugSeq();">
<img src="icons/square.svg" title="Stop" class="seqbtn Running Paused" onclick="stopSeq();">
+ <img src="icons/x-octagon.svg" title="Stop Run and Sequencer immediately" class="seqbtn Running Paused" onclick="stopRunAndSeq();">
<img src="icons/pause.svg" title="Pause" class="seqbtn Running" onclick="modbset('/Sequencer/Command/Pause script',true);">
<img src="icons/resume.svg" title="Resume" class="seqbtn Paused" onclick="modbset('/Sequencer/Command/Resume script',true);">
<img src="icons/step-over.svg" title="Step Over" class="seqbtn Running Paused" onclick="modbset('/Sequencer/Command/Step over',true);">
[gac-megj@pc13513 resources]$ git diff sequencer.js
diff --git a/resources/sequencer.js b/resources/sequencer.js
index cc5398ef..b75c926c 100644
--- a/resources/sequencer.js
+++ b/resources/sequencer.js
@@ -1582,6 +1582,23 @@ function stopSeq() {
});
}
+function stopRunAndSeq() {
+ const message = `Are you sure you want to stop the run and sequence?`;
+ dlgConfirm(message,function(resp) {
+ if (resp) {
+ modbset('/Sequencer/Command/Stop immediately',true);
+
+ mjsonrpc_call("cm_transition", {"transition": "TR_STOP"}).then(function (rpc) {
+ if (rpc.result.status !== 1) {
+ throw new Error("Cannot stop run, cm_transition() status " + rpc.result.status + ", see MIDAS messages");
+ }
+ }).catch(function (error) {
+ mjsonrpc_error_alert(error);
+ });
+ }
+ });
+}
+
// Show or hide parameters table
function showParTable(varContainer) {
let e = document.getElementById(varContainer);
|
2889
|
06 Nov 2024 |
Amy Roberts | Bug Report | Difficulty running MIDAS on Rocky 9.4 | After following Konstantin's debugging suggestions, I thought I would try to replicate
the issue on my own computer. My hope was that I could provide instructions for
replicating the bug so that the MIDAS team could try debugging things more easily.
However, when I ran the current version of MIDAS in a Rocky 9.4 VM on my laptop (both
VMWare and VirtualBox), mserver and odbedit ran just fine (!).
I'm currently trying to find out if there's a way to compare the VMs on my machine and
the machine that's being problematic, I'll report back if I learn anything. |
2886
|
05 Nov 2024 |
Maia Henriksson-Ward | Forum | How to properly write a client listens for events on a given buffer? | > If there's some template for writing a client to access event data, that would be
> very useful (and you can probably just ignore the context I gave below in that
> case).
>
>
> Some context:
>
> Quite a while ago, I wrote the attached "data pipeline" client whose job was to
> listen for events, copy their data, and pipe them to a python script. I believe I
> just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the
> attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
> data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the
> below problems; so don't take the logic in the attached code as the exact code that
> caused the issues below.
>
> However, I'm unable to resolve a couple issues:
>
> 1. If a timeout is set, everything will work until that timeout is reached. Then
> regardless of what kind of logic I tried to implement (retry receiving event,
> disconnect and reconnect client, etc.) the client would refuse to receive more data.
>
> 2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while
> loop. But because I can't set a timeout I have to ctrl-C twice; this would
> occasionally corrupt the ODB which was not ideal. I was able to get around this with
> some impractical solution involving ncurses I believe.
>
>
> Thanks,
> Jack
midas/examples/lowlevel/consume.cxx might be what you're looking for, but I think all
you're missing is a call to cm_yield() in your loop, so your midas client doesn't get
killed when the timeout is reached (and also so you can act on shutdown requests from
midas)
Something like
int status = cm_yield(100);
if (status == SS_ABORT || status == RPC_SHUTDOWN)
break;
There might be a recommended way to handle the ctrl-c and disconnect from the ODB, but
off the top of my head I don't remember it.
Also check out Ben's new(ish) python library, midas/python/examples/event_receiver.py
might be a much easier solution. And you can use the context manager, which will take
care of safely disconnecting from midas after you ctrl-C. |
2885
|
05 Nov 2024 |
Jack Carlton | Forum | How to properly write a client listens for events on a given buffer? | If there's some template for writing a client to access event data, that would be
very useful (and you can probably just ignore the context I gave below in that
case).
Some context:
Quite a while ago, I wrote the attached "data pipeline" client whose job was to
listen for events, copy their data, and pipe them to a python script. I believe I
just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the
attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the
below problems; so don't take the logic in the attached code as the exact code that
caused the issues below.
However, I'm unable to resolve a couple issues:
1. If a timeout is set, everything will work until that timeout is reached. Then
regardless of what kind of logic I tried to implement (retry receiving event,
disconnect and reconnect client, etc.) the client would refuse to receive more data.
2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while
loop. But because I can't set a timeout I have to ctrl-C twice; this would
occasionally corrupt the ODB which was not ideal. I was able to get around this with
some impractical solution involving ncurses I believe.
Thanks,
Jack |
Attachment 1: data_pipeline_(2).cxx
|
#include "midas.h"
#include "msystem.h"
#include "mrpc.h"
#include "mdsupport.h"
#include <iostream>
#include <unistd.h>
#include <stdio.h> // Added for popen
#include <stdlib.h> // Added for malloc and free
INT hBufEvent;
void process_event(EVENT_HEADER *pheader) {
printf("Received event #%d\n", pheader->serial_number);
printf("Event ID: %d\n", pheader->event_id);
printf("Data Size: %d bytes\n", pheader->data_size);
printf("Timestamp: %d\n", pheader->time_stamp);
printf("Trigger mask: %d\n", pheader->trigger_mask);
// Print a marker to indicate the start of serialized data
printf("EVENT_DATA_START\n");
// Serialize and print the event data
int* eventData = (int*)((char*)pheader + sizeof(EVENT_HEADER));
int numIntegers = (pheader->data_size - sizeof(EVENT_HEADER)) / sizeof(int);
for (int i = 0; i < 8; ++i) {
printf("%d ", eventData[i]);
}
printf("\n");
// Process the event here
}
int main() {
HNDLE hDB, hKey;
char host_name[HOST_NAME_LENGTH], expt_name[NAME_LENGTH], str[80];
char buf_name[32] = EVENT_BUFFER_NAME, rep_file[128];
unsigned int status, start_time, stop_time;
INT ch, request_id, size, get_flag, action, single, i;
// Define the maximum event size you expect to receive
INT max_event_size = 4000;
// Allocate memory for storing event data dynamically
void* event_data = malloc(max_event_size);
printf("1\n");
/* Get if existing the pre-defined experiment */
cm_get_environment(host_name, sizeof(host_name), expt_name, sizeof(expt_name));
// Print host_name
printf("host_name = %s\n", host_name);
// Print expt_name
printf("expt_name = %s\n", expt_name);
printf("2\n");
/* connect to the experiment */
status = cm_connect_experiment(host_name, expt_name, "data_pipeline", 0);
if (status != CM_SUCCESS) {
return 1;
}
printf("3\n");
status = bm_open_buffer(buf_name, DEFAULT_BUFFER_SIZE, &hBufEvent);
if (status != BM_SUCCESS && status != BM_CREATED) {
cm_msg(MERROR, "data_pipeline", "Cannot open buffer \"%s\", bm_open_buffer() status %d", buf_name, status);
return 1;
}
printf("4\n");
/* set the buffer cache size if requested */
bm_set_cache_size(hBufEvent, 100000, 0);
printf("5\n");
/* place a request for a specific event id */
status = bm_request_event(hBufEvent, EVENTID_ALL, TRIGGER_ALL, GET_ALL, &request_id, NULL); // Use NULL as the callback routine
printf("6\n");
printf("status = %d\n",status);
// Open a pipe to a Python script for data transfer
FILE* pipe = popen("python3 data_pipeline.py", "w");
if (pipe == NULL) {
perror("popen");
return 1;
}
// Enter the event processing loop
while (1) {
// Use the address of max_event_size in bm_receive_event
status = bm_receive_event(hBufEvent, event_data, &max_event_size, BM_WAIT); // Wait for new data indefinitely
if (status == BM_SUCCESS) {
//process_event((EVENT_HEADER*)((char*)event_data + sizeof(EVENT_HEADER)));
// Send the event data to the Python script via the pipe
fprintf(pipe, "EVENT_DATA_START\n");
int* eventData = (int*)((char*)event_data + sizeof(EVENT_HEADER));
int numIntegers = (max_event_size - sizeof(EVENT_HEADER)) / sizeof(int);
for (int i = 4; i < 12; ++i) {
fprintf(pipe, "%d ", eventData[i]);
}
fprintf(pipe, "\n");
fflush(pipe); // Flush the buffer to ensure data is sent immediately
} else {
printf("Error receiving event: %d\n", status);
break; // Exit the loop if an error occurs
}
}
// Close the pipe
pclose(pipe);
// Free the dynamically allocated memory
free(event_data);
cm_disconnect_experiment();
printf("7\n");
return 1;
}
|
Attachment 2: MidasConnector.cpp
|
#include "MidasConnector.h"
MidasConnector::MidasConnector(const char* clientName) {
// Initialize client name
strncpy(client_name_, clientName, NAME_LENGTH);
// Get host name and experiment name from environment
cm_get_environment(host_name_, sizeof(host_name_), experiment_name_, sizeof(experiment_name_));
// Initialize other private variables if needed
event_id = EVENTID_ALL; // Initialize with default value
trigger_mask = TRIGGER_ALL; // Initialize with default value
sampling_type = GET_ALL; // Initialize with default value (renamed from get_flags)
buffer_size = DEFAULT_BUFFER_SIZE; // Initialize with default value
timeout_millis = BM_WAIT;
strncpy(buffer_name, EVENT_BUFFER_NAME, sizeof(buffer_name)); // Initialize with default value
}
// Getters for the private variables
short MidasConnector::getEventId() const {
return event_id;
}
short MidasConnector::getTriggerMask() const {
return trigger_mask;
}
int MidasConnector::getSamplingType() const {
return sampling_type;
}
int MidasConnector::getBufferSize() const {
return buffer_size;
}
const char* MidasConnector::getBufferName() const {
return buffer_name;
}
int MidasConnector::getTimeout() const {
return timeout_millis;
}
HNDLE MidasConnector::getEventBufferHandle() const {
return hBufEvent;
}
// Setters for the private variables
void MidasConnector::setEventId(short eventId) {
event_id = eventId;
}
void MidasConnector::setTriggerMask(short triggerMask) {
trigger_mask = triggerMask;
}
void MidasConnector::setSamplingType(int samplingType) {
sampling_type = samplingType;
}
void MidasConnector::setBufferSize(int bufferSize) {
buffer_size = bufferSize;
}
void MidasConnector::setBufferName(const char* bufferName) {
strncpy(buffer_name, bufferName, sizeof(buffer_name));
}
void MidasConnector::setTimeout(int timeoutMillis) {
timeout_millis = timeoutMillis;
}
void MidasConnector::setEventBufferHandle(HNDLE eventBufferHandle) {
hBufEvent = eventBufferHandle;
}
bool MidasConnector::ConnectToExperiment() {
// Connect to the experiment
int status = cm_connect_experiment(host_name_, experiment_name_, client_name_, NULL);
if (status != CM_SUCCESS) {
// Handle connection error
return false;
}
return true;
}
void MidasConnector::DisconnectFromExperiment() {
// Disconnect from the experiment
cm_disconnect_experiment();
}
bool MidasConnector::OpenEventBuffer() {
int status = bm_open_buffer(buffer_name, buffer_size, &hBufEvent);
if (status != BM_SUCCESS && status != BM_CREATED) {
cm_msg(MERROR, client_name_, "Cannot open buffer \"%s\", bm_open_buffer() status %d", buffer_name, status);
return false;
}
return true;
}
bool MidasConnector::SetCacheSize(int cacheSize) {
bm_set_cache_size(hBufEvent, cacheSize, 0);
return true;
}
bool MidasConnector::RequestEvent() {
int request_id;
int status = bm_request_event(hBufEvent, event_id, trigger_mask, sampling_type, &request_id, NULL);
return status == BM_SUCCESS;
}
bool MidasConnector::ReceiveEvent(void* eventBuffer, int& maxEventSize) {
int status = bm_receive_event(hBufEvent, eventBuffer, &maxEventSize, timeout_millis);
return status == BM_SUCCESS;
}
|
Attachment 3: main.cpp
|
#include "event_processor/EventProcessor.h"
#include "data_transmitter/DataTransmitter.h"
#include "midas_connector/MidasConnector.h"
#include "json.hpp"
#include <fstream>
INT hBufEvent1;
INT hBufEvent2;
// Function to initialize MIDAS and open an event buffer
bool initializeMidas(MidasConnector& midasConnector, const nlohmann::json& config) {
// Set the MidasConnector properties based on the config
midasConnector.setEventId(config["eventId"].get<short>());
midasConnector.setTriggerMask(config["triggerMask"].get<short>());
midasConnector.setSamplingType(config["samplingType"].get<int>());
midasConnector.setBufferSize(config["bufferSize"].get<int>());
midasConnector.setBufferName(config["bufferName"].get<std::string>().c_str());
midasConnector.setBufferSize(config["bufferSize"].get<int>());
// Call the ConnectToExperiment method
if (!midasConnector.ConnectToExperiment()) {
return false;
}
// Call the OpenEventBuffer method
if (!midasConnector.OpenEventBuffer()) {
return false;
}
// Set the buffer cache size if requested
midasConnector.SetCacheSize(config["cacheSize"].get<int>());
// Place a request for a specific event id
if (!midasConnector.RequestEvent()) {
return false;
}
return true;
}
int main() {
// Read configuration from a JSON file
nlohmann::json config;
std::ifstream configFile("config.json");
configFile >> config;
configFile.close();
// Initialize MidasConnector and connect to the MIDAS experiment
MidasConnector midasConnector(config["clientName"].get<std::string>().c_str());
if (!initializeMidas(midasConnector, config)) {
printf("Error: Failed to initialize MIDAS.\n");
return 1;
}
// Read the maximum event size from the JSON configuration
INT max_event_size = config["maxEventSize"].get<int>();
// Allocate memory for storing event data dynamically
void* event_data = malloc(max_event_size);
// Initialize EventProcessor with detector mapping file and verbosity flag
EventProcessor eventProcessor(config["detectorMappingFile"].get<std::string>(), config["verbose"].get<bool>());
// Initialize DataTransmitter with the ZeroMQ address
DataTransmitter dataPublisher(config["zmqAddress"].get<std::string>());
// Connect to the ZeroMQ server
if (!dataPublisher.bind()) {
// Handle connection error
printf("Error: Failed to bind to port %s.\n", config["zmqAddress"].get<std::string>().c_str());
return 1;
} else {
printf("Connected to the ZeroMQ server.\n");
}
// Event processing loop
while (true) {
midasConnector.ReceiveEvent(event_data, max_event_size);
//Prcoess data once we have it
eventProcessor.processEvent(event_data, max_event_size);
// Serialize the event data with EventProcessor and store it in serializedData
std::string serializedData = eventProcessor.getSerializedData();
// Send the serialized data to the ZeroMQ server with DataTransmitter
if (!dataPublisher.publish(serializedData)) {
// Handle send error
printf("Error: Failed to send serialized data.\n");
}
}
// Cleanup and finalize your application
midasConnector.DisconnectFromExperiment(); // Disconnect from the MIDAS experiment
return 0;
}
|
2884
|
28 Oct 2024 |
Amy Roberts | Bug Report | Difficulty running MIDAS on Rocky 9.4 | > Now for each timeout it will print detailed syscall and timing information, if time goes backwards, it should catch it.
It appears that time is moving forward:
[aroberts@sdfcdmsdaq build]$ odbedit
[ODBEdit,ERROR] [odb.cxx:2043:db_open_database,ERROR] Removed ODB client 'ODBEdit', index 0 because process pid 1617119 does
not exists
[ODBEdit,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Removed exclusive access mode from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Corrected 1 ODB entries
[ODBEdit,INFO] Deleted entry '/System/Clients/1617119' for client 'ODBEdit' because it is not connected to ODB
[ODBEdit,INFO] Client 'ODBEdit' on buffer 'SYSMSG' removed by bm_open_buffer because process pid 1617119 does not exist
[local:amy_test:S]/>ss_semaphore_wait_for: semop/semtimedop(5) returned -1, errno 11 (Resource temporarily unavailable),
start time 0xd4fd98f6, now 0xd4fdc0ef, dt 0x000027f9, timeout 0x00002710 ms, SEMAPHORE TIMEOUT!
[ODBEdit,ERROR] [odb.cxx:2489:db_lock_database,ERROR] cannot lock ODB semaphore, timeout 10000 ms, aborting...
Aborted (core dumped) |
|