02 Mar 2016, ZiyiGuo, Forum, Problem with BLTRead
|
> > Dear all,
> >
> > I'm using MIDAS system and CAEN V1721 to digitize the waveform from photomultipliers (
> > and the link bridge to PC is V2718 ). I use BLTRead to read data of the digitizer, but
> > I found that if the event counting rate is high ( about 100KB/s ), the communication
> > of V1721 and PC would be suspended randomly, and I get an error code of -2. Could you
> > give me some suggestion? Thanks a lot.
>
> Hi,
>
> Can you provide the BLTread call fragment code and the PC /var/log/messages at the time of
> the hang up.
> What is needed to restart the daq?
>
> PAA
Hi Pierre-Andre,
Sorry for my late reply, because the data acquisition system now is running other experiment.
Here is my code. Is there something wrong? Thanks!
/* Read FADC data */
int NByteOfOneEvent = HeadSize + SampSize * NChannel;
int NDWordOfOneEvent = NByteOfOneEvent/4;
/* 1. Create FADC bank. One bank for one branch of a tree or one array branch with length. */
bk_create(pevent, "FADC", TID_DWORD, (void**)&pdata);
uint32_t size_remaining_dwords;
int dwords_read;
/* 2. Read out the event and assign them to pdata (bank buffer) */
//read size of event to be read
sCAEN = CAENComm_Read32(hFADC[card], V1721_EVENT_SIZE, &size_remaining_dwords);
if( size_remaining_dwords < NDWordOfOneEvent ) {
printf("\r\nSize of available data is less than the required size of one event.\r\n");
}
/* Read */
DWORD *pFadcData;
sCAEN = CAENComm_BLTRead(hFADC[card], V1721_EVENT_READOUT_BUFFER, pdata, NDWordOfOneEvent, &dwords_read);
// These code in "if" is for restart communication and save the time information if the communication was suspended
if(sCAEN != 0)
{
//printf("sCAEN =%d \n", sCAEN);
time_t t = time(0);
char tmp[64];
strftime(tmp,sizeof(tmp),"%Y/%m/%d %X %Z",localtime(&t));
fprintf(logfile,tmp);
fprintf(logfile,"\n Here met communication error \n");
printf(" Here met communication error \n");
//re-establish communication
sCAEN = CAENComm_CloseDevice(hFADC[card]);
fprintf(logfile,"sCAEN =%d, device closed **********\n", sCAEN);
ss_sleep(2000);
sCAEN = CAENComm_OpenDevice(CAENComm_PCIE_OpticalLink, l, d, FADCBA[card], &(hFADC[card]));
if (sCAEN == CAENComm_Success) {
fprintf(logfile,"re-establish communication, handle:%d, sCAEN=%d \n", hFADC[card], sCAEN);
}
else {
sCAEN = CAENComm_OpenDevice(CAENComm_PCIE_OpticalLink, l, d, FADCBA[card], &(hFADC[card]));
fprintf(logfile,"try open device again sCAEN= %d\n", sCAEN); }
//pause ongoing reading process
sCAEN = ov1721_AcqCtl(hFADC[card], V1721_RUN_STOP);
sCAEN = CAENComm_Read32(hFADC[card], V1721_EVENT_STORED, &eStored);
//discard FADC buffer
sCAEN = CAENComm_Write32(hFADC[card], V1721_SW_CLEAR, 0);
fprintf(logfile," number of %d events discarded \n\n", eStored);
sCAEN = ov1721_AcqCtl(hFADC[card], V1721_RUN_START);
}
//dwords_read: Number of the words that actually read from the device.
if( dwords_read != NDWordOfOneEvent ) {
printf("\r\nSize of data read out doesn't equal to the required size of one event. \r\n");
}
EvtCounterFadc[card] = *(pdata+2) & 0x00ffffff;
/* 3. Update bank pointer position */
pdata += dwords_read;
/* 4. Finish one bank */
bk_close(pevent, pdata);
|
26 Jun 2019, Hassan, Forum, Problem transferring fetest data from the remote frontend to the backend
|
Hi again, we now have Midas installed on the Rpi (remote frontend machine) and
have managed to run Fetest on it. Now we are at a stage where we want to send
the Fetest data over to the Data Acquisition machine, which also has Midas
installed. We want this data to be read into the Webserver Status page. We have
tried commands such as: (but Fetest then doesn't run)
./fetest -h DAQ-system-ip-address
./fetest -e sampleexpt -h DAQ-system-Ip-address
./fetest -e sampleexpt -h DAQ-system-Ip-address-with-webserver-port
our experiment name is sampleexpt on Rpi and DAQ machine in their respective
exptab files. Maybe the Rpi is getting confused as to whether it should be
running the experiment on Rpi or the DAQ. We need it to run on the DAQ.
Does the mserver have any role in this?
Thanks you for your kind help (we summer interns are really stuck!) |
26 Jun 2019, Konstantin Olchanski, Forum, Problem transferring fetest data from the remote frontend to the backend
|
> Hi again, we now have Midas installed on the Rpi (remote frontend machine) and
> have managed to run Fetest on it. Now we are at a stage where we want to send
> the Fetest data over to the Data Acquisition machine ...
>
> Does the mserver have any role in this?
>
Yes. mserver runs on your daq machine and handles connections from frontends running on frontend machines. It needs to be configured
correctly before it will work: in odb on your daq machine, non-local rpc has to be enabled and the frontend machine has to be added to the
midas rpc access control list.
Read this:
https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux#Running_with_one_or_more_REMOTE_frontends
And this:
https://midas.triumf.ca/MidasWiki/index.php/Security#MIDAS_programs_on_remote_machines
K.O. |
12 Mar 2019, Francesco Renga, Forum, Problem stopping every second run
|
Dear all,
I'm running a DAQ frontend and it works well if one single run is
taken. If I try to take a second run right after, the run is performed but, when
stopping it, I get the error messages below. Any hint?
Thank you for your help,
Francesco
11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:6022:cm_shutdown,ERROR] Killing
and Deleting client 'cygnus_daq' pid 12472
11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:6019:cm_shutdown,ERROR] Cannot
connect to client 'cygnus_daq' on host 'localhost', port 46341
11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:9539:rpc_client_connect,ERROR]
cannot connect to host "localhost", port 46341: connect() returned -1, errno 111
(Connection refused)
11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:10821:rpc_client_call,ERROR]
call to "cygnus_daq" on "localhost" RPC "rc_transition": error,
ss_recv_net_command() status 411
11:42:24.012 2019/03/12 [mhttpd,ERROR] [system.c:4715:ss_recv_net_command,ERROR]
error receiving network command header, see messages
11:42:24.011 2019/03/12 [mhttpd,ERROR] [system.c:4661:recv_tcp2,ERROR]
unexpected connection closure
11:42:23.994 2019/03/12 [cygnus_daq,ERROR] [midas.c:1951:,ERROR]
cm_disconnect_experiment not called at end of program |
13 Mar 2019, Konstantin Olchanski, Forum, Problem stopping every second run
|
> I'm running a DAQ frontend and it works well if one single run is
> taken. If I try to take a second run right after, the run is performed but, when
> stopping it, I get the error messages below. Any hint?
Sure. I will read the error messages for you: note that they come in reverse time order - oldest message is at the bottom. I
will reverse them in my reading:
> 11:42:23.994 2019/03/12 [cygnus_daq,ERROR] [midas.c:1951:,ERROR]
> cm_disconnect_experiment not called at end of program
your frontend has exited (this error message is printed by the atexit() code, so you did not crash but called exit(),
somehow).
> 11:42:24.011 2019/03/12 [mhttpd,ERROR] [system.c:4661:recv_tcp2,ERROR]
> unexpected connection closure
mhttpd reports that the socket connection to your frontend has closed (because the frontend stopped, closing all it's
sockets)
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [system.c:4715:ss_recv_net_command,ERROR]
> error receiving network command header, see messages
mhttpd network code next layer reports this same error again
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:10821:rpc_client_call,ERROR]
> call to "cygnus_daq" on "localhost" RPC "rc_transition": error,
> ss_recv_net_command() status 411
mhttpd network code next layer reports this same error again, now we see that it was trying
to execute a "run start" (or "stop") RPC call to your frontend. But your frontend unexpectedly
shutdown (instead of replying to the RPC).
Next messages after that is mhttpd decided that your frontend is faulty (does not respond to RPC correctly) and tried to
shut it down, but failed (cannot connect, etc). Last message is mhttpd cleaning up your frontend from ODB (because the
frontend did not cleanup after itself - it did not call cm_disconnect_experiment(), per the very first error message).
So this is what we see from the midas messages - your frontend unexpectedly exited during the run transition - if as you
say the run was stopping at the time, it would be in your end_of_run() function.
To debug this, I would do:
a) put some printf() statements in end_of_run() and see what they say during the crash
b) run the frontend inside the debugger, you may need to set a breakpoint on exit() or something like that.
Good luck,
K.O.
>
> Thank you for your help,
> Francesco
>
>
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:6022:cm_shutdown,ERROR] Killing
> and Deleting client 'cygnus_daq' pid 12472
>
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:6019:cm_shutdown,ERROR] Cannot
> connect to client 'cygnus_daq' on host 'localhost', port 46341
>
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:9539:rpc_client_connect,ERROR]
> cannot connect to host "localhost", port 46341: connect() returned -1, errno 111
> (Connection refused)
>
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [midas.c:10821:rpc_client_call,ERROR]
> call to "cygnus_daq" on "localhost" RPC "rc_transition": error,
> ss_recv_net_command() status 411
>
> 11:42:24.012 2019/03/12 [mhttpd,ERROR] [system.c:4715:ss_recv_net_command,ERROR]
> error receiving network command header, see messages
>
> 11:42:24.011 2019/03/12 [mhttpd,ERROR] [system.c:4661:recv_tcp2,ERROR]
> unexpected connection closure
>
> 11:42:23.994 2019/03/12 [cygnus_daq,ERROR] [midas.c:1951:,ERROR]
> cm_disconnect_experiment not called at end of program |
02 Feb 2007, Exaos Lee, Bug Fix, Problem solved by Re-define _syscall0(...)
|
OK, I searched and found that my kernel doesn't support "_syscall0" any more. So I patched the system.c as the following (from line 954):
#if defined(OS_DARWIN)
// blank
#elif defined(OS_LINUX)
#include <sys/syscall.h>
#include <unistd.h>
#undef _syscall0
#define _syscall0(type, name) \
type name(void) \
{\
return syscall(__NR_##name); \
}
_syscall0(pid_t,gettid)
#endif
My kernel version:exaos@memes midas>$ uname -a
Linux memes 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux
Maybe it's not the perfect way, but it works.  |
06 Feb 2007, Stefan Ritt, Bug Fix, Problem solved by Re-define _syscall0(...)
|
| Exaos Lee wrote: | Maybe it's not the perfect way, but it works.  |
I changed it to:
#ifdef OS_UNIX
return syscall(SYS_gettid);
#endif /* OS_UNIX */
[/code1]
without any #define.
Does this work for you?
- Stefan |
14 Oct 2014, Konstantin Olchanski, Bug Report, Problem in mfe multithread equipments
|
In the ALPHA experiment at CERN I found a problem in mfe.c handling of multithreaded equipments. This problem was in
some forms introduced around May 2013 and around Aug 2013 (commit
https://bitbucket.org/tmidas/midas/src/45984c35b4f7/src/mfe.c) (I hope I got it right).
The effect was very odd - if event rate of multithreaded equipment was more than 100 Hz, the event counters on the midas
status page would not increment and the frontend will crash on end of run. Other than that, all the events from the
multithreaded equipment seem to appear in the SYSTEM buffer and in the data file normally.
This happened: in mfe.c::receive_trigger_event() a loop was introduced (previously,
there was no loop there - there was and still is a loop outside of receive_trigger_event()):
while (1)
wait 10 ms for an event
process event, loop back
if there is no event, exit
}
Obviously, if the event rate is more than 100 Hz (repetition rate less than 10 ms),
the 10 ms wait will always return an event and we will never exit this loop.
So the mfe.c main loop is now stuck here and will not process any periodic activity
such as updating the equipment statistics (event counters on the midas status page)
or running periodic equipments in the same front end program.
The crash at the end of run will be caused by a timeout in responding to the "end of run" RPC call.
I have a patch in testing that solves this problem by restoring receive_trigger_event() to the original configuration, i.e.
https://bitbucket.org/tmidas/midas/src/6899b96a4f8177d4af92035cd84aadf5a7cbc875/src/mfe.c?at=develop
K.O. |
14 Oct 2014, Konstantin Olchanski, Bug Report, Problem in mfe multithread equipments
|
For my reference:
good version: https://bitbucket.org/tmidas/midas/src/6899b96a4f8177d4af92035cd84aadf5a7cbc875/src/mfe.c?at=develop
first breakage: https://bitbucket.org/tmidas/midas/src/c60259d9a244bdcd296a8c5c6ab0b91de27f9905/src/mfe.c?at=develop
second breakage: https://bitbucket.org/tmidas/midas/src/45984c35b4f7257f90515f29116dec6fb46f2ebc/src/mfe.c?at=develop
The "first breakage" may actually be okey, because there the badnik loop loops over ring buffers, not infinite. But I cannot test it anymore.
K.O. |
15 Oct 2014, Stefan Ritt, Bug Report, Problem in mfe multithread equipments
|
You are absolutely correct, the code is certainly wrong. It looks to me like the
while (rbh)
was put in there for some testing, and I forgot to remove it. The only thing I could imagine is that we want to have a while loop there for performance reason. Like
readout_start = ss_millitime();
while (ss_millitime - readout_start < (DWORD) eq_info->period) {
read event
return 0 if no event found
}
You find this code also in the check_polled_events() routine. It ensures that the routine does not return after every single event, but after the period defined in the
equipment (which is usually 100 ms for polled events). This way the code is more efficiently, since we do not check for RPC calls between every event, but just 10 times
per second. This way you can shovel more events through the system, while still being responsive to run stops.
I don't have any hardware right now to test this, so please put my code above into the routine and commit it if it works.
I notice also a difference in both codes concerning the read buffer handles. The old code uses rbh2, while the new (wrong) code uses rbh. In your case probably both
handles are the same, so it works, but in other experiments, which might use several ring buffers, it will fail. So please use rbh instead rbh2.
Let me know if it works for you, and if you see any difference in speed between the versions with and without the while loop (actually you will see this only if your trigger
rate maxes out the DAQ).
Cheers,
Stefan |
15 Oct 2014, Stefan Ritt, Bug Report, Problem in mfe multithread equipments
|
Please disregard my previous posting, you don't need the while loop, since it's already in the scheduler (around lines 2160 under /*---- send interrupt events ----*/).
But now I remember the rationale behind it. The loop over the rb[i] is because in MEG I have n calibration threads, each one running on a separate CPU core. So the receive_trigger_event() routine has to collect events from all the
threads, each of them having one ring buffer. In the process of implementing EQ_USER, I changed this somehow, and apparently broke the code by making the while() loop looping forever if the event rate is over 100 Hz.
So for the moment please remove the while loop completely, and I will worry later of putting it back correctly when MEG will start again next year.
/Stefan |
16 Oct 2014, Stefan Ritt, Bug Report, Problem in mfe multithread equipments
|
> while (1)
> wait 10 ms for an event
> process event, loop back
> if there is no event, exit
> }
This code has been rewritten now and should work for event rates >100 Hz.
/Stefan |
09 Jun 2020, Isaac Labrie Boulay, Info, Preparing the VME hardware - VME address jumpers.
|
Hey folks,
I'm currently working on setting up a MIDAS experiment and I am following the
"Setup MIDAS experiment at Triumf" page on
MidasWiki(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_
TRIUMF).
The 3rd line of the hardware checklist under the "Prepare VME hardware section"
has a link to a page that doesn't exit anymore, I'm trying to figure out how to
setup the VME address jumpers on the VME modules.
Does anyone know how to setup the VME modules? Or, can anyone send me a link to
instructions?
Thanks a lot for your time.
Isaac |
10 Jun 2020, Konstantin Olchanski, Info, Preparing the VME hardware - VME address jumpers.
|
Hi, if you are not using any VME hardware, then you have no VME address jumpers to
set. https://en.wikipedia.org/wiki/VMEbus
K.O. |
12 Jun 2020, Isaac Labrie Boulay, Info, Preparing the VME hardware - VME address jumpers.
|
> Hi, if you are not using any VME hardware, then you have no VME address jumpers to
> set. https://en.wikipedia.org/wiki/VMEbus
>
> K.O.
Hi thanks for taking the time to help me out. I am using a VME-MWS in this experiment.
Let me know what you think.
Isaac |
05 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure
|
Today I fixed a long-annoying problem. We have in each front-end an equipment structure
which defined the event id, event type, readout frequency etc. This is mapped to the ODB
subtree
/Equipment/<name>/Common
In the past, the ODB setting took precedence over the frontend structure. We defined this
like 25 years ago and I forgot what the exact reason was. It causes however many people
(including myself) to fall into this trap: You change something in the front-end EQUIPMENT
structure, you restart the front-end, but the new setting does not take effect since the
(old) ODB value took precedence. After some debugging you find out that you have to both
change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the
ODB value itself.
So I changed it in the current develop tree that the front-end structure takes precedence.
You still have a hot-link, so if you want to change anything while the front-end is running
(like the readout period), you can do that in the ODB and it takes effect immediately. But
when you start the front-end the next time, the value from the EQUIPMENT structure is
taken again. So please be aware of this new feature.
Happy BC day,
Stefan |
06 Aug 2019, Thomas Lindner, Info, Precedence of equipment/common structure
|
Hi Stefan,
This change does not sound like a good idea to me. I think that this change will cause just as much confusion as before; probably more since you are changing established behaviour.
It is common that MIDAS frontends usually have a Settings directory in the ODB where details about the frontend behaviour are set. The Settings directory might get initialized from strings in the frontend code, but after initialization the Settings in the ODB have precedence and define how the frontend will behave. Indeed, most of my custom webpages are designed to control my frontend programs through their Settings ODB tree.
So you have created a situation where Frontend/Settings in the ODB has precedence and is the main place for changing frontend behaviour; but Frontend/Common in the ODB is essentially meaningless and will get overwritten the next time the frontend restarts. That seems likely to confuse people.
If you really want to make this change I suggest that you delete the Frontend/Common directory entirely; or make it read-only so that people aren't fooled into changing it.
Thomas
> Today I fixed a long-annoying problem. We have in each front-end an equipment structure
> which defined the event id, event type, readout frequency etc. This is mapped to the ODB
> subtree
>
> /Equipment/<name>/Common
>
> In the past, the ODB setting took precedence over the frontend structure. We defined this
> like 25 years ago and I forgot what the exact reason was. It causes however many people
> (including myself) to fall into this trap: You change something in the front-end EQUIPMENT
> structure, you restart the front-end, but the new setting does not take effect since the
> (old) ODB value took precedence. After some debugging you find out that you have to both
> change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the
> ODB value itself.
>
> So I changed it in the current develop tree that the front-end structure takes precedence.
> You still have a hot-link, so if you want to change anything while the front-end is running
> (like the readout period), you can do that in the ODB and it takes effect immediately. But
> when you start the front-end the next time, the value from the EQUIPMENT structure is
> taken again. So please be aware of this new feature.
>
> Happy BC day,
> Stefan |
06 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure
|
Hi Thomas,
the change only affects Eqipment/<name>/common not the Equipment/<name>/Settings.
The Common subtree is still hot-linked into the frontend, so when running things can be changed if needed. This mainly concerns the readout period of periodic events.
Sometimes you want to change this quickly without restarting the frontend. Changing the other settings are kind of dangerous. If you change the ID of an event on the fly
you won't be able to analyze your data. So having this read-only in the ODB might be a good idea (you still need it in the ODB for the status page), except for the values
you want to change (like the readout period).
Let's see what other people have to say.
Stefan
> Hi Stefan,
>
> This change does not sound like a good idea to me. I think that this change will cause just as much confusion as before; probably more since you are changing established behaviour.
>
> It is common that MIDAS frontends usually have a Settings directory in the ODB where details about the frontend behaviour are set. The Settings directory might get initialized from strings in the frontend code, but after initialization the Settings in the ODB have precedence and define how the frontend will behave. Indeed, most of my custom webpages are designed to control my frontend programs through their Settings ODB tree.
>
> So you have created a situation where Frontend/Settings in the ODB has precedence and is the main place for changing frontend behaviour; but Frontend/Common in the ODB is essentially meaningless and will get overwritten the next time the frontend restarts. That seems likely to confuse people.
>
> If you really want to make this change I suggest that you delete the Frontend/Common directory entirely; or make it read-only so that people aren't fooled into changing it.
>
> Thomas
>
>
>
> > Today I fixed a long-annoying problem. We have in each front-end an equipment structure
> > which defined the event id, event type, readout frequency etc. This is mapped to the ODB
> > subtree
> >
> > /Equipment/<name>/Common
> >
> > In the past, the ODB setting took precedence over the frontend structure. We defined this
> > like 25 years ago and I forgot what the exact reason was. It causes however many people
> > (including myself) to fall into this trap: You change something in the front-end EQUIPMENT
> > structure, you restart the front-end, but the new setting does not take effect since the
> > (old) ODB value took precedence. After some debugging you find out that you have to both
> > change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the
> > ODB value itself.
> >
> > So I changed it in the current develop tree that the front-end structure takes precedence.
> > You still have a hot-link, so if you want to change anything while the front-end is running
> > (like the readout period), you can do that in the ODB and it takes effect immediately. But
> > when you start the front-end the next time, the value from the EQUIPMENT structure is
> > taken again. So please be aware of this new feature.
> >
> > Happy BC day,
> > Stefan |
06 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure
|
After some internal discussion, I decided to undo my previous change again, in order not to break existing habits. Instead, I created a new function
set_odb_equipment_common(equipment, name);
which should be called from frontend_init() which explicitly copies all data from the equipment structure in the front-end into the ODB.
Stefan |
09 Aug 2019, Konstantin Olchanski, Info, Precedence of equipment/common structure
|
> Today I fixed a long-annoying problem. ...
> /Equipment/<name>/Common
> In the past, the ODB setting took precedence over the frontend structure...
> We defined this like 25 years ago and I forgot what the exact reason was.
> It causes however many people (including myself) to fall into this trap: ...
There is good number of confusions regarding entries in /eq/xxx/common:
- for some of them, the frontend code settings take precedence and overwrite settings in odb ("frontend file name")
- for some of them, ODB takes precedence and frontend code values are ignored ("read on" and "period")
- for some of them, changes in ODB take effect immediately (via db_watch) ("period")
- for some of them, frontend restart is required for changes to take effect (output event buffer name "buffer")
- some of them continuously update the odb values ("status", "status color")
I do not think there is a simple way to improve on this.
(One solution would replace the single "common" with several subdirectories, "per function",
one would have items where the code takes precedence, one would have items where odb takes
precedence (in effect, "standard settings"), one will have items that the frontend always updates
and that should not be changes via odb ("frontend name", etc). I am not sure this one solution
is necessarily an "improvement").
Lacking any ideas for improvements, I vote for the status quo. (plus a review of the documentation to ensure we have clearly
written up what each entry in "common" does and whether the user is permitted to edit it in odb).
K.O. |
|