01 Dec 2024, Pavel Murat, Bug Report, EQ_PERIODIC-only equipment ?
|
> There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
>
> midas/examples/experiment/frontend.cxx
>
> and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
> your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Hi Stefan, thank you very much!
As the pointer to the readout function and pointers to device drivers are all defined in the same structure (EQUIPMENT),
I was naively assuming that the readout function should be set during the class driver initialization.
Now it is clear that the equipment responding to EQ_PERIODIC events doesn't have to have drivers,
and specifying the readout function is the responsibility of the user.
I got around exactly this way yesterday, but was thinking that I was hacking the system :)
-- regards, Pasha |
29 Dec 2024, Pavel Murat, Forum, time ordering of run transition calls to TMFeEquipment things
|
Dear MIDAS experts,
I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly,
within this approach it is the TMFeEquipment things, not the TMFrontend's themselves,
which handle the run transitions - the TMFrontend class
https://bitbucket.org/tmidas/midas/src/423082fb67c7711813fcda61f7cd03784c398f49/include/tmfe.h#lines-306:378
simply doesn't have methods to handle those directly.
So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different
TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things
defined by the same TMFrontend and by different TMFrontend's.
Many thanks and happy holidays to everyone!
-- regards, Pasha
|
02 Jan 2025, Pavel Murat, Forum, time ordering of run transition calls to TMFeEquipment things
|
Hi K.O., your clarification is much appreciated!
"
> I am not sure what you are trying to do. It is always easier to suggest a solution to a specific problem.
I think, I owe you an explanation :) :
Consider ~ 40 nodes with two FPGAs (PCIE cards) per node, talking to the detector hardware.
One of those FPGAs, in addition to reading the data, performs the global timing synchronization.
The high-bandwidth data readout is not controlled by MIDAS, so all frontends perform only 'slow control'-type functions.
In MIDAS language, an FPGA implements two different units of slow control equipment:
one - configuring and controlling a single FPGA (equipment type A), and another one - synchronizing
multiple FPGAs (equipment type B). On one of the nodes, unit A and unit B share the FPGA card,
so they better be controlled by the same frontend.
For one, I need to make sure that all type A equipment units, managed by multiple frontends,
are initialized before the [single] type B unit which shares the frontend with the type A unit.
And, of course, the end of a run transition has to be handled in the opposite order - type B unit
shuts down first.
As 'periodic' actions for all registered pieces of equipment are performed in the same loop [thread],
registering the equipment in the needed order - first A, then B - should give a solution - thanks for making that clear.
>
> 1) "time ordering of run transitions" - of course midas transitions are ordered by transition sequence numbers
> and the tmfe class provides methods to control this. ditto for the mfe.cxx frontends.
>
> 2) for one TMFrontend, the order of calling HandleBeginRun() is the order in which equipments were added to the
> equipment using FeAddEquipment(). HandleEndRun() is called in reverse order. (I better check this).
the ordering of the rpc handler calls in tmfe's tr_stop/tr_pause/tr_resume functions is ok.
>
> 3) to have multiple TMFrontends in one program would be unusual (mfe.cxx frontends completely do not support
> this), but should work. Everything was coded to support this, but it was never tested in practice because we
> cannot invent any useful use-case for it. HandleBeginRun() handlers are likely to be called in the frontends are
> created. (I could check this and confirm it works, as long as you have a valid use-case for this configuration).
agreed, I don't think there is a good use case for that, so no need to spend time checking.
>
> 4) Frontend X has EquipmentA and EquipmentB, you want EqA::HandleBeginRun() to be called at run transition 200
> and EqB::HandleBeginRun() to be called at run transition 400.
>
> This is not directly supported by mfe.cxx frontends (the begin_run() handler is a global function) and I did not
> directly implement it in the TMFE frontend.
>
> But I think this would be a useful improvement. I will look into this.
In the simplest case, registering the equipment units in the right order is definitely the answer.
However a single FPGA can perform multiple logically independent tasks and thus represent
multiple logical units of equipment. Those units however are not independent: they share the hardware (FPGA)
and thus do depend on each other. Giving users a full control over the sequence in which those logical units
execute their run transitions is quite likely to be needed, for example, to work around peculiarities of the
custom-made kernel drivers.
>
> Likely I will add per-equipment data members fEqConfBeginRunSeqNo, fEqConfEndRunSeqno, etc. Value 0 would
> unregister the corresponding run transition handler. This would cleanup the code quite a bit, a bunch
> of RegisterTranstionXXX functions could go away.
this also makes sense. -- thanks again, regards, Pasha
>
> K.O. |
09 Dec 2003, Paul Knowles, , db_close_record non-local/non-return
|
Hi All,
I have found a weird one:
The following code executes on the frontend machine in the
frontend_exit() routine, and connects to the odb running on
another separate machine:
...
cm_msg(MINFO,__func__, "line %d", __LINE__);
cm_get_experiment_database(&hdb, NULL);
cm_msg(MINFO,__func__, "line %d", __LINE__);
status = db_find_key(hdb, 0, "/Experiment/Run Parameters", &hkey);
cm_msg(MINFO,__func__, "line %d, hkey=%d, status=%d",
__LINE__, hkey, status);
checkstat("db_find_key returned status %d", status);
cm_msg(MINFO,__func__, "line %d", __LINE__);
status = db_close_record(hdb, hkey);
/* NOTREACHED!! the above call to db_close_record
doesn't return!
*/
cm_msg(MINFO,__func__, "line %d, status=%d", __LINE__, status);
checkstat("db_close_record returned status %d", status);
checkstat is a macro that does the following:
#define checkstat(format, arg...)\
do{ if(status != DB_SUCCESS) {\
cm_msg(MERROR, __func__, format, ## arg);\
return FE_ERR_ODB;}}while(0)
The key exists, and the status of the search is 1
(i.e., DB_SUCCESS) and rest of the code tries to run. What gets
really weird is that the db_close_record _doesn't_ _return_.
The code following the NOTREACHED comment just doesn't get
called. I get the message from the __LINE__ just in front
of the call, but not the message afterwards (cm_msg and printf
were tried). Somehow db_close_record is causing a non-local
exit or signal or something. No error message is printed and the
frontend continues to exit with exit code 0. But, since the rest
of my frontend_exit/odb closing doesn't happen, the odb is left in
a lost state requiring a cleanup. If I comment out the calls to
db_close_record, the rest of my frontend_exit runs normally
and the cm_disconnect_experiment() in mfe.c eventually closes my
open records correctly (I expect, anyway) and this is the present
workaround i am using. The terror i have is that several of my
hotlinked callback routines will call the close_record routine
when resetting illegal values. No end of hilarity will result there...
I was using the same code in the frontend under 1.9.2 and
have only recently upgraded to 1.9.3-? tarball from PAA and
there were no problems using the 1.9.2 code: this is a 1.9.3
issue.
I have localized the weirdness to what I think is the RPC interface.
Running the nullfrontend (no camac access) on the same machine as
hosts the ODB I can make the problem appear and disappear in the
following way:
(odb is local on machine ``monet'')
nullfe -h monet -e acqmonad : db_close_record will get lost
nullfe -e acqmonad : db_close_record works as expected.
I've tried also with the patch for the 256 byte odb string bug since
many of the open records have strings of that length, but that isn't
it. The only substancial looking change to mserver from 1.9.2 to 1.9.3
is the SIGPIPE ignore and that doesn't look like a good candidate either.
Can this be that some of the
#IFDEF LOCAL_ROUTINES
that got moved about in odb.c and others
are causing the remote call to get confused?
Clearly the answer is to just use stable and happy 1.9.2, but the
people for whom I am working now really want to use ROOT for
an analyzer...
cheers,
.p.
Paul Knowles. phone: 41 26 300 90 64
email: Paul.Knowles@unifr.ch Fax: 41 26 300 97 47
finger me at pexppc33.unifr.ch for more contact information |
18 Dec 2003, Paul Knowles, , Poll about default indent style
|
Hi Stefan,
> once and forever, I am considering using the "indent" program which comes
> with every linux installation. Running indent regularly on all our code
> ensures a consistent look.
I think this can be called a Good Thing.
> The "-kr" style does the standard K&R style,
> but used tabs (which is not good), and does a 4-column
> indention which is I think too much. So I would propose
> following flags:
> indent -kr -nut -i2 -di8 -bad <filename.c>
(some of this is a repeat from an earlier mail to SR):
You might also want a -l90 for a longer line length than 75
characters. K&R style with indentation from 5 to 8 spaces
is a good indicator of complexity: as soon as 40 characters
of code wind up unreadably squashed to the right of the
screen, you have to refactor to have less indentation
levels. This means you wind up rolling up the inner parts
of deeply nested conditionals or loops as separate
functions, making the whole code easier to understand.
I think that setting -i2 is ``going around the problem''
of deep nesting. If you really need to keep the indentation
tabs less than 4 (8 is ideal) because your code is falling off the
right edge of the screen, you are indented too deeply. Why do
I say that? There is the famous ``7+-1'' idea that you can hold
in you head only 7 ideas (give or take one) at any time. I'm not
that smart and I top out at about 5: So for example, a conditional
in a loop in a conditional in a switch is about as deep a level
of nesting as I can easily understand (remember that I also have
to hold the line i'm working on as well): that's 4 levels, plus one for the
function itself and we are at 40 characters away from the right edge
of the screen using -i8 and have some 40 characters available for writing code
(how often is a line of code really longer than about 40 characters?).
On top of that, the indentation is easily seen so you know immediately
wheather you are at the upper conditional, or inner conditional. A -i2
just doesn't make the difference big enough. -i5 is a happy balance
with enough visual clue as to the indentation level, but leaves you 50
to 60 characters for the code line itself.
However, if you are indenting very deeply, then the poor reader can't hold
on to the context: there are more than 6 or 7 things to keep in mind.
In those cases, roll up the inner levels as a separate function and
call it that way. The inner complexity of the nested statements gets
nicely abstracted and then dumb people like me can understand what
you are doing.
So, in brief: indent is a good idea, and -in with n>=4 will be best.
I don't think -i2 will lend itself to making the code so much easier
to read.
thanks for listening.
.p. |
07 Aug 2019, Paolo Baesso, Bug Report, ROOTANA bug?
|
Hi,
I posted on the ROOTANA elog but there seems to be little activity there...
Could someone confirm if this is a bug?
https://midas.triumf.ca/elog/Rootana/14
Another user replied that they are encountering the same issue, so I think it is unlikely it is just our installation.
While ROOTANA is unusable for us, I tried to use the example Frontend and Analyzer (under the Experiment source folder). The analyzer does not seem to do much though. A root file is produced but nothing is placed into it. Is that normal?
Any help would be welcome. |
21 May 2024, Nikolay, Bug Report, experiment from midas/examples
|
There are 2 bugs in midas/examples/experiment:
1) In fronted bank named "PRDC" is created for scaler event. But in analyzer
module scaler.cxx the bank named "SCLR" is searched for the same event.
2) In mana.cxx linked from analyzer.cxx is "Invalid name "/Analyzer/Tests/Always
true/Rate [Hz]" passed to db_create_key: should not contain "["".
Looks like ODB doesn't like '[', ']' characters. |
02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option
|
Thanks for all the helpful hints. When finally managing to evade all timeouts and attach the debugger in just the right moment, we find that we get a segfault in mserver at L827:
case RPC_DB_COPY_XML:
status = db_copy_xml(CHNDLE(0), CHNDLE(1), CSTRING(2), CPINT(3), CBOOL(4));
Some printf debugging then pointed us to the fact that the culprit is the pointer de-referencing in CBOOL(4). This in turn can be traced back to mrpc.cxx L282 ff, where the line with the arrow was missing:
{RPC_DB_COPY_XML, "db_copy_xml",
{{TID_INT32, RPC_IN},
{TID_INT32, RPC_IN},
{TID_ARRAY, RPC_OUT | RPC_VARARRAY},
{TID_INT32, RPC_IN | RPC_OUT},
-> {TID_BOOL, RPC_IN},
{0}}},
If we put that in, the mserver process completes peacfully and we get a segfault in the client ("Wrong key type in XML file") which we will attempt to debug next. Shall I create a pull request for the additional RPC argument or will you just fix this on the fly? |
02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option
|
And now we also fixed the client segfault, odb.cxx L8992 also needs to know about the header:
if (rpc_is_remote())
return rpc_call(RPC_DB_COPY_XML, hDB, hKey, buffer, buffer_size, header);
(last argument was missing before). |
26 Jul 2019, Nik Berger, Bug Report, History/Endianness
|
Hi,
I have a bank of floats with slow control values that I store to the history and
ODB. When reading the history, both in the webbrowser and with mhist, the floats
get read with the wrong endianness; under /equipment/variables in the ODB they
however display correctly. System is a an intel OpenSuse linux box. Any ideas?
Thanks
Nik |
06 Oct 2019, Nik Berger, Bug Report, History data size mismatch
|
Logging a list of variables to the history via links in the history ODB subtree,
we get messages as follows at every run start:
19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
Should probably be fixed at a deeper level... |
10 Oct 2019, Nik Berger, Bug Report, History data size mismatch
|
>I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
>history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
>equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
>very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
>under 'variables', you might get the same error.
>
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>
We do this in the MuX DAQ and mix things that come directly from MIDAS (the MIDAS trigger rate) and things from the
analyzer (rates in the self-triggering detectors) and some temperatures from yet somewhere else. Yes, we could have
kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
sense to use floats...
Nik |
28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hello experts,
I have been writing a SC frontend for a powersupply. I have used the model
where the frontend can be started with "-i n" option so that each fe can
control a different supply. During the development/testing of the program I
would normally only run a single instance with "-i 1". However when I started
a second instance with "-i 2" I found problems with the history plots that
were being made for the original "-i 1" instance. The variable being plotted
seemed to randomly jump between the value from the "-i 1" instance and
the "-i 2" instance. confirmed that the "correct" values exist for each
frontend in the odb under /Equipment/Foo01/Variables and
/Equipment/Foo02/Variables
This is also not just a plotting artifact since I was also
able to see the two different values by running mhist.
I saw this behaviour using midas-2019-03 and also the head of the development
branch (686e4de2b55023b0d1936c60bcf4767c5e6caac0 from just under 48 hours ago).
I was able to reproduce this with a stripped down frontend that just
sets a variable that is equal to its frontend_index. Please find the code
and Makefile attached. Presumably I've done something wrong in my
implementation that hopefully a more experienced person can spot quite
quickly, but please let me know if any more information is needed.
I have seen this behaviour on both Debian 10 and on a CentOS 7 Singularity
image running on top of Debian 10.
Thanks,
Nick.
P.S. I made the topic of this post "Forum" and not "Bug Report" since I
expect the root of this problem is somewhere between the keyboard and chair. |
28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Stefan,
thanks for you quick reply.
> My first question would be why are you using several font-ends at all?
Becuase I was following the model used for many of the frontends for the ND280 FGD.
> That makes things more
> complicated than needed. In the normal FE framework, you can define either several equipment
> served by one frontend, or even one equipment linked to several devices. In the MEG experiment
> we have one slow control frontend controlling ~100 devices without problem. In the old days
there
> was a problem that some slow devices could throttle the readout, but since the invention of
multi-
> threaded slow control equipment, each device gets its own thread so they don't block each
other.
Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it
differently but doing it that way seemed naturual since around 90% of the frontend code that I
have see does it that way.
Nick. |
29 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi LCP,
thanks for the suggestion and link. Unfortunatly I don't think this explains it.
Nick. |
29 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
01 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
16 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Konstantin,
thanks for your reply.
> > thanks for your reply. I can confirm that your suggested workaround does indeed
> > make the problem dissapear.
> > I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
>
> I think you found the source of the problem, confused event id assignments. To confirm,
> can you email me (or post here) the output of odbedit "ls -l /History/Events".
Sorry, do you want this for after I've applied the fix suggested by Ben or with the original code
that I posted.
With the original code it only shows one fe even though both are running:
[local:e666:S]History>ls -l /History/Events
Key name Type #Val Size Last Opn Mode Value
---------------------------------------------------------------------------
1 STRING 1 10 2m 0 RWD FeDummy02
0 STRING 1 16 2m 0 RWD Run transitions
[local:e666:S]History> scl
Name Host
mhttpd localhost
fedummy01 localhost
fedummy02 localhost
ODBEdit localhost
Logger localhost
[local:e666:S]History>ls "/History/Display/Default/Dummy/
Timescale 1h
Zero ylow n
Show run markers y
Show values y
Sort Vars n
Log axis n
Minimum 0
Maximum 0
Variables
FeDummy01:Data
FeDummy02:Data
Label
Colour
#00AAFF
#FF9000
Factor
0
0
Offset
0
0
Buttons
10m
1h
3h
12h
24h
3d
7d
Formula
Show old vars n
> If that's the problem, you can avoid it completely by switching to a history storage method
> that does not rely on magic mapping between equipment names and numeric event id's:
> try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
> the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which
> history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names
> from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history
> data from channel XXX", but somebody removed this message).
Using the orginal code I posted and switching from MIDAS history to FILE history did not seem to
change the random behaviour in the history plots.
Regards,
Nick. |
18 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
> > [local:e666:S]History>ls -l /History/Events
> > Key name Type #Val Size Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1 STRING 1 10 2m 0 RWD FeDummy02
> > 0 STRING 1 16 2m 0 RWD Run transitions
>
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.
>
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
>
> I want to debug this.
>
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
> from the "file" storage.
>
> K.O. |
18 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Konstantin,
> > [local:e666:S]History>ls -l /History/Events
> > Key name Type #Val Size Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1 STRING 1 10 2m 0 RWD FeDummy02
> > 0 STRING 1 16 2m 0 RWD Run transitions
>
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.
This is a fresh experiment that I started just to test this this issue, that is why there are not many
entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
>
> I want to debug this.
>
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat"
files
> from the "file" storage.
When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file
history. Jsut now I reenabled the Midas history, so they are currently both active.
% ls -l work/online/{*.hst,mhf*.dat}
-rw-r--r-- 1 hastings hastings 14996 Sep 17 10:21 work/online/190917.hst
-rw-r--r-- 1 hastings hastings 3292 Sep 18 16:29 work/online/190918.hst
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
-rw-r--r-- 1 hastings hastings 166 Sep 17 10:17
work/online/mhf_1568683062_20190917_run_transitions.dat
And again, just as a sanity check:
% odbedit -c 'ls -l /History/Events'
Key name Type #Val Size Last Opn Mode Value
---------------------------------------------------------------------------
1 STRING 1 10 1m 0 RWD FeDummy02
0 STRING 1 16 1m 0 RWD Run transitions
Regards,
Nick. |
|