ID |
Date |
Author |
Topic |
Subject |
1812
|
07 Feb 2020 |
Pintaudi Giorgio | Info | Force triggering of idle routine of a frontend |
Dear Stefan,
Thank you for the advice. I will try to modify the driver as you say. As for the dynamical change of readout rate, basically you are telling me that is not achievable without dirty hacks like mine and it is better to find a way to avoid it.
Best regards
Giorgio
Stefan Ritt wrote: | Dear Giorgio,
ok, now I'm slowly getting your point.
Dynamically changing the slow control readout rate is possible with your modification, but I consider this badd practice.
You mentioned the case of your HV over a quirky serial line. I had the same some years ago. Rather than reducing the readout rate to reduce the number of errors, I modified my device driver. If the connection is broken, the driver tries silently to reconnect. Only if the reconnect fails for more than a given period (like 1 min), then an error is produced. Otherwise the driver reads as fast as possible. Imagine you have some instabilities in your HV, which only last for a few seconds. If you read only once per minute, you might miss that. We worked hard to make the slow control system multi-threaded, so a slow many-times-retrying-to-reconnect driver does not slow any other equipment. On the other hand, if the re-connect fails for a minute, then you know that your HV unit really has a problem the shifter should follow up.
Best,
Stefan |
|
1813
|
09 Feb 2020 |
Stefan Ritt | Info | Force triggering of idle routine of a frontend |
You dirty hacks will probably work, but what you REALLY want is to read out your HV always as fast as possible, not only during run transitions or ramping. We had a case where a detector produced electrostatic discharges which only lasted for a second or so, and we were happy to detect this in spikes in the HV current. With measurements of only one per minute we would not have realized that so quicky.
Stefan
Pintaudi Giorgio wrote: | Dear Stefan,
Thank you for the advice. I will try to modify the driver as you say. As for the dynamical change of readout rate, basically you are telling me that is not achievable without dirty hacks like mine and it is better to find a way to avoid it.
|
|
1816
|
10 Feb 2020 |
Konstantin Olchanski | Info | Force triggering of idle routine of a frontend |
> We had a case where a detector produced electrostatic discharges which only lasted for a second or so
> and we were happy to detect this in spikes in the HV current. With measurements of only one per minute
> we would not have realized that so quicky.
For the T2K/ND280 TPC we implemented something similar. The TPC uses MicroMegas detector which sparks during
normal operation. We asked Wiener/ISEG to implement a "spark counting mode" for us (and they did). In this mode,
high voltage over-current (a micromegas spark) sets a special flag (does not trip the high voltage). Our midas frontend
reads this flag at rate about 1/min, if flag is set, clears it, increments the software spark counter, reads the flag again,
if the flag is still set (failed to clear), it means this was not a normal spark but a high voltage breakdown
and the offending channel is shut down. I believe this mode is still part of the ISEG normal firmware.
Because the Wiener/ISEG interface uses SNMP to "read all data in one operation", the MIDAS "device driver" structure
was not useful, the readout was a simple loop, the readout frequency was easy to control, and indeed,
we read the high voltage with increased frequency during ramping. This was easy to implement because we
did not have to fight the MIDAS "device driver" framework.
If you want a similar solution, talk to the device, interpret the data, record values to odb and history, generate
midas events - all without hand holding from (arm wrestling with the rest of) midas - I recommend
the new tmfe.h/tmfe.cxx c++ frontend - see the two examples in midas/progs/fetest_tmfe.cxx
and fetest_tmfe_thread.cxx (single-threaded and multi-threaded).
K.O. |
1821
|
12 Feb 2020 |
Stefan Ritt | Info | Force triggering of idle routine of a frontend |
I had a look again at the issue. If you sett the event limit to zero in the EQUIPMENT list, then the idle() routine of your class driver is called as often as possible. Typically with 100 Hz. It's then up to you what to do in the class driver. The hv_idle() routine of the HV class driver shipped in the distribution for example read a channel more often if it has been changed recently. Look at the lines
/* additionally read channel recently updated if not multithreaded */
if (!(hv_info->driver[hv_info->last_channel]->flags & DF_MULTITHREAD)) {
act_time = ss_millitime();
act = (hv_info->last_channel_updated + 1) % hv_info->num_channels;
while (!(act_time - hv_info->last_change[act] < 10000)) {
act = (act + 1) % hv_info->num_channels;
if (act == hv_info->last_channel_updated) {
/* non found, so return */
return status;
}
}
/* updated channel found, so read it additionally */
status = hv_read(pequipment, act);
hv_info->last_channel_updated = act;
}
You can do similar things there like if you are ramping. On an end-of-run, the class drivers cd_xx_read() routine is called from the framework, which in turn sends a full midas event down the stream, but getting the current slow control values from its local cache, not from the actual device (otherwise stopping a run could be very slow). So if you want all values at the end of the run with good precision, you have to read them DURING the run as fast as possible. That's why I posted my comment about fixing dropped serial connections automatically and reading as fast as possible.
Stefan
Pintaudi Giorgio wrote: | Dear Stefan,
Thank you for the advice. I will try to modify the driver as you say. As for the dynamical change of readout rate, basically you are telling me that is not achievable without dirty hacks like mine and it is better to find a way to avoid it.
|
|
885
|
10 May 2013 |
Konstantin Olchanski | Bug Fix | Fixed: crash if alarm "write elog message" is enabled |
If the MIDAS Alarm property "write elog message" is enabled, an uninitialized variable "tag" is passed to
el_submit() and depending on your luck, cause a crash. "tag" is supposed to be and is now a NUL-
terminated string. The only other use of el_submit() is in mhttpd.cxx and mserver.c, where it is called
correctly.
alarm.c svn rev 5361
K.O. |
948
|
15 Jan 2014 |
Konstantin Olchanski | Bug Fix | Fixed spurious symlinks to midas.log |
In some experiments (i.e. DEAP), we see spurious symlinks to midas.log scattered just about everywhere. I
now traced this to an uninitialized variable in cm_msg_log() and it should be fixed now. K.O. |
118
|
30 Oct 2003 |
Stefan Ritt | | Fixed several potential problems for ODB corruption |
I just realized that db_set_value, db_set_data, db_set_num_values and
db_merge_data do not check for num_values == 0. With such a parameter the
ODB can become corrupted, since zero length ODB entries are not allowed. I
fixed the according places in odb.c and committed the changes. Everyone
with ODB corruption problems should update that code. |
257
|
18 May 2006 |
Stefan Ritt | Bug Fix | Fixed problems with reload of custom pages |
We had a problem with custom pages and reloading of them. If they contain an ODB field which is editable, one can change the ODB value through the custom page. The URL then contains a "?cmd=Set&value=x&index=x" section, which stays in the browser's address bar after the ODB value has been updated. If the value changes later by some other means in the ODB, and one presses "reload" in the browser, the above URL gets executed again and the value gets changed back which is not wanted.
The problem has been fixed such that mhttpd redirects the browser after setting a variable to the URL not containing the "Set" command from above. |
573
|
07 May 2009 |
Konstantin Olchanski | Bug Fix | Fixed mlogger run start and stop |
Fixed problems with mlogger starting and stopping runs.
Basic difficulty was with the mlogger using ASYNC transitions, which did not implement proper
transition sequencing according to transition sequence numbers. Basically all clients were called at the
same time, regardless of how long they took to process the transitions.
Switching from ASYNC to SYNC transitions introduces a deadlock between mlogger (not reading data
from SYSTEM buffer while inside cm_transition) and any program trying to write into the SYSTEM buffer
(buffer is full, does not listen for transition requests while waiting for mlogger which tries to call it's
transition handler).
Then we invented the mtransition helper program. In the original implemtation for t2k it was spawned
directly from the mlogger to stop the run (avoiding the deadlock). Then cm_transition(DETACHED) was
introduced, but the mlogger start/stop/restart run logic became broken. One problem was with when
auto restart delay is zero, mtransition tries to restart the run before previous run is stopped (instead,
mlogger should restart the run from it's tr_stop() handler). Another problem was with the auto restart
delay counting from the time when we start stopping the run - because stopping the run can take an
unpredictable time, depending on when various frontends have to do - it is impossible to have a
predictable delay between runs (again this is fixed by restarting the run from mlogger.c::tr_stop()).
All this has been straightened out by svn revision 4484. Basically the old run stop/restart logic was
restored in mlogger.c, using cm_transition(DETACH) to avoid the deadlocks.
To remind all, these are the present controls for transitions initiated by mlogger:
/experiment/transition debug flag - set to "2" to capture transition sequences into midas.log
/experiment/transition timeout and transition connect timeout - one can change default timeouts as
needed to accommodate non cooperative frontends.
/logger/async transitions - do not use mtransition - do ASYNC transitions, as before.
/logger/auto restart delay - delay between stopping the run (mlogger.c::tr_stop) and starting the next
run.
svn rev 4484
K.O. |
1319
|
02 Nov 2017 |
Konstantin Olchanski | Bug Fix | Fixed mlogger memory corruption, updated mxml |
I the agdaq system I see memory corruption in the mlogger. There were at least two bugs: one
memory allocation error in mxml and one incorrect memset() in mlogger.cxx. The mxml bug is fixed
in the mxml repository, mlogger.cxx bug is fixed in the midas-2017-10 branch.
I suggest that all update mxml to the latest version: (without waiting for the new midas release)
https://bitbucket.org/tmidas/mxml/commits/branch/master
K.O. |
535
|
27 Nov 2008 |
Konstantin Olchanski | Info | Fixed mlogger crash, was Per-variable history implementation in the mlogger |
> revision 4142+4143 are minor fixes, refactoring (switch the code to use helper
> functions) and implementation of history for structured banks
The implementation of "history for structured banks" had a bug - tags inside
structured banks were counted incorrectly, leading to memory overwrites and mlogger
crash in open_history().
This is problem is now fixed (plus added assert() checks to crash-out if overwrite of
tags[] array is detected).
svn revision 4398.
K.O. |
259
|
25 May 2006 |
Stefan Ritt | Bug Fix | Fixed compiler warnings with gcc 3.4.4 |
I fixed a couple of compiler warning which came up with the new gcc 3.4.4. Seems like the compiler gets more and more picky. There a still warning left in ybos.c and in mcnaf.c, which I leave to the original author |
260
|
25 May 2006 |
Pierre-Andre Amaudruz | Bug Fix | Fixed compiler warnings with gcc 3.4.4 |
Stefan Ritt wrote: | I fixed a couple of compiler warning which came up with the new gcc 3.4.4. Seems like the compiler gets more and more picky. There a still warning left in ybos.c and in mcnaf.c, which I leave to the original author |
Pierre-A. Amaudruz wrote: | >ybos.c, cnaf_callback.c, mcnaf.c, mana.c have been corrected too. |
|
722
|
23 Sep 2010 |
Konstantin Olchanski | Info | Fixed ODB corruption by javascript ODBGet(nonexistant) |
Prior to odb.c rev 4829 and mhttpd.c rev 4830 committed a few minutes ago, HTML javascript
ODBGet("/non_existant_odb_entry") caused ODB corruption requiring ODB reload from backup file.
It turns out that ODBGet() tries to create ODB entries if they do not already exist, but because ODBGet() was
called without the "type", "length", etc arguments, the mhttpd "jset" command was issued with "type" set to
zero. This resulted in a db_create_key() call with "type" set to zero which created an invalid ODB entry.
odb.c rev 4829 adds a check for "type<=0" (check for "type>=TID_LAST" was already there).
In addition, mhttpd.c rev 4830 adds a "jset" check for type==0.
K.O. |
282
|
31 Jul 2006 |
Konstantin Olchanski | Bug Fix | Fix user memory corruption in ODB |
We have been seeing consistent user memory corruption while setting up a new
experiment. This has been traced to a user memory overwrite in ODB db_set_data()
function and this problem is now fixed. This error was triggered by our frontend
code constantly changing the size of a MIDAS data bank that was also written
into ODB via the RO_ODB option. K.O. |
532
|
27 Nov 2008 |
Konstantin Olchanski | Bug Fix | Fix ss_file_size() on 32-bit Linux |
It turns out that on 32-bit Linux, ss_file_size() returns the wrong answer for
files bigger than 2 GB (4GB?). The Linux stat() system call returns an error
(which is ignored) and bogus file size data (returned to the caller).
On 64-bit Linux (compiled with -m64), stat() appears to return correct data.
Related functions ss_disk_size() and ss_disk_free() return correct answers on
both 32-bit and 64-bit Linux (biggest disk I tried was 5.5 TB).
I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".
I also changed ss_file_size(), ss_disk_size() and ss_disk_free() to return -1 if
the system call returns an error. I also added a test program
utils/test_ss_file_size.c.
svn revision 4397.
K.O. |
536
|
01 Dec 2008 |
Stefan Ritt | Bug Fix | Fix ss_file_size() on 32-bit Linux |
> I also changed ss_file_size(), ss_disk_size() and ss_disk_free() to return -1 if
> the system call returns an error. I also added a test program
> utils/test_ss_file_size.c.
The test program gave under 64-bit SL5:
For [(null)], file size: -1, disk size: -0.001, disk free -0.001
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `/bin/ls -ld (null)'
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `/bin/df -k (null)'
Anyhow I guess that this test program just accidentally slipped into the repository.
Test programs for the developers should not be in the repository since they are of
not much use for the average user. If I would have added every test I made as an
individual test program, we would by now have tons of test programs making the whole
distribution pretty bulky, which nobody would know how to use now. So I removed the
test program again. If people do not agree, I suggest to make a central "main" test
program which combines all tests. I know there are also some C structure alignment
tests etc., which then could all be combined into a single, well documented, test
program. |
538
|
02 Dec 2008 |
Stefan Ritt | Bug Fix | Fix ss_file_size() on 32-bit Linux |
> I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".
That does not work if _LARGEFILE64_SOURCE is not defined. In that case, the compiler
complains that stat64 is undefined. Since many Makefiles for front-ends out there do
not have _LARGEFILE64_SOURCE defined, I changed system.c so that stat64 is only used
if that flag is defined:
#ifdef _LARGEFILE64_SOURE
struct stat64 stat_buf;
int status;
/* allocate buffer with file size */
status = stat64(path, &stat_buf);
if (status != 0)
return -1;
return (double) stat_buf.st_size;
#else
... |
539
|
02 Dec 2008 |
Konstantin Olchanski | Bug Fix | Fix ss_file_size() on 32-bit Linux |
> > I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".
> That does not work if _LARGEFILE64_SOURCE is not defined.
> #ifdef _LARGEFILE64_SOURE
> struct stat64 stat_buf;
This does not work (observe the typoe in the #ifdef). But you cannot know this because
you already deleted the test program I wrote and committed to svn exactly to detect and
prevent this kind of breakage (+ plus to give the Solaris, BSD and other wierdo users
some way to check that ss_file_size() works on their systems).
K.O. |
540
|
02 Dec 2008 |
Stefan Ritt | Bug Fix | Fix ss_file_size() on 32-bit Linux |
K.O. wrote: | This does not work (observe the typoe in the #ifdef). |
Sorry for that, I fixed and committed it.
K.O. wrote: | But you cannot know this because you already deleted the test program I wrote and committed to svn exactly to detect and prevent this kind of breakage (+ plus to give the Solaris, BSD and other wierdo users some way to check that ss_file_size() works on their systems).. |
Well, you figured it out even without the test program in the distribution! But I'm sure no other user would have known how to use your test program to diagnose this problem. So 99% of the users would scratch their head about this undocumented program and get confused. I believe we two are responsible that the midas kernel functions work correctly and the average user should not have to bother with it. I agree that it's handy for you to have this little test program in the distribution, so you can run it everywhere you install midas. But for me it would be handy to have files with, let's say, nature's constants, particle decay life times, list of ASCII codes, and so on. But it would clutter up the distribution and the disadvantage of annoying users would be bigger than my personal benefit, so I don't do it.
If you absolutely want to keep a certain test functionality, you can add it into a "central" test program, write some help and documentation for it, educate users how to use it and how to report any errors back to you. Maybe some printout like "all tests ok" and some specific comment if a test fails would be helpful for the normal user. This test program could then also contain other tests like C structure alignment (which sometimes is a problem), some mutex tests and whatever we collected along the road. An alternative would be to add this into a "test" command inside odbedit. |