ID |
Date |
Author |
Topic |
Subject |
453
|
07 Mar 2008 |
Stefan Ritt | Bug Report | array overflows and other bugs | > I have just compiled MIDAS svn 4132 on a fresh SuSE 10.3 x86_64 system and gcc
> found a bunch of bugs, I guess.
Ahh, great! gcc is getting more and more clever. Each time gcc is updated, it finds
a few new issues.
Indeed some are real bugs, and I will work down the list as time permits. I see
however no immediate thread (you are not using fragmented events, a transition 12
never occurs, etc.). Issue #4 from your list has to be checked by Pierre-Andre. |
462
|
10 Mar 2008 |
Stefan Ritt | Bug Report | array overflows and other bugs | There were some trivial and some non-trivial issues. Glad the compiled picked up on
this!
> I see loads of warnings during compile, most of which I know from earlier
> compiles:
> * warning: dereferencing type-punned pointer will break strict-aliasing rules
> * warning: pointer targets in passing argument 3 of 'getsockname' differ in
> signedness
I ignore these for the moment until I have a gcc 4.2 myself (we use Scientific
Linux 5 which has gcc 4.1 for the moment). As Randolph pointed out correctly you
can make gcc shut up by a proper flag there. The warnings have no influence on the
stability of midas.
> (1)=========================
> src/midas.c:7398: warning: array subscript is above array bounds
> Inspection of midas.c:
>
> if (i == MAX_DEFRAG_EVENTS) {
> /* no buffer available -> no first fragment received */
> 7398: free(defrag_buffer[i].pevent);
> memset(&defrag_buffer[i].event_id, 0, sizeof(EVENT_DEFRAG_BUFFER));
> cm_msg(MERROR, "bm_defragement_event",
> "Received fragment without first fragment (ID %d) Ser#:%d",
> pevent->event_id & 0x0FFF, pevent->serial_number);
> return;
> }
The free() was just wrong at that place, I removed it.
> (2)==========================
> src/midas.c:2958: warning: array subscript is above array bounds
>
> for (i = 0; i < 13; i++)
> 2958 if (trans_name[i].transition == transition)
> break;
Fixed that by
for (i=0 ;; i++)
if (trans_name[i].name[0] == 0 || trans_name[i].transition == transition)
break;
Since trans_name[i].name = "" indicates the end of the list.
> (3)=============================
> mfe.c:
> src/mfe.c:412: warning: array subscript is above array bounds
> src/mfe.c:311: warning: array subscript is above array bounds
> src/mfe.c:340: warning: array subscript is above array bounds
>
> 412: device_drv->dd(CMD_GET_DEMAND, device_drv->dd_info, i,
> &device_drv->mt_buffer->channel[i].array[CMD_GET_DEMAND]);
The code at 412 was wrong there, the demand value is queried later by the device
driver directly. For the other two occurences (311 and 340) I had to really
increase the array size by one. This issue can cause segfaults if you have a slow
control front-end which uses multithreading (not many people use it except me).
> (4)=========================
> src/lazylogger.c:1957: warning: array subscript is below array bounds
>
> if ((channel < 0) && (lazyinfo[channel].hKey != 0))
>
> That is lazyinfo[something below zero].
This has to be fixed by Pierre. I guess an or instead of an and would do it, but
I'm not 100% sure.
> (5)=============================
> More warnings an expert might want to have a look at:
>
> * warning: deprecated conversion from string constant to 'char*'
>
> * src/fal.c:106: warning: non-local variable '<anonymous struct> out_info'
> uses anonymous type
> * src/fal.c:3064: warning: non-local variable '<anonymous struct> eb' uses
> anonymous type
>
> I attach the full output of make.
> Could someone knowledgeable please have a look at these warnings and fix them?
Uahhh. Especially the "const char*" vs. "char*" is in principle right, but will
cause a major rework. Probably hundreds of occations have to be fixed. Many strings
must be declared const, others not. It will help the programmer to find some errors
during compile which would later show up only during runtime (like writing into a
fixed string), but I only will go through that when I have gcc 4.2 installed
myself, and have two free days to work on this ;-)
> They make me a bit nervous when thinking about data integrity, and
> there are now so many that they actually start to hide serious stuff
> like the ones I presented.
Except the slow control stuff (which only is an issue for multithreaded frontends)
none of the above things will have an influence on the data integrity. But I agree
that they should be fixed.
- Stefan |
377
|
22 May 2007 |
Randolf Pohl | Bug Report | analyzer_init called by odb_load | Hi,
I wonder why mana.c:odb_load() calls analyzer_init(). This way analyzer_init
is called TWICE or more times:
first from mana.c:mana_init(), for each invocation of the analyzer, and
second from mana.c:odb_load(), for each run to be analyzed
Isn't this a bug? It can mess up several things (like mallocs) if you don't
take the necessary precautions. Other module_init functions are correctly
called only once, before all runs are analyzed.
I have the feeling, that odb_load should NOT call analyzer_init. Or am I wrong
(probably, but please explain to me)? Do I have to live with it and make sure
that my beautiful global initialization in analyzer_init is only done once?
:-)
Cheers,
Randolf
And here is the annotated log using the ROOT example experiment
(several modules changed/added to print their respective names)
:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -r 1 3
analyzer_init <-- ok
Root server listening on port 9090...
adc_calib_init <-- ok
adc_summing_init <-- ok
scaler_init <-- ok
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init <-- not ok, or is it?
OK
run00001.mid:777 events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init <-- not ok, or is it?
OK
run00002.mid:7227 events, 0.03s
Set run number 3 in ODB
Load ODB from run 3...
analyzer_init <-- not ok, or is it?
OK
run00003.mid:13866 events, 0.06s
adc_calib_exit
adc_summing_exit
scaler_exit
analyzer_exit |
378
|
22 May 2007 |
Stefan Ritt | Bug Report | analyzer_init called by odb_load | The reason to call analyzer_init in odb_load is the following:
Assume you run the analyzer offline, analyzing many files in series. Then assume
that you have /Experiment/Run Parameters, which is actively used by the analyzer
(like beam settings etc.). In this case you do a db_open_record() to map
/Experiment/Run Parameters to the exp_param C structure. For this mapping to work,
the ODB structure and the C structure have to be exactly the same. Now assume that
you changed your run parameters over time, like you added some comment later. Now
you want to analyzer several runs, some before and some after the modification.
Both sets have a different structure in /Experiment/Run Parameters, which is a
problem, since the compiled analyzer can only have a single C structure. My "poor"
solution was to call analyzer_init after each loading of the ODB from the *.mid
file. The db_create_record() call matches the C structure to the ODB structure by
modifying the ODB structure if necessary. So if you added one parameter later, this
(modified) structure gets loaded by odb_load, but then it gets adjusted in
analyzer_init().
I understand now that this case might not happen so often, and you are more
bothered by the fact that analyzer_init gets called several time. There must
however be a hook for offline analysis that the user code can correct the ODB
structure. So I propose to add a flag to analyzer_init, such as
INT analyzer_init(BOOL bFirst)
{
}
If bFirst equals TRUE, the function got called from mana_init(), if FALSE, it got
called from odb_load. Then you can put code like
INT analyzer_init(BOOL bFirst)
{
if (bFirst) {
p = malloc()
...
}
}
If you agree, I will modify the code and commit the change.
- Stefan |
379
|
22 May 2007 |
Randolf Pohl | Bug Report | analyzer_init called by odb_load | Thanks for the quick reply, Stefan.
Please don't change anything in the code unless you find it really important. I guess
changing the analyzer_init prototype will break a lot of code out there?
In fact, I think I do understand this behavior now.
And even without your suggested fix there is a simple workaround: I add a static
variable to my analyzer_init.cxx file, and do something similar to your bFirst fix.
In conclusion, commit your fix if it does not harm others. Postpone this commit to a
future new version of midas which breaks a lot of things anyway...
A last question, for me to understand: Why not call db_open_record in
ana_begin_of_run then?
Cheers,
Randolf |
380
|
22 May 2007 |
Stefan Ritt | Bug Report | analyzer_init called by odb_load | > Thanks for the quick reply, Stefan.
>
> Please don't change anything in the code unless you find it really important.
I guess
> changing the analyzer_init prototype will break a lot of code out there?
>
> In fact, I think I do understand this behavior now.
> And even without your suggested fix there is a simple workaround: I add a static
> variable to my analyzer_init.cxx file, and do something similar to your bFirst
fix.
>
> In conclusion, commit your fix if it does not harm others. Postpone this
commit to a
> future new version of midas which breaks a lot of things anyway...
>
> A last question, for me to understand: Why not call db_open_record in
> ana_begin_of_run then?
I fully agree with you that db_open_record would better go into ana_begin_of_run
(and
analyzer_init not being called in odb_load), and I fully agree with you that
changing the
code would break many experiments. ;-)
So I guess we leave it as it is right now as you suggested. |
1245
|
27 Feb 2017 |
William Moore | Suggestion | analyzer failing to load ODB parameters | Hi,
I am attempting to compile and run analysis code on a completely different,
unconnected system than the DAQ computer for the experiment. The analyzer was
developed previously and my goal is to get it running and then update it to
achieve my needs. Before compiling the analyzer, I load a backup ODB file in
odbedit, and compile experim.h. I then compile the analyzer with that experim.h
file. When I run the analyzer I get the following output:
> MIDAS version 2.1ROOT version 5.34/36Root server listening on port 9090...
> Running analyzer offline. Stop with "!"
> Configuration file "/somedir/switches.odb" loaded
> [Analyzer,INFO] Set run number 1290 in ODB
> Load ODB from run 1290...[Analyzer,INFO] cannot load value "Client Notify":
write protected
> [Analyzer,INFO] cannot load value "Prompt": write protected
.
.
.
> [Analyzer,INFO] cannot load value "LANSCE-ops": write protected
> MIDAS version 2.1ROOT version 5.34/36OK
> Configuration file "/somedir/switches.odb" loaded
> Data_Raw/run01290.mid.gz:16355 Data_Analyzed/run01290.root:15208 events, 0.43s
I have confirmed all files being used have read/write access to all users. The
analyzer does populate a .root output file with filled histograms, however not
all histograms are filled. I believe this is because histograms that relied on
an ODB paramater that failed to load did not populate. Any idea as to what I am
doing wrong or how I could resolve this issue are greatly appreciated.
Thanks,
William Moore |
420
|
04 Feb 2008 |
Robert Pattie | Forum | analyzer crashes at high rates | I'm using midas to read data from a waveform digitizer at event rates of
10-30kHz. To accomplish this the digitizer is read via Block transfers and the
raw data put into a single MIDAS event. Thus a MIDAS event could contain upto
250 physical events and at maximum 350kBytes. In the analyzer modules I had
been analyzing the first physics event contained in a MIDAS event with no
problem. Recently I tried to analyze all the physical events. At low rates,
100hz-1khz, this was no problem, 1-5 physical events in a MIDAS event. At
higher rates 10-20kHz, where there are about 40physical events per MIDAS event,
the analyzer keeps up for a few seconds then seg faults with " 'shared object
read from target memory' has disappear; keeping it symbols". Any suggestions as
to why the analyzer is crashing would be very helpful.
Thanks,
Robert |
421
|
05 Feb 2008 |
Stefan Ritt | Forum | analyzer crashes at high rates | > I'm using midas to read data from a waveform digitizer at event rates of
> 10-30kHz. To accomplish this the digitizer is read via Block transfers and the
> raw data put into a single MIDAS event. Thus a MIDAS event could contain upto
> 250 physical events and at maximum 350kBytes. In the analyzer modules I had
> been analyzing the first physics event contained in a MIDAS event with no
> problem. Recently I tried to analyze all the physical events. At low rates,
> 100hz-1khz, this was no problem, 1-5 physical events in a MIDAS event. At
> higher rates 10-20kHz, where there are about 40physical events per MIDAS event,
> the analyzer keeps up for a few seconds then seg faults with " 'shared object
> read from target memory' has disappear; keeping it symbols". Any suggestions as
> to why the analyzer is crashing would be very helpful.
I personally have never seen this error message. The analyzer is designed such that
it produces "back pressure" if the data rate is higher than the analysis rate and
you have "request all events" on. The only thing I can image are the following two
issues:
- At higher rate where you have more than 40 physical events per MIDAS event, there
is some bug in your analysis code which gets exploited only in that case. Maybe some
temporary array which is only 35 entries long or something like this.
- The back pressure mentioned above will slow down the frontend. If your computer
busy logic is not working correctly, you might get more triggers than you can
acquire. Maybe then the data gets screwed up and the analyzer chokes on it.
Finding the exact reason is not simple. For sure you have to run the analyzer inside
the debugger, to see exactly where the segfault happens. You then maybe have to
produce some dummy data in the frontend (like always sending the same event) to
disentangle some possible trigger problems from other problems.
Best regards,
Stefan |
855
|
28 Jan 2013 |
Robert Pattie | Forum | analyzer cannot connect to the statistics database | I've managed to put the analyzer into state where it cannot connect to the
statistics database. The error message suggests another analyzer is connected.
I've recompiled MIDAS and the user code, restarted the computer etc..., and the
analyzer cannot connect. If I run "odbedit -c clean", I can start the analyzer,
but get the same error when exiting or starting a run. I've commented out all the
user code in the analyzer.c and its associated analyzer module's, and read event
code in the frontend and nothing resolves this issue. Any suggestion?
The output from attempting to run the analyzer is:
Connect to experiment nnbarxwnr...[odb.c:1013:db_open_database,ERROR] Removed ODB
client 'Analyzer', index 0 because process pid 31982 does not exists
Deleted entry '/System/Clients/31982' for client 'Analyzer' because it is not
connected to ODB
OK
Root server listening on port 9090...
Loading previous online histos from ./data/last.root
ss_mutex_wait_for: pthread_mutex_lock() returned errno 22 (Invalid argument),
aborting...
When attempting to clean up the Analyzer tree in the ODB I receive the message
:"deletion of key not allowed."
It appears that running the analyzer sets the permissions of the Statistics tree of
my analyzer module into RWDE.
Adding the following lines to my start up script eliminate the above problem:
odbedit -c clean
odbedit -c "chmod 7 Analyzer/"
odbedit -c "rm /Analyzer/fADCs/Statistics"
Now when starting a run the analyzer crashes with this error:analyzer:
src/midas.c:11443: rpc_execute: Assertion `return_buffer' failed.
Aborted (core dumped)
and the messages in the odb are :
[system.c:4295:recv_tcp,ERROR] header: recv returned 0, n_received = 0, unexpected
connection closure
[midas.c:10042:rpc_client_call,ERROR] recv_tcp() failed, routine = "rc_transition",
host = "LANL-FADC-DAQ"
[midas.c:4130:cm_transition,ERROR] Could not start a run: cm_transition() status 503,
message 'Unknown error 503 from client 'Analyzer' on host LANL-FADC-DAQ'
Deleted entry '/System/Clients/1001' for client 'Analyzer' because process pid 1001
does not exists
[midas.c:8893:rpc_client_check,ERROR] Connection broken to "Analyzer" on host
LANL-FADC-DAQ
Run #180 start aborted
Error: Unknown error 503 from client 'Analyzer' on host LANL-FADC-DAQ
20:05:02 [Logger,INFO] Deleting previous file "./data/run00180.mid"
20:05:02 [ODBEdit,ERROR] [system.c:4295:recv_tcp,ERROR] header: recv returned 0,
n_received = 0, unexpected connection closure
20:05:02 [ODBEdit,ERROR] [midas.c:10042:rpc_client_call,ERROR] recv_tcp() failed,
routine = "rc_transition", host = "LANL-FADC-DAQ"
20:05:02 [ODBEdit,ERROR] [midas.c:4130:cm_transition,ERROR] Could not start a run:
cm_transition() status 503, message 'Unknown error 503 from client 'Analyzer' on host
LANL-FADC-DAQ'
20:05:02 [ODBEdit,INFO] Deleted entry '/System/Clients/1001' for client 'Analyzer'
because process pid 1001 does not exists
20:05:02 [ODBEdit,ERROR] [midas.c:8893:rpc_client_check,ERROR] Connection broken to
"Analyzer" on host LANL-FADC-DAQ
20:05:02 [ODBEdit,INFO] Run #180 start aborted
20:05:03 [mdump,INFO] Client 'Analyzer' on buffer 'SYSTEM' removed by cm_watchdog
because process pid 1001 does not exist
20:05:11 [mhttpd,INFO] Client 'Analyzer' (PID 1001) on database 'ODB' removed by
cm_watchdog (idle 10.1s,TO 10s)
Thanks,
Robert Pattie |
856
|
01 Feb 2013 |
Randolf Pohl | Forum | analyzer cannot connect to the statistics database | The simplest thing is probably to delete all files .[A-Z]*.SHM in the odb directory (the
one you specified in /etc/exptab).
This wipes the ODB, shared memory and all the other obscure stuff, giving you a clean,
fresh start.
Of course it wipes all the valuable stuff, too. That's why it's handy to sometimes open
odbedit and "save odb_<yyyymmdd>.odb". You can reload the thing after such a fatal
"rm .[A-Z]*.SHM" |
857
|
01 Feb 2013 |
Stefan Ritt | Forum | analyzer cannot connect to the statistics database | > The simplest thing is probably to delete all files .[A-Z]*.SHM in the odb directory (the
> one you specified in /etc/exptab).
> This wipes the ODB, shared memory and all the other obscure stuff, giving you a clean,
> fresh start.
>
> Of course it wipes all the valuable stuff, too. That's why it's handy to sometimes open
> odbedit and "save odb_<yyyymmdd>.odb". You can reload the thing after such a fatal
> "rm .[A-Z]*.SHM"
Thanks Randolf for helping out, I was not in the office this week.
In addition of deleting the *SHM files, it's sometimes necessary to delete the shared memory. You do this with the
command line tools
ipcs -m
ipcrm -m <shmid>
/Stefan |
2401
|
13 May 2022 |
Konstantin Olchanski | Info | analysis of corner cases in event buffer write cache | introduction:
to remember, bm_send_event() writes an event to the write cache, bm_flush_cache()
writes the contents of the write cache into the shared memory event buffer, buffer
free space is consumed. in the usual case, mlogger is reading events from the shared
memory event buffer, buffer free space is released. there is also a read cache, not
part of this discussion.
the purpose of the write cache is to reduce contention for the shared memory
semaphore. in the case of large number of small events, semaphore is locked per
cache-flush, instead of per-event. correct tuning of write cache and event size can
reduce lock rate from >100 kHz to around 100 Hz or lower.
analysis:
for correct operation of bm_send_event() under all conditions we need to consider
all corner cases:
1) no write cache: (cache size set to 0)
- event_size > buffer_size -> reject the event (obviously)
- event_size > 0.5 * buffer_size -> only 1 event fits into the buffer, next write
will stall until mlogger reads the previous event (sequential operation, bad)
- event_size < 0.3 * buffer_size -> at least 2 events fit into the buffer (good)
decision: limit event size to 0.5 to 0.3 * buffer_size (current limit is 0.5 *
buffer_size, I think).
consequence: buffer size limit is 2 Gbytes (32-bit byte offsets, code is only 31-
bit-clean), max event size is between 1 Gbytes and 0.6 Gbytes.
2) writing to write cache:
- event_size > cache_size -> flush cache, write event to directly to buffer
- event_size > 0.5 * cache_size -> inefficient use of cache: write to cache, next
event does not fit, flush to buffer, repeat. no gain in semaphore locking (bad), one
additional memcpy() (event to cache and cache to buffer) (bad)
- event_size < 0.3 * cache_size -> multiple events fit into cache, but probably no
gain in semaphore locking
decision: events that are bigger than 0.3 to 0.1 * cache_size should not go through
the cache. (flush cache, write directly to buffer).
3) flush write cache to buffer:
- cache_size > buffer_size -> cannot flush in 1 operation, must have a loop and
flush the cache in pieces
- cache_size between 0.5 and 1.0 * buffer_size -> can flush in 1 operation, but must
wait for mlogger to fully empty the buffer (sequential operation, bad)
- cache size < 0.3 * buffer_size -> can flush in 1 operation, at least 2 "flushes"
fit inside the buffer (good)
decision: limit write cache size to 0.3 * buffer_size. (current limit is
0.25*buffer_size).
consequences:
- write cache size limit is 0.3..0.25 * 2GB = 0.6..0.5 Gbytes
- cached event size limit is 0.3..0.1 * 0.5 GBytes = 150..50 Mbytes
- minimum number of cached events: 3 to 10
- semaphore locks reduced: 3 to 10 locks become 1 lock (all events cached),
4 to 11 locks become 2 locks (big event causes cache flush).
4) complications:
- there is a periodic 1/second bm_flush_cache() that flushes the cache early and
reduces it's efficiency (but needed to avoid having data stuck in cache for long
time)
- if multiple frontends use large write cache (~ 0.3..0.5 * buffer_size), again,
sequential operation can happen (bad)
- write cache is per-frontend, not per-equipment. if different equipments request
different cache sizes, mfe.c and tmfe c++ frontends complain about this, but the
user has to sort it out.
K.O. |
2403
|
16 May 2022 |
Konstantin Olchanski | Info | analysis of corner cases in event buffer write cache | > for correct operation of bm_send_event() under all conditions we need to ...
to continue computation from last message:
default SYSTEM buffer size: 32 MiBytes
default max event size: 4 MiBytes
hard max buffer size: 2 Gbytes (code is only 31-bit-clean)
hard max event size: 2 Gbytes (code is only 31-bit-clean)
max event size currently: 32 Mbytes (same as buffer size)
max event size per (1) in previous post: 32*0.5..0.3 = 16..9 MiBytes
number of default-max-size events buffered: 32/4 = 8.
number of per (1) max-size events buffered: 2 or 3
number of current max-size events buffered: 0 (bad, frontend is serialized with mlogger)
default write cache size: 100 kbytes
max write cache size currently: buffer size / 4 = 32/4 = 8 MiBytes
max write cache size per (3) in previous post: buffer_size / 3 = 10 Mbytes
hard max write cache size per (3): 2 Gbytes/3 = 600 Mbytes
max size of cached events:
current: 100 kbytes (size as cache size)
per (2) in previous post: 0.1..0.3 * cache size = 10..30 kbytes
per (2), 1 Mbyte cahe: 0.1..0.3 * cache size = 100..300 kbytes
hard max size: 0.1..0.3 * hard_max_cache_size = 0.1..0.3 * 600 = 60..180 Mbytes.
max data rate before event buffer semaphore locking rate exceeds 100 Hz:
1 kbyte events, no write cache: 100 kbytes/sec
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 100*1kbyte*100Hz -> 10 Mbytes/sec
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 Mbytes/sec (1gige ethernet)
N kbyte events, 1 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
100 kbyte events, 1 Mbyte cache, not cached per (2): 100kbyte*100Hz = 10 Mbytes/sec
300 kbyte events, 1 Mbyte cache, not cached per (2): 300kbyte*100Hz = 30 Mbytes/sec
N00 kbyte events: N0 Mbytes/sec (500->50, etc)
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 Mbytes/sec (10gige ethernet)
N kbyte events, 10 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
1000 kbyte events, 10 Mbyte cache, not cached per (2): 1000kbyte*100Hz = 100 Mbytes/sec
3000 kbyte events, 10 Mbyte cache, not cached per (2): 3000kbyte*100Hz = 300 Mbytes/sec
N000 kbyte events: N00 Mbytes/sec (4000->400, 5000->500, etc)
default max event size: 4 Mibytes*100Hz = 400 Mbytes/sec (exceeds 1gige ethernet)
hard max event size (divided by 10 to buffer 10 events): 200 Mbytes*100Hz -> 20 Gbytes/sec
max event rate before event buffer semaphore locking rate exceeds 100 Hz:
1 kbyte events, no write cache: 100 Hz (obviously)
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 10 kHz
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 kHz
N kbyte events, 1 Mbyte cache: 1000/N events cached, cache flush rate 100 Hz -> 100/N kHz
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 kHz
N kbyte events, 10 Mbyte cache: 10000/N events cached, cache flush rate 100 Hz -> 1000/N kHz
100 kbyte events, not cached per (2): 100 Hz (obviously)
300 kbyte events, not cached per (2): 100 Hz (obviously)
default max event size: 100 Hz (obviously)
K.O. |
2404
|
16 May 2022 |
Konstantin Olchanski | Info | analysis of corner cases in event buffer write cache | > > for correct operation of bm_send_event() under all conditions we need to ...
> to continue computation from last message:
if I got my numbers right, for present-day hardware (1gige/10gige data rates, 100 Hz max locking rate), we should
increase the default buffer write cache size from 100 kbytes to 10 Mbytes.
this cache size will permit processing of the full mix of small/big events
at the full mix of event rates without exceeding the 100 Hz semaphore locking rate.
with the 10 Mbyte write cache, default event buffer size should be 30-40 Mbytes (current size is 33 Mbytes, so does
not need to change).
this computation is for 1 writer (1 reader, mlogger). it is a typical case for our experiments.
multiple writers can run into contention for event buffer space.
consider 10 writers want to flush their 10 Mbyte write cache all at the same time:
if buffer size is the default 33 Mbytes, the first 3 writers will have successful write cache flush,
but the other 7 will stall, there is no space in the buffer, we have to wait for mlogger to free
some (mlogger writing X Mbytes/sec will take Y milliseconds to liberate 10 Mbytes of space for the 4th writer
to successfully flush, writers 5..10 are still stalled).
but in a system with 10 writers writing at 10 Mbytes/sec (1 Hz default cache flush rate) is 100 Mbytes/sec
will likely have SYSTEM buffer size at least 200-300 Mbytes (to buffer 1-2 seconds of data against
any delays in writing to disk/network storage).
so there should be no problem in practice.
K.O. |
1981
|
12 Aug 2020 |
Yan Liu | Suggestion | adding db_get_mode ti check access mode for keys | Hello,
I am wondering if there is a function that checks the access mode for a key? I
found the db_set_mode() function that allows me to set the access mode for a key,
but failed to find its counterpart get function.
Thanks in advance,
Yan |
1982
|
13 Aug 2020 |
Stefan Ritt | Suggestion | adding db_get_mode ti check access mode for keys | > Hello,
>
> I am wondering if there is a function that checks the access mode for a key? I
> found the db_set_mode() function that allows me to set the access mode for a key,
> but failed to find its counterpart get function.
>
> Thanks in advance,
> Yan
KEY k;
db_get_key(hDB, handle, &k);
std::cout << k.access_mode << std::endl;
/Stefan |
1983
|
13 Aug 2020 |
Yan Liu | Suggestion | adding db_get_mode ti check access mode for keys | Thank you!
Yan
> > Hello,
> >
> > I am wondering if there is a function that checks the access mode for a key? I
> > found the db_set_mode() function that allows me to set the access mode for a key,
> > but failed to find its counterpart get function.
> >
> > Thanks in advance,
> > Yan
>
>
> KEY k;
> db_get_key(hDB, handle, &k);
> std::cout << k.access_mode << std::endl;
>
> /Stefan |
811
|
22 Jun 2012 |
Zisis Papandreou | Info | adding 2nd ADC and TDC to crate | Hi folks:
we've been running midas-1.9.5 for a few years here at Regina. We are now
working on a larger cosmic ray testing that requires a second ADC and second TDC
module in our Camac crate (we use the hytek1331 controller by the way). We're
baffled as to how to set this up properly. Specifically we have tried:
frontend.c
/* number of channels */
#define N_ADC 12
(changed this from the old '8' to '12', and it seems to work for Lecroy 2249)
#define SLOT_ADC0 10
#define SLOT_TDC0 9
#define SLOT_ADC1 15
#define SLOT_TDC1 14
Is this the way to define the additional slots (by adding 0, 1 indices)?
Also, we were not able to get a new bank (ADC1) working, so we used a loop to
tag the second ADC values onto those of the first.
If someone has an example of how to handle multiple ADCs and TDCs and
suggestions as to where changes need to be made (header files, analyser, etc)
this would be great.
Thanks, Zisis...
P.S. I am attaching the relevant files. |
Attachment 1: frontend.c
|
/********************************************************************\
Name: frontend.c
Created by: Stefan Ritt
Contents: Experiment specific readout code (user part) of
Midas frontend. This example simulates a "trigger
event" and a "scaler event" which are filled with
CAMAC or random data. The trigger event is filled
with two banks (ADC0 and TDC0), the scaler event
with one bank (SCLR).
$Log: frontend.c,v $
Revision 1.14 2002/05/16 21:09:53 midas
Added max_event_size_frag
Revision 1.11 2000/08/21 10:32:51 midas
Added max_event_size, set event_buffer_size = 10*max_event_size
Revision 1.10 2000/03/13 18:53:29 pierre
- Added 2nd arg in readout functions (offset for Super event)
Revision 1.9 2000/03/02 22:00:00 midas
Added number of subevents as zero
Revision 1.8 1999/02/24 16:27:01 midas
Added some "real" readout code
Revision 1.7 1999/01/20 09:03:38 midas
Added LAM_SOURCE_CRATE and LAM_SOURCE_STATION macros
Revision 1.6 1999/01/19 10:27:30 midas
Use new LAM_SOURCE and LAM_STATION macros
Revision 1.5 1998/11/09 09:14:41 midas
Added code to simulate random data
Revision 1.4 1998/10/29 14:27:46 midas
Added note about FE_ERR_HW in frontend_init()
Revision 1.3 1998/10/28 15:50:58 midas
Changed lam to DWORD
Revision 1.2 1998/10/12 12:18:58 midas
Added Log tag in header
\********************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include "midas.h"
#include "mcstd.h"
#include "experim.h"
/* make frontend functions callable from the C framework */
#ifdef __cplusplus
extern "C" {frontend.c
#endif
/*-- Globals -------------------------------------------------------*/
/* The frontend name (client name) as seen by other MIDAS clients */
char *frontend_name = "Sample Frontend";
/* The frontend file name, don't change it */
char *frontend_file_name = __FILE__;
/* frontend_loop is called periodically if this variable is TRUE */
BOOL frontend_call_loop = FALSE;
/* a frontend status page is displayed with this frequency in ms */
INT display_period = 3000;
/* maximum event size produced by this frontend */
INT max_event_size = 10000;
/* maximum event size for fragmented events (EQ_FRAGMENTED) */
INT max_event_size_frag = 5*1024*1024;
/* buffer size to hold events */
INT event_buffer_size = 10*10000;
/* number of channels */
#define N_ADC 12
#define N_TDC 8
#define N_SCLR 8
/* CAMAC crate and slots */
#define CRATE 0
#define SLOT_IO 23
#define SLOT_ADC0 10
#define SLOT_TDC0 9
#define SLOT_ADC1 15
#define SLOT_TDC1 14
#define SLOT_SCLR 12
/*-- Function declarations -----------------------------------------*/
INT frontend_init();
INT frontend_exit();
INT begin_of_run(INT run_number, char *error);
INT end_of_run(INT run_number, char *error);
INT pause_run(INT run_number, char *error);
INT resume_run(INT run_number, char *error);
INT frontend_loop();
INT read_trigger_event(char *pevent, INT off);
INT read_scaler_event(char *pevent, INT off);
/*-- Equipment list ------------------------------------------------*/
#undef USE_INT
EQUIPMENT equipment[] = {
{ "Trigger", /* equipment name */
1, 0, /* event ID, trigger mask */
"SYSTEM", /* event buffer */
#ifdef USE_INT
EQ_INTERRUPT, /* equipment type */
#else
EQ_POLLED, /* equipment type */
#endif
LAM_SOURCE(CRATE,LAM_STATION(SLOT_TDC0)), /* event source crate 0, TDC */
"MIDAS", /* format */
TRUE, /* enabled */
RO_RUNNING | /* read only when running */
RO_ODB, /* and update ODB */
500, /* poll for 500ms */
0, /* stop run after this event limit */
0, /* number of sub events */
0, /* don't log history */
"", "", "",
read_trigger_event, /* readout routine */
},
{ "Scaler", /* equipment name */
2, 0, /* event ID, trigger mask */
"SYSTEM", /* event buffer */
EQ_PERIODIC |
EQ_MANUAL_TRIG, /* equipment type */
0, /* event source */
"MIDAS", /* format */
TRUE, /* enabled */
RO_RUNNING |
RO_TRANSITIONS | /* read when running and on transitions */
RO_ODB, /* and update ODB */
10000, /* read every 10 sec */
0, /* stop run after this event limit */
0, /* number of sub events */
0, /* log history */
"", "", "",
read_scaler_event, /* readout routine */
},
{ "" }
};
#ifdef __cplusplus
}
#endif
/********************************************************************\
Callback routines for system transitions
These routines are called whenever a system transition like start/
stop of a run occurs. The routines are called on the following
occations:
frontend_init: When the frontend program is started. This routine
should initialize the hardware.
frontend_exit: When the frontend program is shut down. Can be used
to releas any locked resources like memory, commu-
nications ports etc.
begin_of_run: When a new run is started. Clear scalers, open
rungates, etc.
end_of_run: Called on a request to stop a run. Can send
end-of-run event and close run gates.
pause_run: When a run is paused. Should disable trigger events.
resume_run: When a run is resumed. Should enable trigger events.
\********************************************************************/
/*-- Frontend Init -------------------------------------------------*/
INT frontend_init()
{
/* hardware initialization */
cam_init();
cam_crate_clear(CRATE);
cam_crate_zinit(CRATE);
cam_inhibit_set(CRATE);
/* enable LAM in IO unit */
/* camc(CRATE, SLOT_IO, 0, 26); */
/* enable LAM in crate controller */
/* cam_lam_enable(CRATE, SLOT_IO); */
/* reset external LAM Flip-Flop */
/* camo(CRATE, SLOT_IO, 1, 16, 0xFF); */
/* camo(CRATE, SLOT_IO, 1, 16, 0); */
/* print message and return FE_ERR_HW if frontend should not be started */
return SUCCESS;
}
/*-- Frontend Exit -------------------------------------------------*/
INT frontend_exit()
{
return SUCCESS;
}
/*-- Begin of Run --------------------------------------------------*/
INT begin_of_run(INT run_number, char *error)
{
/* put here clear scalers etc. */
/* clear TDC units */
camc(CRATE, SLOT_TDC0, 0, 9);
camc(CRATE, SLOT_TDC1, 0, 9);
/* clear ADC units */
camc(CRATE, SLOT_ADC0, 0, 9);
camc(CRATE, SLOT_ADC1, 0, 9);
/* disable LAM in ADC and TDC1 units */
camc(CRATE, SLOT_ADC0, 0, 24);
camc(CRATE, SLOT_ADC1, 0, 24);
camc(CRATE, SLOT_TDC1, 0, 24);
/* enable LAM in TDC0 unit */
camc(CRATE, SLOT_TDC0, 0, 26);
cam_inhibit_clear(CRATE);
cam_lam_enable(CRATE, SLOT_TDC0);
return SUCCESS;
}
/*-- End of Run ----------------------------------------------------*/
INT end_of_run(INT run_number, char *error)
{
camc(CRATE, SLOT_TDC0, 0, 24);
camc(CRATE, SLOT_ADC0, 0, 24);
camc(CRATE, SLOT_TDC1, 0, 24);
camc(CRATE, SLOT_ADC1, 0, 24);
cam_inhibit_set(CRATE);
return SUCCESS;
}
/*-- Pause Run -----------------------------------------------------*/
INT pause_run(INT run_number, char *error)
{
return SUCCESS;
}
/*-- Resuem Run ----------------------------------------------------*/
INT resume_run(INT run_number, char *error)
{
return SUCCESS;
}
/*-- Frontend Loop -------------------------------------------------*/
INT frontend_loop()
{
/* if frontend_call_loop is true, this routine gets called when
the frontend is idle or once between every event */
return SUCCESS;
}
/*------------------------------------------------------------------*/
/********************************************************************\
Readout routines for different events
\********************************************************************/
/*-- Trigger event routines ----------------------------------------*/
INT poll_event(INT source, INT count, BOOL test)
/* Polling routine for events. Returns TRUE if event
is available. If test equals TRUE, don't return. The test
flag is used to time the polling */
{
int i;
DWORD lam;
... 155 more lines ...
|
Attachment 2: analyzer.c
|
/********************************************************************\
Name: analyzer.c
Created by: Stefan Ritt
Contents: System part of Analyzer code for sample experiment
$Log: analyzer.c,v $
Revision 1.4 2000/03/02 22:00:18 midas
Changed events sent to double
Revision 1.3 1998/10/29 14:18:19 midas
Used hDB consistently
Revision 1.2 1998/10/12 12:18:58 midas
Added Log tag in header
\********************************************************************/
/* standard includes */
#include <stdio.h>
#include <time.h>
/* midas includes */
#include "midas.h"
#include "experim.h"
#include "analyzer.h"
/* cernlib includes */
#ifdef OS_WINNT
#define VISUAL_CPLUSPLUS
#endif
#ifdef __linux__
#define f2cFortran
#endif
#ifndef MANA_LITE
#include <cfortran.h>
#include <hbook.h>
PAWC_DEFINE(1000000);
#endif
/*-- Globals -------------------------------------------------------*/
/* The analyzer name (client name) as seen by other MIDAS clients */
char *analyzer_name = "Analyzer";
/* analyzer_loop is called with this interval in ms (0 to disable) */
INT analyzer_loop_period = 0;
/* default ODB size */
INT odb_size = DEFAULT_ODB_SIZE;
/* ODB structures */
RUNINFO runinfo;
GLOBAL_PARAM global_param;
EXP_PARAM exp_param;
TRIGGER_SETTINGS trigger_settings;
/*-- Module declarations -------------------------------------------*/
extern ANA_MODULE scaler_accum_module;
extern ANA_MODULE adc_calib_module;
extern ANA_MODULE adc_summing_module;
ANA_MODULE *scaler_module[] = {
&scaler_accum_module,
NULL
};
ANA_MODULE *trigger_module[] = {
&adc_calib_module,
&adc_summing_module,
NULL
};
/*-- Bank definitions ----------------------------------------------*/
ASUM_BANK_STR(asum_bank_str);
BANK_LIST trigger_bank_list[] = {
/* online banks */
{ "ADC0", TID_WORD, 2*N_ADC, NULL },
/* { "ADC1", TID_WORD, N_ADC, NULL },
{ "TDC1", TID_WORD, N_TDC, NULL }, */
{ "TDC0", TID_WORD, 2*N_TDC, NULL },
/* calculated banks */
{ "CADC", TID_FLOAT, N_ADC, NULL },
{ "ASUM", TID_STRUCT, sizeof(ASUM_BANK), asum_bank_str },
{ "" },
};
BANK_LIST scaler_bank_list[] = {
/* online banks */
{ "SCLR", TID_DWORD, N_ADC, NULL },
/* calculated banks */
{ "ACUM", TID_DOUBLE, N_ADC, NULL },
{ "" },
};
/*-- Event request list --------------------------------------------*/
ANALYZE_REQUEST analyze_request[] = {
{ "Trigger", /* equipment name */
1, /* event ID */
TRIGGER_ALL, /* trigger mask */
GET_SOME, /* get some events */
"SYSTEM", /* event buffer */
TRUE, /* enabled */
"", "",
NULL, /* analyzer routine */
trigger_module, /* module list */
trigger_bank_list, /* bank list */
1000, /* RWNT buffer size */
TRUE, /* Use tests for this event */
},
{ "Scaler", /* equipment name */
2, /* event ID */
TRIGGER_ALL, /* trigger mask */
GET_ALL, /* get all events */
"SYSTEM", /* event buffer */
TRUE, /* enabled */
"", "",
NULL, /* analyzer routine */
scaler_module, /* module list */
scaler_bank_list, /* bank list */
100, /* RWNT buffer size */
},
{ "" }
};
/*-- Analyzer Init -------------------------------------------------*/
INT analyzer_init()
{
HNDLE hDB, hKey;
char str[80];
RUNINFO_STR(runinfo_str);
EXP_PARAM_STR(exp_param_str);
EXP_EDIT_STR(exp_edit_str);
GLOBAL_PARAM_STR(global_param_str);
TRIGGER_SETTINGS_STR(trigger_settings_str);
/* open ODB structures */
cm_get_experiment_database(&hDB, NULL);
db_create_record(hDB, 0, "/Runinfo", strcomb(runinfo_str));
db_find_key(hDB, 0, "/Runinfo", &hKey);
if (db_open_record(hDB, hKey, &runinfo, sizeof(runinfo), MODE_READ, NULL, NULL) != DB_SUCCESS)
{
cm_msg(MERROR, "analyzer_init", "Cannot open \"/Runinfo\" tree in ODB");
return 0;
}
db_create_record(hDB, 0, "/Experiment/Run Parameters", strcomb(exp_param_str));
db_find_key(hDB, 0, "/Experiment/Run Parameters", &hKey);
if (db_open_record(hDB, hKey, &exp_param, sizeof(exp_param), MODE_READ, NULL, NULL) != DB_SUCCESS)
{
cm_msg(MERROR, "analyzer_init", "Cannot open \"/Experiment/Run Parameters\" tree in ODB");
return 0;
}
db_create_record(hDB, 0, "/Experiment/Edit on start", strcomb(exp_edit_str));
sprintf(str, "/%s/Parameters/Global", analyzer_name);
db_create_record(hDB, 0, str, strcomb(global_param_str));
db_find_key(hDB, 0, str, &hKey);
if (db_open_record(hDB, hKey, &global_param, sizeof(global_param), MODE_READ, NULL, NULL) != DB_SUCCESS)
{
cm_msg(MERROR, "analyzer_init", "Cannot open \"%s\" tree in ODB", str);
return 0;
}
db_create_record(hDB, 0, "/Equipment/Trigger/Settings", strcomb(trigger_settings_str));
db_find_key(hDB, 0, "/Equipment/Trigger/Settings", &hKey);
if (db_open_record(hDB, hKey, &trigger_settings, sizeof(trigger_settings), MODE_READ, NULL, NULL) != DB_SUCCESS)
{
cm_msg(MERROR, "analyzer_init", "Cannot open \"/Equipment/Trigger/Settings\" tree in ODB");
return 0;
}
return SUCCESS;
}
/*-- Analyzer Exit -------------------------------------------------*/
INT analyzer_exit()
{
return CM_SUCCESS;
}
/*-- Begin of Run --------------------------------------------------*/
INT ana_begin_of_run(INT run_number, char *error)
{
return CM_SUCCESS;
}
/*-- End of Run ----------------------------------------------------*/
INT ana_end_of_run(INT run_number, char *error)
{
FILE *f;
time_t now;
char str[256];
int size;
double n;
HNDLE hDB;
BOOL flag;
cm_get_experiment_database(&hDB, NULL);
/* update run log if run was written and running online */
size = sizeof(flag);
db_get_value(hDB, 0, "/Logger/Write data", &flag, &size, TID_BOOL, TRUE);
/* if (flag && runinfo.online_mode == 1) */
if (flag )
{
/* update run log */
size = sizeof(str);
str[0] = 0;
db_get_value(hDB, 0, "/Logger/Data Dir", str, &size, TID_STRING, TRUE);
if (str[0] != 0)
if (str[strlen(str)-1] != DIR_SEPARATOR)
strcat(str, DIR_SEPARATOR_STR);
strcat(str, "runlog.txt");
f = fopen(str, "a");
time(&now);
strcpy(str, ctime(&now));
str[10] = 0;
fprintf(f, "%s\t%3d\t", str, runinfo.run_number);
strcpy(str, runinfo.start_time);
str[19] = 0;
fprintf(f, "%s\t", str+11);
strcpy(str, ctime(&now));
str[19] = 0;
fprintf(f, "%s\t", str+11);
size = sizeof(n);
db_get_value(hDB, 0, "/Equipment/Trigger/Statistics/Events sent", &n, &size, TID_DOUBLE, TRUE);
fprintf(f, "%5.1lfk\t", n/1000);
fprintf(f, "%s\n", exp_param.comment);
fclose(f);
}
return CM_SUCCESS;
}
/*-- Pause Run -----------------------------------------------------*/
INT ana_pause_run(INT run_number, char *error)
{
return CM_SUCCESS;
}
/*-- Resume Run ----------------------------------------------------*/
INT ana_resume_run(INT run_number, char *error)
{
return CM_SUCCESS;
}
/*-- Analyzer Loop -------------------------------------------------*/
INT analyzer_loop()
{
return CM_SUCCESS;
}
/*------------------------------------------------------------------*/
|
Attachment 3: analyzer.h
|
/********************************************************************\
Name: analyzer.h
Created by: Stefan Ritt
Contents: Analyzer global include file
$Log: analyzer.h,v $
Revision 1.2 1998/10/12 12:18:58 midas
Added Log tag in header
\********************************************************************/
/*-- Parameters ----------------------------------------------------*/
/* number of channels */
#define N_ADC 12
#define N_TDC 8
#define N_SCLR 8
/*-- Histo ID bases ------------------------------------------------*/
#define ADCCALIB_ID_BASE 2000
#define ADCSUM_ID_BASE 3000
|
2389
|
30 Apr 2022 |
Konstantin Olchanski | Info | added web pages for "show odb clients" and "show open records" | for a long time, midas web pages have been missing the equivalent of odbedit
"scl" and "sor" to display current odb clients and current odb open records.
this is now added as buttons "show open records" and "show odb clients" in the
odb editor page.
as in odbedit, "sor" shows open records under the current subtree, i.e. if you
are looking at /equipment, you will not see open records for /experiment. to see
all open records, go to "/".
commit b1ab7e67ecf785744fff092708d8389f222b14a4
K.O. |
|