26 Jul 2007, Stefan Ritt, Info, Change of pointer type in mvmestd.h
|
I had to change the pointer type of mvme_read and mvme_write to (void *) instead
to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
VME drivers if necessary.
- Stefan |
12 Aug 2007, Konstantin Olchanski, Info, Change of pointer type in mvmestd.h
|
> I had to change the pointer type of mvme_read and mvme_write to (void *) instead
> to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
> VME drivers if necessary.
Updated: vmicvme.c (VMIVME-7750/7805) and gefvme.c (GEFANUC V7865)
K.O. |
29 Jun 2007, Konstantin Olchanski, Bug Fix, mscb, musbstd fixed on Linux, MacOS
|
I commited a few minor changes to musbstd and mscb code to make them work on
MacOSX (tested on 10.3.9) and Linux (tested on Fedora 6).
The basic functions work with the MSCB USB master, but I still need to
investigate some cases where the connection hangs and usb communications do not
work until the USB cable is unplugged and plugged back in. I see this problem
both on MacOS and Linux.
Important changes:
1) mscb_select_device() does not work on both Linux and MacOS and is disabled.
Please run "msc -d usb0".
2) on Linux, the Makefile should define -DOS_LINUX and -DHAVE_LIBUSB;
on MacOS, the Makefile should define -DOS_LINUX and -DOS_DARWIN. (This is
because MacOS is treated as a funny type of Linux).
3) when doing USB communications, one has to use the correct endpoint numbers,
which seem to be system dependant and for now, I hard code them in mscb.c for
the tested systems.
There supposed to be no changes to the Windows code, but I cannot test on
Windows, so if somebody does and finds breakage, please let me know.
K.O. |
02 Jul 2007, Stefan Ritt, Bug Fix, mscb, musbstd fixed on Linux, MacOS
|
KO wrote: | There supposed to be no changes to the Windows code, but I cannot test on Windows, so if somebody does and finds breakage, please let me know. |
I can confirm that revision 3713 still works under Windows. |
06 Jul 2007, Konstantin Olchanski, Bug Fix, mscb, musbstd fixed on Linux, MacOS
|
> I commited a few minor changes to musbstd and mscb code...
>
> The basic functions work with the MSCB USB master, but I still need to
> investigate some cases where the connection hangs and usb communications do not
> work until the USB cable is unplugged and plugged back in. I see this problem
> both on MacOS and Linux.
I think I fixed the hangs we see on linux and macos - at the end all I had to do is
issue a usb reset to make mscb communicate again.
Also tested on Linux FC6 and SL4.5.
K.O. |
10 May 2007, Konstantin Olchanski, Info, RHEL5/SL5 success!
|
FWIW, I am running latest 32-bit MIDAS on an AM2 dual core AMD machine under 64-bit SL5. Everything
seems to work correctly. K.O.
P.S. For the record, the compiler produces two sets of warnings:
- warning: pointer targets in passing argument 3 of â differ in signedness
- warning: dereferencing type-punned pointer will break strict-aliasing rules
(I do not understand the meaning of the second warning. type-punned pointer, huh?)
K.O. |
03 Jul 2007, Ryu Sawada, Info, RHEL5/SL5 success!
|
> P.S. For the record, the compiler produces two sets of warnings:
> - warning: dereferencing type-punned pointer will break strict-aliasing rules
> (I do not understand the meaning of the second warning. type-punned pointer, huh?)
This is because strict aliasing rule is broken in this code.
In ISO C++99 standard, it is illegal to create two pointers of different types referring to the same address.
Even a code breaks the rule, it compiles, but the result is undefined.
For example following code gives different results depends on -O2 is used or not, because -O2 includes -fstrict-aliasing option.
When -fstrict-aliasing is used, compiler can optimize the code assuming the strict aliasing rule.
#include <stdio.h>
int main(){
int ii = 1;
float* ff = (float*)ⅈ
*ff = 2;
printf("%d\n", ii);
return 0;
}
GCC warns this kind of code with a message like,warning: dereferencing type-punned pointer will break strict-aliasing rules
The behavior differs also depending on compilers. GCC3 does not warn, while GCC4 warns. (GCC3 is the default on SL4, while GCC4 is
the default on SL5)
And results are different. GCC3 gives 0, while GCC4 gives 1.
#include <stdio.h>
typedef struct xxx {int ii;} XX;
int main(){
XX a;
a.ii = 1;
*(short*)&a.ii = 0;
printf("%d\n", a.ii);
return 0;
}
More dangerous thing is that compilers do not always warn about it. For example, following code compiles without warnings even
when you use -Wall (including -Wstrict-aliasing). But the result changes depending on compile flags or compiler versions.#include <stdio.h>
#include <string.h>
#include <malloc.h>
int main(){
int *ii = (int*)malloc(8);
ii[0] = 1;
ii[1] = 2;
float* ff = (float*)ii;
ff[0] = 3;
ff[1] = 4;
printf("%d %d\n", ii[0], ii[1]);
return 0;
}
A safer way is disabling -fstrict-aliasing compile flag. For example, you may change compile flag for midas like "-O2 -fno-strict-
aliasing".
Disadvantage is that there is a possibility that the speed is decreased.
The best way is modifying code to be in the strict aliasing rule.
Best regards |
07 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello,
I am having a problem with the root-based analyzer. It crashes when I try to
analyze multiple runs OFFLINE using the "-i run%05d.mid -o result%05d.root -r
1 2" feature.
I can reproduce the problem with the example experiment which comes with the
MIDAS distribution:
Running the analyzer ONLINE works fine: One can start and stop runs one after
the other, roody shows the histograms being reset and then filled again and
such.
But OFFLINE, the analyzer crashes when trying to analyze the SECOND run in a
sequence. So
./analyzer -i run%05d.mid -o result%05d.root -r 1 1 works (only run 1)
./analyzer -i run%05d.mid -o result%05d.root -r 1 3 dies on run 2
Output attached (I added printf's to the "init"-modules, but that's irrelevant
here)
My own analyzer shows the same effect. There I got the impression the segfault
happens on the first attempt to Fill/Reset/SetName etc. a histogram in the 2nd
run. But with the midas example it looks like the analyzer finishes filling
histos even for run 2, but then dies in eor.
Can you reproduce the problem?
I run MIDAS on an Intel Quadcore, 64 bit SuSE Linux 10.2.
pohl@lamb2:~/midas/examples/root> gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)
(maybe 4.1.2 "PRERELEASE" is the problem? See message ID 344)
I am using midas rev. 3674 (April 19, 2007), but I got the impression there
has since not been a change relevant to this problem. Please correct me if I
am wrong, then I would try it with Rev HEAD.
(My version includes already the fix to the x86_64 segfault problem of message
ID 337)
Best regards,
Randolf |
08 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
Unfortunately I don't have time right now to debug the problem, but I could see
roughly what it could be. The analyzer crashes inside CloseRootOutputFile:
#5 <signal handler called>
#6 0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7 0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489
in the line
free(tree_struct.event_tree[i].branch);
If a "free" crashes, it might indicate that the memory beyond the allocated space
got corrupted. The branch gets allocated in book_ttree(), once for each
analyze_request[i]. The branch gets filled in write_event_ttree():
/* fill tree both online and offline */
if (!exclude_all)
et->tree->Fill();
Maybe one should put printf debugging statements in these places to see what's
going on. |
09 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello Stefan,
tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722 /tmp/midas/run00001.root:2720 events,
0.21s
book_ttree: tree_struct.n_tree = 3 <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347 /tmp/midas/run00002.root:2345 events,
0.18s
*** Break *** segmentation violation
----------------- \output ----------------------------
Adding this one line fixes the segfault problem for the root example expt.
----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c (revision 3705)
+++ mana.c (working copy)
@@ -1496,6 +1496,7 @@
/* delete event tree */
free(tree_struct.event_tree);
tree_struct.event_tree = NULL;
+ tree_struct.n_tree = 0;
// go to ROOT root directory
gROOT->cd();
---------------- \code ---------------------------
Please check if this gives the intended behaviour. I am not very familiar with the
midas internals.
Unfortunately my own analyzer's segfault problem is not solved by this patch. I
guess I have to keep searching for a bug on my side..... :-)
Cheers,
Randolf |
10 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
> tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
Yes this indeed a bug. I applied your change and committed the new code. |
11 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello again,
just for the record, in case somebody else runs into the same problem...
I have hunted down "my" segfault problem to the fact that I book histograms not
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
know which histograms to book, as this information comes from the ODB (booking
only histograms for CAMAC modules which were set to "read" in the ODB). The core
dump happens on the first access (->Fill, ->SetName,...) of one of these histos
in the 2nd run analyzed offline ("./analyzer -r n m").
In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the
segfault when trying to access them in the 2nd run. (I keep track of existing
histograms, only booking the missing histos in bor.)
The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
TFolders and booking the histogram.
I do, however, not really understand the intention why histos booked in bor() go
to only the file, whereas histos booked in init() go to memory. Could you please
comment briefly? Maybe I missed the most important point. And what about online
mode, should this work?
Thanks a lot in advance,
Randolf |
11 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
> I have hunted down "my" segfault problem to the fact that I book histograms not
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
> know which histograms to book, as this information comes from the ODB (booking
> only histograms for CAMAC modules which were set to "read" in the ODB). The core
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos
> in the 2nd run analyzed offline ("./analyzer -r n m").
>
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the
> segfault when trying to access them in the 2nd run. (I keep track of existing
> histograms, only booking the missing histos in bor.)
>
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
> TFolders and booking the histogram.
ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.
> I do, however, not really understand the intention why histos booked in bor() go
> to only the file, whereas histos booked in init() go to memory. Could you please
> comment briefly? Maybe I missed the most important point. And what about online
> mode, should this work?
The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().
I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code. |
12 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hi
> So I guess your solution is not a real solution.
I was not precise enough on what I do. This way the histograms persist in memory, but
they are also written to every file:
e.g. in module "trig_tdc":
TDirectory *savedir = gDirectory; // will restore this afterwards
gROOT->cd(); // go to file
// make sure we are in the right "analyzer module folder"
TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
gHistoFolderStack->Add((TObject *) TDC_Folder);
...(loop over all TDCs, figure out which histos exist, and which need to be booked)
open_subfolder("raw4208");
hrTDC = h1_book(....); // create histo in memory, but it shows up in the file, too.
close_subfolder(); //raw4208
// restore gHistoFolderStack (we added a folder when entering routine)
gHistoFolderStack->Remove(gHistoFolderStack->Last());
// restore current directory
savedir->cd();
When deleting histos I do:
gManaHistosFolder->RecursiveRemove(*pHisto);
(*pHisto)->Delete();
(*pHisto) = NULL; // for my book-keeping of existing histos.
You don't have to clear the histos explicitly between runs. gManaHistosFolder does this
magic to you.
> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.
No, the code is fine. I just wanted to explain my problem and a solution to it, because
I thought that somebody might run into the same problem, too.
Ciao,
Randolf |
22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load
|
Hi,
I wonder why mana.c:odb_load() calls analyzer_init(). This way analyzer_init
is called TWICE or more times:
first from mana.c:mana_init(), for each invocation of the analyzer, and
second from mana.c:odb_load(), for each run to be analyzed
Isn't this a bug? It can mess up several things (like mallocs) if you don't
take the necessary precautions. Other module_init functions are correctly
called only once, before all runs are analyzed.
I have the feeling, that odb_load should NOT call analyzer_init. Or am I wrong
(probably, but please explain to me)? Do I have to live with it and make sure
that my beautiful global initialization in analyzer_init is only done once?
:-)
Cheers,
Randolf
And here is the annotated log using the ROOT example experiment
(several modules changed/added to print their respective names)
:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -r 1 3
analyzer_init <-- ok
Root server listening on port 9090...
adc_calib_init <-- ok
adc_summing_init <-- ok
scaler_init <-- ok
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init <-- not ok, or is it?
OK
run00001.mid:777 events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init <-- not ok, or is it?
OK
run00002.mid:7227 events, 0.03s
Set run number 3 in ODB
Load ODB from run 3...
analyzer_init <-- not ok, or is it?
OK
run00003.mid:13866 events, 0.06s
adc_calib_exit
adc_summing_exit
scaler_exit
analyzer_exit |
22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load
|
The reason to call analyzer_init in odb_load is the following:
Assume you run the analyzer offline, analyzing many files in series. Then assume
that you have /Experiment/Run Parameters, which is actively used by the analyzer
(like beam settings etc.). In this case you do a db_open_record() to map
/Experiment/Run Parameters to the exp_param C structure. For this mapping to work,
the ODB structure and the C structure have to be exactly the same. Now assume that
you changed your run parameters over time, like you added some comment later. Now
you want to analyzer several runs, some before and some after the modification.
Both sets have a different structure in /Experiment/Run Parameters, which is a
problem, since the compiled analyzer can only have a single C structure. My "poor"
solution was to call analyzer_init after each loading of the ODB from the *.mid
file. The db_create_record() call matches the C structure to the ODB structure by
modifying the ODB structure if necessary. So if you added one parameter later, this
(modified) structure gets loaded by odb_load, but then it gets adjusted in
analyzer_init().
I understand now that this case might not happen so often, and you are more
bothered by the fact that analyzer_init gets called several time. There must
however be a hook for offline analysis that the user code can correct the ODB
structure. So I propose to add a flag to analyzer_init, such as
INT analyzer_init(BOOL bFirst)
{
}
If bFirst equals TRUE, the function got called from mana_init(), if FALSE, it got
called from odb_load. Then you can put code like
INT analyzer_init(BOOL bFirst)
{
if (bFirst) {
p = malloc()
...
}
}
If you agree, I will modify the code and commit the change.
- Stefan |
22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load
|
Thanks for the quick reply, Stefan.
Please don't change anything in the code unless you find it really important. I guess
changing the analyzer_init prototype will break a lot of code out there?
In fact, I think I do understand this behavior now.
And even without your suggested fix there is a simple workaround: I add a static
variable to my analyzer_init.cxx file, and do something similar to your bFirst fix.
In conclusion, commit your fix if it does not harm others. Postpone this commit to a
future new version of midas which breaks a lot of things anyway...
A last question, for me to understand: Why not call db_open_record in
ana_begin_of_run then?
Cheers,
Randolf |
22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load
|
> Thanks for the quick reply, Stefan.
>
> Please don't change anything in the code unless you find it really important.
I guess
> changing the analyzer_init prototype will break a lot of code out there?
>
> In fact, I think I do understand this behavior now.
> And even without your suggested fix there is a simple workaround: I add a static
> variable to my analyzer_init.cxx file, and do something similar to your bFirst
fix.
>
> In conclusion, commit your fix if it does not harm others. Postpone this
commit to a
> future new version of midas which breaks a lot of things anyway...
>
> A last question, for me to understand: Why not call db_open_record in
> ana_begin_of_run then?
I fully agree with you that db_open_record would better go into ana_begin_of_run
(and
analyzer_init not being called in odb_load), and I fully agree with you that
changing the
code would break many experiments. ;-)
So I guess we leave it as it is right now as you suggested. |
21 May 2007, Konstantin Olchanski, Info, mhttpd changes to use /History/Tags data
|
I am slowly commiting the changes to the history code. This installement adds
code to mhttpd to use the /History/Tags data (to be) generated by the mlogger.
In the nutshell, the logger fills /History/Tags to "remember" what events,
variables and tags exist in the history files.
This replaces the old code that attempts to guess the contents of history files
by looking at /Equipment tree.
To ease the transition to the new system, I am leaving all the old code alive
and active in the absense of "/History/Tags" entries.
As soon as one starts using the new mlogger (to be commited), the new tags based
mhttpd code will activate itself.
K.O. |
09 May 2007, Carl Metelko, Forum, Splitting data transfer and control onto different networks
|
Hi,
I'm setting up a system with two networks with the intension of having
control info (odb, alarm) on the 192.168.0.x
and the frontend readout on 192.168.1.x
Is there any easy way of doing this?
I'm also trying to separate processes onto different machines, is there
any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same
machine?
Thanks,
Carl Metelko |
09 May 2007, Stefan Ritt, Forum, Splitting data transfer and control onto different networks
|
Hi Carl,
so far I did not experience any problems of running odb&alarm on the same link as
the readout, since the data goes usually frontend->backend, and all other messages
from backend->frontend. So before you do something complicated, try it first the
easy way and check if you have problems at all. So far I don't know anybody who
did separate the network interfaces so I have not description for that.
You can however separate processes. The easiest is to buy a multi-core machine. If
you want to use however separate computers, note that receiving events over the
network is not very optimized. So you should run mserver connected to the frontend
, the event builder and mlogger on the same machine. mhttpd can easily live on
another machine, but there is not much CPU consumption from that (unless you don't
plot long history trends). Running mserver, the event builder and mlogger on the
same machine (dual Xenon mainboard) gave me easily 50 MB/sec (actually disk
limited), and not both CPUs were near 100%. If you put any receiving process (like
the event builder or mlogger or the analyzer) on a separate machine, you might see
a bottlened on the event receiving side of maybe 10MB/sec or so (never really
tried recently).
Best regards,
Stefan
> Hi,
> I'm setting up a system with two networks with the intension of having
> control info (odb, alarm) on the 192.168.0.x
> and the frontend readout on 192.168.1.x
>
> Is there any easy way of doing this?
> I'm also trying to separate processes onto different machines, is there
> any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same
> machine?
> Thanks,
> Carl Metelko |
09 May 2007, Konstantin Olchanski, Forum, Splitting data transfer and control onto different networks
|
> I'm setting up a system with two networks with the intension of having
> control info (odb, alarm) on the 192.168.0.x
> and the frontend readout on 192.168.1.x
We have some experience with this at TRIUMF - the TWIST experiment we run with the main data
generating frontends on a private network - it is a supported configuration and it works fine.
We ran into one problem after adding some code to the frontends for stopping the run upon detecting
some data errors - stopping runs requires sending RPC transactions to every midas client, so we had to
add static network routes for routing packets between midas nodes on the private network and midas
nodes on the normal network.
> I'm also trying to separate processes onto different machines, is there
> any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same machine?
mserver runs on the machine with the ODB shared memory by definition (think of it as "nfs server").
mhttpd typically runs on the machine with the ODB shared memory and until recently it had no code for
connecting to the mserver. I recently fixed some of it, and now you can run mhttpd in "history mode"
through the mserver. This is useful for offloading the generation of history plots to another cpu or
another machine. In our case, we run the "history mhttpd" on the machine that holds the history files.
mlogger could be made to run remotely via the mserver, but presently it will refuse to do so, as it has
some code that requires direct access to midas shared memory. If data has to be written to a remote
filesystem, the consensus is that it is more efficient to run mserver locally and let the OS handle remote
filesystem access (NFS, etc).
All other midas programs should be able to run remotely via the mserver.
K.O. |
14 May 2007, Carl Metelko, Forum, Splitting data transfer and control onto different networks
|
Hi,
thanks for the advice. We do have dual core Xeons so we'll try running
most things on the server. Unless it proves to be a problem we'll run all
MIDAS signals on one network and NFS etc on the other.
I do have one more query about running systems like Konstantin.
What we would like to do is have a 'mirror' server serving multiple
online monitoring machines so that the load on the server is constant nomatter
the demands on the mirror.
Is there a way to set this up? Or would it be best to have a remote analyser
making short (1min) root files shared with the online monitoring? |
10 May 2007, Konstantin Olchanski, Bug Fix, Fix error reporting from cm_transition()
|
For some time now, error reporting from cm_transition() was broken.
Typical symptom was when starting a run from mhttpd, when a transition error occurred, the run does not
start (good) but the user is presented with a message "Success" in big letters (confusing the user).
Part of the problem was caused by user-written frontends that return an empty error string. Code in
cm_transition() now detects this and shows the numeric value of the error status returned by the frontend.
This is fixed in revision 3681.
The error string "Success" is now returned only when cm_transition() was successful, and other error
reporting inside this function was cleaned up.
K.O. |
10 May 2007, Konstantin Olchanski, Bug Fix, mhttpd: fix broken boolean arrays in "edit on start"
|
For some time now, boolean arrays did not work correctly in "/experiment/edit on start". This is now fixed
in rev 3680. K.O. |
10 Apr 2007, Dan Gastler, Forum, Interrupt code for VME?
|
Hello,
Is there any example code for using midas for interrupt driven data
collection over VME? I am using a Struck SIS3100 PCI/VME setup to connect to my
VME crate. Thanks,
-Dan |
09 Apr 2007, Konstantin Olchanski, Info, move history, elog and alarm functions into separate files
|
As approved by Stefan, I moved the history (hs_xxx), alarm (al_xxx) and elog (el_xxx) functions out of
midas.c into separate files. Commited as revision 3665. This change should be transparent to all users.
K.O. |
02 Apr 2007, Exaos Lee, Bug Fix, SIGABT of "mlogger" and possible fix
|
Version: svn 3658
Code: mlogger.c
Problem: After executation of "mlogger", a "SIGABT" appears.
Compiler: GCC 4.1.2, under Ubuntu Linux 7.04 AMD64
Possible fix:
Change the code in "mlogger.c" from
/* append argument "-b" for batch mode without graphics */
rargv[rargc] = (char *) malloc(3);
rargv[rargc++] = "-b";
TApplication theApp("mlogger", &rargc, rargv);
/* free argument memory */
free(rargv[0]);
free(rargv[1]);
free(rargv);
to
/* append argument "-b" for batch mode without graphics */
rargv[rargc] = (char *) malloc(3);
rargv[rargc++] = "-b";
TApplication theApp("mlogger", &rargc, rargv);
/* free argument memory */
free(rargv[0]);
/*free(rargv[1]);*/
free(rargv);
I think, it might be the problem of 'rargv[rargc++]="-b"'. You may try the following test program:
#include <stdio.h>
#include <malloc.h>
int main(int argc, char** argv)
{
char* pp;
pp = (char *)malloc(sizeof(char)*3);
/* pp = "-b"; */
strcpy(pp,"-b");
printf("PP=%s\n",pp);
free(pp);
return 0;
}
If using "pp=\"-b\"", a SIGABRT appears. |
03 Apr 2007, Stefan Ritt, Bug Fix, SIGABT of "mlogger" and possible fix
|
Exaos Lee wrote: | Version: svn 3658
Code: mlogger.c
Problem: After executation of "mlogger", a "SIGABT" appears.
Compiler: GCC 4.1.2, under Ubuntu Linux 7.04 AMD64
Possible fix:
Change the code in "mlogger.c" from
/* append argument "-b" for batch mode without graphics */
rargv[rargc] = (char *) malloc(3);
rargv[rargc++] = "-b";
TApplication theApp("mlogger", &rargc, rargv);
/* free argument memory */
free(rargv[0]);
free(rargv[1]);
free(rargv);
to
/* append argument "-b" for batch mode without graphics */
rargv[rargc] = (char *) malloc(3);
rargv[rargc++] = "-b";
TApplication theApp("mlogger", &rargc, rargv);
/* free argument memory */
free(rargv[0]);
/*free(rargv[1]);*/
free(rargv);
I think, it might be the problem of 'rargv[rargc++]="-b"'. |
Actually the line
rargv[rargc] = (char *) malloc(3);
needs also to be removed, since rargv[1] points to "-b" which is some static memory and does not need any allocation. I committed the change. |
03 Apr 2007, Stefan Ritt, Info, Switch to Visual C++ 2005 under Windows
|
I had to switch to Visual C++ 2005 under Windows. This required the upgrade of
all project files under \midas\nt\ and fixing a few warnings, since the new
compiler is more picky.
Note that in order to use most C RTL funcitons, you have to define two
preprocessor statements:
#define _CRT_SECURE_NO_DEPRECATE
#define _CRT_NONSTDC_NO_DEPRECATE
either at the beginning of a file (before you include stdio.h), or via the
project property page under C/C++ / Preprocessor / Preprocessor Definitions,
where you also have the WIN32 and the _CONSOLE definitions. I adapted all
project files in the distribution, but for all local projects this has to be
done additionally. |
23 Feb 2007, Konstantin Olchanski, Info, RFC- history system improvements
|
While running the ALPHA experiment at CERN, we stressed and broke the MIDAS history system. We
generated about 0.5 GB of history data per day, and this killed the performance of the history plot
system in mhttpd - we had to wait for *minutes* to look at any plots of any variables.
One way to address this problem could be by changing the way ALPHA slow controls data is collected.
Another way to address this problem could be by improving the midas history system by removing
some of the existing limitations and inefficiencies, enabling it to handle the ever increasing data
volumes we keep throwing at it.
I feel the second approach (improving midas) is more useful in general and it appears that big
improvements can be made by small modifications of existing code. No rewrites of midas are required.
Read on.
Issue 1: in the mlogger, history is recorded with fairly coarse granularity.
For an equipment, if any varible changes, *all* variables for that equipment are written into the history
file.
Historically, this worked fairly well for experiments with low data rates (a few history changes per
minute) and with variables equally distributed between different equipments. But even for a modest
sized experiment like TRIUMF-E614-TWIST, recording many variables when only one has changed has
been a visible inefficiency. Current experiments wish to record more history data more frequently, but
even with latest and greatest hardware, in the case of ALPHA, this inefficiency has become a
performance killer.
One could solve this problem by refactoring the data (one variable per equipment/one equipment per
variable). I find this approach inelegant and contrary to the "midas way" (whatever that is).
An alternative would be to change the mlogger to record history with per-variable granularity. When
one variable changes, only that variable is recorded. Preliminary examination of the existing code
indicates that history writing in the mlogger is already structured in a way that makes it easy to
implement, while the history reading code does not seem to need any changes at all.
Issue 2: all history data is recorded into a single file.
Again, this has worked well historically. In fact, until not so long ago, it was the only sane way to record
history data because operating systems could not efficiently write data into multiple files at the same
time. Insifficient data buffering, suboptimal storage allocation strategies - all leading to bad
performance. Latest Linux kernels have largely resolved all such issues.
The present problem arises when recording large amounts of history data (say 100 variables) and then
making a history plot of 1 variable. Because data for the one variable of interest is spread across the
whole file, effectively, the whole file has to be read into memory, data for 1 variable collected and data
for the other 99 variables skipped.
In this case, a speed up by a factor of 100 could be obtained by recording (say) one variable per history
file. (Yes, the history code does use "lseek", but the seek granularity of modern disks is very coarse and
in my tests, reading the whole file (streaming) is almost faster than seeking through it).
One has to be very careful when looking at these numbers and running benchmarks. Modern computers
with fast disks and large RAM performs very well no matter how history data is stored and organized.
Performance problems surface only under the load when running the production system, when the
disks are busy recording the main data stream and all RAM is consumed by user applications doing
data analysis.
The obvious solution to this problem is to record each variable into a separate data file. This will
require modifications to the history writing code in the mlogger and to the history reading code in
mhttpd, mhist & co.
An extra challenge in this tast is to minimize changes to the existing code and to keep compatibility
with the existing data files - new code should be able to read existing data files.
I propose to organize data into subdirectores:
history/equipmentNNN/variableVVV/YYMMDD.hst
This scheme does two good things for the history plotting in mhttpd:
1) note that mhttpd always plots one variable at a time, and the variables are addressed by equipment
(int) and variable name (string) (plus the array index). In the proposed scheme, the code would know
exactly which history file to open to get the data, no scanning of directories or seeking inside the
history file.
2) when setting up mhttpd history plots, the code can easily see what equipment and variables exist
and *ever existed*. The present code only examines the latest history file and cannot see variables that
have been deleted (or not yet written into the existing file). For example, one cannot see variables that
existed in the 2005 history but were removed (or renamed) in 2006. (Yes, it can be done by an expert
using mhist to examine the 2005 history files and odbedit to manually setup the history plots).
Over the next few weeks, I will proceed with implementing these two improvements: (1) mlogger write
history with per-variable granularity; (2) history file split into one-file-per-variable. If my initial
assessment is correct and the changes indeed are small, contained, non-intrusive and compatible with
existing history files, I will submit them for inclusion into mainline midas.
K.O. |
26 Feb 2007, Stefan Ritt, Info, RFC- history system improvements
|
I agree to what you propose. I'm pretty sure you are right in getting a significant improvement in readout speed
of the history system. So far there was no big request for improving the history system, since the performance in
the experiments I was involved in was good. In MEG for example, we have ~20MB of history data per day, and all
plots even going back some months can be made in a couple of seconds. Have a look for example at
http://midas.psi.ch/megon00/HS/PCS/Pressures.gif?hscale=1843200&hoffset=-5068800
This plot stretches over two weeks and involves ~500 MB of history data, and is prepared in a couple of seconds.
The key question here is how big the disk cache of the OS is. The above plot does not read all 500 MB, but skips
many data points in order to obtain ~1000 data points (one per pixel) for the requested period. To find these
data points, it reads and scans the history index files (yymmdd.idx), which are only a few percent of the
yymmdd.hst data files. The index file contains only the time stamp, the event id and the location of the event in
the *.hst file. Scanning the index file is as efficient as scanning a history file with a single variable. Now
comes the access of the history file. For ~1000 data points, 1000 locations have to be read. This requires
reading in the FAT table for the history file and accessing the sector clusters containing the data. In worst
case one has to read 1000 clusters. With a cluster size of 2kB this will be 2MB of data, something which can be
read very quickly. On the MEG system I observe that the first history plot takes about 5 seconds, while all
consecutive plots take about 1 second. This indicates that the FAT information is cached by the OS. This depends
of course as you indicated correctly on how much memory is available for disk caching, how many processes are
running etc. and will finally determine how fast your history access will be.
So if you implement your proposed new scheme, please consider the following:
- Scanning a single variable file is about the same as scanning the current index file. You save however the
access to the data file. If you plot several variables together, you have to access several "single variable
files", so your access time scales with the number of variables. In the current system, it's likely that
different variables from the same event are located in the same cluster. So you have to read the history file
once for each variable, but after the first variable the sectors of interest are very likely cached by the OS. So
I would estimate that the break-even point is about 2-3 variables. I mean if you read more that three variables,
your proposed method might get slower than the current one. This is of course not the case if there are very many
events in the history file. In that case the index file might be much bigger, since it gets a new entry if *any*
variable in an event changes. If all index file together are bigger than you disk cache, the system will become
slow (and I guess that's what you see). In MEG, the index file is about 1MB per day, so a few weeks fit easily
into the disk cache.
- In order not to get too much data, the history system needs fine tuning. Each slow control system class driver
as an "update threshold", which is used to determine if a variable has "changed". For some noisy channels, it
might be worth to set the threshold at 3 sigma of the noise level (RMS). This can reduce your history data
dramatically. For some equipment, you even might consider to define a minimum update period. This is done via
"/Equipment/<name>/Common/Log history". If that variable is set to 10, the time between two consecutive history
records is at least 10 seconds. For some temperatures for example it might make sense to set this even to one
minute or so, depending on how fast your temperatures change.
- If you implement a per-variable history, you probably have to use the per-event hot link in the ODB. Otherwise
you would exceed the number of hot links MAX_OPEN_RECORDS which is currently 256. If you then get a hot link
update, you have to check manually which variable(s) have changed in log_history() in mlogger.c
- Before you actually go and implement the full system, I would write some small test code to "simulate" the new
scheme. Write some dummy files with the full data you expect in the ALPHA experiment and see what the improvement
is under realistic conditions. Only if you see a big improvement it's worth to implement the full code. Test this
on various machine to get a better overview. Maybe it's worth testing different file systems and cluster sizes as
well.
- If there is an improvement, I'm more than happy to replace the current history code in midas. It might however
not be clean to have a heterogeneous history system, where some files are in the old format and some in the new.
It might be better to write a little conversion routine which converts the old format into the new one, even
omitting records where single variables did not change. This conversion could be even put into the standard
mlogger code and is executed automatically if the logger is started first and finds some old data files.
Even if the speed improvement is not so big, one will certainly win a lot on disk file size (like if only one
variable out of 100 changes). This will probably make it worth to implement anyhow. |
16 Mar 2007, Konstantin Olchanski, Info, RFC- history system improvements
|
> Let's improve the midas history system...
After implementing 2 prototypes, one aspect of the new design is starting to firm up enough to write it down (I do so in a mock FAQ format).
Q. I ran an experiment at triumf, returned home and now I have a bunch of midas history files (*.hst) on my laptop. How do I export these history
data to some useful format?
A. Run "mhdump *.hst | import_to_sql.perl" or "mh2ttree -o history.root *.hst" (export to mysql or ROOT TTree respectively). (TBW:
import_to_sql.perl and mh2ttree)
Q. I have all these midas history files (*.hst), how do I look at them with mhttpd?
A. Follow these steps:
1) setup a blank experiment (no frontends, no analyzer, no mlogger), make sure you can run odbedit and mhttpd.
2) put (symlink) the history files into the history (data) directory
3) run "mhdump -t *.hst > tags.cmd"
4) run "odbedit -c @tags.cmd"
5) start mhttpd, go to the "history" page, setup history plots
6) look at history plots as usual
As always, all the cool stuff is happening behind the scenes:
- in step (3) and (4) we create ODB entries for all events and tags in the history files:
/history/tags/2 = "Trigger" <--- declare event 2 "Trigger" (was equipment "Trigger" while we were taking data)
/history/tags/2:Rate = 1 <--- declare tag "Rate" as an array of one element
/history/tags/2:Scalers = 10 <--- declare tag "Scalers" as an array of 10 elements
... and so forth for each event and tag that ever existed in the history files.
When running a live experiment, the /history/tags entries are created by the mlogger.
- in step (5), the history plot setup page reads the names of history events and tags from /history/tags. The existing code for extracting the
names of events and tags from the /equipment tree goes away. The variables part of history plots are saved the same way as now, i.e.
"Trigger:Rate" and "Trigger:Scalers[3]" - existing plot definitions continue working as before.
- in step (6), to plot the variable named "Trigger:Scalers[3]", the mhttpd code again reads /history/tags to find out that "Trigger" corresponds to
event id 2 and "Scalers" is a valid array (of size 10). This is enough to call hs_read() with the correct arguments to read the existing .hst files - the
existing code will even regenerate the .idx and .def history files.
How do existing experiments migrate to the new code? It is all automatic, no user actions needed. For writing history files, there are no changes.
For reading history files, the "new mhttpd" expects to find /history/tags, which will be created automatically by the "new mlogger".
I am presently cleaning up the implementation of this idea in mhttpd and in the mlogger (only those 2 files are affected- 2 functions in mhttpd.c
and 1 function in mlogger.c) and after some testing it will be ready for commiting to midas svn.
The next step would be changes in mlogger.c for recording the history for each variable separately (each variable gets it's own event id). I have
this implemented, but interaction with mhttpd is still in flux and I may want to run the new code at CERN for a few months before I deem it stable
enough for general use.
K.O. |
06 Mar 2007, Konstantin Olchanski, Info, commited mhttpd fixes & improvements
|
I commited the mhttpd fixes and improvements to the history code accumulated while running the ALPHA
experiment at CERN:
- fix crashes and infinite loops while generating history plots (also seen in TWIST)
- permit more than 10 variables per history plot
- let users set their own colours for variables on history plot
- (finally) add gui elements for setting mimimum and maximum values on a plot
- implement special "history" mode. In this mode, the master mhttpd does all the work, except for
generating of history plots, which is done in a separate mhttpd running in history mode, possibly on a
different computer (via ODB variable "/history/url").
I also have improvements to the mhttpd elog code (better formatting of email) and to the "export history
plot as CSV" function, which I will not be commiting: for elog, we switched to the standalone elogd; and
CSV export is still very broken, even with my fixes.
The commited fixes have been in use at CERN since last Summer, but I could have introduced errors
during the merge & commit. I am now using this new code, so any new errors should surface and get
squashed quickly.
K.O. |
27 Feb 2007, Piotr Zolnierczuk, Forum, event builder scalability
|
Hi there:
I have a question if there's anybody out there running MIDAS with event builder
that assembles events from more that just a few front ends (say on the order of
0x10 or more)?
Any experiences with scalability?
Cheers
Piotr |
27 Feb 2007, Stefan Ritt, Forum, event builder scalability
|
> Hi there:
> I have a question if there's anybody out there running MIDAS with event builder
> that assembles events from more that just a few front ends (say on the order of
> 0x10 or more)?
> Any experiences with scalability?
At the MEG experiment at PSI we run with 5 front-ends (later 8), each running at
about 10 MB/sec. This gives an overall rate of 50MB/sec without any problem. The
CPU load on the backend (2.6 GHz dual Xenon) is 30% for the event builder and 26%
for the logger. The DANCE experiment at Los Alamos runs 17 front-ends if I'm not
mistaken (John?). |
27 Feb 2007, John M O'Donnell, Forum, event builder scalability
|
At Los Alamos, we have 15+1 frontends - the 15 between them read about 2 or 3
TB/hour and reduce it to 1 to 5 GB/hour which is then sent to the mevb on a 17th
computer. The 16th frontend handles deadtime issues and scalers (small data rate).
frontends are 1GHz pentium 3, and backend is 2.8GHz dual CPU with hyperthreading.
Interconnect is 100Mb ethernet from frontends to switch, and 1Gb ethernet from
switch to backend.
Our bottle neck is (a) compactPCI backplane reading data from waveform digitizers
to the frontend CPUs and (b) CPU power on the frontend CPUs to analyzer the waveforms.
John |
27 Feb 2007, Stefan Ritt, Forum, event builder scalability
|
> Our bottle neck is (a) compactPCI backplane reading data from waveform digitizers
> to the frontend CPUs and (b) CPU power on the frontend CPUs to analyzer the waveforms.
I forgot to mention that our front-ends at MEG are 2.8 GHz dual Xenon with Hyperthreading.
This gives "virtual" 4 CPU cores which are really necessary for waveform calibration and
analysis. It makes use of the new multi-threading feature in the midas front-end. I run
actually 7 threads (one VME readout, 4 calibration threads, one encoding thread and the
main thread sending data to the backend. This speeds up data taking by a factor of four
compared to a single thread. So if one plans for waveform analysis in the frontend to
reduce the data, I would recommend a box with dual quad cores. |
02 Mar 2007, Kevin Lynch, Forum, event builder scalability
|
> Hi there:
> I have a question if there's anybody out there running MIDAS with event builder
> that assembles events from more that just a few front ends (say on the order of
> 0x10 or more)?
> Any experiences with scalability?
>
> Cheers
> Piotr
Mulan (which you hopefully remember with great fondness :-) is currently running
around ten frontends, six of which produce data at any rate. If I'm remembering
correctly, the event builder handles about 30-40MB/s. You could probably ping Tim
Gorringe or his current postdoc Volodya Tishenko (tishenko@pa.uky.edu) if you want
more details. Volodya solved a significant number of throughput related
bottlenecks in the year leading up to our 2006 run. |
03 Mar 2007, Piotr Zolnierczuk, Forum, event builder scalability
|
Hi all,
thank you for all responses.
It seems that there's no problem running MIDAS with event builder assembling
data from ~10 front-ends. How about ~100? One possible solution is to have a
multi-tiered architecture.
The reason I am asking is that we are in the process of designing an Ethernet
based DAQ system with front-ends running on embedded computers (Linux/ARM
CPU/Xilinix FPGA) and MIDAS is one of my options as a DAQ framework.
I am open for advice/suggestions.
Thanks again
Piotr |
03 Mar 2007, Stefan Ritt, Forum, event builder scalability
|
> It seems that there's no problem running MIDAS with event builder assembling
> data from ~10 front-ends. How about ~100? One possible solution is to have a
> multi-tiered architecture.
>
> The reason I am asking is that we are in the process of designing an Ethernet
> based DAQ system with front-ends running on embedded computers (Linux/ARM
> CPU/Xilinix FPGA) and MIDAS is one of my options as a DAQ framework.
> I am open for advice/suggestions.
The event builder is a standalone application not part of the "midas core". It
receives data from N producers and combines the fragments into events based on
their serial number as a dedicated process. If it would become a bottleneck, it
can simply be redesigned and optimized. I made currently good experience with
multi-threaded applications running on multi-core CPUs. Implementing your
multi-tiered architecture as a multi-threaded event builder, where each of ten
threads receives data from ten front-ends, combines them and passes them to the
"collector thread" would make sense to me. Between the threads you can pass data
with many GB/sec, as compared to an ethernet-based architecture. I currently
implemented the rb_xxx functions inside midas.c which lets you pass data between
threads on a zero-copy basis.
Inside the core functions of midas there is no limitations whatsoever. All
counters etc. are 32-bit, so you can run 2^32 data consumers etc. You will first
hit the OS process limit. What I'm more concerned is your network bandwidth. If
you run 100 front-ends each with more than 1MB/sec, you would hit the 1GBit limit
of your network card. If you put more network interfaces, you will hit the disk
I/O limit which is around 100-200MB/sec even on larger RAID1 disk arrays (unless
you do data compression during event building).
Another limit I see is the run transition. On each start/stop of a run, the
process which wants to start/stop the run has to contact all producers via a TCP
connection. Opening 100 TCP connection will take maybe 10-30 seconds, which is not
very convenient. A multi-threaded approach will help, but this is not (yet)
implemented, maybe you would have to do it yourself.
Another approach would be that you put the event building "in front of midas". All
your front-ends run a specific protocol outside of midas. They send their data to
a collecting process which acts as a single front-end to midas. So in the midas
framework you see only a single front-end, which gets it's data not from hardware,
but from 100 other nodes. This way you can optimize the protocol between your
front-end nodes and the collector process for your application. Run transitions
can be done through multicast UDP messages for example, which will even work with
1000 front-ends. But you have to implement that yourself.
I would start with the first approach: Taking the out-of-the box midas, see how
far I get. If you have access to a normal linux cluster, you can simply run ten
dummy front-ends on each of ten nodes, thus simulating 100 front-ends and see how
far you get. If the event builder is the bottle neck, do an optimization or
redesign. If the run transitions become your bottle neck, switch to method two. In
both ways you can utilize the downstream part of midas, like the logger, the
history system, etc. so you would still gain a lot compared to a design from scratch.
Best regards,
Stefan |
26 Feb 2007, Stefan Ritt, Info, Fragmented polled events
|
Fragmented polled events have been implemented in SVN revision 3625.
Fragmentation is a method of breaking down large (>MB) events into smaller
pieces and send them through the shared memory buffers, reassembling them at the
output. In the past this was only possible for periodic events (such as large
histograms read out once every few seconds), but now this is also possible for
polled events. |
26 Feb 2007, Stefan Ritt, Info, Usage of event channel for improved throughput
|
Starting from SVN revision 3642, sending events from the front-end has been revised.
Since long time ago, there is a special TCP socket established between any front-end and the mserver which can be used to bypass the midas RPC layer completely and purely send events. There was a #define USE_EVENT_CHANNEL but to my knowledge nobody used it.
While optimizing data throughput for the MEG experiment, I revisited this mechanism and got it finally working. Here are some benchmark tests made with the produce program on two dual-CPU machines running on Gigabit Ethernet:
Using normal RPC socket:
event size speed [MB/sec] CPU usage front-end CPU usage server
==================================================================
40 3 22 100
1000 44 25 100
100000 101 14 50
Using new event socket:
event size speed [MB/sec] CPU usage front-end CPU usage server
==================================================================
40 12 100 34
1000 99 58 59
100000 101 14 43
As can be seen, the CPU load on the server drops significantly for smaller events since the processing time per event is reduced. If the transfer was limited by the server, the throughput goes up significantly. For large events the bottleneck on the server side is the memcpy of events, so no big improvement is visible. The saved CPU time however can be used to analyze more events for example.
The event socket is now enabled by default in the front-end by setting
rpc_mode = 1
in mfe.c and should be checked carefully in various experiments. There is a small chance that events get stuck in the buffer cache on the server side at the end of the run, in which case they would show up as the first events of the next run. I know that this problem happened in some experiment before, but that must have been unrelated to the rpc_mode. So please check again and report any problem with the new rpc_mode. |
23 Feb 2007, Konstantin Olchanski, Info, RFC- support for writing to removable hard disk storage
|
At triumf, we are developing a system to use removable hard drives to store data collected by midas
daq stations. The basic idea is to replace storage on 300 GB DLT tapes with storage on removable
esata, usb2 or firewire 750 GB hard drives.
To minimize culture shock, we stay as close as possible to the "tape" paradigm. Two removable disks
are used in tandem. Data is written to the first removable disk until it is full. Then midas automatically
switches to the second disk and asks the operator to replace the full disk with a blank disk. Similar to
handling tapes, the operator takes the full disk and stores it on the shelf (offline); takes a blank disk
and connects it to the computer. To read data from one of the disks, the operator takes the disk from
the shelf and connects it to the daq computer or to some other computer equipped with a compatible
removable storage bay. The full data disks are mounted read-only to prevent accidental data
modifications.
Two pieces of software are needed to implement this system:
1) midas support for switching to alternate output disks as they become full. Data could be written to
the removable disk directly by the mlogger (no extra data copy on local disks) or by the lazylogger
(mlogger writes the data to the local disk, then the lazylogger copies it to the removable disk). Writing
directly to the removable disk is more efficient as it avoids the one extra data copy operation by the
lazylogger.
2) a user interface utility for mounting and dismounting removable disks. Handling of removable disks
cannot be fully automatic: before unplugging a removable disk, the user has to inform the system; after
connecting a removable disk, the user has to tell the system to mount it read-only (for existing data),
read-write (to add more data) or to initialize a blank disk (fdisk+mkfs). (Also, some SATA interfaces do
not implement automatic hot-plug: they have to be manually told "please look for new disks").
We are presently evaluating various internal SATA hot-plug enclosures. We evaluated external eSATA
and USB2 enclosures and decided not to use them: while the performance is adequate, presence of
extra bulky components (eSATA and USB cables, non-standardized power bricks) and the extra cost of
eSATA and USB hard drive enclosures makes them unattractive.
I am open to suggestions and comments. I am most interested in hearing which data path (mlogger or
the lazylogger) would be most useful for other users.
K.O. |
23 Feb 2007, John M O'Donnell, Info, RFC- support for writing to removable hard disk storage
|
We stopped using tapes at Los Alamos a while ago. The model we use is:
write data with mlogger to a local RAID system. This is NFS mounted read only on teh analysis machines, and
becomes the working copy for most tasks. Copy data to external hardrives. We have been using USB. The USB
system is sometime a little flaky (lnux 2.4.21-7, so we have a computer dedicated to this task. The USB driver
can be reloaded, or if the user is not so knowledgeable, the copmuter can be rebooted. users on this computer
have sudo privs, so they can format hard drives. The disks are inserted into boxes while in use, and stored on
a shelf for data archival, so we don't have a lot of enclosures.
I use the automounter to mount and unmount the drives. With a 10 second timeout, the user needs only to wait a
few seconds before unplugging the disk. (cat /proc/mounts allows them to check if they want.) dmesg allows
them to find the drive letter. This works for any device which appears later as a SCSI disk. The automounter
manages /mnt/usb for vfat formatted devices, and /mnt/usbl for ext3 formatted devices (preferred for data
archiving).
autofs config files are:
/etc/auto.usb
# This is an automounter map and it has the following format
# key [ -mount-options-separated-by-comma ] location
# Details may be found in the autofs(5) manpage
* -fstype=auto,nosuid,nodev,umask=0000,noatime :/dev/&
/etc/auto.usbl
# This is an automounter map and it has the following format
# key [ -mount-options-separated-by-comma ] location
# Details may be found in the autofs(5) manpage
* -fstype=auto,nosuid,nodev :/dev/&
/etc/auto.master contains
/mnt/usb /etc/auto.usb --timeout=10
/mnt/usbl /etc/auto.usbl --timeout=10
John.
> At triumf, we are developing a system to use removable hard drives to store data collected by midas
> daq stations. The basic idea is to replace storage on 300 GB DLT tapes with storage on removable
> esata, usb2 or firewire 750 GB hard drives.
>
> To minimize culture shock, we stay as close as possible to the "tape" paradigm. Two removable disks
> are used in tandem. Data is written to the first removable disk until it is full. Then midas automatically
> switches to the second disk and asks the operator to replace the full disk with a blank disk. Similar to
> handling tapes, the operator takes the full disk and stores it on the shelf (offline); takes a blank disk
> and connects it to the computer. To read data from one of the disks, the operator takes the disk from
> the shelf and connects it to the daq computer or to some other computer equipped with a compatible
> removable storage bay. The full data disks are mounted read-only to prevent accidental data
> modifications.
>
> Two pieces of software are needed to implement this system:
>
> 1) midas support for switching to alternate output disks as they become full. Data could be written to
> the removable disk directly by the mlogger (no extra data copy on local disks) or by the lazylogger
> (mlogger writes the data to the local disk, then the lazylogger copies it to the removable disk). Writing
> directly to the removable disk is more efficient as it avoids the one extra data copy operation by the
> lazylogger.
>
> 2) a user interface utility for mounting and dismounting removable disks. Handling of removable disks
> cannot be fully automatic: before unplugging a removable disk, the user has to inform the system; after
> connecting a removable disk, the user has to tell the system to mount it read-only (for existing data),
> read-write (to add more data) or to initialize a blank disk (fdisk+mkfs). (Also, some SATA interfaces do
> not implement automatic hot-plug: they have to be manually told "please look for new disks").
>
> We are presently evaluating various internal SATA hot-plug enclosures. We evaluated external eSATA
> and USB2 enclosures and decided not to use them: while the performance is adequate, presence of
> extra bulky components (eSATA and USB cables, non-standardized power bricks) and the extra cost of
> eSATA and USB hard drive enclosures makes them unattractive.
>
> I am open to suggestions and comments. I am most interested in hearing which data path (mlogger or
> the lazylogger) would be most useful for other users.
>
> K.O. |
26 Feb 2007, Stefan Ritt, Info, RFC- support for writing to removable hard disk storage
|
In the MEG experiment, we simply installed 100TB of RAID disks and don't need to change anything
But seriously, you are right that such a system might be beneficial. I propose to extend the current logger code to switch disks. In the current tr_start() funciton in mlogger, the code checks for "subdir_format" to create separate subdirectories like once per week. One could extend this code in the following way:
- Add an array of strings and name it "Path", such as
/dev/sda1/datadir/
/dev/sdb1/datadir/
- On each stop of the run, check if the current disk has enough space for one more run. Take either the "Byte limit" of that channel, or the actual size of the last run and multiply it by two or so. If the disk is "almost full", switch to the next array element in "Path". Append the file name, such as "/dev/sda1/datadir/run1234.mid" and put this into "Current filename" as a feedback for the user. Now write to the new disk/file.
- Add as string like "Execute on switch", which gets called after you switched to the next disk. This shell script can then handle the un-mounting of the full disk, notify the user etc. This is similar to the "/Programs/Execute on start run" in the ODB, but it gets only called if you switch the disk. |
05 Feb 2007, Fedor Ignatov, Bug Report, segmentation violation of analyzer on a x86_64
|
Hello,
When I connect to analyzer on a x86_64 processor(with Roody),
a analyzer break with segmentation violation in the root_server_thread function.
Same code are working fine on a 32bit processor.
As I found the problem are in exchanging of pointers between analyzer and client.
Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
this place:
Index: src/mana.c
===================================================================
--- src/mana.c (revision 3498)
+++ src/mana.c (working copy)
@@ -5386,7 +5386,7 @@
//write pointer
message->Reset(kMESS_ANY);
- int p = (POINTER_T) obj;
+ POINTER_T p = (POINTER_T) obj;
*message << p;
sock->Send(*message);
Sincerely Yours,
Fedor Ignatov |
06 Feb 2007, Stefan Ritt, Bug Report, segmentation violation of analyzer on a x86_64
|
> Hello,
>
> When I connect to analyzer on a x86_64 processor(with Roody),
> a analyzer break with segmentation violation in the root_server_thread function.
> Same code are working fine on a 32bit processor.
> As I found the problem are in exchanging of pointers between analyzer and client.
> Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
> this place:
> Index: src/mana.c
> ===================================================================
> --- src/mana.c (revision 3498)
> +++ src/mana.c (working copy)
> @@ -5386,7 +5386,7 @@
>
> //write pointer
> message->Reset(kMESS_ANY);
> - int p = (POINTER_T) obj;
> + POINTER_T p = (POINTER_T) obj;
> *message << p;
> sock->Send(*message);
>
>
> Sincerely Yours,
> Fedor Ignatov
Do I understand you right? With your patch it works even on 64 bit, right? Or do you
mean there is still a segmentation violation? Anyhow I committed your patch since the
"int" is clearly incorrect.
- Stefan |
06 Feb 2007, Fedor Ignatov, Bug Report, segmentation violation of analyzer on a x86_64
|
Yes right, Problem of a segmentation violation is solved with this patch. Now it works
fine on x86_64.
Fedor
> Do I understand you right? With your patch it works even on 64 bit, right? Or do you
> mean there is still a segmentation violation? Anyhow I committed your patch since the
> "int" is clearly incorrect.
>
> - Stefan |
17 Feb 2007, Konstantin Olchanski, Bug Report, segmentation violation of analyzer on a x86_64
|
> Yes right, Problem of a segmentation violation is solved with this patch. Now it works
> fine on x86_64.
Right. I confirm this. I have this exact same fix in my stand-alone copy of the midas
histogram server, and should commit it to MIDAS CVS as well.
K.O. |
|