ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 18 of 152

Not logged in

Find | Login | Help

Full | Summary | Threaded | Hide attachments

3027 Entries

Goto page Previous 1, 2, 3 ... 17, 18, 19 ... 150, 151, 152 Next

ID	Date	Author	Topic	Subject
396	13 Jul 2007	Stefan Ritt	Forum	Midas on a x86_64 - incompatible with x86_32
> The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or > the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding > to these data structures and break the 32-bit ODB). > > I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed > environment because out latest host computers all run 64-bit Linux while all our VME processors and all > older machines can only run 32-bit code; this incompatibility causes us weekly headaches. > > Any thoughts? I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling. This ensures to keep the native 64-bit packing, which probably will be somehow optimized for 64-bit architectures and therefore might be a bit faster in the long run, when most systems are 64-bit. After this has been implemented and well tested, I would go with an official announcement of the 32-bit break in the ODB, and release a new version, so people can update from a TAR file if necessary. Existing ODB's can be converted to the new format by exporting them in XML form and importing them again after the upgrade.
397	26 Jul 2007	Stefan Ritt	Info	Change of pointer type in mvmestd.h
I had to change the pointer type of mvme_read and mvme_write to (void ) instead to (mvme_locaddr_t ) to avoid warnings under 64-bit linux. Please adjust your VME drivers if necessary. - Stefan
405	03 Sep 2007	Stefan Ritt	Bug Report	how to handle end of run?
> I am having problems with handling the end-of-run situation in my midas > frontend. I have a device that continuously sends data (over USB) and I read > this data in my "read_event" function. > > Everything is good until the end-of-run, at which time this happens: > 0) mfe.c calls my read_event() to read the data (loop until the end-of-run > transition) > 1) mfe.c calls my end_of_run() > 2) here, I tell the device "please stop sending data" > 3) all seems good, but wait!!! > 4) there is all this data generated between step 0 and step 2 still sitting > inside the device and it has nowhere to go: the run is ended, the output file is > closed, my read_event() will never be called ever again (well, until the next run). > > It seems to me mfe.c needs to have one more function, something like > "pre_end_of_run()" that works like this: > 0) mfe.c calls my read_event() to read the data (loop until the end-of-run > transition) > 1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data > 2) mfe.c calls read_event() for the very last time, to give me the opportunity > to read and send away any data I still may have. > 3) mfe.c calls the end_of_run(). The run is truely finished. > > Any thoughts? You can achieve the desired functionality without changing mfe.c: 0) mfe.c calls read_event 1) mfe.c calls end_of_run. Your end_of_run tells the device to stop data and flushes the remaining data. At this point you have to re-make actually a part of the mfe.c functionality, but basically you need a bm_compose_event() and a bm_send_event(), so just a few lines of code. If you want to have the final event number right in your equipment, you also need to update eq->events_sent accordingly. Given the fact that 99% of the experiments do not need this functionality, I propose that we keep mfe.c and you add the few lines of code into your user part of the specific frontend. Stefan
406	06 Sep 2007	Stefan Ritt	Info	Introduction of MIDAS_MAX_EVENT_SIZE
We had the problem that different experiments used different MAX_EVENT_SIZE values (the MEG experiment actually 10 MB!). If each experiment changes the value in midas.h and accidentally commits it, other experiments are affected. Therefore I modified midas.h and the Makefile to accept a new environment variable MIDAS_MAX_EVENT_SIZE. If this value is set, the Makefile passes it's value to midas.h where it supersedes the default value which is currently at 4 MB. PAA: Can you pleas add this to the documentation at the right spot? Thanks.
409	08 Oct 2007	Stefan Ritt	Bug Report	Error in data format- ending blocks on 32bit boundary x86_64
> Hi, > I found that midas banks can be given an extra 32 bits of zeros when > trying to keep to 32bit boundary on my x86_64. > > This can be fixed by changing (in midas.h) > #define ALIGN8(x) (((x)+7) & ~7) > to > #define ALIGN8(x) (((x)+3) & ~3) > > Is there any bad consequences doing this? Yes. ALIGN8 means 'align to 8-byte boundary' (64-bit), and if you change that, you break the code at various locations. Furthermore, 8-byte aligned access is faster on x86_64 than 4-byte aligned access, so you will get a performance penalty. If course if you have very many small banks, the zero padding can cause some overhead, but in that case you could combine some data into a single bank.
410	11 Oct 2007	Stefan Ritt	Bug Report	_syscall0 not available on gcc 4.1.1
Dear Stephan, I am writting on behalf of the LiBeRACE collaboration at Berkeley/Livermore. We are trying to use midas (2.0.0) for our acquisition system. However we had some difficulties to compile it on LINUX Fedora Core 6 with gcc 4.1.1 I tried to trace back the problem and I found that _syscall0 in system.c is actually an obsolete call (since gcc 4.x apparently). Playing with assembly language being behond my competence, I would like to know if you ever came across this situation recently and if you have any suggestion(s). With my best regards Julien GIBELIN ------------------------------------------------------ GIBELIN Julien Lawrence Berkeley National Laboratory Nuclear Science Division One Cyclotron Rd. MS 88R0192 BERKELEY, CA 94720-8101 Tel: +1 (510) 495-2695 Fax: +1 (510) 486-7983 ------------------------------------------------------
411	11 Oct 2007	Stefan Ritt	Bug Report	_syscall0 not available on gcc 4.1.1
> Dear Stephan, > > I am writting on behalf of the LiBeRACE collaboration > at Berkeley/Livermore. > > We are trying to use midas (2.0.0) for our acquisition system. > However we had some difficulties to compile it on LINUX Fedora > Core 6 with gcc 4.1.1 > I tried to trace back the problem and I found that _syscall0 in > system.c is actually an obsolete call (since gcc 4.x apparently). > Playing with assembly language being behond my competence, I would > like to know if you ever came across this situation recently and > if you have any suggestion(s). The '_syscall0' function call was replaced by 'syscall' in SVN revision 3583. I would recommend that you switch to the current SVN version (see http://ladd00.triumf.ca/~daqweb/doc/midas/html/quickstart.html on how to obtain the SVN version). If the problem still persists, please let us know. - Stefan
414	17 Oct 2007	Stefan Ritt	Forum	Multi-core CPUs
> I have this beautiful Intel Quadcore with fast disks, but MIDAS does obviously > only make use of one CPU at a time. Has anyboy of you already done some work > on making MIDAS parallel? Event-based data analysis should be the best > candidate for this. There are ring buffer routines rb_xxx for distributed event analysis, but this is currently only implemented in the front-end framework. These routines are pretty simple, and their integration into the analyzer should not be very difficult. Unfortunately I don't have time for that right now. We do our analysis such that we analyze four different runs in parallel on a quadcore machine. - Stefan
418	27 Nov 2007	Stefan Ritt	Info	ODB links to array elements implemented
In revision 4090 I implemented ODB links to individual array elements. Now you can have for example: Key name Type #Val Size Last Opn Mode Value --------------------------------------------------------------------------- array INT 10 4 2m 0 RWD [0] 0 [1] 0 [2] 123 [3] 0 [4] 0 [5] 0 [6] 0 [7] 0 [8] 0 [9] 0 element2 -> /array[2] INT 1 4 3m 0 RWD 123 In this case, the link "element2" points to the third element of "array", but is treated like a single value. This links are very useful for example for the "Edit on start" parameters, which can now point to individual array elements. The same is true for the "Links BOR" when the logger writes to a MySQL database. This modification required major modifications in the ODB. I have carefully tested the example experiment from the distribution to verify that everything is fine, but I'm not 100% sure that I covered all possible situations. So if you update to revision 4090+ and you observe some strange behavior related to links in the ODB, please report. There are following two new functions related to this change: db_get_link() db_get_link_data() They are counterparts of db_get_key() and db_get_data(), respectively, but without following links in the ODB. These functions are probably not of much use outside odbedit and mhttpd, which are supposed to display links explicitly. Most user applications want to follow links without even knowing that these are links.
419	07 Jan 2008	Stefan Ritt	Info	Roll-back for history sytem added
The midas history system always had the problem that the database can get corrupted if the disk gets full where the history records (.hst & .idx) are stored. This can happen if a history event can only be written partially on the almost full disk. If later some space is freed up (by deleting other files), the writing continues at the old position, leaving the partial event in the data base. In that case the whole history data of the current day cannot be read because it is corrupted. To solve the problem, a roll-back system has been implemented in the hs_write_event() function. If an event cannot be written fully, the history file is restored to the old state, so the partial event is removed from the end of the file via truncation. This way only the data which could not be written to the disk is missing in the history file, but the other data from that day is still valid and readable. The change has been committed in revision 4107.
421	05 Feb 2008	Stefan Ritt	Forum	analyzer crashes at high rates
> I'm using midas to read data from a waveform digitizer at event rates of > 10-30kHz. To accomplish this the digitizer is read via Block transfers and the > raw data put into a single MIDAS event. Thus a MIDAS event could contain upto > 250 physical events and at maximum 350kBytes. In the analyzer modules I had > been analyzing the first physics event contained in a MIDAS event with no > problem. Recently I tried to analyze all the physical events. At low rates, > 100hz-1khz, this was no problem, 1-5 physical events in a MIDAS event. At > higher rates 10-20kHz, where there are about 40physical events per MIDAS event, > the analyzer keeps up for a few seconds then seg faults with " 'shared object > read from target memory' has disappear; keeping it symbols". Any suggestions as > to why the analyzer is crashing would be very helpful. I personally have never seen this error message. The analyzer is designed such that it produces "back pressure" if the data rate is higher than the analysis rate and you have "request all events" on. The only thing I can image are the following two issues: - At higher rate where you have more than 40 physical events per MIDAS event, there is some bug in your analysis code which gets exploited only in that case. Maybe some temporary array which is only 35 entries long or something like this. - The back pressure mentioned above will slow down the frontend. If your computer busy logic is not working correctly, you might get more triggers than you can acquire. Maybe then the data gets screwed up and the analyzer chokes on it. Finding the exact reason is not simple. For sure you have to run the analyzer inside the debugger, to see exactly where the segfault happens. You then maybe have to produce some dummy data in the frontend (like always sending the same event) to disentangle some possible trigger problems from other problems. Best regards, Stefan
422	05 Feb 2008	Stefan Ritt	Info	Implementation of relative paths in mhttpd
A major change was made to mhttpd, changing all internal URLs to relative paths. This allows proxy access to mhttpd via an apache server for example, which might be needed to securely access an experiment from outside the lab through a firewall. Following setting can be places into the Apache configuration, assuming the experiment runs on machine "online1.your.domain", and apache on a publically available machine "www.your.domain": Redirect permanent /online1 http://www.your.domain/online1 ProxyPass /online1/ http://online1.your.domain/ <Location "/online1"> AuthType Basic AuthName ... AuthUserFile ... Require user ... </Location> If the the URL http://www.your.domain/online1 is accessed, it gets redirected (after optional authentication) to http://online1.your.domain. If you click on the mhttpd history page for example, mhttpd would normally redirect this to http://online1.your.domain/HS/ but this is not correct since you want to go through the proxy www.your.domain. The new relative redirection inside mhttpd now redirects the history page correctly to http://www.your.domain/onlin1/HS/ I had to change many places inside mhttpd to make this work, and I'm not 100% sure if I covered all occurrences. So if you upgrade to mhttpd revision 4115 and observe some error accessing some pages, please report it to me. - Stefan
425	06 Feb 2008	Stefan Ritt	Forum	rpc timeout, related to event_size and watch dog? need help
Most likely you changed the maximal event size in midas.h, but you did not re-compile all programs. The maximal event size goes into the size of the shared memory buffer, so all participating programs have to have the same setting, especially the mserver program. So do the following: - update to the latest midas version, which is revision 4116 - modify in your midas.h only MAX_EVENT_SIZE. The other settings you modified might have bad side effects. If you increase the RPC timeout, the error will still happen, just later. It comes from the fact that you sent too big events the the server (or the logger), which refuses to take the big events or simply crashes, so the RPC call never returns and after the timeout you get the error. - recompile all midas programs, don't forget the mserver program - run the standard demo frontend from the distribution I tried the above and it just worked fine for me.
427	06 Feb 2008	Stefan Ritt	Forum	rpc timeout, related to event_size and watch dog? need help
First of all, I would appreciate if you do not post your entry ten times. Each time you edit it, you produce an email notification going to everybody, so people might get annoyed to receive too many emails from you. Think what you want to write and then post once. Second, I told you to use the frontend from the distribution, but you used your own code. Since I successfully ran the demo frontend with the large event size, the origin of your problem must be "in between". So start with the demo frontend, try it, then modify its buffer size in frontend.c, then try again. When I told to to recompile midas, I meant you should also recompile your front-end each time you change midas.h. The mserver is automatically recompiled when you recompile and install midas (just check the /usr/local/bin/mserver date and time to confirm that it got updated during your last "make install"). Then add things from your specific front-end program step by step to see at which step the problem occurs the first time. This gives you some hint where the real cause lies.
431	13 Feb 2008	Stefan Ritt	Info	Roll-back for history sytem added
> But to make things more interesting we had another history outage this week - we > happen to write history files to an NFS server (not recommened! do not do this!) and > when the NFS server had a glitch, history files got corrupted - because during the > glitch NFS was not available, I think this roll-back feature would not have helped. Actually I put our history data on a separate file system, on a separate disk controlled by a separate RAID controller! If you write bulk data with the logger, and want to read history files at the same time with mhttpd, you get a bottleneck if both data are at the same physical disk. Separating this (and even the controller) speeded things up dramatically. The rollback will not work for NFS, since it requires truncating the file if an event gets only partially written. While on a full file system you always can delete data, this does not work if NFS is down. This explains the behavior. > Anyhow, I now have a patch to allow hs_read() to "skip the bad spots" in the history > files. (hs_gen_index() also needs a patch). > > In the nutshell, if invalid history data is detected, the code continues to read the > data one byte at a time, looking for valid event_id markers (etc). > > The code looks sane by inspection, and if nobody objects, I would like to commit it > in the next few days. Great. I was thinking of something like this myself. Having a quick look at your code looks good. The best of course would be if we would have some "magic number" for re-synchronizating the data stream, but that would blow up the file length. So searching for the right event id is good, but will not work 100%. Also the check if (irec.time < last_irec_time) to see if the history is broken is very weak. If you take random data, it will be true 50% and false 50%. If one makes however a check if ((irec.time - last_irec_time) > 360024) this would work correctly with random data in >99% of all cases (360024/2^32). Maybe you should change that.
432	14 Feb 2008	Stefan Ritt	Info	mhttpd history display updates
You misspelled one ODB entry: Line 9014: sprintf(str, "/History/Display/%s/Label", path); Line 9028: sprintf(str, "/History/Display/%s/Labels", path); ---^ I wonder how you could have tested that code for 1/2 year without noticing this error. I fixed and committed it.
439	19 Feb 2008	Stefan Ritt	Bug Fix	"make install" error on MacOS 10.4.7, svn 3366
> I forgot to mention that, the following (and similar) lines: > install -v -D -m 755 $$file $(SYSBIN_DIR)/`basename $$file` ; \ > are changed into > install -v -d -m 755 $$file $(SYSBIN_DIR)/`basename $$file` ; \ > > since -D is an illegal option for install. I am not sure whether -D in Linux means the same thing for -d in MacOSX install. -D under linux means: -D create all leading components of DEST except the last, then copy SOURCE to DEST; useful in the 1st format This means if you install the first time, and eithe SYSBIN_DIR or `basename is not existing, it will be created on-the-fly from the install program. If OSX does not support this, you somehow have to crate these subdirectories manually.
442	21 Feb 2008	Stefan Ritt	Bug Report	mhttpd safari 3.0.4 redirect problem
> /* start command / > if (getparam("Start")) { > /* for NT: close reply socket before starting subprocess / > - redirect2("?cmd=programs"); > + redirect2("/?cmd=programs"); The second version won't work if mhttpd is run under an Apache proxy. Assume the proxy redirects http://proxy.ca/midas to http://daq.ca:8080 If you now do a redirect to "/?cmd=programs", you will end up at http://proxy.ca/?cmd=programs which is now what you want. I tried to put a "./?cmd=programs", and that bings you to http://proxy.ca/midas/./?cmd=programs which is correctly redirected to http://daq.ca:8080/?cmd=programs I tried with the windows version (ughhh) of Safari and it worked for me. So give it a try, the change is committed. > ODB corruption happens here: > > sprintf(str, "/Programs/%s/Start command", name); > - db_get_value(hDB, 0, str, command, &size, TID_STRING, TRUE); > + db_get_value(hDB, 0, str, command, &size, TID_STRING, FALSE); > if (command[0]) { > ss_system(command); > > It looks like db_get_value() would corrupt ODB if given funny "str". When Safari explodes, > funny strings are generated. What happes is an endless redirect from xxxx -> xxxx?cmd=Programs. So in the end you have http://url.ca?cmd=programs?cmd=programs?cmd=programs?cmd=programs.... and in the end you get a stack overflow, which busts all. > The simple fix is to replace "TRUE" with "FALSE", then at least db_get_value() does not try to make bogus > entries in ODB. I changed both butting FALSE there and adding if (strchr(name, '?')) strchr(name, '?') = 0; which keeps the URL short. So for me it looks fine at the moment, but I cannot guarantee that everything works, so keep an eye open on that.
453	07 Mar 2008	Stefan Ritt	Bug Report	array overflows and other bugs
> I have just compiled MIDAS svn 4132 on a fresh SuSE 10.3 x86_64 system and gcc > found a bunch of bugs, I guess. Ahh, great! gcc is getting more and more clever. Each time gcc is updated, it finds a few new issues. Indeed some are real bugs, and I will work down the list as time permits. I see however no immediate thread (you are not using fragmented events, a transition 12 never occurs, etc.). Issue #4 from your list has to be checked by Pierre-Andre.
456	09 Mar 2008	Stefan Ritt	Suggestion	New Makefile for building MIDAS
> I rewrote the Makefile for MIDAS in order to make it tidy. I tested it on my box > and it works here. > 1. The full file is seperated to several parts > a. initialized setup > b. environment setup > c. specify OS-specific flags > d. processing environment for building flags > e. targets > 2. The file is less than 400 lines now. The original one is more than 500 lines. > 3. The modified one is easy for debuging. > > I tried to learn "autoconf" and "automake" in order to make building MIDAS more > compatible for various platforms. But I havn't enough time now. Hope somebody > can help it. The attached file is original named "Makefile.in" for using "autoconf". I think it is a good idea to cleanup the Makefile. It grew over many years and certainly had some inconsistencies. We did however not use "autoconf" since it is not of much use. It is meant for systems where small differences between different Unix flavors are covered by this system, but the midas source code is supposed not only to run on Unix, but also on vxWorks and Windows. As you can imagine, the differences are much more severe and a simple makefile generator cannot cover the details. Furthermore, under Windows there is no such thing like autoconf. So all the work to make the source code compile on all systems has been put into system.c using conditional compiling. So putting another abstraction layer on this would maybe more complicate things than simplify it. I will test your Makefile, and I also ask the guys at TRIUMF to do so. Once we conclude that it works fine, we can replace the original Makefile from the distribution.

Goto page Previous 1, 2, 3 ... 17, 18, 19 ... 150, 151, 152 Next

ELOG V3.1.4-2e1708b5