It appears that the latest version of MIDAS malfunctions on PowerPC/VxWorks
machines, below are two problem reports. As reported, previous versions of MIDAS
work fine, I guess that reduces the probability of it being buggy user code. At
least one of the problems feels like a missing endian conversion somewhere, but
I am not aware of any recent changes in the MIDAS RPC code... We will be trying
to debug both problems, but any insight would be greatly appreciated.
K.O.
From suz@triumf.ca Tue May 30 16:58:16 2006
Date: Tue, 30 May 2006 16:58:16 -0700 (PDT)
From: Suzannah Daviel <suz@triumf.ca>
To: konstantin olchanski <olchansk@triumf.ca>
Subject: rpc problems
Hi Konstantin,
Herewith a description of the problems,
Suzannah
Problem on system A:
--------------------
After upgrading the Linux operating system from RH9 to SL4, and installing
latest Midas software, the first time a manual trigger is issued, the VxWorks
frontend (running
on a PPC) crashes:
Output on PPC consol:
trigger histo event from status page
rpc_client_accept: starting with sock:11
program
Exception current instruction address: 0x01ac7388
Machine Status Register: 0x0008b030
Condition Register: 0x24000082
Task: 0x1b47908 "mfe"
The histo event is usually large so is fragmented. It is sent out by a
manual trigger and at end of run. When the run is ended (before an event
request using a manual trigger so program has not yet crashed) the histo
event is sent successfully.
After returning to the previous version of Midas but still running SL4,
this problem disappeared.
Problem on system B:
--------------------
Again, SL9 was installed, and the Midas software updated to the latest.
When sending a periodic (non-fragmented) event, after a while, one of the
parameters appears to become corrupted, and a lot of rpc_call error
messages appear. These continue while data is still successfully sent out
until the run is ended.
Tue May 9 05:20:29 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v5 at Tue May 9 05:20:29
2006 (SN=5) ***
Tue May 9 05:21:30 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v6 at Tue May 9 05:21:30
2006 (SN=6) ***
Tue May 9 05:22:31 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v7 at Tue May 9 05:22:31
2006 (SN=7) ***
Tue May 9 05:23:12 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May 9 05:23:12 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
........................................
Tue May 9 05:23:31 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May 9 05:23:32 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May 9 05:23:32 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v8 at Tue May 9 05:23:32
2006 (SN=8) ***
Tue May 9 05:23:32 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May 9 05:23:33 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
etc.
Another example showing that the corrupted parameter varies in size:
Thu Apr 13 19:00:00 2006 [mhttpd] Run #30005 started
Thu Apr 13 19:00:08 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v1 at Thu Apr 13 19:00:08 2006 ***
Thu Apr 13 19:01:10 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v2 at Thu Apr 13 19:01:10 2006 ***
Thu Apr 13 19:02:14 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v3 at Thu Apr 13 19:02:14 2006 ***
Thu Apr 13 19:03:20 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v4 at Thu Apr 13 19:03:20 2006 ***
Thu Apr 13 19:04:22 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v5 at Thu Apr 13 19:04:22 2006 ***
Thu Apr 13 19:05:12 2006 [feBNMR] [midas.c:9323:rpc_call] parameters
(1077739560) too large for network buffer
(524344)
Thu Apr 13 19:05:13 2006 [feBNMR] [midas.c:9323:rpc_call] parameters
(1077739560) too large for network buffer
(524344)
etc. |