Back Midas Rome Roody Rootana
  Midas DAQ System, Page 56 of 146  Not logged in ELOG logo
Entry  14 Feb 2024, Konstantin Olchanski, Info, bitbucket permissions 
I pushed some buttons in bitbucket user groups and permissions to make it happy 
wrt recent changes.

The intended configuration is this:

- two user groups: admins and developers
- admins has full control over the workspace, project and repositories ("Admin" 
permission)
- developers have push permission for all repositories (not the "create 
repository" permission, this is limited to admins) ("Write" permission).
- there seems to be a quirk, admins also need to be in the developers group or 
some things do not work (like "run pipeline", which set me off into doing all 
this).
- admins "Admin" permission is set at the "workspace" level and is inherited 
down to project and repository level.
- developers "Write" permission is set at the "project" level and is inherited 
down to repository level.
- individual repositories in the "MIDAS" project also seem to have explicit 
(non-inhertited) permissions, I think this is redundant and I will probably 
remove them at some point (not today).

K.O.
Entry  18 Jan 2019, Konstantin Olchanski, Info, bitbucket issue tracker "feature" 
It turns out the bitbucket issue tracker has a feature - I cannot make it automatically add me 
to the watcher list of all new issues.

So when you create a new issue, I think I get one message about it, and no more.

If there is some other activity on the issue (Stefan answers, Thomas answers),
I most definitely will not see any of it.

So in case you wondered why I sometimes completely do not react to some bug reports, this 
is why.

So much for "advanced" and "highly automated" and "intelligent" bug tracking.

K.O.
    Reply  14 Mar 2019, Konstantin Olchanski, Info, bitbucket issue tracker "feature" 
> It turns out the bitbucket issue tracker has a feature - I cannot make it automatically add me 
> to the watcher list of all new issues.
> 
> So when you create a new issue, I think I get one message about it, and no more.
> 

I retested this and I confirm that for new issues I am not a watcher "by default". If I reply to an issue,
the system enters an inconsistent state - it thinks I am a watcher, the best I can tell, but it also
says "0 watchers".

K.O.
Entry  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
midas bitbucket repository had a spam attack, about 40 spam messages were posted 
into the issues. I was able to delete them manually. No idea how they got past 
bitbucket spam filters and if they are spam or an attack against automated issue 
tracker tools or an attack against the repo owner (who is vulnerable as they rush 
in to deal with the spam). if this happens again, "anonymous issues" may have to 
be disabled, bitbucket login required. K.O.
    Reply  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
> midas bitbucket repository had a spam attack, about 40 spam messages were posted 
> into the issues. I was able to delete them manually. No idea how they got past 
> bitbucket spam filters and if they are spam or an attack against automated issue 
> tracker tools or an attack against the repo owner (who is vulnerable as they rush 
> in to deal with the spam). if this happens again, "anonymous issues" may have to 
> be disabled, bitbucket login required. K.O.

Two more spam messages, deleted. "Anonymous users can create issues" is now turned off.

K.O.
    Reply  16 Mar 2023, Konstantin Olchanski, Forum, bitbucket issue spam cleaned 
> > midas bitbucket repository had a spam attack, about 40 spam messages were posted 
> > into the issues. I was able to delete them manually. No idea how they got past 
> > bitbucket spam filters and if they are spam or an attack against automated issue 
> > tracker tools or an attack against the repo owner (who is vulnerable as they rush 
> > in to deal with the spam). if this happens again, "anonymous issues" may have to 
> > be disabled, bitbucket login required. K.O.
> 
> Two more spam messages, deleted. "Anonymous users can create issues" is now turned off.

Also, same for: rootana.
Also, empty issue trackers disabled: mvodb, midasio, mscb.

K.O.
Entry  12 Sep 2024, Konstantin Olchanski, Bug Fix, bitbucket builds repaired 
bitbucket builds work again, also added ubuntu-24 and almalinux-9.

two problems fixed:
- cmake file in examples/experiment was replaced by a non-working version
- unannounced change of strlcpy() to mstrlcpy() broke "make remoteonly"

P.S. I should also fix the rootana and the roody bitbucket builds.

K.O.
Entry  02 Jun 2021, Konstantin Olchanski, Info, bitbucket build truncated 
I truncated the bitbucket build to only build on ubuntu LTS 20.04.

Somehow all the other build targets - centos-7, centos-8, ubuntu-18 - have
an obsolete version of cmake. I do not know where the bitbucket os images
get these obsolete versions of cmake - my centos-7 and centos-8 have much
more recent versions of cmake.

If somebody has time to figure it out, please go at it, I would like very
much to have centos-7 and centos-8 builds restored (with ROOT), also
to have a ubuntu LTS 20.04 build with ROOT. (For me, debugging bitbucket
builds is extremely time consuming).

Right now many midas cmake files require cmake 3.12 (released in late 2018).

I do not know why that particular version of cmake (I took the number
from the tutorials I used).

I do not know what is the actual version of cmake that MIDAS (and ROOTANA) 
require/depend on.

I wish there were a tool that would look at a cmake file, examine all the 
features it uses and report the lowest version of cmake that supports them.

K.O.
    Reply  10 Jun 2019, Konstantin Olchanski, Release, bin and lib symlinks, mxml-2019-03-a, midas-2019-03-h 
> > > > the midas release 2019-03 is ready for general use.

The latest version of MIDAS puts libraries and executables in $MIDASSYS/lib and bin (the "linux" part of pathname is removed).

Some packages (rootana) have been already changed to use this new scheme and they will not build against older versions of midas. 
I recommend that you create following symlinks to make old versions of midas compatible with the new scheme:

cd $MIDASSYS # (~/packages/midas)
ln -s linux/bin .
ln -s linux/lib .

K.O.
    Reply  11 Jun 2019, Stefan Ritt, Release, bin and lib symlinks, mxml-2019-03-a, midas-2019-03-h 
> The latest version of MIDAS puts libraries and executables in $MIDASSYS/lib and bin (the "linux" part of pathname is removed).
> 
> Some packages (rootana) have been already changed to use this new scheme and they will not build against older versions of midas. 
> I recommend that you create following symlinks to make old versions of midas compatible with the new scheme:
> 
> cd $MIDASSYS # (~/packages/midas)
> ln -s linux/bin .
> ln -s linux/lib .

If i'm not mistaken the proper commands are

cd $MIDASSYS
ln -s ../bin linux/bin
ln -s ../lib linux/lib

Alternatively, you can change your PATH to point to $MIDASSYS/bin instead of $MIDASSYS/linux/bin and link against $MIDASSYS/lib instead of 
$MIDASSYS/linux/lib

Stefan
    Reply  17 Jun 2019, Konstantin Olchanski, Release, bin and lib symlinks, mxml-2019-03-a, midas-2019-03-h 
> 
> If i'm not mistaken the proper commands are
> 
> cd $MIDASSYS
> mkdir linux
> ln -s ../bin linux/bin
> ln -s ../lib linux/lib
> 

This is for making the new midas look like the old midas. My instructions were for making the old midas looking like the new midas.

Old midas:
packages/midas/linux/bin, linux/lib with symlinks for
packages/midas/bin -> linux/bin, etc

New midas:
packages/midas/bin, lib with symlinks for
packages/midas/linux/bin -> ../bin, etc.

K.O.
Entry  30 May 2006, Konstantin Olchanski, Bug Report, badness with vxworks/ppc 
It appears that the latest version of MIDAS malfunctions on PowerPC/VxWorks
machines, below are two problem reports. As reported, previous versions of MIDAS
work fine, I guess that reduces the probability of it being buggy user code. At
least one of the problems feels like a missing endian conversion somewhere, but
I am not aware of any recent changes in the MIDAS RPC code... We will be trying
to debug both problems, but any insight would be greatly appreciated.

K.O.


From suz@triumf.ca  Tue May 30 16:58:16 2006
Date: Tue, 30 May 2006 16:58:16 -0700 (PDT)
From: Suzannah Daviel <suz@triumf.ca>
To: konstantin olchanski <olchansk@triumf.ca>
Subject: rpc problems

Hi Konstantin,

Herewith a description of the problems,

Suzannah

Problem on system A:
--------------------

After upgrading the Linux operating system from RH9 to SL4, and installing
latest Midas software, the first time a manual trigger is issued, the VxWorks
frontend (running
on a PPC) crashes:


Output on PPC consol:

trigger histo event from status page

rpc_client_accept: starting with sock:11

program
Exception current instruction address: 0x01ac7388
Machine Status Register: 0x0008b030
Condition Register: 0x24000082
Task: 0x1b47908 "mfe"



The histo event is usually large so is fragmented. It is sent out by a
manual trigger and at end of run. When the run is ended (before an event
request using a manual trigger so program has not yet crashed) the histo
event is sent successfully.

After returning to the previous version of Midas but still running SL4,
this problem disappeared.




Problem on system B:
--------------------

Again, SL9 was installed, and the Midas software updated to the latest.
When sending a periodic (non-fragmented) event, after a while, one of the
parameters appears to become corrupted, and a lot of rpc_call error
messages appear. These continue while data is still successfully sent out
until the run is ended.



Tue May  9 05:20:29 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v5 at Tue May  9 05:20:29
2006 (SN=5) ***
Tue May  9 05:21:30 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v6 at Tue May  9 05:21:30
2006 (SN=6) ***
Tue May  9 05:22:31 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v7 at Tue May  9 05:22:31
2006 (SN=7) ***

Tue May  9 05:23:12 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May  9 05:23:12 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
........................................
Tue May  9 05:23:31 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May  9 05:23:32 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808

Tue May  9 05:23:32 2006 [Mdarc] *** data saved in file
/is01_data/bnmr/dlog/2006/040377.msr_v8 at Tue May  9 05:23:32
2006 (SN=8) ***

Tue May  9 05:23:32 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
Tue May  9 05:23:33 2006 [feBNMR] [midas.c:9325:rpc_call] parameters
(1099059848) too large for network buffer
(524344); param_size=1099059808
etc.

Another example showing that the corrupted parameter varies in size:

Thu Apr 13 19:00:00 2006 [mhttpd] Run #30005 started
Thu Apr 13 19:00:08 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v1 at Thu Apr 13 19:00:08 2006 ***
Thu Apr 13 19:01:10 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v2 at Thu Apr 13 19:01:10 2006 ***
Thu Apr 13 19:02:14 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v3 at Thu Apr 13 19:02:14 2006 ***
Thu Apr 13 19:03:20 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v4 at Thu Apr 13 19:03:20 2006 ***
Thu Apr 13 19:04:22 2006 [Mdarc] *** Saved data file
/is01_data/bnmr/dlog/2006/030005.msr_v5 at Thu Apr 13 19:04:22 2006 ***
Thu Apr 13 19:05:12 2006 [feBNMR] [midas.c:9323:rpc_call] parameters
(1077739560) too large for network buffer
(524344)
Thu Apr 13 19:05:13 2006 [feBNMR] [midas.c:9323:rpc_call] parameters
(1077739560) too large for network buffer
(524344)
etc.
Entry  17 Mar 2017, Pierre Gorel, Bug Report, badly managed case in history_schema.cxx: dat file empty 
For an unknown reason, Logger died few days ago while writing the history. The
file mhf_1489577446_20170315_system.dat was created, but was empty.

When trying to restart Logger, I would get a seg fault without any special error
message.

I tracked the issue to the "read_file_schema" function in history_schema.cxx

* L4731, a pointer to HsFileSchema *s is declared.
* L4747, We enter a while(1) loop.
* L4749, get char on the filename.
In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.

Problem: the memory allocation for "s" is later in the loop, L4768.
Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
SegFault. 

It would be nice to at least have a message before breaking the loop...
    Reply  15 Apr 2017, Konstantin Olchanski, Bug Report, badly managed case in history_schema.cxx: dat file empty 
> For an unknown reason, Logger died few days ago while writing the history. The
> file mhf_1489577446_20170315_system.dat was created, but was empty.

I ran into same problem installing new midas in the alpha experiment at cern. It should be fixed now:
https://bitbucket.org/tmidas/midas/commits/788021d9cb39a348a40e36f1b35b1440e06aa744

K.O.

> 
> When trying to restart Logger, I would get a seg fault without any special error
> message.
> 
> I tracked the issue to the "read_file_schema" function in history_schema.cxx
> 
> * L4731, a pointer to HsFileSchema *s is declared.
> * L4747, We enter a while(1) loop.
> * L4749, get char on the filename.
> In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
> 
> Problem: the memory allocation for "s" is later in the loop, L4768.
> Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
> SegFault. 
> 
> It would be nice to at least have a message before breaking the loop...
Entry  30 Nov 2003, Konstantin Olchanski, , bad call to cm_cleanup() in fal.c 
fal.c does not compile: it calls cm_cleanup() with one argument when there
should be two arguments. K.O.
    Reply  30 Nov 2003, Stefan Ritt, , bad call to cm_cleanup() in fal.c 
> fal.c does not compile: it calls cm_cleanup() with one argument when there
> should be two arguments. K.O.

Fixed and committed.
Entry  22 Oct 2013, Konstantin Olchanski, Info, audit of db_get_record() 
Record-oriented ODB functions db_create_record(), db_get_record(), db_check_record() and 
db_set_record() require special attention to the consistency between their "C struct"s (usually defined in 
midas.h), their initialization strings (usually defined in midas.h) and the contents of ODB.

When these 3 items become inconsistent, the corresponding midas functions tend to break.

Unlike ODB internal structures and event buffer internal structures, these record-oriented functions are 
not part of the midas binary-compatibility abi and they are not protected by db_validate_sizes().

From time to time, new items are added to some of these data structures. Usually this does not cause 
problems, but recently we had some difficulty with the runinfo and equipment structures, prompting this 
audit.

db_check_record: note: (C) means that this record is created there

alarm.c: alarm_odb_str(C)
mana.c: skipped
mfe.c: equipment_common_str, equipment_statistics_str(C), event_descrip(C), bank_list(C)
mhttpd.cxx: cgif_label_str(C), cgif_bar_str(C), runinfo_str(C), equipment_common_str(C)
mlogger.cxx: ch_settings_str(C)
sequencer.cxx: sequencer_str(C)

db_create_record:

alarm.c: alarm_odb_str, alarm_periodic_str, alarm_class_str
fal.c: skipped
mfe.c: equipment_common_str
midas.c: program_info_str (maybe)
odb.c: (maybe)
lazylogger.cxx: lazy_settings, lazy_statistics
mhttpd.cxx: runinfo_str
mlogger.cxx: chn_settings_str

db_get_record: (hard to do with grep, will have to check every db_get_record by hand)

alarm.c: alarm, class, program_info
fal.c: skipped
mana.c: skipped
midas.c: program_info
odb.c: (maybe)
lazylogger.cxx: lazyst
mhttpd.cxx: runinfo, equipment, ?hkeytemp?, chn_settings, chn_stats, ?label?, ?bar?
mlogger.cxx: ?, ?, chn_stats, chn, settings
sequencer.cxx: hkeyseq

db_set_record:

alarm.c: hkeyalarm, hkeyclass, ???, program_info
fal.c: skipped
mana.c: skipped
mfe.c: equipment_info, ?event structure?
odb.c: (maybe)
lazelogger.cxx: lazyst
mlogger.cxx: chn_stat
sequencer.cxx: seq

db_open_record: note: (W) means MODE_WRITE

fal.c: skipped
mana.c: skipped
mfe.c: equipment_info, equipment_stats(W)
midas.c: requested_transition
odbedit.c: key_update - generic test of hotlink
lazylogger.cxx: runstate, lazyst(W), lazy?
mlogger.cxx: history, chn_statistics, chn_settings
sequencer.cxx: seq

K.O.
Entry  16 Mar 2019, Gennaro Tortone, Forum, assertion failed 
Hi,
I'm developing a Slow Control equipment on a Linux board that send data on a remote server
running 'mserver'; the build goes fine, but  when I run the executable it seems that an assertion in 
midas.c failed:

[dfe01,INFO] Slow control equipment initialized
dfe: src/midas.c:838: cm_msg_flush_buffer: Assertion `rp[3]=='_'' failed.

if I remove line 838 from midas.c (fixing message length) the problem disappear...

Regards,
Gennaro
    Reply  18 Mar 2019, Konstantin Olchanski, Forum, assertion failed 
> [dfe01,INFO] Slow control equipment initialized
> dfe: src/midas.c:838: cm_msg_flush_buffer: Assertion `rp[3]=='_'' failed.
> if I remove line 838 from midas.c (fixing message length) the problem disappear...

Thank you for reporting this problem.

It is very strange, the check is for message start "MSG_", why "M", "S" and "G" are there
but "_" is missing? And you remove the check for "_" and the rest of the message is also okey?
Very odd.

I look at the code in midas.c and I also see is that the ring buffer has no protection against
overflow, it is created for max message length of around 1000 bytes, but I look at the code 
that feeds messages into it (cm_msg_format()) and it also has a buffer overrun
possibility (sprintf() instead snprintf()). Actually I just ran into this buffer overrun
myself when adding memory leak debugging code into mhttpd.

Anyhow, before this message that causes the crash, maybe perchance you have a long
message before it? Of length 1000 bytes of longer? (1000 bytes is 12 lines of 80 chars).

Or this is the very first message you generate? (other than the normal messages generated by mfe.c?)

Is there anything in midas.log around the time of the crash? (post it here or email it to me?) Also I would
like to see everything printed by your frontend from the start time all the way to the crash.

You can also add this code "assert(4+3*sizeof(int)+len < 1020)" in cm_msg_buffer() right before
rb_increment_wp() - it this assert fails, we definitely determine that we have a buffer overflow.

Thank in advance,
K.O.
    Reply  19 Mar 2019, Gennaro Tortone, Forum, assertion failed 
> > [dfe01,INFO] Slow control equipment initialized
> > dfe: src/midas.c:838: cm_msg_flush_buffer: Assertion `rp[3]=='_'' failed.
> > if I remove line 838 from midas.c (fixing message length) the problem disappear...
> 
> Thank you for reporting this problem.
> 
> It is very strange, the check is for message start "MSG_", why "M", "S" and "G" are there
> but "_" is missing? And you remove the check for "_" and the rest of the message is also okey?
> Very odd.

if I remove the check for "_" then the first message is empty and next messages are ok...
If I don't remove the check the frontend fails at start and I find these lines in midas.log:

14:46:29.719 2019/03/19 [dfe01,INFO] Program dfe01 on host lxaria02 started
14:46:29.731 2019/03/19 [dfe01,INFO] Dome FE initialized
14:46:29.737 2019/03/19 [dfe01,ERROR] [system.c:4709:recv_tcp2,ERROR] unexpected connection closure
14:46:29.737 2019/03/19 [dfe01,ERROR] [midas.c:12814:recv_event_server,ERROR] recv_tcp2(header) returned -1
14:46:29.737 2019/03/19 [dfe01,ERROR] [midas.c:14699:rpc_server_receive,ERROR] recv_event_server() returned -1, abort
14:46:29.737 2019/03/19 [dfe01,TALK] Program 'dfe01' on host 'lxaria02' aborted
 
> You can also add this code "assert(4+3*sizeof(int)+len < 1020)" in cm_msg_buffer() right before
> rb_increment_wp() - it this assert fails, we definitely determine that we have a buffer overflow.

I added assert you suggested in cm_msg_buffer() function before rp_increment_wp() and 
result is always the same at same line:

[dfe01,INFO] Dome FE initialized
Dome0001-rc:
[dfe01,INFO] Slow control equipment initialized
dfe: src/midas.c:839: cm_msg_flush_buffer: Assertion `rp[3]=='_'' failed.
Aborted

Regards,
Gennaro
ELOG V3.1.4-2e1708b5