19 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size
|
I am sorry for my late reply memory in my PC is 16 GB I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM". But there is no buffer sizes in "/Experiment" After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T |
20 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size
|
I am sorry for my late reply
memory in my PC is 16 GB
I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
But there is no buffer sizes in "/Experiment"
After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T |
20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> memory in my PC is 16 GB
You can safely go to buffer size 100 Mbytes or more.
> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
Good.
> But there is no buffer sizes in "/Experiment"
This is strange. How old is your midas? What does it say on the "help" page in "Revision"?
> After run "ipcrm -M 0x4d040761 size0x204a3c"
This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.
My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key, key);
Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?
> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T
Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.
K.O. |
20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> memory in my PC is 16 GB
You can safely go to buffer size 100 Mbytes or more.
> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
Good.
> But there is no buffer sizes in "/Experiment"
This is strange. How old is your midas? What does it say on the "help" page in "Revision"?
> After run "ipcrm -M 0x4d040761 size0x204a3c"
This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.
My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key,
key);
Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?
> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T
Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.
K.O. |
20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> > memory in my PC is 16 GB
>
> You can safely go to buffer size 100 Mbytes or more.
>
> > I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
>
> Good.
No, wait, this is all wrong. If it says POSIX shared memory, how come it later
complains about SYSV shared memory and tells you to run SYSV shared memory
commands like ipcrm?!?
> > But there is no buffer sizes in "/Experiment"
Now this kind of makes sense - you are probably running a strange mixture
of very old and recently new MIDAS. Probably you current version is so old
that it does not use .SHM_TYPE.TXT and can only do SYSV shared memory
and so old it does not have "/Experiment/buffer sizes".
But at some point you must have run a recent version of midas, or you would
not have the file .SHM_TYPE.TXT in your experiment directory.
I say:
a) run the correct ipcrm command (without the spurious "size..." text)
b) review your computer contents to identify all the versions of midas
and to make sure you are using the midas you want to use (old or new,
whatever), but not some wrong version by accident (incorrect PATH setting, etc)
As MIDAS developers, we usually recommend that you use the latest version of MIDAS,
certainly latest version is simpler to debug.
K.O. |
14 Mar 2017, Andreas Suter, Bug Report, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
I think there sneaked in a little bug in the mhttpd: when starting an experiment
from scratch and starting the mhttpd, the Menu Buttons are missing and,
correctly, I get periodic error messages. I expected that the default ODB entry
for the Menu Buttons is create if it doesn't exist. As far as I see this happens
now since the default creation of the 'Menu Buttons' is now tag as an obsolete
feature. In case this is not a bug but a feature, it should documented. |
14 Mar 2017, Konstantin Olchanski, Bug Report, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> I think there sneaked in a little bug in the mhttpd: when starting an experiment
> from scratch and starting the mhttpd, the Menu Buttons are missing and,
> correctly, I get periodic error messages. I expected that the default ODB entry
> for the Menu Buttons is create if it doesn't exist. As far as I see this happens
> now since the default creation of the 'Menu Buttons' is now tag as an obsolete
> feature. In case this is not a bug but a feature, it should documented.
I think you are right. Will fix.
K.O. |
16 Mar 2017, Konstantin Olchanski, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > from scratch and starting the mhttpd, the Menu Buttons are missing
Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
but I was moved to implement something I wanted to do for a long time.
The menu configuration is now done through a subdirectory /experiment/menu. Each entry corresponds to
one menu button. Set to "y" to show it, set to "n" to hide it.
Buttons are displayed in the same order as they are in ODB, to change the order of buttons,
change their order in ODB (odbedit command "move").
This fixes the long standing problem with adding new midas pages - they were not automatically added to
the existing "menu buttons" lists. So for example when the "chat" page was added, I did not know about it
for a long time (and some people still do not know about it's existence) because it is was not included in
my "/experiment/menu buttons" list in all my already existing experiments. When the "start" and
"transition" pages were added, probably nobody knows that they exist.
Now new buttons for new pages are automatically added to the list (via mhttpd.cxx::init_menu_buttons()),
the users have an option to hide them by setting their values to "n".
K.O. |
16 Mar 2017, Thomas Lindner, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > > from scratch and starting the mhttpd, the Menu Buttons are missing
>
> Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
> but I was moved to implement something I wanted to do for a long time.
>
Is this change back-wards compatible with an old ODB? Ie, if I upgrade MIDAS, will it notice that I have the old-style key "/Experiment/Menu Buttons"
and replace it equivalently set keys in /Experiment/Menu? Or will it just continue to use the old-style ODB key? |
17 Mar 2017, Pierre Gorel, Bug Report, badly managed case in history_schema.cxx: dat file empty
|
For an unknown reason, Logger died few days ago while writing the history. The
file mhf_1489577446_20170315_system.dat was created, but was empty.
When trying to restart Logger, I would get a seg fault without any special error
message.
I tracked the issue to the "read_file_schema" function in history_schema.cxx
* L4731, a pointer to HsFileSchema *s is declared.
* L4747, We enter a while(1) loop.
* L4749, get char on the filename.
In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
Problem: the memory allocation for "s" is later in the loop, L4768.
Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
SegFault.
It would be nice to at least have a message before breaking the loop... |
28 Mar 2017, Konstantin Olchanski, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > > > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > > > from scratch and starting the mhttpd, the Menu Buttons are missing
> >
> > Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
> > but I was moved to implement something I wanted to do for a long time.
> >
>
> Is this change back-wards compatible with an old ODB? Ie, if I upgrade MIDAS, will it notice that I have the old-style key "/Experiment/Menu Buttons"
> and replace it equivalently set keys in /Experiment/Menu? Or will it just continue to use the old-style ODB key?
I am trying to keep some compatibility between the web pages and mhttpd. I think in most cases, old mhttpd should continue to work
against new web pages (assuming matching mhttpd.js & co). But old web pages would probably break against new mhttpd, mostly due
to the rapid pace of their development.
Anyhow, the midas web page forms menu buttons in this order:
/Experiment/Menu, if it does not exist, then:
/Experiment/menu buttons, if it does not exist, then
built in list of menu buttons, which includes all possible buttons, hardcoded in mhttpd.js.
In cooperation with mhttpd: new mhttpd
- will automatically create the tree /experiment/menu with all buttons disabled
- will complain about the existence of /expriment/menu buttons, instruct user to delete it.
So to answer the question:
after git pull, make, restart mhttpd, you will see all possible menu buttons and you will have to go
into the odb editor to disable the buttons you do not want to see (i.e. the mscb button).
I did it this way on purpose, to give old-time midas users an opportunity to discover
some of the newly added buttons and pages, like the "chat" page, or the "example" page. If I migrated
the existing "menu buttons" verbatim, to the new tree, I would not even today know
that the "chat" page exists (I do not think it was ever announced or described on this forum
or anywhere in the documentation).
K.O. |
05 Apr 2017, Andreas Suter, Bug Report, Equipment Expand doesn't work anymore
|
I'd liked very much the possibility to hide away Equipment on the main page. It
is also nice to have the '+' to get it quickly back when needed. However, this
seems not to work anymore (git c9d9d604803). Is this a feature or something went
wrong? |
10 Apr 2017, Stefan Ritt, Bug Report, Equipment Expand doesn't work anymore
|
> I'd liked very much the possibility to hide away Equipment on the main page. It
> is also nice to have the '+' to get it quickly back when needed. However, this
> seems not to work anymore (git c9d9d604803). Is this a feature or something went
> wrong?
The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
branch. Please check if it's working.
Stefan |
10 Apr 2017, Andreas Suter, Bug Report, Equipment Expand doesn't work anymore
|
> > I'd liked very much the possibility to hide away Equipment on the main page. It
> > is also nice to have the '+' to get it quickly back when needed. However, this
> > seems not to work anymore (git c9d9d604803). Is this a feature or something went
> > wrong?
>
> The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
> implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
> this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
> branch. Please check if it's working.
>
> Stefan
Tested it on two machines and expansion is back and working! Thanks a lot!
Andreas |
13 Apr 2017, Andreas Suter, Bug Report, stop form odbedit broken
|
when I try to stop a run from odbedit I get a core dump.
[ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
midas commit 53af92a5d0...
-----
I checked what happens if I try to stop a run via the mhttpd web-page: this
works! So what is different?
-----
I placed a issue (# 47) on bitbucket as well.
What is the preferred channel to report potential bugs (elog / bitbucket issues)? |
13 Apr 2017, Andreas Suter, Bug Report, stop form odbedit broken
|
> when I try to stop a run from odbedit I get a core dump.
>
> [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
>
> midas commit 53af92a5d0...
>
> -----
>
> I checked what happens if I try to stop a run via the mhttpd web-page: this
> works! So what is different?
>
> -----
>
> I placed a issue (# 47) on bitbucket as well.
>
> What is the preferred channel to report potential bugs (elog / bitbucket issues)?
I think I found the problem. Some ODB String values which are **automatically**
generated:
CSS File = STRING : [1024] mhttpd.css
Sqlite dir = STRING : [1024]
History dir = STRING : [1024]
Sound = STRING : [1000] alarm.mp3
are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)
It looks as if this screws up quite a bit of the system! When deleting .ODB.SHM and
afterwards try to reload the ODB via a dump I previously made with odbedit, the
following is happening:
1) I get the error message that some strings are too long (exceeding
MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
variables this is.
2) After this reload, essentially nothing is working anymore. Any client I tried to
start just crashed.
Since it seems that the string length of MAX_STRING_LENGTH is very crucial I would
suggest that db_create_record (or whatever routine is dealing with it) checks for
STRING variables and ensures that they cannot exceed MAX_STRING_LENGTH.
When I shortened in my dump the above variables to MAX_STRING_LENGTH, regenerated the
ODB, also the 'stop' Problem in odbedit is gone. |
15 Apr 2017, Konstantin Olchanski, Bug Report, Equipment Expand doesn't work anymore
|
> > > I'd liked very much the possibility to hide away Equipment on the main page. It
> > > is also nice to have the '+' to get it quickly back when needed. However, this
> > > seems not to work anymore (git c9d9d604803). Is this a feature or something went
> > > wrong?
> >
> > The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
> > implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
> > this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
> > branch. Please check if it's working.
> >
> > Stefan
>
> Tested it on two machines and expansion is back and working! Thanks a lot!
>
Confirmed fixed. Thanks. Not sure how this got lost.
K.O. |
15 Apr 2017, Konstantin Olchanski, Bug Report, stop form odbedit broken
|
> > when I try to stop a run from odbedit I get a core dump.
> >
> > [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> > Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
> >
I am quite puzzled by this situation. We have seen the above error before, tried to track it down, failed. I was
always thinking this is some kind of strange size mismatch between odb size and shared memory size and
shared memory save file odb.shm size.
Now with your information, it looks like it is memory corruption.
I always thought there is no length limit to odb strings, except for the odb api problem where you have to
know the maximum string length for db_get_value() & co otherwise long strings will be corrupted. Today
nobody uses fixed size buffers, either db_get_value() allocates the string of correct size (replacing buffer
overflow errors with memory leak errors), or return std::string.
I shall check on the use of MAX_STRING_SIZE at least in odb itself...
The default value 256 seems to be too small for today's use. (if you want to store json data, web page
fragments, etc).
K.O.
> > midas commit 53af92a5d0...
> >
> > -----
> >
> > I checked what happens if I try to stop a run via the mhttpd web-page: this
> > works! So what is different?
> >
> > -----
> >
> > I placed a issue (# 47) on bitbucket as well.
> >
> > What is the preferred channel to report potential bugs (elog / bitbucket issues)?
>
> I think I found the problem. Some ODB String values which are **automatically**
> generated:
>
> CSS File = STRING : [1024] mhttpd.css
> Sqlite dir = STRING : [1024]
> History dir = STRING : [1024]
> Sound = STRING : [1000] alarm.mp3
>
> are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)
>
> It looks as if this screws up quite a bit of the system! When deleting .ODB.SHM and
> afterwards try to reload the ODB via a dump I previously made with odbedit, the
> following is happening:
>
> 1) I get the error message that some strings are too long (exceeding
> MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
> variables this is.
>
> 2) After this reload, essentially nothing is working anymore. Any client I tried to
> start just crashed.
>
> Since it seems that the string length of MAX_STRING_LENGTH is very crucial I would
> suggest that db_create_record (or whatever routine is dealing with it) checks for
> STRING variables and ensures that they cannot exceed MAX_STRING_LENGTH.
>
> When I shortened in my dump the above variables to MAX_STRING_LENGTH, regenerated the
> ODB, also the 'stop' Problem in odbedit is gone. |
15 Apr 2017, Konstantin Olchanski, Bug Report, where to report bugs, stop form odbedit broken
|
>
> What is the preferred channel to report potential bugs (elog / bitbucket issues)?
>
I prefer that bugs be reported on this forum here. Most bugs affect every midas user, so best to notify the
whole community.
Bitbucket have a nice bug tracking system, but there is a couple of problems:
a) only a couple of people see the bug reports for midas, minimizing probability of fix.
b) bug reports on bitbucket stay on bitbucket, we do not have backups and archives
of bug reports, if tomorrow bitbucket goes belly-up, our bug database goes poof! with them.
c) I can search the bug report on this forum using "grep" (i am sure there is a "find" button
on the bitbucket web page and it finds what I am looking for right away).
So if you have a bug report that others should know about (i.e. the "+" button on the status page does
not work), I say use this forum.
If you have a bug that you think is unique to you, not interesting to others (i.e. my midas crashes when I
do X), file it on bitbucket. If you see no activity on the bitbucket for a week or two, repost it here.
K.O. |
15 Apr 2017, Konstantin Olchanski, Bug Report, badly managed case in history_schema.cxx: dat file empty
|
> For an unknown reason, Logger died few days ago while writing the history. The
> file mhf_1489577446_20170315_system.dat was created, but was empty.
I ran into same problem installing new midas in the alpha experiment at cern. It should be fixed now:
https://bitbucket.org/tmidas/midas/commits/788021d9cb39a348a40e36f1b35b1440e06aa744
K.O.
>
> When trying to restart Logger, I would get a seg fault without any special error
> message.
>
> I tracked the issue to the "read_file_schema" function in history_schema.cxx
>
> * L4731, a pointer to HsFileSchema *s is declared.
> * L4747, We enter a while(1) loop.
> * L4749, get char on the filename.
> In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
>
> Problem: the memory allocation for "s" is later in the loop, L4768.
> Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
> SegFault.
>
> It would be nice to at least have a message before breaking the loop... |
|