09 Oct 2012, Stefan Ritt, Bug Fix, [PATCH] mana.c compile fix, gz files
|
> Hi,
>
> I had to apply the attached patch to convince SuSE Linux 12.2 to compile mana.c
> gcc version is "(SUSE Linux) 4.6.2"
>
> Problem is that gz{write,close, etc.} expect a 1st argument of type gzFile (see
> zlib.h), whereas out_file is FILE*. In fact, out_file is a cast to FILE*, even
> in the case when we work on a gzfile (HAVE_ZLIB).
>
> Could you please confirm that the patch is correct, and possibly apply it to trunk?
>
> I haven't checked if mana works as advertised now.
>
> Cheers,
>
>
> Randolf
I applied your patch to the trunk.
Best,
Stefan |
17 Dec 2024, Lukas Gerritzen, Bug Report, [History plots] "Jump to current time" resets x range to 7d
|
To reproduce:
- Open a history plot, click [-] a few times until the x axis shows more than 7 days.
- Scroll to the past (left)
- Click "Jump to current time" (the triangle)
Expected result:
The upper limit of the x axis is at the current time and the lower range is now - whatever range you had before
(>7d)
Actual result:
The upper limit is the current time, the lower limit is now - 7d
(The interval seems unchanged if the range was < 7d before clicking "Jump to current time") |
19 Dec 2024, Stefan Ritt, Bug Report, [History plots] "Jump to current time" resets x range to 7d
|
I had put in a check which limits the range to 7d into the past if you press the "play" button, but now I'm not sure why this was needed. I removed it again and
things seem to be fine. Change is committed to develop.
Stefan |
13 Jun 2006, Stefan Ritt, Info, ZLIB dependency modified
|
Due to recent problems with the ROME analyzer having zlib.h both in the
system and in the midas tree it has been decided to change the zlib policy in midas. By default, zlib support is not included in the midas analyzer. If one want it (but I guess only very few experiments need that), one can do a
make NEED_ZLIB=1
to compile zlib support into mana.c
Under linux (&Co), the zlib is these days normally pre-installed. The header file will therefor be taken from /usr/include and the library from /usr/lib/libz.a. Under Windows, the zlib is still included in the distribution, and has to be manually added to the Visual C++ project file. |
04 Aug 2010, Konstantin Olchanski, Info, YBOS support now optional, disabled by default
|
As of svn rev 4800, YBOS support was made optional, disabled by default. (But note that ybos.c is still used
by mdump). See HAVE_YBOS in the Makefile.
K.O. |
31 Aug 2010, Konstantin Olchanski, Info, YBOS support now optional, disabled by default
|
> As of svn rev 4800, YBOS support was made optional, disabled by default. (But note that ybos.c is still used
> by mdump). See HAVE_YBOS in the Makefile.
It looks like some example drivers in .../drivers/class want to link against YBOS libraries. This fails because ybos.o is missing from the MIDAS library.
After discussions with SR and PAA, we think YBOS support can be removed or made optional, but there are too many of these drivers for me to fix
them all right now in five minutes. Please accept my apology and use these workarounds:
If you get linker errors because of missing YBOS functions:
1) enable YBOS suport in the Makefile (uncomment HAVE_YBOS=1), or
2) "#ifdef HAVE_YBOS" all places that call YBOS functions
Solution (2) is preferable as it permits us to eventually remove YBOS completely. If you fix files from MIDAS svn, please do send me patches or diffs (or
post them here).
K.O. |
08 Sep 2010, Stefan Ritt, Info, YBOS support now optional, disabled by default
|
> It looks like some example drivers in .../drivers/class want to link against YBOS libraries.
> This fails because ybos.o is missing from the MIDAS library.
I fixed the class drivers in meantime (SVN 4814).
There is however another problem: The lazylogger needs YBOS support compiled in if the FTP transfer mode is used.
At PSI we are stuck at the moment to FTP, so we still need YBOS there (although none of the data is in YBOS format).
Maybe there is a chance that this will be fixed some time and we can get rid of YBOS. |
27 May 2021, Lukas Gerritzen, Bug Report, Wrong location for mysql.h on our Linux systems
|
Hi,
with the recent fix of the CMakeLists.txt, it seems like another bug surfaced.
In midas/progs/mlogger.cxx:48/49, the mysql header files are included without a
prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our
systems, the lines should be
48 #include <mysql/mysql.h>
49 #include <mysql/mysqld_error.h>
This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on
Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.
If this problem occurs for other Linux/MySQL versions as well, it should be
fixed in mlogger.cxx and midas/src/history_schema.cxx.
If this problem only occurs on some distributions or MySQL versions, it needs
some more differentiation than #ifdef OS_UNIX.
Also, this somehow seems familiar, wasn't there such a problem in the past? |
27 May 2021, Nick Hastings, Bug Report, Wrong location for mysql.h on our Linux systems
|
Hi,
> with the recent fix of the CMakeLists.txt, it seems like another bug
surfaced.
> In midas/progs/mlogger.cxx:48/49, the mysql header files are included without
a
> prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our
> systems, the lines should be
> 48 #include <mysql/mysql.h>
> 49 #include <mysql/mysqld_error.h>
> This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on
> Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.
>
> If this problem occurs for other Linux/MySQL versions as well, it should be
> fixed in mlogger.cxx and midas/src/history_schema.cxx.
> If this problem only occurs on some distributions or MySQL versions, it needs
> some more differentiation than #ifdef OS_UNIX.
What does "mariadb_config --cflags" or "mysql_config --cflags" return on
these systems? For mariadb 10.3.27 on Debian 10 it returns both paths:
% mariadb_config --cflags
-I/usr/include/mariadb -I/usr/include/mariadb/mysql
Note also that mysql.h and mysqld_error.h reside in /usr/include/mariadb *not*
/usr/include/mariadb/mysql so using "#include <mysql/mysql.h>" would not work.
On CentOS 7 with mariadb 5.5.68:
% mysql_config --include
-I/usr/include/mysql
% ls -l /usr/include/mysql/mysql*.h
-rw-r--r--. 1 root root 38516 May 6 2020 /usr/include/mysql/mysql.h
-r--r--r--. 1 root root 76949 Oct 2 2020 /usr/include/mysql/mysqld_ername.h
-r--r--r--. 1 root root 28805 Oct 2 2020 /usr/include/mysql/mysqld_error.h
-rw-r--r--. 1 root root 24717 May 6 2020 /usr/include/mysql/mysql_com.h
-rw-r--r--. 1 root root 1167 May 6 2020 /usr/include/mysql/mysql_embed.h
-rw-r--r--. 1 root root 2143 May 6 2020 /usr/include/mysql/mysql_time.h
-r--r--r--. 1 root root 938 Oct 2 2020 /usr/include/mysql/mysql_version.h
So this seems to be the correct setup for both Debian and RHEL. If this is to
be worked around in Midas I would think it would be better to do it at the
cmake level than by putting another #ifdef in the code.
Cheers,
Nick. |
02 Jun 2021, Konstantin Olchanski, Bug Report, Wrong location for mysql.h on our Linux systems
|
> % mariadb_config --cflags
> -I/usr/include/mariadb -I/usr/include/mariadb/mysql
I get similar, both .../include and .../include/mysql are in my include path,
so both #include "mysql/mysql.h" and #include "mysql.h" work.
I added a message to cmake to report the MySQL CFLAGS and libraries, so next time
this is a problem, we can see what happened from the cmake output:
4ed0:midas olchansk$ make cmake | grep MySQL
...
-- MIDAS: Found MySQL version 10.4.16
-- MIDAS: MySQL CFLAGS: -I/opt/local/include/mariadb-10.4/mysql;-I/opt/local/include/mariadb-
10.4/mysql/mysql and libs: -L/opt/local/lib/mariadb-10.4/mysql/ -lmariadb
K.O. |
13 Feb 2020, Marius Koeppel, Forum, Writting Midas Events via FPGAs
|
Dear all,
we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The buffer looks something like:
// event 1
dma_buf[0] = 0x00000001; // Trigger and Event ID
dma_buf[1] = 0x00000001; // Serial number
dma_buf[2] = TIME; // time
dma_buf[3] = 18*4-4*4; // event size
dma_buf[4] = 18*4-6*4; // all bank size
dma_buf[5] = 0x11; // flags
// bank 0
dma_buf[6] = 0x46454230; // bank name
dma_buf[7] = 0x6; // bank type TID_DWORD
dma_buf[8] = 0x3*4; // data size
dma_buf[9] = 0xAFFEAFFE; // data
dma_buf[10] = 0xAFFEAFFE; // data
dma_buf[11] = 0xAFFEAFFE; // data
// bank 1
dma_buf[12] = 0x1; // bank name
dma_buf[12] = 0x46454231; // bank name
dma_buf[13] = 0x6; // bank type TID_DWORD
dma_buf[14] = 0x3*4; // data size
dma_buf[15] = 0xAFFEAFFE; // data
dma_buf[16] = 0xAFFEAFFE; // data
dma_buf[17] = 0xAFFEAFFE; // data
// event 2
.....
dma_buf[fpga_pointer] = 0xXXXXXXXX;
And we do something like:
while{true}
// obtain buffer space
status = rb_get_wp(rbh, (void **)&pdata, 10);
fpga_pointer = fpga.read_last_data_add();
wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
last_fpga_pointer = fpga_pointer;
Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step. So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
Remark: dma_buf is volatile so memcpy is not possible here.
Cheers,
Marius |
13 Feb 2020, Stefan Ritt, Forum, Writting Midas Events via FPGAs
|
The rb_xxx functions are (thoroughly tested) robust against high data rate given that you use them as intended:
1) Once you create the ring buffer via rb_create(), specify the maximum event size (midas event, not bank size!). Later there is not protection any more, so if you obtain pdata from rb_get_wp, you can of course write 4GB to pdata, overwriting everything in your memory, causing a fatal crash. It's your duty not to write more bytes into pdata then what you specified in rb_create()
2) Once you obtain a write pointer to the ring buffer via rb_get_wp(), this function might fail when the receiving side reads data slower than the producing side, simply because the buffer is full. In that case the producing side was to wait to get new buffer space. If you call to rb_get_wp() returns DB_TIMEOUT, it means that the function did not obtain enough free space for the next event. In that case you have to wait (like ss_sleep(10)) and try again. Only when rb_get_wp() returns DB_SUCCESS, you are allowed to write into pdata (up to the maximum event size specified in rb_create() of course).
/Stefan
> Dear all,
>
> we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The buffer looks something like:
>
> // event 1
> dma_buf[0] = 0x00000001; // Trigger and Event ID
> dma_buf[1] = 0x00000001; // Serial number
> dma_buf[2] = TIME; // time
> dma_buf[3] = 18*4-4*4; // event size
> dma_buf[4] = 18*4-6*4; // all bank size
> dma_buf[5] = 0x11; // flags
> // bank 0
> dma_buf[6] = 0x46454230; // bank name
> dma_buf[7] = 0x6; // bank type TID_DWORD
> dma_buf[8] = 0x3*4; // data size
> dma_buf[9] = 0xAFFEAFFE; // data
> dma_buf[10] = 0xAFFEAFFE; // data
> dma_buf[11] = 0xAFFEAFFE; // data
> // bank 1
> dma_buf[12] = 0x1; // bank name
> dma_buf[12] = 0x46454231; // bank name
> dma_buf[13] = 0x6; // bank type TID_DWORD
> dma_buf[14] = 0x3*4; // data size
> dma_buf[15] = 0xAFFEAFFE; // data
> dma_buf[16] = 0xAFFEAFFE; // data
> dma_buf[17] = 0xAFFEAFFE; // data
>
> // event 2
> .....
>
> dma_buf[fpga_pointer] = 0xXXXXXXXX;
>
>
> And we do something like:
>
> while{true}
> // obtain buffer space
> status = rb_get_wp(rbh, (void **)&pdata, 10);
> fpga_pointer = fpga.read_last_data_add();
>
> wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
> copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
> rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
>
> last_fpga_pointer = fpga_pointer;
>
> Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
>
> The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step. So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
>
> Remark: dma_buf is volatile so memcpy is not possible here.
>
> Cheers,
> Marius |
13 Feb 2020, Stefan Ritt, Forum, Writting Midas Events via FPGAs
|
The rb_xxx function are (thoroughly tested!) robust against high data rate given that you use them as intended:
1) Once you create the ring buffer via rb_create(), specify the maximum event size (overall event size, not bank size!). Later there is no protection any more, so if you obtain pdata from rb_get_wp, you can of course write 4GB to pdata, overwriting everything in your memory, causing a total crash. It's your responsibility to not write more bytes into pdata then
what you specified as max event size in rb_create()
2) Once you obtain a write pointer to the ring buffer via rb_get_wp, this function might fail when the receiving side reads data slower than the producing side, simply because the buffer is full. In that case the producing side has to wait until space is freed up in the buffer by the receiving side. If your call to rb_get_wp returns DB_TIMEOUT, it means that the
function did not obtain enough free space for the next event. In that case you have to wait (like ss_sleep(10)) and try again, until you succeed. Only when rb_get_wp() returns DB_SUCCESS, you are allowed to write into pdata, up to the maximum event size specified in rb_create of course. I don't see this behaviour in your code. You would need something
like
do {
status = rb_get_wp(rbh, (void **)&pdata, 10);
if (status == DB_TIMEOUT)
ss_sleep(10);
} while (status == DB_TIMEOUT);
Best,
Stefan
> Dear all,
>
> we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The buffer looks something like:
>
> // event 1
> dma_buf[0] = 0x00000001; // Trigger and Event ID
> dma_buf[1] = 0x00000001; // Serial number
> dma_buf[2] = TIME; // time
> dma_buf[3] = 18*4-4*4; // event size
> dma_buf[4] = 18*4-6*4; // all bank size
> dma_buf[5] = 0x11; // flags
> // bank 0
> dma_buf[6] = 0x46454230; // bank name
> dma_buf[7] = 0x6; // bank type TID_DWORD
> dma_buf[8] = 0x3*4; // data size
> dma_buf[9] = 0xAFFEAFFE; // data
> dma_buf[10] = 0xAFFEAFFE; // data
> dma_buf[11] = 0xAFFEAFFE; // data
> // bank 1
> dma_buf[12] = 0x1; // bank name
> dma_buf[12] = 0x46454231; // bank name
> dma_buf[13] = 0x6; // bank type TID_DWORD
> dma_buf[14] = 0x3*4; // data size
> dma_buf[15] = 0xAFFEAFFE; // data
> dma_buf[16] = 0xAFFEAFFE; // data
> dma_buf[17] = 0xAFFEAFFE; // data
>
> // event 2
> .....
>
> dma_buf[fpga_pointer] = 0xXXXXXXXX;
>
>
> And we do something like:
>
> while{true}
> // obtain buffer space
> status = rb_get_wp(rbh, (void **)&pdata, 10);
> fpga_pointer = fpga.read_last_data_add();
>
> wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
> copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
> rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
>
> last_fpga_pointer = fpga_pointer;
>
> Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
>
> The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step. So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
>
> Remark: dma_buf is volatile so memcpy is not possible here.
>
> Cheers,
> Marius |
14 Feb 2020, Konrad Briggl, Forum, Writting Midas Events via FPGAs
|
Hello Stefan,
is there a difference for the later data processing (after writing the ring buffer blocks)
if we write single events or multiple in one rb_get_wp - memcopy - rb_increment_wp cycle?
Both Marius and me have seen some inconsistencies in the number of events produced that is reported in the status page when writing multiple events in one go,
so I was wondering if this is due to us treating the buffer badly or the way midas handles the events after that.
Given that we produce the full event in our (FPGA) domain, an option would be to always copy one event from the dma to the midas-system buffer in a loop.
The question is if there is a difference (for midas) between
[pseudo code, much simplified]
while(dma_read_index < last_dma_write_index){
if(rb_get_wp(pdata)!=SUCCESS){
dma_read_index+=event_size;
continue;
}
copy_n(dma_buffer, pdata, event_size);
rb_increment_wp(event_size);
dma_read_index+=event_size;
}
and
while(dma_read_index < last_dma_write_index){
if(rb_get_wp(pdata)!=SUCCESS){
...
};
total_size=max_n_events_that_fit_in_rb_block();
copy_n(dma_buffer, pdata, total_size);
rb_increment_wp(total_size);
dma_read_index+=total_size;
}
Cheers,
Konrad
> The rb_xxx function are (thoroughly tested!) robust against high data rate given that you use them as intended:
>
> 1) Once you create the ring buffer via rb_create(), specify the maximum event size (overall event size, not bank size!). Later there is no protection any more, so if you obtain pdata from rb_get_wp, you can of course write 4GB to pdata, overwriting everything in your memory, causing a total crash. It's your responsibility to not write more bytes into pdata then
> what you specified as max event size in rb_create()
>
> 2) Once you obtain a write pointer to the ring buffer via rb_get_wp, this function might fail when the receiving side reads data slower than the producing side, simply because the buffer is full. In that case the producing side has to wait until space is freed up in the buffer by the receiving side. If your call to rb_get_wp returns DB_TIMEOUT, it means that the
> function did not obtain enough free space for the next event. In that case you have to wait (like ss_sleep(10)) and try again, until you succeed. Only when rb_get_wp() returns DB_SUCCESS, you are allowed to write into pdata, up to the maximum event size specified in rb_create of course. I don't see this behaviour in your code. You would need something
> like
>
> do {
> status = rb_get_wp(rbh, (void **)&pdata, 10);
> if (status == DB_TIMEOUT)
> ss_sleep(10);
> } while (status == DB_TIMEOUT);
>
> Best,
> Stefan
>
>
> > Dear all,
> >
> > we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The buffer looks something like:
> >
> > // event 1
> > dma_buf[0] = 0x00000001; // Trigger and Event ID
> > dma_buf[1] = 0x00000001; // Serial number
> > dma_buf[2] = TIME; // time
> > dma_buf[3] = 18*4-4*4; // event size
> > dma_buf[4] = 18*4-6*4; // all bank size
> > dma_buf[5] = 0x11; // flags
> > // bank 0
> > dma_buf[6] = 0x46454230; // bank name
> > dma_buf[7] = 0x6; // bank type TID_DWORD
> > dma_buf[8] = 0x3*4; // data size
> > dma_buf[9] = 0xAFFEAFFE; // data
> > dma_buf[10] = 0xAFFEAFFE; // data
> > dma_buf[11] = 0xAFFEAFFE; // data
> > // bank 1
> > dma_buf[12] = 0x1; // bank name
> > dma_buf[12] = 0x46454231; // bank name
> > dma_buf[13] = 0x6; // bank type TID_DWORD
> > dma_buf[14] = 0x3*4; // data size
> > dma_buf[15] = 0xAFFEAFFE; // data
> > dma_buf[16] = 0xAFFEAFFE; // data
> > dma_buf[17] = 0xAFFEAFFE; // data
> >
> > // event 2
> > .....
> >
> > dma_buf[fpga_pointer] = 0xXXXXXXXX;
> >
> >
> > And we do something like:
> >
> > while{true}
> > // obtain buffer space
> > status = rb_get_wp(rbh, (void **)&pdata, 10);
> > fpga_pointer = fpga.read_last_data_add();
> >
> > wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
> > copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
> > rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
> >
> > last_fpga_pointer = fpga_pointer;
> >
> > Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
> >
> > The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step. So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
> >
> > Remark: dma_buf is volatile so memcpy is not possible here.
> >
> > Cheers,
> > Marius |
14 Feb 2020, Stefan Ritt, Forum, Writting Midas Events via FPGAs
|
rb_xxx functions are midas event agnostic. The receiving side in mfe.cxx (lines 1418 in receive_trigger_event) however pulls one event at a time. If you
have some inconsistency I would put some debugging code there.
Stefan
> Hello Stefan,
> is there a difference for the later data processing (after writing the ring buffer blocks)
> if we write single events or multiple in one rb_get_wp - memcopy - rb_increment_wp cycle?
> Both Marius and me have seen some inconsistencies in the number of events produced that is reported in the status page when writing multiple
events in one go,
> so I was wondering if this is due to us treating the buffer badly or the way midas handles the events after that.
>
> Given that we produce the full event in our (FPGA) domain, an option would be to always copy one event from the dma to the midas-system buffer
in a loop.
> The question is if there is a difference (for midas) between
> [pseudo code, much simplified]
>
> while(dma_read_index < last_dma_write_index){
> if(rb_get_wp(pdata)!=SUCCESS){
> dma_read_index+=event_size;
> continue;
> }
> copy_n(dma_buffer, pdata, event_size);
> rb_increment_wp(event_size);
> dma_read_index+=event_size;
> }
>
> and
>
> while(dma_read_index < last_dma_write_index){
> if(rb_get_wp(pdata)!=SUCCESS){
> ...
> };
> total_size=max_n_events_that_fit_in_rb_block();
> copy_n(dma_buffer, pdata, total_size);
> rb_increment_wp(total_size);
> dma_read_index+=total_size;
> }
>
> Cheers,
> Konrad
>
> > The rb_xxx function are (thoroughly tested!) robust against high data rate given that you use them as intended:
> >
> > 1) Once you create the ring buffer via rb_create(), specify the maximum event size (overall event size, not bank size!). Later there is no protection
any more, so if you obtain pdata from rb_get_wp, you can of course write 4GB to pdata, overwriting everything in your memory, causing a total crash.
It's your responsibility to not write more bytes into pdata then
> > what you specified as max event size in rb_create()
> >
> > 2) Once you obtain a write pointer to the ring buffer via rb_get_wp, this function might fail when the receiving side reads data slower than the
producing side, simply because the buffer is full. In that case the producing side has to wait until space is freed up in the buffer by the receiving side.
If your call to rb_get_wp returns DB_TIMEOUT, it means that the
> > function did not obtain enough free space for the next event. In that case you have to wait (like ss_sleep(10)) and try again, until you succeed.
Only when rb_get_wp() returns DB_SUCCESS, you are allowed to write into pdata, up to the maximum event size specified in rb_create of course. I
don't see this behaviour in your code. You would need something
> > like
> >
> > do {
> > status = rb_get_wp(rbh, (void **)&pdata, 10);
> > if (status == DB_TIMEOUT)
> > ss_sleep(10);
> > } while (status == DB_TIMEOUT);
> >
> > Best,
> > Stefan
> >
> >
> > > Dear all,
> > >
> > > we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA
sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The
buffer looks something like:
> > >
> > > // event 1
> > > dma_buf[0] = 0x00000001; // Trigger and Event ID
> > > dma_buf[1] = 0x00000001; // Serial number
> > > dma_buf[2] = TIME; // time
> > > dma_buf[3] = 18*4-4*4; // event size
> > > dma_buf[4] = 18*4-6*4; // all bank size
> > > dma_buf[5] = 0x11; // flags
> > > // bank 0
> > > dma_buf[6] = 0x46454230; // bank name
> > > dma_buf[7] = 0x6; // bank type TID_DWORD
> > > dma_buf[8] = 0x3*4; // data size
> > > dma_buf[9] = 0xAFFEAFFE; // data
> > > dma_buf[10] = 0xAFFEAFFE; // data
> > > dma_buf[11] = 0xAFFEAFFE; // data
> > > // bank 1
> > > dma_buf[12] = 0x1; // bank name
> > > dma_buf[12] = 0x46454231; // bank name
> > > dma_buf[13] = 0x6; // bank type TID_DWORD
> > > dma_buf[14] = 0x3*4; // data size
> > > dma_buf[15] = 0xAFFEAFFE; // data
> > > dma_buf[16] = 0xAFFEAFFE; // data
> > > dma_buf[17] = 0xAFFEAFFE; // data
> > >
> > > // event 2
> > > .....
> > >
> > > dma_buf[fpga_pointer] = 0xXXXXXXXX;
> > >
> > >
> > > And we do something like:
> > >
> > > while{true}
> > > // obtain buffer space
> > > status = rb_get_wp(rbh, (void **)&pdata, 10);
> > > fpga_pointer = fpga.read_last_data_add();
> > >
> > > wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
> > > copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
> > > rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
> > >
> > > last_fpga_pointer = fpga_pointer;
> > >
> > > Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also
increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
> > >
> > > The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step.
So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
> > >
> > > Remark: dma_buf is volatile so memcpy is not possible here.
> > >
> > > Cheers,
> > > Marius |
20 Feb 2020, Konstantin Olchanski, Forum, Writting Midas Events via FPGAs
|
> rb_xxx functions are midas event agnostic. The receiving side in mfe.cxx (lines 1418 in receive_trigger_event) however pulls one event at a time. If you
> have some inconsistency I would put some debugging code there.
I agree with Stefan, I do not think there is any bugs in the ring buffer code.
But. I do not think we ever did DMA the data directly into the ring buffer. Hmm...
I just checked, this is what we do (and this worked in the ALPHA Si-strip DAQ system for 10 years now):
- mfe.cxx multithread equipment
- mfe readout thread grabs pointer from ring buffer
- mfe creates event headers, etc
- calls our read_event() function
- creates data bank
- DMA data into the data bank (this is the DMA from VME block reads, using DMA controller inside the UniverseII and tsi148 VME-to-PCI bridges)
- close data bank
- return to mfe
- mfe readout thread increments the ring buffer
- mfe main thread grabs events from ring buffer, sends them to the mserver
So there could be trouble:
a) the ring buffer code does not have the required "volatile" (ahem, "atomic") annotations, so DMA may have a bad interaction with compiler optimizations (values stored in registers
instead of in memory, etc)
b) the DMA driver must doctor the memory settings to (1) mark the DMA target memory uncachable or (1b) invalidate the cache after DMA completes, (2) mark the DMA target
memory unswappable.
So I see possibilities for the ring buffer to malfunction.
But now I am curious, which DMA controller you use? The Altera or Xilinx PCIe block with the vendor supplied DMA driver? Or you do DMA on an ARM SoC FPGA? (no PCI/PCIe,
different DMA controller, different DMA driver).
I am curious because we will be implementing pretty much what you do on ARM SoC FPGAs pretty soon, so good to know
if there is trouble to expect.
But I will probably use the tmfe.h c++ frontend and a "pure c++" ring buffer instead of mfe.cxx and the midas "rb" ring buffer.
(I did not look at your code at all, there could be a bug right there, this ring buffer stuff is tricky. With luck there is no bug
in your dma driver. The dma drivers for our vme bridges did do have bugs).
K.O. |
20 Feb 2020, Marius Koeppel, Forum, Writting Midas Events via FPGAs
|
We also agree and found the problem now. Since we build everything (MIDAS Event Header, Bank Header, Banks etc.) in the FPGA we had some struggle with the MIDAS data format (http://lmu.web.psi.ch/docu/manuals/bulk_manuals/software/midas195/html/AppendixA.html). We thought that only the MIDAS Event needs to be aligned to 64 bit but as it turned out also the bank data (Stefan updated the wiki page already) needs to be aligned. Since we are using the BANK32 it was a bit unclear for us since the bank header is not 64 bit aligned. But we managed this now by adding empty data and the system is running now.
Our setup looks like this:
Software:
- mfe.cxx multithread equipment
- mfe readout thread grabs pointer from dma ring buffer
- since the dma buffer is volatile we do copy_n for transforming the data to MIDAS
- the data is already in the MIDAS format so done from our side :)
- mfe readout thread increments the ring buffer
- mfe main thread grabs events from ring buffer, sends them to the mserver
Firmware:
- Arria 10 development board
- Altera PCIe block
- Own DMA engine since we are doing burst writing DMA with PCIe 3.0.
- Own device driver
- no interrupts
If you have more questions fell free to ask. |
20 Feb 2020, Stefan Ritt, Forum, Writting Midas Events via FPGAs
|
Actually the cause of all of the is a real bug in the midas functions. We want each bank 8-byte aligned, so there is code in bk_close like:
midas.cxx:14788:
((BANK_HEADER *) event)->data_size += sizeof(BANK32) + ALIGN8(pbk32->data_size);
While the old sizeof(BANK)=8, the extended sizeof(BANK32)=12, so not 8-byte aligned. This code should rather be:
((BANK_HEADER *) event)->data_size += ALIGN8(sizeof(BANK32) +pbk32->data_size);
But if we change that, it would break every midas data file on this planet!
The only chance I see is to use the "flags" in the BANK_HEADER to distinguish a current bank from a "correct" bank.
So we could introduce a flag BANK_FORMAT_ALIGNED which distinguishes between the two pieces of code above.
Then bk_iterate32 would look at that flag and do the right thing.
Any thoughts?
Best,
Stefan |
21 Feb 2020, Konstantin Olchanski, Forum, Writting Midas Events via FPGAs
|
Hi, Stefan - is this our famous 64-bit misalignement? Where we have each alternating bank aligned and misaligned at 64 bits? Without changing the data
format, one can always store data in 64-bit aligned banks by inserting a dummy banks between real banks:
event header
bank header
bank1 --- 64-bit aligned --- with data
bank2 --- misaligned, no data
bank3 --- 64-bit aligned --- with data
bank4 --- misaligned, no data
...
for sure, wastes space for bank2, bank4, etc, but at 12 bytes per bank, maybe this is negligible overhead compared to total event size.
BTW, aligned-to-64-bit is old news. The the PWB FPGA, I have 128-bit data paths to DDR RAM, the data has to be aligned to 128 bits, or else!
K.O.
> Actually the cause of all of the is a real bug in the midas functions. We want each bank 8-byte aligned, so there is code in bk_close like:
>
> midas.cxx:14788:
> ((BANK_HEADER *) event)->data_size += sizeof(BANK32) + ALIGN8(pbk32->data_size);
>
> While the old sizeof(BANK)=8, the extended sizeof(BANK32)=12, so not 8-byte aligned. This code should rather be:
>
> ((BANK_HEADER *) event)->data_size += ALIGN8(sizeof(BANK32) +pbk32->data_size);
>
> But if we change that, it would break every midas data file on this planet!
>
> The only chance I see is to use the "flags" in the BANK_HEADER to distinguish a current bank from a "correct" bank.
> So we could introduce a flag BANK_FORMAT_ALIGNED which distinguishes between the two pieces of code above.
> Then bk_iterate32 would look at that flag and do the right thing.
>
> Any thoughts?
>
> Best,
> Stefan |
21 Feb 2020, Konstantin Olchanski, Forum, Writting Midas Events via FPGAs
|
> We also agree and found the problem now.
Good. what was wrong?
> - Own DMA engine since we are doing burst writing DMA with PCIe 3.0.
> - Own device driver
Scary stuff.
> - no interrupts
Right. Best I can tell, interrupts no longer useful in Linux - interrupt handler cannot do any real work, has to hand off to a kernel thread, resulting
in so much latency and overhead that one might as well poll for the data... And for DMA data transfers, the data rate is well known,
so easy to predict how long the DMA will run for and sleep for that amount of time instead of waiting for an interrupt.
K.O. |
|