Roland Mainz
2013-07-14 16:16:03 UTC
Hi!
----
During benchmarking I noticed an issue with AST grep(1) - it uses
|mmap()| but doesn't use |madvise(..., MADV_SEQUENTIAL, ...)| ... I
digged a little bit around in the code and noticed that while sfio has
|SF_SEQUENTIAL| there is no way to set it at |sfopen()| time...
... what would be the best place to fix it ? Putting it into
src/lib/libcmd/grep.c doesn't help other cases where huge regex data
are processed and there are cases when |mmap()| may not work (e.g.
filesystem doesn't support |mmap()| or chunk size is too small) but we
could still use |posix_fadvise(..., POSIX_FADV_SEQUENTIAL)| ... would
be a new |sfioadvise()| call be a good idea ?
** Notes:
- The the performance improvement measured via the "time"/"timex" may
be be minor for idle systems because |madvise(..., MADV_SEQUENTIAL,
...)| (and to a lesser degree |posix_fadvise(...,
POSIX_FADV_SEQUENTIAL)|) affects the time needed until an I/O page
gets re-used for something else. The trouble is that there are
multiple ways to get them re-used... and in some cases (like Solaris
offloaded some VM tasks to different strands to make applications
faster by parallising the VM work).
Or short: The performance improvement is for a complete system (e.g.
being able to process more data) but may have little effect for an
individual process run (except when the VM system is already under
pressure... then the performance benefit can be huge).
----
Bye,
Roland
----
During benchmarking I noticed an issue with AST grep(1) - it uses
|mmap()| but doesn't use |madvise(..., MADV_SEQUENTIAL, ...)| ... I
digged a little bit around in the code and noticed that while sfio has
|SF_SEQUENTIAL| there is no way to set it at |sfopen()| time...
... what would be the best place to fix it ? Putting it into
src/lib/libcmd/grep.c doesn't help other cases where huge regex data
are processed and there are cases when |mmap()| may not work (e.g.
filesystem doesn't support |mmap()| or chunk size is too small) but we
could still use |posix_fadvise(..., POSIX_FADV_SEQUENTIAL)| ... would
be a new |sfioadvise()| call be a good idea ?
** Notes:
- The the performance improvement measured via the "time"/"timex" may
be be minor for idle systems because |madvise(..., MADV_SEQUENTIAL,
...)| (and to a lesser degree |posix_fadvise(...,
POSIX_FADV_SEQUENTIAL)|) affects the time needed until an I/O page
gets re-used for something else. The trouble is that there are
multiple ways to get them re-used... and in some cases (like Solaris
= 11.1) it may even be a seperate "CPU strand" (on the same CPU
(sharing the same MMU)) which does the reusing (Solaris >= 11.1offloaded some VM tasks to different strands to make applications
faster by parallising the VM work).
Or short: The performance improvement is for a complete system (e.g.
being able to process more data) but may have little effect for an
individual process run (except when the VM system is already under
pressure... then the performance benefit can be huge).
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)