Discussion:
[ast-developers] libast memory allocator enhancements;
Irek Szczesniak
2014-01-20 23:32:32 UTC
Permalink
Does anyone have Phong Vo <kpv at research.att.com>'s new email address?
The following work (i.e. the per thread caching etc) in Illumos might
be very interesting...

Forwarded conversation
Subject: [developer] 4489-4491 ptcumem and other libumem enhancements
------------------------

From: Robert Mustacchi <rm at joyent.com>
Date: Thu, Jan 16, 2014 at 3:59 AM
To: illumos Developer <developer at lists.illumos.org>


This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.

https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html

Robert


-------------------------------------------
illumos-developer
Archives: https://www.listbox.com/member/archive/182179/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175106-197fca96
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175106&id_secret=21175106-3d886b59
Powered by Listbox: http://www.listbox.com

----------
From: Dan McDonald <danmcd at omniti.com>
Date: Mon, Jan 20, 2014 at 4:33 AM
To: Robert Mustacchi <rm at joyent.com>
Cc: illumos Developer <developer at lists.illumos.org>
This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.
https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html
I mostly understood this except for one constant, and I suspect that's
me being dense. I also didn't find much wrong, save for one possible
attack-surface reduction.

Dan

-----------

exception_lists/check_rtime
usr/src/cmd/mdb/common/modules/libc/libc.c
usr/src/cmd/mdb/common/modules/libumem/libumem.c
usr/src/cmd/mdb/common/modules/libumem/umem.c
usr/src/cmd/mdb/intel/amd64/libumem/Makefile
usr/src/cmd/mdb/intel/ia32/libumem/Makefile
usr/src/cmd/mdb/sparc/v7/libumem/Makefile
usr/src/cmd/mdb/sparc/v9/libumem/Makefile
usr/src/lib/libc/amd64/Makefile
usr/src/lib/libc/i386/Makefile.com
usr/src/lib/libc/port/mapfile-vers
usr/src/lib/libc/port/threads/thr.c
usr/src/lib/libc/sparc/Makefile.com
usr/src/lib/libc/sparcv9/Makefile.com
usr/src/lib/libumem/Makefile.com
usr/src/lib/libumem/common/envvar.c
usr/src/lib/libumem/common/malloc.c
usr/src/lib/libumem/common/stub_stand.c
usr/src/lib/libumem/common/umem_base.h
usr/src/lib/libumem/common/umem_impl.h
usr/src/lib/libumem/common/vmem_base.c
usr/src/lib/libumem/i386/asm_subr.s
usr/src/lib/libc/port/threads/tmem.c
usr/src/man/man3malloc/umem_alloc.3malloc

* File look good, no comments needed.


usr/src/lib/libc/inc/thr_uberdata.h

* Why 16 for tm_roots? Ahhh, it's on line 405-408 of umem.c. Mention
the big-theory statement, please?


usr/src/lib/libumem/amd64/umem_genasm.c

* Line 567: Grammar nit: "it's" ("it is", not "single-byte addl's
MAX_UINT32")


usr/src/lib/libumem/common/mapfile-vers

* Have you thought through the security considerations of allowing
asm_subr.o's text segment to be writable? Or am I worrying about nothing?
Worst case, you create a new .o that JUST has these writeable malloc
linkages in them, I think.


usr/src/lib/libumem/common/umem.c

* Sorry for being dense, but 16 for TMNUMROOTS still seems arbitrary. Please
enlighten me, or tell me where in here or in umem_genasm.c I missed it?!


usr/src/lib/libumem/i386/umem_genasm.c

* Lines 32 & 557: it's/its grammar.


usr/src/lib/libumem/sparc/umem_genasm.c

* WHAT?!? NO SPARC?!? okay okay... put the baseball bat down... just
kidding. :)

----------
From: Josef 'Jeff' Sipek <jeffpc at josefsipek.net>
Date: Mon, Jan 20, 2014 at 4:30 PM
To: Robert Mustacchi <rm at joyent.com>
Cc: illumos Developer <developer at lists.illumos.org>
This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.
https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html
Fascinating read in umem.c...

I'm not exactly thrilled about some of the arch specifics in common code.
Yes, I know we support only x86 and sparc (and no one actually uses sparc)
and the ptcumem changes are supported only on x86. For example, line 489
mentions 5-byte jump. Should some of this architecture specific
documentation move into the architecture specific files? Ah, I see that
there's some there. I feel like having arch specific details in the common
files is begging for contradicting comments down the line as one gets
updated but the other doesn't.

umem.c:552: is that supposed to be s/makefile/mapfile/?

would it make sense to make umem_genasm_supported const?

stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar
with the requirements for stubs.)

malloc.c: Is there any "config" in illumos so that places like this can
check for a (compile time) feature instead of an architecture?

Jeff.

--
Once you have their hardware. Never give it back.
(The First Rule of Hardware Acquisition)

----------
From: Robert Mustacchi <rm at joyent.com>
Date: Mon, Jan 20, 2014 at 5:42 PM
To: Josef 'Jeff' Sipek <jeffpc at josefsipek.net>
Cc: illumos Developer <developer at lists.illumos.org>
This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.
https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html
Fascinating read in umem.c...
I'm not exactly thrilled about some of the arch specifics in common code.
Yes, I know we support only x86 and sparc (and no one actually uses sparc)
and the ptcumem changes are supported only on x86. For example, line 489
mentions 5-byte jump. Should some of this architecture specific
documentation move into the architecture specific files? Ah, I see that
there's some there. I feel like having arch specific details in the common
files is begging for contradicting comments down the line as one gets
updated but the other doesn't.
It's not a perfect world, and I don't think there's a great answer. The
vast majority of that block of text that I added into the big theory
statement is equally viable across all architectures that illumos does
or may some day support. If someone wanted to port this to sparc, which
I'd welcome, or we enter the mystical future where there are other
architectures that illumos supports, then the contents there tell the
writer how they should go about doing this, the parts about breaking
down the generated asm into different portions are the same as what
you'd do across any architecture.

There are a few places where we delve into x86 specifics (8.2 and 8.3),
which I've updated to try to generalize them. I'd really rather not
break out the vast majority of this into the per-architecture
implementations as otherwise I fear we'll hit your fears much sooner, as
we'll actually be duplicating content.
umem.c:552: is that supposed to be s/makefile/mapfile/?
Yeah, it is, thanks.
would it make sense to make umem_genasm_supported const?
I don't think it really changes anything, but sure, I can do that.
stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar
with the requirements for stubs.)
So, no, not really. These are basically just used for kmdb and aren't
atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in
cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more
consistent, I could do the same rename tricks and move its stub
implementation into there and do the same rename tricks.
malloc.c: Is there any "config" in illumos so that places like this can
check for a (compile time) feature instead of an architecture?
We only have that in terms of features of the architecture and processor
or langauge, see <sys/isa_defs.h> and <sys/feature_tests.h>. So, for
better or worse, this is what we do in the rest of the system.

I've updated the webrev in place.

Robert

----------
From: Josef 'Jeff' Sipek <jeffpc at josefsipek.net>
Date: Mon, Jan 20, 2014 at 8:16 PM
To: Robert Mustacchi <rm at joyent.com>
Cc: illumos Developer <developer at lists.illumos.org>


Generalizing is what I was looking for.
umem.c:552: is that supposed to be s/makefile/mapfile/?
Yeah, it is, thanks.
would it make sense to make umem_genasm_supported const?
I don't think it really changes anything, but sure, I can do that.
.data vs .rodata? Since it's not a tunable might as well have the compiler
and kernel (assuming non-writable .rodata) tell us about any bad behavior.

Should umem_base.h be updated too?
stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar
with the requirements for stubs.)
So, no, not really. These are basically just used for kmdb and aren't
atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in
cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more
consistent, I could do the same rename tricks and move its stub
implementation into there and do the same rename tricks.
I don't think I quite have the big picture here to decide which way makes
more sense, so I leave it up to you to decide. (I just made sure to mention
it in case it did need to be atomic.)
malloc.c: Is there any "config" in illumos so that places like this can
check for a (compile time) feature instead of an architecture?
We only have that in terms of features of the architecture and processor
or langauge, see <sys/isa_defs.h> and <sys/feature_tests.h>. So, for
better or worse, this is what we do in the rest of the system.
Yeah, that's what I thought.

Some more feedback:

umem_genasm.c:116: 0x0 -> 0x00

Overall, LGTM.

Jeff.

--
Mankind invented the atomic bomb, but no mouse would ever construct a
mousetrap.
- Albert Einstein

----------
From: Robert Mustacchi <rm at joyent.com>
Date: Mon, Jan 20, 2014 at 9:08 PM
To: Josef 'Jeff' Sipek <jeffpc at josefsipek.net>
Cc: illumos Developer <developer at lists.illumos.org>
Should umem_base.h be updated too?
Yes, I forgot about that. Thanks.
stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar
with the requirements for stubs.)
So, no, not really. These are basically just used for kmdb and aren't
atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in
cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more
consistent, I could do the same rename tricks and move its stub
implementation into there and do the same rename tricks.
I don't think I quite have the big picture here to decide which way makes
more sense, so I leave it up to you to decide. (I just made sure to mention
it in case it did need to be atomic.)
I've opted to do the latter to be more consistent.
umem_genasm.c:116: 0x0 -> 0x00
Fixed.

Webrev updated in place.

Robert

----------
From: Josef 'Jeff' Sipek <jeffpc at josefsipek.net>
Date: Mon, Jan 20, 2014 at 9:24 PM
To: Robert Mustacchi <rm at joyent.com>
Cc: illumos Developer <developer at lists.illumos.org>
Should umem_base.h be updated too?
Yes, I forgot about that. Thanks.
stub_stand.c:151: is atomic_swap_64 supposed to be atomic? (I'm not familiar
with the requirements for stubs.)
So, no, not really. These are basically just used for kmdb and aren't
atomic. See what we're doing with mutex_lock and umem_atomic_add_64 in
cmd/mdb/common/kmdb/kmdb_umemglue.c. If you feel it will be more
consistent, I could do the same rename tricks and move its stub
implementation into there and do the same rename tricks.
I don't think I quite have the big picture here to decide which way makes
more sense, so I leave it up to you to decide. (I just made sure to mention
it in case it did need to be atomic.)
I've opted to do the latter to be more consistent.
Turns out that I have no idea how the rename tricks work. Assuming that it
works, LGTM.

Jeff.
umem_genasm.c:116: 0x0 -> 0x00
Fixed.
Webrev updated in place.
Robert
--
*NOTE: This message is ROT-13 encrypted twice for extra protection*

----------
From: Robert Mustacchi <rm at joyent.com>
Date: Tue, Jan 21, 2014 at 12:21 AM
To: Dan McDonald <danmcd at omniti.com>
Cc: illumos Developer <developer at lists.illumos.org>


Hey Dan,

Responses inline.
usr/src/lib/libc/inc/thr_uberdata.h
* Why 16 for tm_roots? Ahhh, it's on line 405-408 of umem.c. Mention
the big-theory statement, please?
I've added a comment to direct folks there.
usr/src/lib/libumem/common/mapfile-vers
* Have you thought through the security considerations of allowing
asm_subr.o's text segment to be writable? Or am I worrying about nothing?
Worst case, you create a new .o that JUST has these writeable malloc
linkages in them, I think.
If we wanted to do anything, we could mprotect it after the work by
umem_genasm(). That'd actually solve the problems that this could cause.
I think that's a reasonable low priority follow up RFE.
usr/src/lib/libumem/common/umem.c
* Sorry for being dense, but 16 for TMNUMROOTS still seems arbitrary. Please
enlighten me, or tell me where in here or in umem_genasm.c I missed it?!
Well, it isn't properly explained, so I'll try and add something to the
comments. It is slightly arbitrary. But it was done based on the size of
the allocations and what sizes would be covered by the per-thread cache
and the likelihood we've seen for applications to allocate those sizes.
This means <= 256 bytes and <=448 bytes will be cached for i386 and
amd64 respectively. Basically, beyond those sizes you start getting to
more serious sized objects being allocated which are much more
noticeable in other systems. So beyond that size we basically felt that
the efficacy of this was going to decrease and that folks who are doing
larger sized allocations are going to be much more noticeable and less
likely to just do it every time by accident.

I've updated the webrev in place.

Robert
--
Irek
Glenn Fowler
2014-01-22 07:46:18 UTC
Permalink
Phong peruses the list

the top comment in src/lib/libast/vmalloc/vmbest.c:

/* Best-fit allocation method extended for concurrency.
**
** The memory arena of a region is organized as follows:
** 1. Raw memory are (possibly noncontiguous) segments (Seg_t) obtained
** via Vmdisc_t.memoryf. Segments may be further partitioned into
** "superblocks" on requests from memory packs (more below). The
** collection of superblocks is called the "segment pool".
** 2. A region consists of one or more packs. Each pack typically holds
** one or more superblocks from the segment pool. These blocks are
** partitioned into smaller blocks for allocation. General
allocation
** is done in a best-fit manner with a splay tree holding free
blocks
** by size. Caches of small blocks are kept to speed up allocation.
** 3. Packs are created dynamically and kept in an array. An allocation
** request uses the ASO's ID of the calling thread/process to hash
** into this array and search for packs with available memory. Thus,
** allocation is thread-preferred but not thread-specific since
** all packs may be used by all threads/processes.
**
** Written by Kiem-Phong Vo, 01/16/1994, 12/21/2012.
*/

if you want to compare malloc implementation X against vmalloc w.r.t. multi
thread/proc then look at these tests

tcache-scratch.c
tcache-thrash.c
tcontent.c
tmtmalloc.c
tperform.c
tsafemalloc.c
tsignal.c

in src/cmd/tests/vmalloc

I believe the tests use just malloc()/free() when VMALLOC is not defined
(i.e., non-ast builds)
I'm sure the list would be interested in comparison results

to see how the tests are run for ast malloc:

bin/package use
cd tests
nmake -n test.vmalloc
Post by Irek Szczesniak
Does anyone have Phong Vo <kpv at research.att.com>'s new email address?
The following work (i.e. the per thread caching etc) in Illumos might
be very interesting...
Forwarded conversation
Subject: [developer] 4489-4491 ptcumem and other libumem enhancements
------------------------
From: Robert Mustacchi <rm at joyent.com>
Date: Thu, Jan 16, 2014 at 3:59 AM
To: illumos Developer <developer at lists.illumos.org>
This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.
https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html
Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140122/b976c9c3/attachment.html>
Phong Vo
2014-01-22 15:18:12 UTC
Permalink
I recall testing Vmalloc against libumem (perhaps not the latest version)
and a few other malloc implementations (Hoard, JEMalloc, MTmalloc,
TCmalloc) about a year ago while still at AT&T. The only other malloc that
was competitive with Vmalloc over a whole range of tests (in the list that
Glenn gave) was Google's TCmalloc. TCmalloc and Vmalloc edged each other on
time depending on tests and platforms (Solaris vc Linux) but Vmalloc
usually won on space compaction.

Over all, I think the thread-preferred model of Vmalloc allows better
adaptation than other thread-specific mallocs in terms of managing all
memory in the arena for an environment where threads (and parallel
processes) are not uniform in their computational loads. This was proven
out in the Daytona database system where Vmalloc was used to manage shared
memory (shmem) across processes doing rather different types of
computations.

Phong
Post by Glenn Fowler
Phong peruses the list
/* Best-fit allocation method extended for concurrency.
**
** 1. Raw memory are (possibly noncontiguous) segments (Seg_t) obtained
** via Vmdisc_t.memoryf. Segments may be further partitioned into
** "superblocks" on requests from memory packs (more below). The
** collection of superblocks is called the "segment pool".
** 2. A region consists of one or more packs. Each pack typically holds
** one or more superblocks from the segment pool. These blocks are
** partitioned into smaller blocks for allocation. General
allocation
** is done in a best-fit manner with a splay tree holding free
blocks
** by size. Caches of small blocks are kept to speed up allocation.
** 3. Packs are created dynamically and kept in an array. An allocation
** request uses the ASO's ID of the calling thread/process to hash
** into this array and search for packs with available memory. Thus,
** allocation is thread-preferred but not thread-specific since
** all packs may be used by all threads/processes.
**
** Written by Kiem-Phong Vo, 01/16/1994, 12/21/2012.
*/
if you want to compare malloc implementation X against vmalloc w.r.t.
multi thread/proc then look at these tests
tcache-scratch.c
tcache-thrash.c
tcontent.c
tmtmalloc.c
tperform.c
tsafemalloc.c
tsignal.c
in src/cmd/tests/vmalloc
I believe the tests use just malloc()/free() when VMALLOC is not defined
(i.e., non-ast builds)
I'm sure the list would be interested in comparison results
bin/package use
cd tests
nmake -n test.vmalloc
Post by Irek Szczesniak
Does anyone have Phong Vo <kpv at research.att.com>'s new email address?
The following work (i.e. the per thread caching etc) in Illumos might
be very interesting...
Forwarded conversation
Subject: [developer] 4489-4491 ptcumem and other libumem enhancements
------------------------
From: Robert Mustacchi <rm at joyent.com>
Date: Thu, Jan 16, 2014 at 3:59 AM
To: illumos Developer <developer at lists.illumos.org>
This combines work done over the past few years at Joyent enhancing
libumem. It's broken down into two commits. One which adds 128k slabs
and makes VM_BESTFIT the default, the other which adds per thread
caching umem. For more on umem, see
http://dtrace.org/blogs/rm/2012/07/16/per-thread-caching-in-libumem/.
https://us-east.manta.joyent.com/rmustacc/public/webrevs/4489/index.html
Robert
_______________________________________________
ast-developers mailing list
ast-developers at lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-developers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.research.att.com/pipermail/ast-developers/attachments/20140122/8bcfe408/attachment-0001.html>
Loading...