malloc

From Seo Wiki - Search Engine Optimization and Programming Languages

Jump to: navigation, search

In computing, malloc is a subroutine for performing dynamic memory allocation in the C and C++ programming languages. It is part of the standard library for both languages. Many implementations of malloc are available, each of which performs differently depending on the computing hardware and how a program is written. Performance varies in both execution time and required memory. Programs must properly manage dynamic memory allocated through the use of malloc to avoid memory leaks and memory corruption.

Contents

Rationale

The C programming language manages memory statically, automatically, or dynamically. Static-duration variables are allocated in main (fixed) memory and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and, before C99 (which allows variable-length automatic arrays[1]), automatic-duration variables, the size of the allocation is required to be compile-time constant. If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate.

The lifetime of allocated memory is also a concern. Neither static- nor automatic-duration memory is adequate for all situations. Automatic-allocated data cannot persist across multiple function calls, while static data persists for the life of the program whether it is needed or not. In many situations the programmer requires greater flexibility in managing the lifetime of allocated memory.

These limitations are avoided by using dynamic memory allocation in which memory is more explicitly (but more flexibly) managed, typically, by allocating it from the heap, an area of memory structured for this purpose. In C, the library function malloc is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc returns. When the memory is no longer needed, the pointer is passed to free which deallocates the memory so that it can be used for other purposes.

Some platforms provide library calls which allow run-time dynamic allocation from the C stack rather than the heap (e.g. Unix alloca()[2], Microsoft Windows CRTL's malloca()[3]). This memory is automatically freed when the calling function ends. The need for this is lessened by changes in the C99 standard, which added support for variable-length arrays of block scope having sizes determined at runtime.

Dynamic memory allocation in C

The malloc function is one of the functions in standard C to allocate memory. Its function prototype is

void *malloc(size_t size);

which allocates size bytes of memory. If the allocation succeeds, a pointer to the block of memory is returned, otherwise a NULL pointer is returned.

Memory allocated via malloc is persistent: it will continue to exist until the program terminates or the memory is explicitly deallocated by the programmer (that is, the block is said to be "freed"). This is achieved by use of the free function. Its prototype is

void free(void *pointer);

which releases the block of memory pointed to by pointer. pointer must have been previously returned by malloc, calloc, or realloc and must only be passed to free once.

Usage example

The standard method of creating an array of 10 int objects:

int array[10];

However, if one wishes to allocate a similar array dynamically, the following code could be used:

/* Allocate space for an array with ten elements of type int. */
int *ptr = (int*)malloc(10 * sizeof (int));
if (ptr == NULL) {
    /* Memory could not be allocated, the program should handle the error here as appropriate. */
} else {
    /* Allocation succeeded.  Do something.  */
    free(ptr);  /* We are done with the int objects, and free the associated pointer. */ 
    ptr = NULL; /* The pointer must not be used again, unless re-assigned to using malloc again. */
}

malloc returns a null pointer to indicate that no memory is available, or that some other error occurred which prevented memory being allocated.

A useful idiom with malloc is shown in this example:

int *ptr = malloc(10 * sizeof *ptr);

That is, instead of writing a hard-wired type into the argument to malloc, one uses the sizeof operator on the content of the pointer to be allocated. This ensures that the types on the left and right of the assignment will never get out of sync when code is revised.

Casting and type safety

malloc returns a void pointer (void *), which indicates that it is a pointer to a region of unknown data type. One may "cast" (see type conversion) this pointer to a specific type, as in

int *ptr = (int*)malloc(10 * sizeof (int));

When using C, this is considered bad practice; it is redundant under the C standard. Moreover, putting in a cast may mask failure to include the header stdlib.h, in which the prototype for malloc is found. In the absence of a prototype for malloc, the C compiler will assume that malloc returns an int, and will issue a warning in a context such as the above, provided the error is not masked by a cast.

The returned pointer need not be explicitly cast to a more specific pointer type, since ANSI C defines an implicit conversion between the void pointer type and other pointers to objects. An explicit cast of malloc's return value is sometimes performed because malloc originally returned a char *, but this cast is unnecessary in standard C code.[4][5] Omitting the cast, however, creates an incompatibility with C++, which does require it.

The lack of a specific pointer type returned from malloc is type-unsafe behaviour: malloc allocates based on byte count but not on type. This distinguishes it from the C++ new operator that returns a pointer whose type relies on the operand. The latter is closer to type-safe behaviour, though neither are type-safe since the pointer type can be overridden (see C Type Safety).

Related functions

calloc

malloc returns a block of memory that is allocated for the programmer to use, but is uninitialized. The memory is usually initialized by hand if necessary—either via the memset function, or by one or more assignment statements that dereference the pointer. An alternative is to use the calloc function, which allocates memory and then initializes it. Its prototype is

void *calloc(size_t nelements, size_t elementSize);

which allocates a region of memory, initialized to 0, of size nelements × elementSize.

realloc

It is often useful to be able to grow or shrink a block of memory. This can be done using realloc which returns a pointer to a memory region of the specified size, which contains the same data as the old region pointed to by pointer (truncated to the minimum of the old and new sizes). If realloc is unable to resize the memory region in place, it allocates new storage, copies the required data, and frees the old pointer. If this allocation fails, realloc maintains the original pointer unaltered, and returns the null pointer value. The newly allocated region of memory is uninitialized (its contents are not predictable). The function prototype is

void *realloc(void *pointer, size_t size);

realloc behaves like malloc if the first argument is NULL:

void *p = malloc(42);
void *p = realloc(NULL, 42); /* equivalent */

In the C89 standard, realloc with length 0 is the same as a free(), and NULL is returned [1]. In the C99 standard, this is no longer the case, the allocated memory block is reduced in size to zero bytes and a non-NULL pointer is returned (which cannot be directly dereferenced, since it points at no allocated memory, but it can be used in future calls to realloc and free) [2]. In the Open Group Unix specification the behaviour is implementation-defined [3]: either it matches C89, or C99. Under all standards, NULL can be returned on memory allocation failure. When using realloc, one should always use a temporary variable. For example

void *p = malloc(orig_size);
/* and later... */
void *tmp = realloc(p, big_size); 
if (tmp != NULL) {
   p = tmp; /* OK, assign new, larger storage to p */
} else {
   /* handle the problem somehow */
}

If instead one did

void *p = malloc(orig_size);
/* and later... */
p = realloc(p, big_size);

and if it is not possible to obtain big_size bytes of memory, then p will have value NULL and we no longer have a pointer to the memory previously allocated for p, creating a memory leak (see below).

Common errors

The improper use of malloc and related functions can frequently be a source of bugs.

Allocation failure

malloc is not guaranteed to succeed — if there is no memory available, or if the program has exceeded the amount of memory it is allowed to reference, malloc will return a null pointer. Many programs do not check for malloc failure. Such a program would attempt to use the null pointer returned by malloc as if it pointed to allocated memory, and the program would crash.

Memory leaks

When a call to malloc, calloc or realloc succeeds, the return value of the call should eventually be passed to the free function. This releases the allocated memory, allowing it to be reused to satisfy other memory allocation requests. If this is not done, the allocated memory will not be released until the process exits (and in some environments, not even then) — in other words, a memory leak will occur. Typically, memory leaks are caused by losing track of pointers, for example not using a temporary pointer for the return value of realloc, which may lead to the original pointer being overwritten with a null pointer, for example:

void *ptr;
size_t size = BUFSIZ;
 
ptr = malloc(size);
 
/* some further execution happens here... */
 
/* now the buffer size needs to be doubled */
if (size > SIZE_MAX / 2) {
  /* handle overflow error */
  /* ... */
  return (1);
}
size *= 2;
ptr = realloc(ptr, size);
if (ptr == NULL) {
  /* the realloc failed (it returned a null pointer), but the original address in ptr has been lost
     so the memory cannot be freed and a leak has occurred */
  /* ... */
  return 1;
}
/* ... */

Use after free

After a pointer has been passed to free, it becomes a dangling pointer: it references a region of memory with undefined content, which may not be available for use. The pointer's value cannot be accessed. For example:

int *ptr = malloc(sizeof (int));
free(ptr);
*ptr = 0; /* Undefined behavior */

Code like this has undefined behavior: its effect may vary. Even attempting to print the variable with printf is undefined behavior (assuming malloc did not return a null pointer); for example:

printf("%p", (void *) *ptr); /* Undefined behavior */

Commonly, the system may have reused freed memory for other purposes. Therefore, writing through a pointer to a deallocated region of memory may result in overwriting another piece of data somewhere else in the program. Depending on what data is overwritten, this may result in data corruption or cause the program to crash at a later time. A particularly bad example of this problem is if the same pointer is passed to free twice, known as a double free. To avoid this, some programmers set pointers to NULL after passing them to free[6]:

free(ptr); 
ptr = NULL; /*is safe (it does nothing).*/

However, this will not protect other aliases to the same pointer from being doubly freed.

Best practice is that a pointer passes out of scope immediately after being freed.

Freeing unallocated memory

Another problem is when free is passed an address that wasn't allocated by malloc, realloc or calloc. This can be caused when a pointer to a literal string or the name of a declared array is passed to free, for example:

char *msg = "Default message";
int tbl[100];

Passing either of the above pointers to free will result in undefined behaviour.

Implementations

The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data. The same dynamic memory allocator is often used to implement both malloc and operator new in C++. Hence, it is referred to below as the allocator rather than malloc.

Heap-based

Implementation of the allocator on IA-32 architectures is commonly done using the heap, or data segment. The allocator will usually expand and contract the heap to fulfill allocation requests.

The heap method suffers from a few inherent flaws, stemming entirely from fragmentation. Like any method of memory allocation, the heap will become fragmented; that is, there will be sections of used and unused memory in the allocated space on the heap. A good allocator will attempt to find an unused area of already allocated memory to use before resorting to expanding the heap. The major problem with this method is that the heap has only two significant attributes: base, or the beginning of the heap in virtual memory space; and length, or its size. The heap requires enough system memory to fill its entire length, and its base can never change. Thus, any large areas of unused memory are wasted. The heap can get "stuck" in this position if a small used segment exists at the end of the heap, which could waste any magnitude of address space, from a few megabytes to a few hundred.

dlmalloc (the glibc allocator)

Since 2.3 release GNU C library (glibc) uses a modified ptmalloc2, based on "Doug Lea's Malloc" (dlmalloc).[7]

Memory on the heap is allocated as "chunks", an 8-byte aligned data structure which contains a header and usable memory. Allocated memory contains 4 bytes overhead for the size of the chunk and usage flags. Unallocated chunks also store pointers to other free chunks in the usable space area, making the minimum chunk size 24-bytes.[7]

Unallocated memory is grouped into "bins" of similar sizes, implemented by using a double-linked list of chunks (with pointers stored in the unallocated space inside the chunk).[7]

If additional memory needs to be allocated, dlmalloc often uses the brk call to the Linux kernel to increase the size of the heap. Increasing the size of the heap increases the size of the top-most chunk (wilderness chunk), which is always unallocated, and is treated specially by malloc.[7]

For requests above a threshold, the memory is allocated using the mmap system call. The threshold is 1 megabyte by default, but can be changed by calling the mallopt function.[8] The mmap method averts problems with huge buffers trapping a small allocation at the end after their expiration, but always allocates an entire page of memory, which on many architectures are 4096 bytes in size. See Sanderson, Bruce. "RAM, Virtual Memory, Pagefile and all that stuff", Microsoft Help and Support, 2004-12-12.

FreeBSD's jemalloc

Since FreeBSD 7.0, old malloc implementation (phkmalloc) was replaced by new one, written by Jason Evans. Main reason for this was lack of scalability of phkmalloc in terms of multithreading. In order to avoid lock contention, jemalloc uses separate 'arenas' for each CPU. Experiments measuring number of allocations per second in multithreading application has shown that this makes it scale linearly with the number of threads, while for both pkhmalloc and dlmalloc performance was inverse proportional to the number of threads.[9]

Since Firefox 3, jemalloc is used as the default allocator instead of using the one provided by the operating system. This improves performance and lowers memory consumption, due to less fragmentation.

OpenBSD's malloc

OpenBSD's implementation of the malloc function makes use of mmap. For requests greater in size than one page, the entire allocation is retrieved using mmap; smaller sizes are assigned from memory pools maintained by malloc within a number of "bucket pages," also allocated with mmap. On a call to free, memory is released and unmapped from the process address space using munmap. This system is designed to improve security by taking advantage of the address space layout randomization and gap page features implemented as part of OpenBSD's mmap system call, and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault and termination of the program.

Hoard's malloc

The Hoard memory allocator is an allocator whose goal is scalable memory allocation performance. Like OpenBSD's allocator, Hoard uses mmap exclusively, but manages memory in chunks of 64 kilobytes called superblocks. Hoard's heap is logically divided into a single global heap and a number of per-processor heaps. In addition, there is a thread-local cache that can hold a limited number of superblocks. By allocating only from superblocks on the local per-thread or per-processor heap, and moving mostly-empty superblocks to the global heap so they can be reused by other processors, Hoard keeps fragmentation low while achieving near linear scalability with the number of threads.[4]

In-kernel

Operating system kernels need to allocate memory just as application programs do. The implementation of malloc within a kernel often differs significantly from the implementations used by C libraries, however. For example, memory buffers might need to conform to special restrictions imposed by DMA, or the memory allocation function might be called from interrupt context [5]. This necessitates a malloc implementation tightly integrated with the virtual memory subsystem of the operating system kernel.

Allocation size limits

The largest possible memory block malloc can allocate depends on the host system, particularly the size of physical memory and the operating system implementation. Theoretically, the largest number should be the maximum value that can be held in a size_t type, which is an implementation-dependent unsigned integer representing the size of an area of memory. The maximum value is 2CHAR_BIT*sizeof(size_t) − 1, or the constant SIZE_MAX in the C99 standard.

See also

References

  1. gcc manual on gnu.org accessed at December 14, 2008
  2. http://man.freebsd.org/alloca
  3. malloca() page on MSDN Visual C++ Developer Center. Accessed on 12th March 2009
  4. comp.lang.c FAQ list · Question 7.7b on C-FAQ accessed at March 9, 2007
  5. FAQ > Explanations of... > Casting malloc on Cprogramming.com accessed at March 9, 2007
  6. The Open Group Base Specifications Issue 6 on The Open Group accessed at March 9, 2007
  7. 7.0 7.1 7.2 7.3 Kaempf, Michel (2001). "Vudo malloc tricks". Phrack (57): 8. http://phrack.org/issues.html?issue=57&id=8&mode=txt. Retrieved 2009-04-29. 
  8. "Malloc Tunable Parameters". GNU. http://www.gnu.org/software/libc/manual/html_node/Malloc-Tunable-Parameters.html. Retrieved 2009-05-02. 
  9. http://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf

External links

it:Malloc ja:Malloc pt:Malloc ru:Malloc sr:Malloc

Personal tools

Served in 0.283 secs.