Undefined behavior

From Seo Wiki - Search Engine Optimization and Programming Languages

Jump to: navigation, search

In computer science, undefined behavior is a feature of some programming languages — most famously C. In these languages, to simplify the specification and allow some flexibility in implementation, the specification leaves the results of certain operations specifically undefined.

For example, in C the use of any automatic variable before it has been initialized yields undefined behavior, as do division by zero and indexing an array outside of its defined bounds (see buffer overflow). This specifically frees the compiler to do whatever is easiest or most efficient, should such a program be submitted. In general, any behavior afterwards is also undefined. In particular, it is never required that the compiler diagnose undefined behavior — therefore, programs invoking undefined behavior may appear to compile and even run without errors at first, only to fail on another system, or even on another date. When an instance of undefined behavior occurs, so far as the language specification is concerned anything could happen, maybe nothing at all.

In some languages (including C), even the compiler is not bound to behave in a sensible manner once undefined behavior has been invoked. One instance of undefined behavior acting as an Easter egg is the behavior of early versions of the GCC C compiler when given a program containing the #pragma directive, which has implementation-defined behavior according to the C standard. (It should be noted here that "implementation-defined" is more restrictive than "undefined", requiring the implementation to document what it does.) In practice, many C implementations recognize, for example, #pragma once as a rough equivalent of #include guards — but GCC 1.21, upon finding a #pragma directive, would instead attempt to launch commonly distributed Unix games such as NetHack and Rogue, or start Emacs running a simulation of the Towers of Hanoi.[1]

Under some circumstances there can be specific restrictions on undefined behavior. For example, the instruction set architecture of a CPU might leave the behavior of some forms of an instruction undefined, but if the CPU supports memory protection then the architecture specification will probably include a blanket rule stating that no user-accessible instruction may cause a hole in the operating system's security; so an implementation of the architecture would be permitted to corrupt all user registers in response to such an instruction but would not be allowed to, for example, switch into supervisor mode.

Examples in C++

Attempting to modify a string literal causes undefined behavior:[2]

int main()
{
  char * p = "wikipedia"; // requires deprecated implicit conversion from const char[] to char*
  p[0] = 'W'; // undefined behaviour
}

Division by zero results in undefined behavior:[3]

int f(int x)
{
  return x/0; // undefined behavior
}

Certain pointer operations may result in undefined behavior:[4]

int main()
{
  int arr[4] = {0, 1, 2, 3};
  int * p = arr + 5; // undefined behavior
}

References

  1. "A Pragmatic Decision" quotes the March 1988 issue of UNIX Review magazine, which referred to GCC version 1.17 but got the order wrong. "Everything2: #pragma" gives the correct order. The actual code is in file "cccp.c" in the GCC 1.21 distribution: http://www.oldlinux.org/Linux.old/gnu/gcc-1/
  2. ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §2.13.4 String literals [lex.string] para. 2
  3. ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §5.6 Multiplicative operators [expr.mul] para. 4
  4. ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §5.6 Multiplicative operators [expr.mul] para. 5

External links

ru:Неопределённое поведение uk:Невизначена поведінка

Personal tools

Served in 0.130 secs.