C string

From Seo Wiki - Search Engine Optimization and Programming Languages

Jump to: navigation, search

In computing, a C string is a character sequence stored as a one-dimensional character array and terminated with a null character ('\0', called NUL in ASCII). The name refers to the ubiquitous C programming language which uses this string representation. Alternative names are ASCIIZ and null-terminated string.

The length of a C string is found by searching for the (first) NUL byte. This takes O(n) (linear) time with respect to the string length, and it also means a string cannot contain the NUL byte. At the time C (and the languages that it was derived from) were developed, memory was extremely limited, and using only one byte of overhead to store the length of a string was imperative. The only popular alternative (often called a "Pascal string") used one byte to store the length. This allowed the string to contain NUL and made finding the length take O(1) (constant) time, but it limited the length to 255 bytes. This length limitation proved to be far more restrictive than the limitations of C strings.

On modern systems the memory usage is less of a concern, and a larger value can be used for the length (if you have vast numbers of short strings a hash table can be used to save memory instead). Most replacements for C strings (such as the C++ std::string container and the Qt string) use a 32-bit or more length. Thus NUL bytes can be placed in the string and finding the length is O(1).

Making a "copy" of a C string with any number of bytes removed from the start can be done by just moving the pointer, an O(1) constant time operation (far faster than any other string representation). Many pieces of software have taken advantage of this, making it difficult to change them to a new string style without serious speed impact.

The NUL termination has historically created security problems. A bug or malicious program can insert a NUL into the middle of a string, truncating it unexpectedly. A common bug was to not write the NUL at the end of a string (often not detected because there was a NUL already there), allowing leakage of program internal information, added to the end of the string. Due to the expense of finding the length, many programs did not bother before copying the string to a fixed-size buffer, causing a buffer overflow.

Many attempts have been made to make C string handling less error prone. These range from adding safer and more useful functions such as strdup and strlcpy, to entire wrappers to treat the string as an opaque object, such as the MFC CString class which internally represents the string as a C string, but does not require the programmer to handle memory allocation issues.


C String header

The C standard library named string.h (<cstring> header in C++) is used to work with C strings. Confusion or programming errors arise when strings are treated as simple data types. Specific functions have to be employed for comparison and assignment such as strcpy for assignment instead of the standard = and strncmp instead of == for comparison.

Functions included in <cstring>
Operation Function Description
memcpy Copies a block of memory
memmove Move block of memory
strcpy Copy string
strncpy Copy n number characters from string
strcat Concatenate strings
strncat Append n number of characters from string
memcmp Compare two blocks of memory
strcmp Compare two strings
strcoll Compare two strings using locale
strncmp Compare first n characters of two strings
strxfrm Transform string using locale
memchr Locate character in block of memory
strchr Locate first occurrence of character in string
strcspn Get span until character in string
strpbrk Locate character in string
strrchr Locate last occurrence of character in string
strspn Get span of character set in string
strstr Locate substring
strtok Split string into tokens
memset Fill block of memory
strerror Get pointer to error message string
strlen Get string length


C strings are equivalent to the strings created by the .ASCIZ directive of the PDP-11 and VAX macroassembly languages and the ASCIZ directive of the MACRO-10 macro assembly language for the PDP-10.

See also



es:C string

ko:C 문자열

Personal tools

Served in 0.144 secs.