Interview Questions
1. What is a local block?
A local block is any portion of a C program that is enclosed by the left brace ({) and the right brace (}). A C function contains left and right braces, and therefore anything between the two braces is contained in a local block. An if statement or a switch statement can also contain braces, so the portion of code between these two braces would be considered a local block.
Additionally, you might want to create your own local block without the aid of a C function or keyword construct. This is perfectly legal. Variables can be declared within local blocks, but they must be declared only at the beginning of a local block. Variables declared in this manner are visible only within the local block. Duplicate variable names declared within a local block take precedence over variables with the same name declared outside the local block. Here is an example of a program that uses local blocks:
#include <stdio.h>
void main(void);
void main()
{
/* Begin local block for function main() */
int test_var = 10;
printf("Test variable before the if statement: %d\n", test_var);
if (test_var > 5)
{
/* Begin local block for "if" statement */
int test_var = 5;
printf("Test variable within the if statement: %d\n",
test_var);
{
/* Begin independent local block (not tied to
any function or keyword) */
int test_var = 0;
printf(
"Test variable within the independent local block:%d\n",
test_var);
}
/* End independent local block */
}
/* End local block for "if" statement */
printf("Test variable after the if statement: %d\n", test_var);
}
/* End local block for function main() */
This example program produces the following output:
Test variable before the if statement: 10
Test variable within the if statement: 5
Test variable within the independent local block: 0
Test variable after the if statement: 10
Notice that as each test_var was defined, it took precedence over the previously defined test_var. Also notice that when the if statement local block had ended, the program had reentered the scope of the original test_var, and its value was 10.
2. Should variables be stored in local blocks?
The use of local blocks for storing variables is unusual and therefore should be avoided, with only rare exceptions. One of these exceptions would be for debugging purposes, when you might want to declare a local instance of a global variable to test within your function. You also might want to use a local block hen you want to make your program more readable in the current context.
Sometimes having the variable declared closer to where it is used makes your program more readable. However, well-written programs usually do not have to resort to declaring variables in this manner, and you should avoid using local blocks.
3. When is a switch statement better than multiple if statements?
A switch statement is generally best to use when you have more than two conditional expressions based on a single variable of numeric type. For instance, rather than the code
if (x == 1)
printf("x is equal to one.\n");
else if (x == 2)
printf("x is equal to two.\n");
else if (x == 3)
printf("x is equal to three.\n");
else
printf("x is not equal to one, two, or three.\n");
the following code is easier to read and maintain:
switch (x)
{
case 1: printf("x is equal to one.\n");
break;
case 2: printf("x is equal to two.\n");
break;
case 3: printf("x is equal to three.\n");
break;
default: printf("x is not equal to one, two, or three.\n");
break;
}
Notice that for this method to work, the conditional expression must be based on a variable of numeric type in order to use the switch statement. Also, the conditional expression must be based on a single variable. For instance, even though the following if statement contains more than two conditions, it is not a candidate for using a switchstatement because it is based on string comparisons and not numeric comparisons:
char* name = "Lupto";
if (!stricmp(name, "Isaac"))
printf("Your name means 'Laughter'.\n");
else if (!stricmp(name, "Amy"))
printf("Your name means 'Beloved'.\n ");
else if (!stricmp(name, "Lloyd"))
printf("Your name means 'Mysterious'.\n ");
else
printf("I haven't a clue as to what your name means.\n");
4. Is a default case necessary in a switch statement?
No, but it is not a bad idea to put default statements in switch statements for error- or logic-checking purposes. For instance, the following switch statement is perfectly normal:
switch (char_code)
{
case 'Y':
case 'y': printf("You answered YES!\n");
break;
case 'N':
case 'n': printf("You answered NO!\n");
break;
}
Consider, however, what would happen if an unknown character code were passed to this switch statement. The program would not print anything. It would be a good idea, therefore, to insert a default case where this condition would be taken care of:
...
default: printf("Unknown response: %d\n", char_code);
break;
...
Additionally, default cases come in handy for logic checking. For instance, if your switch statement handled a fixed number of conditions and you considered any value outside those conditions to be a logic error, you could insert adefault case which would flag that condition. Consider the following example:
void move_cursor(int direction)
{
switch (direction)
{
case UP: cursor_up();
break;
case DOWN: cursor_down();
break;
case LEFT: cursor_left();
break;
case RIGHT: cursor_right();
break;
default: printf("Logic error on line number %ld!!!\n",
__LINE__);
break;
}
}
5. Can the last case of a switch statement skip including the break?
Even though the last case of a switch statement does not require a break statement at the end, you should addbreak statements to all cases of the switch statement, including the last case. You should do so primarily because your program has a strong chance of being maintained by someone other than you who might add cases but neglect to notice that the last case has no break statement.
This oversight would cause what would formerly be the last case statement to "fall through" to the new statements added to the bottom of the switch statement. Putting a break after each case statement would prevent this possible mishap and make your program more "bulletproof." Besides, most of today's optimizing compilers will optimize out the last break, so there will be no performance degradation if you add it.
6. Other than in a for statement, when is the comma operator used?
The comma operator is commonly used to separate variable declarations, function arguments, and expressions, as well as the elements of a for statement. Look closely at the following program, which shows some of the many ways a comma can be used:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main()
{
/* Here, the comma operator is used to separate
three variable declarations. */
int i, j, k;
/* Notice how you can use the comma operator to perform
multiple initializations on the same line. */
i = 0, j = 1, k = 2;
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the comma operator is used to execute three expressions
in one line: assign k to i, increment j, and increment k.
The value that i receives is always the rightmost expression. */
i = (j++, k++);
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the while statement uses the comma operator to
assign the value of i as well as test it. */
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n", i);
printf("\nGuess what? i is 50!\n");
}
Notice the line that reads
i = (j++, k++);
This line actually performs three actions at once. These are the three actions, in order:
1. Assigns the value of k to i. This happens because the left value (lvalue) always evaluates to the rightmost argument. In this case, it evaluates to k. Notice that it does not evaluate to k++, because k++ is a postfix incremental expression, and k is not incremented until the assignment of k to i is made. If the expression had read++k, the value of ++k would be assigned to i because it is a prefix incremental expression, and it is incremented before the assignment is made.
2. Increments j.
3. Increments k.
Also, notice the strange-looking while statement:
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n");
Here, the comma operator separates two expressions, each of which is evaluated for each iteration of the whilestatement. The first expression, to the left of the comma, assigns i to a random number from 0 to 99.
The second expression, which is more commonly found in a while statement, is a conditional expression that tests to see whether i is not equal to 50. For each iteration of the while statement, i is assigned a new random number, and the value of i is checked to see that it is not 50. Eventually, i is randomly assigned the value 50, and thewhile statement terminates.
7. How can you tell whether a loop ended prematurely?
Generally, loops are dependent on one or more variables. Your program can check those variables outside the loop to ensure that the loop executed properly. For instance, consider the following example:
#define REQUESTED_BLOCKS 512
int x;
char* cp[REQUESTED_BLOCKS];
/* Attempt (in vain, I must add...) to
allocate 512 10KB blocks in memory. */
for (x=0; x< REQUESTED_BLOCKS; x++)
{
cp[x] = (char*) malloc(10000, 1);
if (cp[x] == (char*) NULL)
break;
}
/* If x is less than REQUESTED_BLOCKS,
the loop has ended prematurely. */
if (x < REQUESTED_BLOCKS)
printf("Bummer! My loop ended prematurely!\n");
Notice that for the loop to execute successfully, it would have had to iterate through 512 times. Immediately following the loop, this condition is tested to see whether the loop ended prematurely. If the variable x is anything less than 512, some error has occurred.
8. What is the difference between goto and long jmp( ) and setjmp()?
A goto statement implements a local jump of program execution, and the longjmp() and setjmp() functions implement a nonlocal, or far, jump of program execution. Generally, a jump in execution of any kind should be avoided because it is not considered good programming practice to use such statements as goto and longjmp in your program.
A goto statement simply bypasses code in your program and jumps to a predefined position. To use the gotostatement, you give it a labeled position to jump to. This predefined position must be within the same function. You cannot implement gotos between functions. Here is an example of a goto statement:
void bad_programmers_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
x = 1;
while (1)
{
printf("%d\n", x);
if (x == 5000)
goto all_done;
else
x = x + 1;
}
all_done:
printf("Whew! That wasn't so bad, was it?\n");
}
This example could have been written much better, avoiding the use of a goto statement. Here is an example of an improved implementation:
void better_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
for (x=1; x<=5000; x++)
printf("%d\n", x);
printf("Whew! That wasn't so bad, was it?\n");
}
As previously mentioned, the longjmp() and setjmp() functions implement a nonlocal goto. When your program calls setjmp(), the current state of your program is saved in a structure of type jmp_buf. Later, your program can call the longjmp() function to restore the program's state as it was when you called setjmp(). Unlike the gotostatement, the longjmp() and setjmp() functions do not need to be implemented in the same function.
However, there is a major drawback to using these functions: your program, when restored to its previously saved state, will lose its references to any dynamically allocated memory between the longjmp() and the setjmp(). This means you will waste memory for every malloc() or calloc() you have implemented between your longjmp() andsetjmp(), and your program will be horribly inefficient. It is highly recommended that you avoid using functions such as longjmp() and setjmp() because they, like the goto statement, are quite often an indication of poor programming practice.
Here is an example of the longjmp() and setjmp() functions:
#include <stdio.h>
#include <setjmp.h>
jmp_buf saved_state;
void main(void);
void call_longjmp(void);
void main(void)
{
int ret_code;
printf("The current state of the program is being saved...\n");
ret_code = setjmp(saved_state);
if (ret_code == 1)
{
printf("The longjmp function has been called.\n");
printf("The program's previous state has been restored.\n");
exit(0);
}
printf("I am about to call longjmp and\n");
printf("return to the previous program state...\n");
call_longjmp();
}
void call_longjmp(void)
{
longjmp(saved_state, 1);
}
9. What is an lvalue?
An lvalue is an expression to which a value can be assigned. The lvalue expression is located on the left side of an assignment statement, whereas an rvalue is located on the right side of an assignment statement. Each assignment statement must have an lvalue and an rvalue. The lvalue expression must reference a storable variable in memory. It cannot be a constant. For instance, the following lines show a few examples of lvalues:
int x;
int* p_int;
x = 1;
*p_int = 5;
The variable x is an integer, which is a storable location in memory. Therefore, the statement x = 1 qualifies x to be an lvalue. Notice the second assignment statement, *p_int = 5. By using the * modifier to reference the area of memory that p_int points to, *p_int is qualified as an lvalue. In contrast, here are a few examples of what would not be considered lvalues:
#define CONST_VAL 10
int x;
/* example 1 */
1 = x;
/* example 2 */
CONST_VAL = 5;
In both statements, the left side of the statement evaluates to a constant value that cannot be changed because constants do not represent storable locations in memory. Therefore, these two assignment statements do not contain lvalues and will be flagged by your compiler as errors.
10. Can an array be an lvalue?
Is an array an expression to which we can assign a value? The answer to this question is no, because an array is composed of several separate array elements that cannot be treated as a whole for assignment purposes. The following statement is therefore illegal:
int x[5], y[5];
x = y;
You could, however, use a for loop to iterate through each element of the array and assign values individually, such as in this example:
int i;
int x[5];
int y[5];
...
for (i=0; i<5; i++)
x[i] = y[i]
...
Additionally, you might want to copy the whole array all at once. You can do so using a library function such as thememcpy() function, which is shown here:
memcpy(x, y, sizeof(y));
It should be noted here that unlike arrays, structures can be treated as lvalues. Thus, you can assign one structure variable to another structure variable of the same type, such as this:
typedef struct t_name
{
char last_name[25];
char first_name[15];
char middle_init[2];
} NAME;
...
NAME my_name, your_name;
...
your_name = my_name;
...
In the preceding example, the entire contents of the my_name structure were copied into the your_name structure. This is essentially the same as the following line:
memcpy(your_name, my_name, sizeof(your_name));
11. What is an rvalue?
rvalue can be defined as an expression that can be assigned to an lvalue. The rvalue appears on the right side of an assignment statement.
Unlike an lvalue, an rvalue can be a constant or an expression, as shown here:
int x, y;
x = 1; /* 1 is an rvalue; x is an lvalue */
y = (x + 1); /* (x + 1) is an rvalue; y is an lvalue */
An assignment statement must have both an lvalue and an rvalue. Therefore, the following statement would not compile because it is missing an rvalue:
int x; x = void_function_call() /* the function void_function_call() returns nothing */
If the function had returned an integer, it would be considered an rvalue because it evaluates into something that the lvalue, x, can store.
12. Is left-to-right or right-to-left order guaranteed for operator precedence?
The simple answer to this question is neither. The C language does not always evaluate left-to-right or right-to-left. Generally, function calls are evaluated first, followed by complex expressions and then simple expressions.
Additionally, most of today's popular C compilers often rearrange the order in which the expression is evaluated in order to get better optimized code. You therefore should always implicitly define your operator precedence by using parentheses.
For example, consider the following expression:
a = b + c/d / function_call() * 5
The way this expression is to be evaluated is totally ambiguous, and you probably will not get the results you want. Instead, try writing it by using implicit operator precedence:
a = b + (((c/d) / function_call()) * 5)
Using this method, you can be assured that your expression will be evaluated properly and that the compiler will not rearrange operators for optimization purposes.
13. What is the difference between ++var and var++?
The ++ operator is called the increment operator. When the operator is placed before the variable (++var), the variable is incremented by 1 before it is used in the expression. When the operator is placed after the variable (var++), the expression is evaluated, and then the variable is incremented by 1.
The same holds true for the decrement operator (--). When the operator is placed before the variable, you are said to have a prefix operation. When the operator is placed after the variable, you are said to have a postfix operation.
For instance, consider the following example of postfix incrementation:
int x, y;
x = 1;
y = (x++ * 5);
In this example, postfix incrementation is used, and x is not incremented until after the evaluation of the expression is done. Therefore, y evaluates to 1 times 5, or 5. After the evaluation, x is incremented to 2.
Now look at an example using prefix incrementation:
int x, y;
x = 1;
y = (++x * 5);
This example is the same as the first one, except that this example uses prefix incrementation rather than postfix. Therefore, x is incremented before the expression is evaluated, making it 2. Hence, y evaluates to 2 times 5, or 10.
14. What does the modulus operator do?
The modulus operator (%) gives the remainder of two divided numbers. For instance, consider the following portion of code:
x = 15/7
If x were an integer, the resulting value of x would be 2. However, consider what would happen if you were to apply the modulus operator to the same equation:
x = 15%7
The result of this expression would be the remainder of 15 divided by 7, or 1. This is to say that 15 divided by 7 is 2 with a remainder of 1.
The modulus operator is commonly used to determine whether one number is evenly divisible into another. For instance, if you wanted to print every third letter of the alphabet, you would use the following code:
int x;
for (x=1; x<=26; x++)
if ((x%3) == 0)
printf("%c", x+64);
The preceding example would output the string "cfilorux", which represents every third letter in the alphabet.
Variable And Data Storage
1. Where in memory are my variables stored?
Variables can be stored in several places in memory, depending on their lifetime. Variables that are defined outside any function (whether of global or file static scope), and variables that are defined inside a function as static variables, exist for the lifetime of the program's execution. These variables are stored in the "data segment." The data segment is a fixed-size area in memory set aside for these variables. The data segment is subdivided into two parts, one for initialized variables and another for uninitialized variables.
Variables that are defined inside a function as auto variables (that are not defined with the keyword static) come into existence when the program begins executing the block of code (delimited by curly braces {}) containing them, and they cease to exist when the program leaves that block of code.
Variables that are the arguments to functions exist only during the call to that function. These variables are stored on the "stack". The stack is an area of memory that starts out small and grows automatically up to some predefined limit. In DOS and other systems without virtual memory, the limit is set either when the program is compiled or when it begins executing. In UNIX and other systems with virtual memory, the limit is set by the system, and it is usually so large that it can be ignored by the programmer.
The third and final area doesn't actually store variables but can be used to store data pointed to by variables. Pointer variables that are assigned to the result of a call to the malloc() function contain the address of a dynamically allocated area of memory. This memory is in an area called the "heap." The heap is another area that starts out small and grows, but it grows only when the programmer explicitly calls malloc() or other memory allocation functions, such as calloc(). The heap can share a memory segment with either the data segment or the stack, or it can have its own segment. It all depends on the compiler options and operating system. The heap, like the stack, has a limit on how much it can grow, and the same rules apply as to how that limit is determined.
2. Do variables need to be initialized?
No. All variables should be given a value before they are used, and a good compiler will help you find variables that are used before they are set to a value. Variables need not be initialized, however. Variables defined outside a function or defined inside a function with the static keyword are already initialized to 0 for you if you do not explicitly initialize them.
Automatic variables are variables defined inside a function or block of code without the static keyword. These variables have undefined values if you don't explicitly initialize them. If you don't initialize an automatic variable, you must make sure you assign to it before using the value.
Space on the heap allocated by calling malloc() contains undefined data as well and must be set to a known value before being used. Space allocated by calling calloc() is set to 0 for you when it is allocated.
3. What is page thrashing?
Some operating systems (such as UNIX or Windows in enhanced mode) use virtual memory. Virtual memory is a technique for making a machine behave as if it had more memory than it really has, by using disk space to simulate RAM (random-access memory). In the 80386 and higher Intel CPU chips, and in most other modern microprocessors (such as the Motorola 68030, Sparc, and Power PC), exists a piece of hardware called the Memory Management Unit, or MMU.
The MMU treats memory as if it were composed of a series of "pages." A page of memory is a block of contiguous bytes of a certain size, usually 4096 or 8192 bytes. The operating system sets up and maintains a table for each running program called the Process Memory Map, or PMM. This is a table of all the pages of memory that program can access and where each is really located.
Every time your program accesses any portion of memory, the address (called a "virtual address") is processed by the MMU. The MMU looks in the PMM to find out where the memory is really located (called the "physical address"). The physical address can be any location in memory or on disk that the operating system has assigned for it. If the location the program wants to access is on disk, the page containing it must be read from disk into memory, and the PMM must be updated to reflect this action (this is called a "page fault"). Hope you're still with me, because here's the tricky part. Because accessing the disk is so much slower than accessing RAM, the operating system tries to keep as much of the virtual memory as possible in RAM. If you're running a large enough program (or several small programs at once), there might not be enough RAM to hold all the memory used by the programs, so some of it must be moved out of RAM and onto disk (this action is called "paging out").
The operating system tries to guess which areas of memory aren't likely to be used for a while (usually based on how the memory has been used in the past). If it guesses wrong, or if your programs are accessing lots of memory in lots of places, many page faults will occur in order to read in the pages that were paged out. Because all of RAM is being used, for each page read in to be accessed, another page must be paged out. This can lead to more page faults, because now a different page of memory has been moved to disk. The problem of many page faults occurring in a short time, called "page thrashing," can drastically cut the performance of a system.
Programs that frequently access many widely separated locations in memory are more likely to cause page thrashing on a system. So is running many small programs that all continue to run even when you are not actively using them. To reduce page thrashing, you can run fewer programs simultaneously. Or you can try changing the way a large program works to maximize the capability of the operating system to guess which pages won't be needed. You can achieve this effect by caching values or changing lookup algorithms in large data structures, or sometimes by changing to a memory allocation library which provides an implementation of malloc() that allocates memory more efficiently. Finally, you might consider adding more RAM to the system to reduce the need to page out.
4. What is a const pointer?
The access modifier keyword const is a promise the programmer makes to the compiler that the value of a variable will not be changed after it is initialized. The compiler will enforce that promise as best it can by not enabling the programmer to write code which modifies a variable that has been declared const.
A "const pointer," or more correctly, a "pointer to const," is a pointer which points to data that is const (constant, or unchanging). A pointer to const is declared by putting the word const at the beginning of the pointer declaration. This declares a pointer which points to data that can't be modified. The pointer itself can be modified. The following example illustrates some legal and illegal uses of a const pointer:
const char *str = "hello";
char c = *str /* legal */
str++; /* legal */
*str = 'a'; /* illegal */
str[1] = 'b'; /* illegal */
The first two statements here are legal because they do not modify the data that str points to. The next two statements are illegal because they modify the data pointed to by str.
Pointers to const are most often used in declaring function parameters. For instance, a function that counted the number of characters in a string would not need to change the contents of the string, and it might be written this way:
my_strlen(const char *str)
{
int count = 0;
while (*str++)
{
count++;
}
return count;
}
Note that non-const pointers are implicitly converted to const pointers when needed, but const pointers are not converted to non-const pointers. This means that my_strlen() could be called with either a const or a non-const character pointer.
5. When should the register modifier be used? Does it really help?
The register modifier hints to the compiler that the variable will be heavily used and should be kept in the CPU's registers, if possible, so that it can be accessed faster. There are several restrictions on the use of the register modifier.
First, the variable must be of a type that can be held in the CPU's register. This usually means a single value of a size less than or equal to the size of an integer. Some machines have registers that can hold floating-point numbers as well.
Second, because the variable might not be stored in memory, its address cannot be taken with the unary and operator. An attempt to do so is flagged as an error by the compiler. Some additional rules affect how useful the register modifier is. Because the number of registers is limited, and because some registers can hold only certain types of data (such as pointers or floating-point numbers), the number and types of register modifiers that will actually have any effect are dependent on what machine the program will run on. Any additional register modifiers are silently ignored by the compiler.
Also, in some cases, it might actually be slower to keep a variable in a register because that register then becomes unavailable for other purposes or because the variable isn't used enough to justify the overhead of loading and storing it.
So when should the register modifier be used? The answer is never, with most modern compilers. Early C compilers did not keep any variables in registers unless directed to do so, and the register modifier was a valuable addition to the language. C compiler design has advanced to the point, however, where the compiler will usually make better decisions than the programmer about which variables should be stored in registers. In fact, many compilers actually ignore the register modifier, which is perfectly legal, because it is only a hint and not a directive.
In the rare event that a program is too slow, and you know that the problem is due to a variable being stored in memory, you might try adding the register modifier as a last resort, but don't be surprised if this action doesn't change the speed of the program.
6. When should the volatile modifier be used?
The volatile modifier is a directive to the compiler's optimizer that operations involving this variable should not be optimized in certain ways. There are two special cases in which use of the volatile modifier is desirable. The first case involves memory-mapped hardware (a device such as a graphics adaptor that appears to the computer's hardware as if it were part of the computer's memory), and the second involves shared memory (memory used by two or more programs running simultaneously).
Most computers have a set of registers that can be accessed faster than the computer's main memory. A good compiler will perform a kind of optimization called "redundant load and store removal." The compiler looks for places in the code where it can either remove an instruction to load data from memory because the value is already in a register, or remove an instruction to store data to memory because the value can stay in a register until it is changed again anyway.
If a variable is a pointer to something other than normal memory, such as memory-mapped ports on a peripheral, redundant load and store optimizations might be detrimental. For instance, here's a piece of code that might be used to time some operation:
time_t time_addition(volatile const struct timer *t, int a)
{
int n;
int x;
time_t then;
x = 0;
then = t->value;
for (n = 0; n < 1000; n++)
{
x = x + a;
}
return t->value - then;
}
In this code, the variable t->value is actually a hardware counter that is being incremented as time passes. The function adds the value of a to x 1000 times, and it returns the amount the timer was incremented by while the 1000 additions were being performed.
Without the volatile modifier, a clever optimizer might assume that the value of t does not change during the execution of the function, because there is no statement that explicitly changes it. In that case, there's no need to read it from memory a second time and subtract it, because the answer will always be 0. The compiler might therefore "optimize" the function by making it always return 0.
If a variable points to data in shared memory, you also don't want the compiler to perform redundant load and store optimizations. Shared memory is normally used to enable two programs to communicate with each other by having one program store data in the shared portion of memory and the other program read the same portion of memory. If the compiler optimizes away a load or store of shared memory, communication between the two programs will be affected.
7. Can a variable be both const and volatile?
Yes. The const modifier means that this code cannot change the value of the variable, but that does not mean that the value cannot be changed by means outside this code. For instance, the timer structure was accessed through a volatile const pointer. The function itself did not change the value of the timer, so it was declared const. However, the value was changed by hardware on the computer, so it was declared volatile. If a variable is both const andvolatile, the two modifiers can appear in either order.
8. When should the const modifier be used?
There are several reasons to use const pointers. First, it allows the compiler to catch errors in which code accidentally changes the value of a variable, as in
while (*str = 0) /* programmer meant to write *str != 0 */
{
/* some code here */
str++;
}
in which the = sign is a typographical error. Without the const in the declaration of str, the program would compile but not run properly.
Another reason is efficiency. The compiler might be able to make certain optimizations to the code generated if it knows that a variable will not be changed.
Any function parameter which points to data that is not modified by the function or by any function it calls should declare the pointer a pointer to const. Function parameters that are passed by value (rather than through a pointer) can be declared const if neither the function nor any function it calls modifies the data.
In practice, however, such parameters are usually declared const only if it might be more efficient for the compiler to access the data through a pointer than by copying it.
9. How reliable are floating-point comparisons?
Floating-point numbers are the "black art" of computer programming. One reason why this is so is that there is no optimal way to represent an arbitrary number. The Institute of Electrical and Electronic Engineers (IEEE) has developed a standard for the representation of floating-point numbers, but you cannot guarantee that every machine you use will conform to the standard.
Even if your machine does conform to the standard, there are deeper issues. It can be shown mathematically that there are an infinite number of "real" numbers between any two numbers. For the computer to distinguish between two numbers, the bits that represent them must differ. To represent an infinite number of different bit patterns would take an infinite number of bits. Because the computer must represent a large range of numbers in a small number of bits (usually 32 to 64 bits), it has to make approximate representations of most numbers.
Because floating-point numbers are so tricky to deal with, it's generally bad practice to compare a floating- point number for equality with anything. Inequalities are much safer. If, for instance, you want to step through a range of numbers in small increments, you might write this:
#include <stdio.h>
const float first = 0.0;
const float last = 70.0;
const float small = 0.007;
main()
{
float f;
for (f = first; f != last && f < last + 1.0; f += small)
;
printf("f is now %g\n", f);
}
However, rounding errors and small differences in the representation of the variable small might cause f to never be equal to last (it might go from being just under it to being just over it). Thus, the loop would go past the value last. The inequality f < last + 1.0 has been added to prevent the program from running on for a very long time if this happens. If you run this program and the value printed for f is 71 or more, this is what has happened.
A safer way to write this loop is to use the inequality f < last to test for the loop ending, as in this example:
float f;
for (f = first; f < last; f += small)
;
You could even precompute the number of times the loop should be executed and use an integer to count iterations of the loop, as in this example:
float f;
int count = (last - first) / small;
for (f = first; count-- > 0; f += small)
10. How can you determine the maximum value that a numeric variable can hold?
The easiest way to find out how large or small a number that a particular type can hold is to use the values defined in the ANSI standard header file limits.h. This file contains many useful constants defining the values that can be held by various types, including these:
Value
Description
CHAR_BIT
-
Number of bits in a char
CHAR_MAX
-
Maximum decimal integer value of a char
CHAR_MIN
-
Minimum decimal integer value of a char
MB_LEN_MAX
-
Maximum number of bytes in a multibyte character
INT_MAX
-
Maximum decimal value of an int
INT_MIN
-
Minimum decimal value of an int
LONG_MAX
-
Maximum decimal value of a long
LONG_MIN
-
Minimum decimal value of a long
SCHAR_MAX
-
Maximum decimal integer value of a signed char
SCHAR_MIN
-
Minimum decimal integer value of a signed char
SHRT_MAX
-
Maximum decimal value of a short
SHRT_MIN
-
Minimum decimal value of a short
UCHAR_MAX
-
Maximum decimal integer value of unsigned char
UINT_MAX
-
Maximum decimal value of an unsigned integer
ULONG_MAX
-
Maximum decimal value of an unsigned long int
USHRT_MAX
-
Maximum decimal value of an unsigned short int
For integral types, on a machine that uses two's complement arithmetic (which is just about any machine you're likely to use), a signed type can hold numbers from -2(number of bits - 1) to +2(number of bits - 1) - 1.
An unsigned type can hold values from 0 to +2(number of bits)- 1. For instance, a 16-bit signed integer can hold numbers from -215(-32768) to +215 - 1 (32767).
11. Are there any problems with performing mathematical operations on different variable types?
C has three categories of built-in data types: pointer types, integral types, and floating-point types. Pointer types are the most restrictive in terms of the operations that can be performed on them. They are limited to
- subtraction of two pointers, valid only when both pointers point to elements in the same array. The result is the same as subtracting the integer subscripts corresponding to the two pointers.
+ addition of a pointer and an integral type. The result is a pointer that points to the element which would be selected by that integer.
Floating-point types consist of the built-in types float, double, and long double. Integral types consist of char, unsigned char, short, unsigned short, int, unsigned int, long, and unsigned long. All of these types can have the following arithmetic operations performed on them:
+ Addition
- Subtraction
* Multiplication
/ Division
Integral types also can have those four operations performed on them, as well as the following operations: % Modulo or remainder of division
<< Shift left
>> Shift right
& Bitwise AND operation
| Bitwise OR operation
^ Bitwise exclusive OR operation
! Logical negative operation
~ Bitwise "one's complement" operation
Although C permits "mixed mode" expressions (an arithmetic expression involving different types), it actually converts the types to be the same type before performing the operations (except for the case of pointer arithmetic described previously). The process of automatic type conversion is called "operator promotion."
12. What is operator promotion?
If an operation is specified with operands of two different types, they are converted to the smallest type that can hold both values. The result has the same type as the two operands wind up having. To interpret the rules, read the following table from the top down, and stop at the first rule that applies.
If Either Operand Is
And the Other Is
Change Them To
long double
-
any other type
-
long double
double
-
any smaller type
-
Double
float
-
any smaller type
-
Float
unsigned long
-
any integral type
-
unsigned long
long
-
unsigned > LONG_MAX
-
Long
long
-
any smaller type
-
Long
unsigned
-
any signed type
-
Unsigned
The following example code illustrates some cases of operator promotion. The variable f1 is set to 3/4. Because both 3 and 4 are integers, integer division is performed, and the result is the integer 0. The variable f2 is set to 3/4.0. Because 4.0 is a float, the number 3 is converted to a float as well, and the result is the float 0.75.
#include <stdio.h>
main()
{
float f1 = 3 / 4;
float f2 = 3 / 4.0;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
13. When should a type cast be used?
There are two situations in which to use a type cast. The first use is to change the type of an operand to an arithmetic operation so that the operation will be performed properly. The variable f1 is set to the result of dividing the integer i by the integer j. The result is 0, because integer division is used. The variable f2 is set to the result of dividing i by j as well. However, the (float) type cast causes i to be converted to a float. That in turn causes floating-point division to be used and gives the result 0.75.
#include <stdio.h>
main()
{
int i = 3;
int j = 4;
float f1 = i / j;
float f2 = (float) i / j;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
The second case is to cast pointer types to and from void * in order to interface with functions that expect or return void pointers. For example, the following line type casts the return value of the call to malloc() to be a pointer to a foo structure.
struct foo *p = (struct foo *) malloc(sizeof(struct foo));
14. When should a type cast not be used?
A type cast should not be used to override a const or volatile declaration. Overriding these type modifiers can cause the program to fail to run correctly.
A type cast should not be used to turn a pointer to one type of structure or data type into another. In the rare events in which this action is beneficial, using a union to hold the values makes the programmer's intentions clearer.
15. Is it acceptable to declare/define a variable in a C header?
A global variable that must be accessed from more than one file can and should be declared in a header file. In addition, such a variable must be defined in one source file. Variables should not be defined in header files, because the header file can be included in multiple source files, which would cause multiple definitions of the variable.
The ANSI C standard will allow multiple external definitions, provided that there is only one initialization. But because there's really no advantage to using this feature, it's probably best to avoid it and maintain a higher level of portability.
"Global" variables that do not have to be accessed from more than one file should be declared static and should not appear in a header file.
16. What is the difference between declaring a variable and defining a variable?
Declaring a variable means describing its type to the compiler but not allocating any space for it. Defining a variable means declaring it and also allocating space to hold the variable. You can also initialize a variable at the time it is defined. Here is a declaration of a variable and a structure, and two variable definitions, one with initialization:
extern int decl1; /* this is a declaration */
struct decl2
{
int member;
}; /* this just declares the type--no variable mentioned */
int def1 = 8; /* this is a definition */
int def2; /* this is a definition */
To put it another way, a declaration says to the compiler, "Somewhere in my program will be a variable with this name, and this is what type it is." A definition says, "Right here is this variable with this name and this type."
A variable can be declared many times, but it must be defined exactly once. For this reason, defi nitions do not belong in header files, where they might get #included into more than one place in your program.
17. Can static variables be declared in a header file?
You can't declare a static variable without defining it as well (this is because the storage class modifiers staticand extern are mutually exclusive). A static variable can be defined in a header file, but this would cause each source file that included the header file to have its own private copy of the variable, which is probably not what was intended.
18. What is the benefit of using const for declaring constants?
The benefit of using the const keyword is that the compiler might be able to make optimizations based on the knowledge that the value of the variable will not change. In addition, the compiler will try to ensure that the values won't be changed inadvertently.
Of course, the same benefits apply to #defined constants. The reason to use const rather than #define to define a constant is that a const variable can be of any type (such as a struct, which can't be represented by a#defined constant). Also, because a const variable is a real variable, it has an address that can be used, if needed, and it resides in only one place in memory.
Bit And Bytes
1. What is the most efficient way to store flag values?
A flag is a value used to make a decision between two or more options in the execution of a program. For instance, the /w flag on the MS-DOS dir command causes the command to display filenames in several columns across the screen instead of displaying them one per line. In which a flag is used to indicate which of two possible types is held in a union. Because a flag has a small number of values (often only two), it is tempting to save memory space by not storing each flag in its own int or char.
Efficiency in this case is a tradeoff between size and speed. The most memory-space efficient way to store a flag value is as single bits or groups of bits just large enough to hold all the possible values. This is because most computers cannot address individual bits in memory, so the bit or bits of interest must be extracted from the bytes that contain it.
The most time-efficient way to store flag values is to keep each in its own integer variable. Unfortunately, this method can waste up to 31 bits of a 32-bit variable, which can lead to very inefficient use of memory. If there are only a few flags, it doesn't matter how they are stored. If there are many flags, it might be advantageous to store them packed in an array of characters or integers. They must then be extracted by a process called bit masking, in which unwanted bits are removed from the ones of interest.
Sometimes it is possible to combine a flag with another value to save space. It might be possible to use high- order bits of integers that have values smaller than what an integer can hold. Another possibility is that some data is always a multiple of 2 or 4, so the low-order bits can be used to store a flag.
2. What is meant by "bit masking"?
Bit masking means selecting only certain bits from byte(s) that might have many bits set. To examine some bits of a byte, the byte is bitwise "ANDed" with a mask that is a number consisting of only those bits of interest. For instance, to look at the one's digit (rightmost digit) of the variable flags, you bitwise AND it with a mask of one (the bitwise AND operator in C is &):
flags & 1;
To set the bits of interest, the number is bitwise "ORed" with the bit mask (the bitwise OR operator in C is |). For instance, you could set the one's digit of flags like so:
flags = flags | 1;
Or, equivalently, you could set it like this:
flags |= 1;
To clear the bits of interest, the number is bitwise ANDed with the one's complement of the bit mask. The "one's complement" of a number is the number with all its one bits changed to zeros and all its zero bits changed to ones. The one's complement operator in C is ~. For instance, you could clear the one's digit of flags like so:
flags = flags & ~1;
Or, equivalently, you could clear it like this:
flags &= ~1;
Sometimes it is easier to use macros to manipulate flag values.
Example Program : Macros that make manipulating flags easier.
/* Bit Masking */
/* Bit masking can be used to switch a character
between lowercase and uppercase */
#define BIT_POS(N) ( 1U << (N) )
#define SET_FLAG(N, F) ( (N) |= (F) )
#define CLR_FLAG(N, F) ( (N) &= -(F) )
#define TST_FLAG(N, F) ( (N) & (F) )
#define BIT_RANGE(N, M) ( BIT_POS((M)+1 - (N))-1 << (N) )
#define BIT_SHIFTL(B, N) ( (unsigned)(B) << (N) )
#define BIT_SHIFTR(B, N) ( (unsigned)(B) >> (N) )
#define SET_MFLAG(N, F, V) ( CLR_FLAG(N, F), SET_FLAG(N, V) )
#define CLR_MFLAG(N, F) ( (N) &= ~(F) )
#define GET_MFLAG(N, F) ( (N) & (F) )
#include <stdio.h>
void main()
{
unsigned char ascii_char = 'A'; /* char = 8 bits only */
int test_nbr = 10;
printf("Starting character = %c\n", ascii_char);
/* The 5th bit position determines if the character is
uppercase or lowercase.
5th bit = 0 - Uppercase
5th bit = 1 - Lowercase */
printf("\nTurn 5th bit on = %c\n", SET_FLAG(ascii_char, BIT_POS(5)) );
printf("Turn 5th bit off = %c\n\n", CLR_FLAG(ascii_char, BIT_POS(5)) );
printf("Look at shifting bits\n");
printf("=====================\n");
printf("Current value = %d\n", test_nbr);
printf("Shifting one position left = %d\n",
test_nbr = BIT_SHIFTL(test_nbr, 1) );
printf("Shifting two positions right = %d\n",
BIT_SHIFTR(test_nbr, 2) );
}
BIT_POS(N) takes an integer N and returns a bit mask corresponding to that single bit position (BIT_POS(0) returns a bit mask for the one's digit, BIT_POS(1) returns a bit mask for the two's digit, and so on). So instead of writing
#define A_FLAG 4096
#define B_FLAG 8192
you can write
#define A_FLAG BIT_POS(12)
#define B_FLAG BIT_POS(13)
which is less prone to errors.
The SET_FLAG(N, F) macro sets the bit at position F of variable N. Its opposite is CLR_FLAG(N, F), which clears the bit at position F of variable N. Finally, TST_FLAG(N, F) can be used to test the value of the bit at position F of variable N, as in
if (TST_FLAG(flags, A_FLAG))
/* do something */;
The macro BIT_RANGE(N, M) produces a bit mask corresponding to bit positions N through M, inclusive. With this macro, instead of writing
#define FIRST_OCTAL_DIGIT 7 /* 111 */
#define SECOND_OCTAL_DIGIT 56 /* 111000 */
you can write
#define FIRST_OCTAL_DIGIT BIT_RANGE(0, 2) /* 111 */
#define SECOND_OCTAL_DIGIT BIT_RANGE(3, 5) /* 111000 */
which more clearly indicates which bits are meant.
The macro BIT_SHIFT(B, N) can be used to shift value B into the proper bit range (starting with bit N). For instance, if you had a flag called C that could take on one of five possible colors, the colors might be defined like this:
#define C_FLAG BIT_RANGE(8, 10) /* 11100000000 */
/* here are all the values the C flag can take on */
#define C_BLACK BIT_SHIFTL(0, 8) /* 00000000000 */
#define C_RED BIT_SHIFTL(1, 8) /* 00100000000 */
#define C_GREEN BIT_SHIFTL(2, 8) /* 01000000000 */
#define C_BLUE BIT_SHIFTL(3, 8) /* 01100000000 */
#define C_WHITE BIT_SHIFTL(4, 8) /* 10000000000 */
#define C_ZERO C_BLACK
#define C_LARGEST C_WHITE
/* A truly paranoid programmer might do this */
#if C_LARGEST > C_FLAG
Cause an error message. The flag C_FLAG is not
big enough to hold all its possible values.
#endif /* C_LARGEST > C_FLAG */
The macro SET_MFLAG(N, F, V) sets flag F in variable N to the value V. The macro CLR_MFLAG(N, F) is identical toCLR_FLAG(N, F), except the name is changed so that all the operations on multibit flags have a similar naming convention. The macro GET_MFLAG(N, F) gets the value of flag F in variable N, so it can be tested, as in
if (GET_MFLAG(flags, C_FLAG) == C_BLUE)
/* do something */;
3. Are bit fields portable?
Bit fields are not portable. Because bit fields cannot span machine words, and because the number of bits in a machine word is different on different machines, a particular program using bit fields might not even compile on a particular machine.
Assuming that your program does compile, the order in which bits are assigned to bit fields is not defined. Therefore, different compilers, or even different versions of the same compiler, could produce code that would not work properly on data generated by compiled older code. Stay away from using bit fields, except in cases in which the machine can directly address bits in memory and the compiler can generate code to take advantage of it and the increase in speed to be gained would be essential to the operation of the program.
4. Is it better to bitshift a value than to multiply by 2?
Any decent optimizing compiler will generate the same code no matter which way you write it. Use whichever form is more readable in the context in which it appears. The following program's assembler code can be viewed with a tool such as CODEVIEW on DOS/Windows or the disassembler (usually called "dis") on UNIX machines:
Example: Multiplying by 2 and shifting left by 1 are often the same.
void main()
{
unsigned int test_nbr = 300;
test_nbr *= 2;
test_nbr = 300;
test_nbr <<= 1;
}
5. What is meant by high-order and low-order bytes?
We generally write numbers from left to right, with the most significant digit first. To understand what is meant by the "significance" of a digit, think of how much happier you would be if the first digit of your paycheck was increased by one compared to the last digit being increased by one.
The bits in a byte of computer memory can be considered digits of a number written in base 2. That means the least significant bit represents one, the next bit represents 2´1, or 2, the next bit represents 2´2´1, or 4, and so on. If you consider two bytes of memory as representing a single 16-bit number, one byte will hold the least significant 8 bits, and the other will hold the most significant 8 bits. Figure shows the bits arranged into two bytes. The byte holding the least significant 8 bits is called the least significant byte, or low-order byte. The byte containing the most significant 8 bits is the most significant byte, or high- order byte.
6. How are 16- and 32-bit numbers stored?
A 16-bit number takes two bytes of storage, a most significant byte and a least significant byte. If you write the 16-bit number on paper, you would start with the most significant byte and end with the least significant byte. There is no convention for which order to store them in memory, however.
Let's call the most significant byte M and the least significant byte L. There are two possible ways to store these bytes in memory. You could store M first, followed by L, or L first, followed by M. Storing byte M first in memory is called "forward" or "big-endian" byte ordering. The term big endian comes from the fact that the "big end" of the number comes first, and it is also a reference to the book Gulliver's Travels, in which the term refers to people who eat their boiled eggs with the big end on top.
Storing byte L first is called "reverse" or "little-endian" byte ordering. Most machines store data in a big- endian format. Intel CPUs store data in a little-endian format, however, which can be confusing when someone is trying to connect an Intel microprocessor-based machine to anything else.
A 32-bit number takes four bytes of storage. Let's call them Mm, Ml, Lm, and Ll in decreasing order of significance. There are 4! (4 factorial, or 24) different ways in which these bytes can be ordered. Over the years, computer designers have used just about all 24 ways. The most popular two ways in use today, however, are (Mm, Ml, Lm, Ll), which is big-endian, and (Ll, Lm, Ml, Mm), which is little-endian. As with 16-bit numbers, most machines store 32-bit numbers in a big-endian format, but Intel machines store 32-bit numbers in a little-endian format.
Functions
1. When should I declare a function?
Functions that are used only in the current source file should be declared as static, and the function's declaration should appear in the current source file along with the definition of the function. Functions used outside of the current source file should have their declarations put in a header file, which can be included in whatever source file is going to use that function. For instance, if a function named stat_func() is used only in the source file stat.c, it should be declared as shown here:
/* stat.c */
#include <stdio.h>
static int stat_func(int, int); /* static declaration of stat_func() */
void main(void);
void main(void)
{
...
rc = stat_func(1, 2);
...
}
/* definition (body) of stat_func() */
static int stat_func(int arg1, int arg2)
{
...
return rc;
}
In this example, the function named stat_func() is never used outside of the source file stat.c. There is therefore no reason for the prototype (or declaration) of the function to be visible outside of the stat.c source file. Thus, to avoid any confusion with other functions that might have the same name, the declaration of stat_func() should be put in the same source file as the declaration of stat_func().
In the following example, the function glob_func() is declared and used in the source file global.c and is used in the source file extern.c. Because glob_func() is used outside of the source file in which it's declared, the declaration of glob_func() should be put in a header file (in this example, named proto.h) to be included in both the global.c and the extern.c source files. This is how it's done:
/* proto.h */
int glob_func(int, int); /* declaration of the glob_func() function */
/* global.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void main(void);
void main(void)
{
...
rc = glob_func(1, 2);
...
}
/* definition (body) of the glob_func() function */
int glob_func(int arg1, int arg2)
{
...
return rc;
}
/* extern.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void ext_func(void);
void ext_func(void)
{
...
/* call glob_func(), which is defined in the global.c source file */
rc = glob_func(10, 20);
...
}
In the preceding example, the declaration of glob_func() is put in the header file named proto.h becauseglob_func() is used in both the global.c and the extern.c source files. Now, whenever glob_func() is going to be used, you simply need to include the proto.h header file, and you will automatically have the function's declaration. This will help your compiler when it is checking parameters and return values from global functions you are using in your programs. Notice that your function declarations should always appear before the first function declaration in your source file.
In general, if you think your function might be of some use outside of the current source file, you should put its declaration in a header file so that other modules can access it. Otherwise, if you are sure your function will never be used outside of the current source file, you should declare the function as static and include the declaration only in the current source file.
2. Why should I prototype a function?
A function prototype tells the compiler what kind of arguments a function is looking to receive and what kind of return value a function is going to give back. This approach helps the compiler ensure that calls to a function are made correctly and that no erroneous type conversions are taking place. For instance, consider the following prototype:
int some_func(int, char*, long);
Looking at this prototype, the compiler can check all references (including the definition of some_func()) to ensure that three parameters are used (an integer, a character pointer, and then a long integer) and that a return value of type integer is received. If the compiler finds differences between the prototype and calls to the function or the definition of the function, an error or a warning can be generated to avoid errors in your source code. For instance, the following examples would be flagged as incorrect, given the preceding
prototype of some_func():
x = some_func(1); /* not enough arguments passed */
x = some_func("HELLO!", 1, "DUDE!"); /* wrong type of arguments used */
x = some_func(1, str, 2879, "T"); /* too many arguments passed */
/* In the following example, the return value expected
from some_func() is not an integer: */
long* lValue;
lValue = some_func(1, str, 2879); /* some_func() returns an int,
not a long* */
Using prototypes, the compiler can also ensure that the function definition, or body, is correct and correlates with the prototype. For instance, the following definition of some_func() is not the same as its prototype, and it therefore would be flagged by the compiler:
int some_func(char* string, long lValue, int iValue) /* wrong order of
parameters */
{
...
}
The bottom line on prototypes is that you should always include them in your source code because they provide a good error-checking mechanism to ensure that your functions are being used correctly. Besides, many of today's popular compilers give you warnings when compiling if they can't find a prototype for a function that is being referenced.
3. How many parameters should a function have?
There is no set number or "guideline" limit to the number of parameters your functions can have. However, it is considered bad programming style for your functions to contain an inordinately high (eight or more) number of parameters. The number of parameters a function has also directly affects the speed at which it is called—the more parameters, the slower the function call. Therefore, if possible, you should minimize the number of parameters you use in a function. If you are using more than four parameters, you might want to rethink your function design and calling conventions.
One technique that can be helpful if you find yourself with a large number of function parameters is to put your function parameters in a structure. Consider the following program, which contains a function namedprint_report() that uses 10 parameters. Instead of making an enormous function declaration and proto- type, theprint_report() function uses a structure to get its parameters:
#include <stdio.h>
typedef struct
{
int orientation;
char rpt_name[25];
char rpt_path[40];
int destination;
char output_file[25];
int starting_page;
int ending_page;
char db_name[25];
char db_path[40];
int draft_quality;
} RPT_PARMS;
void main(void);
int print_report(RPT_PARMS*);
void main(void)
{
RPT_PARMS rpt_parm; /* define the report parameter
structure variable */
...
/* set up the report parameter structure variable to pass to the
print_report() function */
rpt_parm.orientation = ORIENT_LANDSCAPE;
rpt_parm.rpt_name = "QSALES.RPT";
rpt_parm.rpt_path = "C:\REPORTS";
rpt_parm.destination = DEST_FILE;
rpt_parm.output_file = "QSALES.TXT";
rpt_parm.starting_page = 1;
rpt_parm.ending_page = RPT_END;
rpt_parm.db_name = "SALES.DB";
rpt_parm.db_path = "C:\DATA";
rpt_parm.draft_quality = TRUE;
/* Call the print_report() function, passing it a pointer to the
parameters instead of passing it a long list of 10 separate
parameters. */
ret_code = print_report(&rpt_parm);
...
}
int print_report(RPT_PARMS* p)
{
int rc;
...
/* access the report parameters passed to the print_report()
function */
orient_printer(p->orientation);
set_printer_quality((p->draft_quality == TRUE) ? DRAFT : NORMAL);
...
return rc;
}
The preceding example avoided a large, messy function prototype and definition by setting up a predefined structure of type RPT_PARMS to hold the 10 parameters that were needed by the print_report() function. The only possible disadvantage to this approach is that by removing the parameters from the function definition, you are bypassing the compiler's capability to type-check each of the parameters for validity during the compile stage.
Generally, you should keep your functions small and focused, with as few parameters as possible to help with execution speed. If you find yourself writing lengthy functions with many parameters, maybe you should rethink your function design or consider using the structure-passing technique presented here. Additionally, keeping your functions small and focused will help when you are trying to isolate and fix bugs in your programs.
4. What is a static function?
A static function is a function whose scope is limited to the current source file. Scope refers to the visibility of a function or variable. If the function or variable is visible outside of the current source file, it is said to have global, or external, scope. If the function or variable is not visible outside of the current source file, it is said to havelocal, or static, scope.
A static function therefore can be seen and used only by other functions within the current source file. When you have a function that you know will not be used outside of the current source file or if you have a function that you do not want being used outside of the current source file, you should declare it as static. Declaring local functions as static is considered good programming practice. You should use static functions often to avoid possible conflicts with external functions that might have the same name.
For instance, consider the following example program, which contains two functions. The first function,open_customer_table(), is a global function that can be called by any module. The second function,open_customer_indexes(), is a local function that will never be called by another module. This is because you can't have the customer's index files open without first having the customer table open. Here is the code:
#include <stdio.h>
int open_customer_table(void); /* global function, callable from
any module */
static int open_customer_indexes(void); /* local function, used only in
this module */
int open_customer_table(void)
{
int ret_code;
/* open the customer table */
...
if (ret_code == OK)
{
ret_code = open_customer_indexes();
}
return ret_code;
}
static int open_customer_indexes(void)
{
int ret_code;
/* open the index files used for this table */
...
return ret_code;
}
Generally, if the function you are writing will not be used outside of the current source file, you should declare it as static.
5. Should a function contain a return statement if it does not return a value?
In C, void functions (those that do not return a value to the calling function) are not required to include a return statement. Therefore, it is not necessary to include a return statement in your functions declared as being void.
In some cases, your function might trigger some critical error, and an immediate exit from the function might be necessary. In this case, it is perfectly acceptable to use a return statement to bypass the rest of the function's code. However, keep in mind that it is not considered good programming practice to litter your functions with return statements-generally, you should keep your function's exit point as focused and clean as possible.
6. How can you pass an array to a function by value?
An array can be passed to a function by value by declaring in the called function the array name with square brackets ([ and ]) attached to the end. When calling the function, simply pass the address of the array (that is, the array's name) to the called function. For instance, the following program passes the array x[] to the function namedbyval_func() by value:
#include <stdio.h>
void byval_func(int[]); /* the byval_func() function is passed an
integer array by value */
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call byval_func(), passing the x array by value. */
byval_func(x);
}
/* The byval_function receives an integer array by value. */
void byval_func(int i[])
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", i[y]);
}
In this example program, an integer array named x is defined and initialized with 10 values. The functionbyval_func() is declared as follows:
int byval_func(int[]);
The int[] parameter tells the compiler that the byval_func() function will take one argument—an array of integers. When the byval_func() function is called, you pass the address of the array to byval_func():byval_func(x);
Because the array is being passed by value, an exact copy of the array is made and placed on the stack. The called function then receives this copy of the array and can print it. Because the array passed to byval_func() is a copy of the original array, modifying the array within the byval_func() function has no effect on the original array.
Passing arrays of any kind to functions can be very costly in several ways. First, this approach is very inefficient because an entire copy of the array must be made and placed on the stack. This takes up valuable program time, and your program execution time is degraded. Second, because a copy of the array is made, more memory (stack) space is required. Third, copying the array requires more code generated by the compiler, so your program is larger.
Instead of passing arrays to functions by value, you should consider passing arrays to functions by reference: this means including a pointer to the original array. When you use this method, no copy of the array is made. Your programs are therefore smaller and more efficient, and they take up less stack space. To pass an array by reference, you simply declare in the called function prototype a pointer to the data type you are holding in the array.
Consider the following program, which passes the same array (x) to a function:
#include <stdio.h>
void const_func(const int*);
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call const_func(), passing the x array by reference. */
const_func(x);
}
/* The const_function receives an integer array by reference.
Notice that the pointer is declared as const, which renders
it unmodifiable by the const_func() function. */
void const_func(const int* i)
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", *(i+y));
}
In the preceding example program, an integer array named x is defined and initialized with 10 values. The function const_func() is declared as follows:
int const_func(const int*);
The const int* parameter tells the compiler that the const_func() function will take one argument—a constant pointer to an integer. When the const_func() function is called, you pass the address of the array toconst_func():
const_func(x);
Because the array is being passed by reference, no copy of the array is made and placed on the stack. The called function receives simply a constant pointer to an integer. The called function must be coded to be smart enough to know that what it is really receiving is a constant pointer to an array of integers. The const modifier is used to prevent the const_func() from accidentally modifying any elements of the original array.
The only possible drawback to this alternative method of passing arrays is that the called function must be coded correctly to access the array—it is not readily apparent by the const_func() function prototype or definition that it is being passed a reference to an array of integers. You will find, however, that this method is much quicker and more efficient, and it is recommended when speed is of utmost importance.
7. Is it possible to execute code even after the program exits the main() function?
The standard C library provides a function named atexit() that can be used to perform "cleanup" operations when your program terminates. You can set up a set of functions you want to perform automatically when your program exits by passing function pointers to the atexit() function. Here's an example of a program that uses the atexit()function:
#include <stdio.h>
#include <stdlib.h>
void close_files(void);
void print_registration_message(void);
int main(int, char**);
int main(int argc, char** argv)
{
...
atexit(print_registration_message);
atexit(close_files);
while (rec_count < max_records)
{
process_one_record();
}
exit(0);
}
This example program uses the atexit() function to signify that the close_files() function and theprint_registration_message() function need to be called automatically when the program exits. When themain() function ends, these two functions will be called to close the files and print the registration message. There are two things that should be noted regarding the atexit() function. First, the functions you specify to execute at program termination must be declared as void functions that take no parameters. Second, the functions you designate with the atexit() function are stacked in the order in which they are called with atexit(), and therefore they are executed in a last-in, first-out (LIFO) method. Keep this information in mind when using the atexit()function. In the preceding example, the atexit() function is stacked as shown here:
atexit(print_registration_message);
atexit(close_files);
Because the LIFO method is used, the close_files() function will be called first, and then theprint_registration_message() function will be called.
The atexit() function can come in handy when you want to ensure that certain functions (such as closing your program's data files) are performed before your program terminates.
8. What does a function declared as PASCAL do differently?
A C function declared as PASCAL uses a different calling convention than a "regular" C function. Normally, C function parameters are passed right to left; with the PASCAL calling convention, the parameters are passed left to right.
Consider the following function, which is declared normally in a C program:
int regular_func(int, char*, long);
Using the standard C calling convention, the parameters are pushed on the stack from right to left. This means that when the regular_func() function is called in C, the stack will contain the following parameters:
long
char*
int
The function calling regular_func() is responsible for restoring the stack when regular_func() returns.
When the PASCAL calling convention is being used, the parameters are pushed on the stack from left to right.
Consider the following function, which is declared as using the PASCAL calling convention:
int PASCAL pascal_func(int, char*, long);
When the function pascal_func() is called in C, the stack will contain the following parameters:
int
char*
long
The function being called is responsible for restoring the stack pointer. Why does this matter? Is there any benefit to using PASCAL functions?
Functions that use the PASCAL calling convention are more efficient than regular C functions—the function calls tend to be slightly faster. Microsoft Windows is an example of an operating environment that uses the PASCAL calling convention. The Windows SDK (Software Development Kit) contains hundreds of functions declared as PASCAL.
When Windows was first designed and written in the late 1980s, using the PASCAL modifier tended to make a noticeable difference in program execution speed. In today's world of fast machinery, the PASCAL modifier is much less of a catalyst when it comes to the speed of your programs. In fact, Microsoft has abandoned the PASCAL calling convention style for the Windows NT operating system.
In your world of programming, if milliseconds make a big difference in your programs, you might want to use the PASCAL modifier when declaring your functions. Most of the time, however, the difference in speed is hardly noticeable, and you would do just fine to use C's regular calling convention.
9. Is using exit() the same as using return?
No. The exit() function is used to exit your program and return control to the operating system. The return statement is used to return from a function and return control to the calling function. If you issue a return from themain() function, you are essentially returning control to the calling function, which is the operating system. In this case, the return statement and exit() function are similar. Here is an example of a program that uses the exit()function and return statement:
#include <stdio.h>
#include <stdlib.h>
int main(int, char**);
int do_processing(void);
int do_something_daring();
int main(int argc, char** argv)
{
int ret_code;
if (argc < 3)
{
printf("Wrong number of arguments used!\n");
/* return 1 to the operating system */
exit(1);
}
ret_code = do_processing();
...
/* return 0 to the operating system */
exit(0);
}
int do_processing(void)
{
int rc;
rc = do_something_daring();
if (rc == ERROR)
{
printf("Something fishy is going on around here..."\n);
/* return rc to the operating system */
exit(rc);
}
/* return 0 to the calling function */
return 0;
}
In the main() function, the program is exited if the argument count (argc) is less than 3. The statement exit(1);tells the program to exit and return the number 1 to the operating system. The operating system can then decide what to do based on the return value of the program. For instance, many DOS batch files check the environment variable named ERRORLEVEL for the return value of executable programs.
Strings
1. What is the difference between a string copy (strcpy) and a memory copy (memcpy)? When should each be used?
The strcpy() function is designed to work exclusively with strings. It copies each byte of the source string to the destination string and stops when the terminating null character (\0) has been moved. On the other hand, thememcpy() function is designed to work with any type of data.
Because not all data ends with a null character, you must provide the memcpy() function with the number of bytes you want to copy from the source to the destination. The following program shows examples of both the strcpy()and the memcpy() functions:
#include <stdio.h>
#include <string.h>
typedef struct cust_str {
int id;
char last_name[20];
char first_name[15];
} CUSTREC;
void main(void);
void main(void)
{
char* src_string = "This is the source string";
char dest_string[50];
CUSTREC src_cust;
CUSTREC dest_cust;
printf("Hello! I'm going to copy src_string into dest_string!\n");
/* Copy src_string into dest_string. Notice that the destination
string is the first argument. Notice also that the strcpy()
function returns a pointer to the destination string. */
printf("Done! dest_string is: %s\n",
strcpy(dest_string, src_string));
printf("Encore! Let's copy one CUSTREC to another.\n");
printf("I'll copy src_cust into dest_cust.\n");
/* First, initialize the src_cust data members. */
src_cust.id = 1;
strcpy(src_cust.last_name, "Strahan");
strcpy(src_cust.first_name, "Troy");
/* Now, use the memcpy() function to copy the src_cust structure to
the dest_cust structure. Notice that, just as with strcpy(), the
destination comes first. */
memcpy(&dest_cust, &src_cust, sizeof(CUSTREC));
printf("Done! I just copied customer number #%d (%s %s).",
dest_cust.id, dest_cust.first_name, dest_cust.last_name);
}
When dealing with strings, you generally should use the strcpy() function, because it is easier to use with strings. When dealing with abstract data other than strings (such as structures), you should use the memcpy() function.
2. How can I remove the trailing spaces from a string?
The C language does not provide a standard function that removes trailing spaces from a string. It is easy, however, to build your own function to do just this. The following program uses a custom function named rtrim() to remove the trailing spaces from a string. It carries out this action by iterating through the string backward, starting at the character before the terminating null character (\0) and ending when it finds the first nonspace character. When the program finds a nonspace character, it sets the next character in the string to the terminating null character (\0), thereby effectively eliminating all the trailing blanks. Here is how this task is performed:
#include <stdio.h>
#include <string.h>
void main(void);
char* rtrim(char*);
void main(void)
{
char* trail_str = "This string has trailing spaces in it. ";
/* Show the status of the string before calling the rtrim()
function. */
printf("Before calling rtrim(), trail_str is '%s'\n", trail_str);
printf("and has a length of %d.\n", strlen(trail_str));
/* Call the rtrim() function to remove the trailing blanks. */
rtrim(trail_str);
/* Show the status of the string
after calling the rtrim() function. */
printf("After calling rtrim(), trail_str is '%s'\n", trail_str);
printf("and has a length of %d.\n", strlen(trail_str));
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE
the null character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
Notice that the rtrim() function works because in C, strings are terminated by the null character. With the insertion of a null character after the last nonspace character, the string is considered terminated at that point, and all characters beyond the null character are ignored.
3. How can I remove the leading spaces from a string?
The C language does not provide a standard function that removes leading spaces from a string. It is easy, however, to build your own function to do just this. you can easily construct a custom function that uses the rtrim()function in conjunction with the standard C library function strrev() to remove the leading spaces from a string. Look at how this task is performed:
#include <stdio.h>
#include <string.h>
void main(void);
char* ltrim(char*);
char* rtrim(char*);
void main(void)
{
char* lead_str = " This string has leading spaces in it.";
/* Show the status of the string before calling the ltrim()
function. */
printf("Before calling ltrim(), lead_str is '%s'\n", lead_str);
printf("and has a length of %d.\n", strlen(lead_str));
/* Call the ltrim() function to remove the leading blanks. */
ltrim(lead_str);
/* Show the status of the string
after calling the ltrim() function. */
printf("After calling ltrim(), lead_str is '%s'\n", lead_str);
printf("and has a length of %d.\n", strlen(lead_str));
}
/* The ltrim() function removes leading spaces from a string. */
char* ltrim(char* str)
{
strrev(str); /* Call strrev() to reverse the string. */
rtrim(str); /* Call rtrim() to remove the "trailing" spaces. */
strrev(str); /* Restore the string's original order. */
return str; /* Return a pointer to the string. */
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE
the null character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
Notice that the ltrim() function performs the following tasks: First, it calls the standard C library functionstrrev(), which reverses the string that is passed to it. This action puts the original string in reverse order, thereby creating "trailing spaces" rather than leading spaces. Now, the rtrim() function is used to remove the "trailing spaces" from the string. After this task is done, the strrev() function is called again to "reverse" the string, thereby putting it back in its original order.
4. How can I right-justify a string?
Even though the C language does not provide a standard function that right-justifies a string, you can easily build your own function to perform this action. Using the rtrim() function, you can create your own function to take a string and right-justify it. Here is how this task is accomplished:
#include <stdio.h>
#include <string.h>
#include <malloc.h>
void main(void);
char* rjust(char*);
char* rtrim(char*);
void main(void)
{
char* rjust_str = "This string is not right-justified. ";
/* Show the status of the string before calling the rjust()
function. */
printf("Before calling rjust(), rjust_str is '%s'\n.", rjust_str);
/* Call the rjust() function to right-justify this string. */
rjust(rjust_str);
/* Show the status of the string
after calling the rjust() function. */
printf("After calling rjust(), rjust_str is '%s'\n.", rjust_str);
}
/* The rjust() function right-justifies a string. */
char* rjust(char* str)
{
int n = strlen(str); /* Save the original length of the string. */
char* dup_str;
dup_str = strdup(str); /* Make an exact duplicate of the string. */
rtrim(dup_str); /* Trim off the trailing spaces. */
/* Call sprintf() to do a virtual "printf" back into the original
string. By passing sprintf() the length of the original string,
we force the output to be the same size as the original, and by
default the sprintf() right-justifies the output. The sprintf()
function fills the beginning of the string with spaces to make
it the same size as the original string. */
sprintf(str, "%*.*s", n, n, dup_str);
free(dup_str); /* Free the memory taken by
the duplicated string. */
return str; /* Return a pointer to the string. */
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE the null
character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
The rjust() function first saves the length of the original string in a variable named n. This step is needed because the output string must be the same length as the input string. Next, the rjust() function calls the standard C library function named strdup() to create a duplicate of the original string. A duplicate of the string is required because the original version of the string is going to be overwritten with a right-justified version. After the duplicate string is created, a call to the rtrim() function is invoked (using the duplicate string, not the original), which eliminates all trailing spaces from the duplicate string.
Next, the standard C library function sprintf() is called to rewrite the new string to its original place in memory. The sprintf() function is passed the original length of the string (stored in n), thereby forcing the output string to be the same length as the original. Because sprintf() by default right-justifies string output, the output string is filled with leading spaces to make it the same size as the original string. This has the effect of right-justifying the input string. Finally, because the strdup() function dynamically allocates memory, the free() function is called to free up the memory taken by the duplicate string.
5. How can I pad a string to a known length?
Padding strings to a fixed length can be handy when you are printing fixed-length data such as tables or spreadsheets. You can easily perform this task using the printf() function. The following example program shows how to accomplish this task:
#include <stdio.h>
char *data[25] = {
"REGION", "--Q1--", "--Q2--", "--Q3--", " --Q4--",
"North", "10090.50", "12200.10", "26653.12", "62634.32",
"South", "21662.37", "95843.23", "23788.23", "48279.28",
"East", "23889.38", "23789.05", "89432.84", "29874.48",
"West", "85933.82", "74373.23", "78457.23", "28799.84" };
void main(void);
void main(void)
{
int x;
for (x=0; x<25; x++)
{
if ((x % 5) == 0 && (x != 0))
printf("\n");
printf("%-10.10s", data[x]);
}
}
In this example, a character array (char* data[]) is filled with this year's sales data for four regions. Of course, you would want to print this data in an orderly fashion, not just print one figure after the other with no formatting. This being the case, the following statement is used to print the data:
printf("%-10.10s", data[x]);
The "%-10.10s" argument tells the printf() function that you are printing a string and you want to force it to be 10 characters long. By default, the string is right-justified, but by including the minus sign (-) before the first 10, you tell the printf() function to left-justify your string. This action forces the printf() function to pad the string with spaces to make it 10 characters long. The result is a clean, formatted spreadsheet-like
output:
REGION --Q1-- --Q2-- --Q3-- --Q4--
North 10090.50 12200.10 26653.12 62634.32
South 21662.37 95843.23 23788.23 48279.28
East 23889.38 23789.05 89432.84 29874.48
West 85933.82 74373.23 78457.23 28799.84
6. How can I copy just a portion of a string?
You can use the standard C library function strncpy() to copy one portion of a string into another string. Thestrncpy() function takes three arguments: the first argument is the destination string, the second argument is the source string, and the third argument is an integer representing the number of characters you want to copy from the source string to the destination string. For example, consider the following program, which uses the strncpy()function to copy portions of one string to another:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* source_str = "THIS IS THE SOURCE STRING";
char dest_str1[40] = {0}, dest_str2[40] = {0};
/* Use strncpy() to copy only the first 11 characters. */
strncpy(dest_str1, source_str, 11);
printf("How about that! dest_str1 is now: '%s'!!!\n", dest_str1);
/* Now, use strncpy() to copy only the last 13 characters. */
strncpy(dest_str2, source_str + (strlen(source_str) - 13), 13);
printf("Whoa! dest_str2 is now: '%s'!!!\n", dest_str2);
}
The first call to strncpy() in this example program copies the first 11 characters of the source string intodest_str1. This example is fairly straightforward, one you might use often. The second call is a bit more complicated and deserves some explanation. In the second argument to the strncpy() function call, the total length of thesource_str string is calculated (using the strlen() function). Then, 13 (the number of characters you want to print) is subtracted from the total length of source_str. This gives the number of remaining characters in source_str. This number is then added to the address of source_str to give a pointer to an address in the source string that is 13 characters from the end of source_str.
Then, for the last argument, the number 13 is specified to denote that 13 characters are to be copied out of the string. The combination of these three arguments in the second call to strncpy() sets dest_str2 equal to the last 13 characters of source_str.
The example program prints the following output:
How about that! dest_str1 is now: 'THIS IS THE'!!!
Whoa! dest_str2 is now: 'SOURCE STRING'!!!
Notice that before source_str was copied to dest_str1 and dest_st2, dest_str1 and dest_str2 had to be initialized to null characters (\0). This is because the strncpy() function does not automatically append a null character to the string you are copying to. Therefore, you must ensure that you have put the null character after the string you have copied, or else you might wind up with garbage being printed.
7. How can I convert a number to a string?
The standard C library provides several functions for converting numbers of all formats (integers, longs, floats, and so on) to strings and vice versa. One of these functions, itoa(), is used here to illustrate how an integer is converted to a string:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
int num = 100;
char str[25];
itoa(num, str, 10);
printf("The number 'num' is %d and the string 'str' is %s.\n",
num, str);
}
Notice that the itoa() function takes three arguments: the first argument is the number you want to convert to the string, the second is the destination string to put the converted number into, and the third is the base, or radix, to be used when converting the number. The preceding example uses the common base 10 to convert the number to the string.
The following functions can be used to convert integers to strings:
Function Name
Purpose
itoa()
-
Converts an integer value to a string.
ltoa()
-
Converts a long integer value to a string.
ultoa()
-
Converts an unsigned long integer value to a string.
Note that the itoa(), ltoa(), and ultoa() functions are not ANSI compatible. An alternative way to convert an integer to a string (that is ANSI compatible) is to use the sprintf() function, as in the following example:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
int num = 100;
char str[25];
sprintf(str, "%d", num);
printf("The number 'num' is %d and the string 'str' is %s.\n",
num, str);
}
When floating-point numbers are being converted, a different set of functions must be used. Here is an example of a program that uses the standard C library function fcvt() to convert a floating-point value to a string:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
double num = 12345.678;
char* str;
int dec_pl, sign, ndigits = 3; /* Keep 3 digits of precision. */
str = fcvt(num, ndigits, &dec_pl, &sign); /* Convert the float
to a string. */
printf("Original number: %f\n", num); /* Print the original
floating-point
value. */
printf("Converted string: %s\n", str); /* Print the converted
string's value */
printf("Decimal place: %d\n", dec_pl); /* Print the location of
the decimal point. */
printf("Sign: %d\n", sign); /* Print the sign.
0 = positive,
1 = negative. */
}
Notice that the fcvt() function is quite different from the itoa() function used previously. The fcvt() function takes four arguments. The first argument is the floating-point value you want to convert. The second argument is the number of digits to be stored to the right of the decimal point. The third argument is a pointer to an integer that is used to return the position of the decimal point in the converted string. The fourth argument is a pointer to an integer that is used to return the sign of the converted number (0 is positive, 1 is negative).
Note that the converted string does not contain the actual decimal point. Instead, the fcvt() returns the position of the decimal point as it would have been if it were in the string. In the preceding example, the dec_pl integer variable contains the number 5 because the decimal point is located after the fifth digit in the resulting string. If you wanted the resulting string to include the decimal point, you could use the gcvt() function (described in the following table).
The following functions can be used to convert floating-point values to strings:
Function
Purpose
ecvt()
-
Converts a double-precision floating-point value to a string without an embedded decimal point.
fcvt()
-
Same as ecvt(), but forces the precision to a specified number of digits.
gcvt()
-
Converts a double-precision floating-point value to a string with an embedded decimal point.
8. How can I convert a string to a number?
The standard C library provides several functions for converting strings to numbers of all formats (integers, longs, floats, and so on) and vice versa. One of these functions, atoi(), is used here to illustrate how a string is converted to an integer:
#include <stdio.h>
#include <stdlib.h>
void main(void);
{
int num;
char* str = "100";
num = atoi(str);
printf("The string 'str' is %s and the number 'num' is %d.\n",
str, num);
}
To use the atoi() function, you simply pass it the string containing the number you want to convert. The return value from the atoi() function is the converted integer value.
The following functions can be used to convert strings to numbers:
Function Name
Purpose
atof()
-
Converts a string to a double-precision floating-point value.
atoi()
-
Converts a string to an integer.
atol()
-
Converts a string to a long integer.
strtod()
-
Converts a string to a double-precision floating-point value and reports any "leftover" numbers that could not be converted.
strtol()
-
Converts a string to a long integer and reports any "leftover" numbers that could not be converted.
strtoul()
-
Converts a string to an unsigned long integer and reports any "leftover" numbers that could not be converted.
Sometimes, you might want to trap overflow errors that can occur when converting a string to a number that results in an overflow condition. The following program shows an example of the strtoul() function, which traps this overflow condition:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
void main(void);
void main(void)
{
char* str = "1234567891011121314151617181920";
unsigned long num;
char* leftover;
num = strtoul(str, &leftover, 10);
printf("Original string: %s\n", str);
printf("Converted number: %lu\n", num);
printf("Leftover characters: %s\n", leftover);
}
In this example, the string to be converted is much too large to fit into an unsigned long integer variable. Thestrtoul() function therefore returns ULONG_MAX (4294967295) and sets the char* leftover to point to the character in the string that caused it to overflow. It also sets the global variable errno to ERANGE to notify the caller of the function that an overflow condition has occurred. The strtod() and strtol() functions work exactly the same way as the strtoul() function shown above. Refer to your C compiler documentation for more information regarding the syntax of these functions.
9. How can you tell whether two strings are the same?
The standard C library provides several functions to compare two strings to see whether they are the same. One of these functions, strcmp(), is used here to show how this task is accomplished:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* str_1 = "abc";
char* str_2 = "abc";
char* str_3 = "ABC";
if (strcmp(str_1, str_2) == 0)
printf("str_1 is equal to str_2.\n");
else
printf("str_1 is not equal to str_2.\n");
if (strcmp(str_1, str_3) == 0)
printf("str_1 is equal to str_3.\n");
else
printf("str_1 is not equal to str_3.\n");
}
This program produces the following output:
str_1 is equal to str_2.
str_1 is not equal to str_3.
Notice that the strcmp() function is passed two arguments that correspond to the two strings you want to compare. It performs a case-sensitive lexicographic comparison of the two strings and returns one of the following values:
Return Value
Meaning
<0
-
The first string is less than the second string.
0
-
The two strings are equal.
>0
-
The first string is greater than the second string.
In the preceding example code, strcmp() returns 0 when comparing str_1 (which is "abc") and str_2 (which is "abc"). However, when comparing str_1 (which is "abc") with str_3 (which is "ABC"), strcmp() returns a value greater than 0, because the string "ABC" is greater than (in ASCII order) the string "abc".
Many variations of the strcmp() function perform the same basic function (comparing two strings), but with slight differences. The following table lists some of the functions available that are similar to strcmp():
Function Name
Description
strcmp()
-
Case-sensitive comparison of two strings
strcmpi()
-
Case-insensitive comparison of two strings
stricmp()
-
Same as strcmpi()
strncmp()
-
Case-sensitive comparison of a portion of two strings
strnicmp()
-
Case-insensitive comparison of a portion of two strings
Looking at the example provided previously, if you were to replace the call to strcmp() with a call to strcmpi() (a case-insensitive version of strcmp()), the two strings "abc" and "ABC" would be reported as being equal.
10. How do you print only part of a string?
The following program shows how to print only part of a string using the printf() function:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* source_str = "THIS IS THE SOURCE STRING";
/* Use printf() to print the first 11 characters of source_str. */
printf("First 11 characters: '%11.11s'\n", source_str);
/* Use printf() to print only the
last 13 characters of source_str. */
printf("Last 13 characters: '%13.13s'\n",
source_str + (strlen(source_str) - 13));
}
This example program produces the following output:
First 11 characters: 'THIS IS THE'
Last 13 characters: 'SOURCE STRING'
The first call to printf() uses the argument "%11.11s" to force the printf() function to make the output exactly 11 characters long. Because the source string is longer than 11 characters, it is truncated, and only the first 11 characters are printed. The second call to printf() is a bit more tricky. The total length of the source_str string is calculated (using the strlen() function). Then, 13 (the number of characters you want to print) is subtracted from the total length of source_str.
This gives the number of remaining characters in source_str. This number is then added to the address of source_str to give a pointer to an address in the source string that is 13 characters from the end of source_str. By using the argument "%13.13s", the program forces the output to be exactly 13 characters long, and thus the last 13 characters of the string are printed.
1. What is a local block?
A local block is any portion of a C program that is enclosed by the left brace ({) and the right brace (}). A C function contains left and right braces, and therefore anything between the two braces is contained in a local block. An if statement or a switch statement can also contain braces, so the portion of code between these two braces would be considered a local block.
Additionally, you might want to create your own local block without the aid of a C function or keyword construct. This is perfectly legal. Variables can be declared within local blocks, but they must be declared only at the beginning of a local block. Variables declared in this manner are visible only within the local block. Duplicate variable names declared within a local block take precedence over variables with the same name declared outside the local block. Here is an example of a program that uses local blocks:
#include <stdio.h>
void main(void);
void main()
{
/* Begin local block for function main() */
int test_var = 10;
printf("Test variable before the if statement: %d\n", test_var);
if (test_var > 5)
{
/* Begin local block for "if" statement */
int test_var = 5;
printf("Test variable within the if statement: %d\n",
test_var);
{
/* Begin independent local block (not tied to
any function or keyword) */
int test_var = 0;
printf(
"Test variable within the independent local block:%d\n",
test_var);
}
/* End independent local block */
}
/* End local block for "if" statement */
printf("Test variable after the if statement: %d\n", test_var);
}
/* End local block for function main() */
This example program produces the following output:
Test variable before the if statement: 10
Test variable within the if statement: 5
Test variable within the independent local block: 0
Test variable after the if statement: 10
Notice that as each test_var was defined, it took precedence over the previously defined test_var. Also notice that when the if statement local block had ended, the program had reentered the scope of the original test_var, and its value was 10.
2. Should variables be stored in local blocks?
The use of local blocks for storing variables is unusual and therefore should be avoided, with only rare exceptions. One of these exceptions would be for debugging purposes, when you might want to declare a local instance of a global variable to test within your function. You also might want to use a local block hen you want to make your program more readable in the current context.
Sometimes having the variable declared closer to where it is used makes your program more readable. However, well-written programs usually do not have to resort to declaring variables in this manner, and you should avoid using local blocks.
3. When is a switch statement better than multiple if statements?
A switch statement is generally best to use when you have more than two conditional expressions based on a single variable of numeric type. For instance, rather than the code
if (x == 1)
printf("x is equal to one.\n");
else if (x == 2)
printf("x is equal to two.\n");
else if (x == 3)
printf("x is equal to three.\n");
else
printf("x is not equal to one, two, or three.\n");
the following code is easier to read and maintain:
switch (x)
{
case 1: printf("x is equal to one.\n");
break;
case 2: printf("x is equal to two.\n");
break;
case 3: printf("x is equal to three.\n");
break;
default: printf("x is not equal to one, two, or three.\n");
break;
}
Notice that for this method to work, the conditional expression must be based on a variable of numeric type in order to use the switch statement. Also, the conditional expression must be based on a single variable. For instance, even though the following if statement contains more than two conditions, it is not a candidate for using a switchstatement because it is based on string comparisons and not numeric comparisons:
char* name = "Lupto";
if (!stricmp(name, "Isaac"))
printf("Your name means 'Laughter'.\n");
else if (!stricmp(name, "Amy"))
printf("Your name means 'Beloved'.\n ");
else if (!stricmp(name, "Lloyd"))
printf("Your name means 'Mysterious'.\n ");
else
printf("I haven't a clue as to what your name means.\n");
4. Is a default case necessary in a switch statement?
No, but it is not a bad idea to put default statements in switch statements for error- or logic-checking purposes. For instance, the following switch statement is perfectly normal:
switch (char_code)
{
case 'Y':
case 'y': printf("You answered YES!\n");
break;
case 'N':
case 'n': printf("You answered NO!\n");
break;
}
Consider, however, what would happen if an unknown character code were passed to this switch statement. The program would not print anything. It would be a good idea, therefore, to insert a default case where this condition would be taken care of:
...
default: printf("Unknown response: %d\n", char_code);
break;
...
Additionally, default cases come in handy for logic checking. For instance, if your switch statement handled a fixed number of conditions and you considered any value outside those conditions to be a logic error, you could insert adefault case which would flag that condition. Consider the following example:
void move_cursor(int direction)
{
switch (direction)
{
case UP: cursor_up();
break;
case DOWN: cursor_down();
break;
case LEFT: cursor_left();
break;
case RIGHT: cursor_right();
break;
default: printf("Logic error on line number %ld!!!\n",
__LINE__);
break;
}
}
5. Can the last case of a switch statement skip including the break?
Even though the last case of a switch statement does not require a break statement at the end, you should addbreak statements to all cases of the switch statement, including the last case. You should do so primarily because your program has a strong chance of being maintained by someone other than you who might add cases but neglect to notice that the last case has no break statement.
This oversight would cause what would formerly be the last case statement to "fall through" to the new statements added to the bottom of the switch statement. Putting a break after each case statement would prevent this possible mishap and make your program more "bulletproof." Besides, most of today's optimizing compilers will optimize out the last break, so there will be no performance degradation if you add it.
6. Other than in a for statement, when is the comma operator used?
The comma operator is commonly used to separate variable declarations, function arguments, and expressions, as well as the elements of a for statement. Look closely at the following program, which shows some of the many ways a comma can be used:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main()
{
/* Here, the comma operator is used to separate
three variable declarations. */
int i, j, k;
/* Notice how you can use the comma operator to perform
multiple initializations on the same line. */
i = 0, j = 1, k = 2;
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the comma operator is used to execute three expressions
in one line: assign k to i, increment j, and increment k.
The value that i receives is always the rightmost expression. */
i = (j++, k++);
printf("i = %d, j = %d, k = %d\n", i, j, k);
/* Here, the while statement uses the comma operator to
assign the value of i as well as test it. */
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n", i);
printf("\nGuess what? i is 50!\n");
}
Notice the line that reads
i = (j++, k++);
This line actually performs three actions at once. These are the three actions, in order:
1. Assigns the value of k to i. This happens because the left value (lvalue) always evaluates to the rightmost argument. In this case, it evaluates to k. Notice that it does not evaluate to k++, because k++ is a postfix incremental expression, and k is not incremented until the assignment of k to i is made. If the expression had read++k, the value of ++k would be assigned to i because it is a prefix incremental expression, and it is incremented before the assignment is made.
2. Increments j.
3. Increments k.
Also, notice the strange-looking while statement:
while (i = (rand() % 100), i != 50)
printf("i is %d, trying again...\n");
Here, the comma operator separates two expressions, each of which is evaluated for each iteration of the whilestatement. The first expression, to the left of the comma, assigns i to a random number from 0 to 99.
The second expression, which is more commonly found in a while statement, is a conditional expression that tests to see whether i is not equal to 50. For each iteration of the while statement, i is assigned a new random number, and the value of i is checked to see that it is not 50. Eventually, i is randomly assigned the value 50, and thewhile statement terminates.
7. How can you tell whether a loop ended prematurely?
Generally, loops are dependent on one or more variables. Your program can check those variables outside the loop to ensure that the loop executed properly. For instance, consider the following example:
#define REQUESTED_BLOCKS 512
int x;
char* cp[REQUESTED_BLOCKS];
/* Attempt (in vain, I must add...) to
allocate 512 10KB blocks in memory. */
for (x=0; x< REQUESTED_BLOCKS; x++)
{
cp[x] = (char*) malloc(10000, 1);
if (cp[x] == (char*) NULL)
break;
}
/* If x is less than REQUESTED_BLOCKS,
the loop has ended prematurely. */
if (x < REQUESTED_BLOCKS)
printf("Bummer! My loop ended prematurely!\n");
Notice that for the loop to execute successfully, it would have had to iterate through 512 times. Immediately following the loop, this condition is tested to see whether the loop ended prematurely. If the variable x is anything less than 512, some error has occurred.
8. What is the difference between goto and long jmp( ) and setjmp()?
A goto statement implements a local jump of program execution, and the longjmp() and setjmp() functions implement a nonlocal, or far, jump of program execution. Generally, a jump in execution of any kind should be avoided because it is not considered good programming practice to use such statements as goto and longjmp in your program.
A goto statement simply bypasses code in your program and jumps to a predefined position. To use the gotostatement, you give it a labeled position to jump to. This predefined position must be within the same function. You cannot implement gotos between functions. Here is an example of a goto statement:
void bad_programmers_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
x = 1;
while (1)
{
printf("%d\n", x);
if (x == 5000)
goto all_done;
else
x = x + 1;
}
all_done:
printf("Whew! That wasn't so bad, was it?\n");
}
This example could have been written much better, avoiding the use of a goto statement. Here is an example of an improved implementation:
void better_function(void)
{
int x;
printf("Excuse me while I count to 5000...\n");
for (x=1; x<=5000; x++)
printf("%d\n", x);
printf("Whew! That wasn't so bad, was it?\n");
}
As previously mentioned, the longjmp() and setjmp() functions implement a nonlocal goto. When your program calls setjmp(), the current state of your program is saved in a structure of type jmp_buf. Later, your program can call the longjmp() function to restore the program's state as it was when you called setjmp(). Unlike the gotostatement, the longjmp() and setjmp() functions do not need to be implemented in the same function.
However, there is a major drawback to using these functions: your program, when restored to its previously saved state, will lose its references to any dynamically allocated memory between the longjmp() and the setjmp(). This means you will waste memory for every malloc() or calloc() you have implemented between your longjmp() andsetjmp(), and your program will be horribly inefficient. It is highly recommended that you avoid using functions such as longjmp() and setjmp() because they, like the goto statement, are quite often an indication of poor programming practice.
Here is an example of the longjmp() and setjmp() functions:
#include <stdio.h>
#include <setjmp.h>
jmp_buf saved_state;
void main(void);
void call_longjmp(void);
void main(void)
{
int ret_code;
printf("The current state of the program is being saved...\n");
ret_code = setjmp(saved_state);
if (ret_code == 1)
{
printf("The longjmp function has been called.\n");
printf("The program's previous state has been restored.\n");
exit(0);
}
printf("I am about to call longjmp and\n");
printf("return to the previous program state...\n");
call_longjmp();
}
void call_longjmp(void)
{
longjmp(saved_state, 1);
}
9. What is an lvalue?
An lvalue is an expression to which a value can be assigned. The lvalue expression is located on the left side of an assignment statement, whereas an rvalue is located on the right side of an assignment statement. Each assignment statement must have an lvalue and an rvalue. The lvalue expression must reference a storable variable in memory. It cannot be a constant. For instance, the following lines show a few examples of lvalues:
int x;
int* p_int;
x = 1;
*p_int = 5;
The variable x is an integer, which is a storable location in memory. Therefore, the statement x = 1 qualifies x to be an lvalue. Notice the second assignment statement, *p_int = 5. By using the * modifier to reference the area of memory that p_int points to, *p_int is qualified as an lvalue. In contrast, here are a few examples of what would not be considered lvalues:
#define CONST_VAL 10
int x;
/* example 1 */
1 = x;
/* example 2 */
CONST_VAL = 5;
In both statements, the left side of the statement evaluates to a constant value that cannot be changed because constants do not represent storable locations in memory. Therefore, these two assignment statements do not contain lvalues and will be flagged by your compiler as errors.
10. Can an array be an lvalue?
Is an array an expression to which we can assign a value? The answer to this question is no, because an array is composed of several separate array elements that cannot be treated as a whole for assignment purposes. The following statement is therefore illegal:
int x[5], y[5];
x = y;
You could, however, use a for loop to iterate through each element of the array and assign values individually, such as in this example:
int i;
int x[5];
int y[5];
...
for (i=0; i<5; i++)
x[i] = y[i]
...
Additionally, you might want to copy the whole array all at once. You can do so using a library function such as thememcpy() function, which is shown here:
memcpy(x, y, sizeof(y));
It should be noted here that unlike arrays, structures can be treated as lvalues. Thus, you can assign one structure variable to another structure variable of the same type, such as this:
typedef struct t_name
{
char last_name[25];
char first_name[15];
char middle_init[2];
} NAME;
...
NAME my_name, your_name;
...
your_name = my_name;
...
In the preceding example, the entire contents of the my_name structure were copied into the your_name structure. This is essentially the same as the following line:
memcpy(your_name, my_name, sizeof(your_name));
11. What is an rvalue?
rvalue can be defined as an expression that can be assigned to an lvalue. The rvalue appears on the right side of an assignment statement.
Unlike an lvalue, an rvalue can be a constant or an expression, as shown here:
int x, y;
x = 1; /* 1 is an rvalue; x is an lvalue */
y = (x + 1); /* (x + 1) is an rvalue; y is an lvalue */
An assignment statement must have both an lvalue and an rvalue. Therefore, the following statement would not compile because it is missing an rvalue:
int x; x = void_function_call() /* the function void_function_call() returns nothing */
If the function had returned an integer, it would be considered an rvalue because it evaluates into something that the lvalue, x, can store.
12. Is left-to-right or right-to-left order guaranteed for operator precedence?
The simple answer to this question is neither. The C language does not always evaluate left-to-right or right-to-left. Generally, function calls are evaluated first, followed by complex expressions and then simple expressions.
Additionally, most of today's popular C compilers often rearrange the order in which the expression is evaluated in order to get better optimized code. You therefore should always implicitly define your operator precedence by using parentheses.
For example, consider the following expression:
a = b + c/d / function_call() * 5
The way this expression is to be evaluated is totally ambiguous, and you probably will not get the results you want. Instead, try writing it by using implicit operator precedence:
a = b + (((c/d) / function_call()) * 5)
Using this method, you can be assured that your expression will be evaluated properly and that the compiler will not rearrange operators for optimization purposes.
13. What is the difference between ++var and var++?
The ++ operator is called the increment operator. When the operator is placed before the variable (++var), the variable is incremented by 1 before it is used in the expression. When the operator is placed after the variable (var++), the expression is evaluated, and then the variable is incremented by 1.
The same holds true for the decrement operator (--). When the operator is placed before the variable, you are said to have a prefix operation. When the operator is placed after the variable, you are said to have a postfix operation.
For instance, consider the following example of postfix incrementation:
int x, y;
x = 1;
y = (x++ * 5);
In this example, postfix incrementation is used, and x is not incremented until after the evaluation of the expression is done. Therefore, y evaluates to 1 times 5, or 5. After the evaluation, x is incremented to 2.
Now look at an example using prefix incrementation:
int x, y;
x = 1;
y = (++x * 5);
This example is the same as the first one, except that this example uses prefix incrementation rather than postfix. Therefore, x is incremented before the expression is evaluated, making it 2. Hence, y evaluates to 2 times 5, or 10.
14. What does the modulus operator do?
The modulus operator (%) gives the remainder of two divided numbers. For instance, consider the following portion of code:
x = 15/7
If x were an integer, the resulting value of x would be 2. However, consider what would happen if you were to apply the modulus operator to the same equation:
x = 15%7
The result of this expression would be the remainder of 15 divided by 7, or 1. This is to say that 15 divided by 7 is 2 with a remainder of 1.
The modulus operator is commonly used to determine whether one number is evenly divisible into another. For instance, if you wanted to print every third letter of the alphabet, you would use the following code:
int x;
for (x=1; x<=26; x++)
if ((x%3) == 0)
printf("%c", x+64);
The preceding example would output the string "cfilorux", which represents every third letter in the alphabet.
Variable And Data Storage
1. Where in memory are my variables stored?
Variables can be stored in several places in memory, depending on their lifetime. Variables that are defined outside any function (whether of global or file static scope), and variables that are defined inside a function as static variables, exist for the lifetime of the program's execution. These variables are stored in the "data segment." The data segment is a fixed-size area in memory set aside for these variables. The data segment is subdivided into two parts, one for initialized variables and another for uninitialized variables.
Variables that are defined inside a function as auto variables (that are not defined with the keyword static) come into existence when the program begins executing the block of code (delimited by curly braces {}) containing them, and they cease to exist when the program leaves that block of code.
Variables that are the arguments to functions exist only during the call to that function. These variables are stored on the "stack". The stack is an area of memory that starts out small and grows automatically up to some predefined limit. In DOS and other systems without virtual memory, the limit is set either when the program is compiled or when it begins executing. In UNIX and other systems with virtual memory, the limit is set by the system, and it is usually so large that it can be ignored by the programmer.
The third and final area doesn't actually store variables but can be used to store data pointed to by variables. Pointer variables that are assigned to the result of a call to the malloc() function contain the address of a dynamically allocated area of memory. This memory is in an area called the "heap." The heap is another area that starts out small and grows, but it grows only when the programmer explicitly calls malloc() or other memory allocation functions, such as calloc(). The heap can share a memory segment with either the data segment or the stack, or it can have its own segment. It all depends on the compiler options and operating system. The heap, like the stack, has a limit on how much it can grow, and the same rules apply as to how that limit is determined.
2. Do variables need to be initialized?
No. All variables should be given a value before they are used, and a good compiler will help you find variables that are used before they are set to a value. Variables need not be initialized, however. Variables defined outside a function or defined inside a function with the static keyword are already initialized to 0 for you if you do not explicitly initialize them.
Automatic variables are variables defined inside a function or block of code without the static keyword. These variables have undefined values if you don't explicitly initialize them. If you don't initialize an automatic variable, you must make sure you assign to it before using the value.
Space on the heap allocated by calling malloc() contains undefined data as well and must be set to a known value before being used. Space allocated by calling calloc() is set to 0 for you when it is allocated.
3. What is page thrashing?
Some operating systems (such as UNIX or Windows in enhanced mode) use virtual memory. Virtual memory is a technique for making a machine behave as if it had more memory than it really has, by using disk space to simulate RAM (random-access memory). In the 80386 and higher Intel CPU chips, and in most other modern microprocessors (such as the Motorola 68030, Sparc, and Power PC), exists a piece of hardware called the Memory Management Unit, or MMU.
The MMU treats memory as if it were composed of a series of "pages." A page of memory is a block of contiguous bytes of a certain size, usually 4096 or 8192 bytes. The operating system sets up and maintains a table for each running program called the Process Memory Map, or PMM. This is a table of all the pages of memory that program can access and where each is really located.
Every time your program accesses any portion of memory, the address (called a "virtual address") is processed by the MMU. The MMU looks in the PMM to find out where the memory is really located (called the "physical address"). The physical address can be any location in memory or on disk that the operating system has assigned for it. If the location the program wants to access is on disk, the page containing it must be read from disk into memory, and the PMM must be updated to reflect this action (this is called a "page fault"). Hope you're still with me, because here's the tricky part. Because accessing the disk is so much slower than accessing RAM, the operating system tries to keep as much of the virtual memory as possible in RAM. If you're running a large enough program (or several small programs at once), there might not be enough RAM to hold all the memory used by the programs, so some of it must be moved out of RAM and onto disk (this action is called "paging out").
The operating system tries to guess which areas of memory aren't likely to be used for a while (usually based on how the memory has been used in the past). If it guesses wrong, or if your programs are accessing lots of memory in lots of places, many page faults will occur in order to read in the pages that were paged out. Because all of RAM is being used, for each page read in to be accessed, another page must be paged out. This can lead to more page faults, because now a different page of memory has been moved to disk. The problem of many page faults occurring in a short time, called "page thrashing," can drastically cut the performance of a system.
Programs that frequently access many widely separated locations in memory are more likely to cause page thrashing on a system. So is running many small programs that all continue to run even when you are not actively using them. To reduce page thrashing, you can run fewer programs simultaneously. Or you can try changing the way a large program works to maximize the capability of the operating system to guess which pages won't be needed. You can achieve this effect by caching values or changing lookup algorithms in large data structures, or sometimes by changing to a memory allocation library which provides an implementation of malloc() that allocates memory more efficiently. Finally, you might consider adding more RAM to the system to reduce the need to page out.
4. What is a const pointer?
The access modifier keyword const is a promise the programmer makes to the compiler that the value of a variable will not be changed after it is initialized. The compiler will enforce that promise as best it can by not enabling the programmer to write code which modifies a variable that has been declared const.
A "const pointer," or more correctly, a "pointer to const," is a pointer which points to data that is const (constant, or unchanging). A pointer to const is declared by putting the word const at the beginning of the pointer declaration. This declares a pointer which points to data that can't be modified. The pointer itself can be modified. The following example illustrates some legal and illegal uses of a const pointer:
const char *str = "hello";
char c = *str /* legal */
str++; /* legal */
*str = 'a'; /* illegal */
str[1] = 'b'; /* illegal */
The first two statements here are legal because they do not modify the data that str points to. The next two statements are illegal because they modify the data pointed to by str.
Pointers to const are most often used in declaring function parameters. For instance, a function that counted the number of characters in a string would not need to change the contents of the string, and it might be written this way:
my_strlen(const char *str)
{
int count = 0;
while (*str++)
{
count++;
}
return count;
}
Note that non-const pointers are implicitly converted to const pointers when needed, but const pointers are not converted to non-const pointers. This means that my_strlen() could be called with either a const or a non-const character pointer.
5. When should the register modifier be used? Does it really help?
The register modifier hints to the compiler that the variable will be heavily used and should be kept in the CPU's registers, if possible, so that it can be accessed faster. There are several restrictions on the use of the register modifier.
First, the variable must be of a type that can be held in the CPU's register. This usually means a single value of a size less than or equal to the size of an integer. Some machines have registers that can hold floating-point numbers as well.
Second, because the variable might not be stored in memory, its address cannot be taken with the unary and operator. An attempt to do so is flagged as an error by the compiler. Some additional rules affect how useful the register modifier is. Because the number of registers is limited, and because some registers can hold only certain types of data (such as pointers or floating-point numbers), the number and types of register modifiers that will actually have any effect are dependent on what machine the program will run on. Any additional register modifiers are silently ignored by the compiler.
Also, in some cases, it might actually be slower to keep a variable in a register because that register then becomes unavailable for other purposes or because the variable isn't used enough to justify the overhead of loading and storing it.
So when should the register modifier be used? The answer is never, with most modern compilers. Early C compilers did not keep any variables in registers unless directed to do so, and the register modifier was a valuable addition to the language. C compiler design has advanced to the point, however, where the compiler will usually make better decisions than the programmer about which variables should be stored in registers. In fact, many compilers actually ignore the register modifier, which is perfectly legal, because it is only a hint and not a directive.
In the rare event that a program is too slow, and you know that the problem is due to a variable being stored in memory, you might try adding the register modifier as a last resort, but don't be surprised if this action doesn't change the speed of the program.
6. When should the volatile modifier be used?
The volatile modifier is a directive to the compiler's optimizer that operations involving this variable should not be optimized in certain ways. There are two special cases in which use of the volatile modifier is desirable. The first case involves memory-mapped hardware (a device such as a graphics adaptor that appears to the computer's hardware as if it were part of the computer's memory), and the second involves shared memory (memory used by two or more programs running simultaneously).
Most computers have a set of registers that can be accessed faster than the computer's main memory. A good compiler will perform a kind of optimization called "redundant load and store removal." The compiler looks for places in the code where it can either remove an instruction to load data from memory because the value is already in a register, or remove an instruction to store data to memory because the value can stay in a register until it is changed again anyway.
If a variable is a pointer to something other than normal memory, such as memory-mapped ports on a peripheral, redundant load and store optimizations might be detrimental. For instance, here's a piece of code that might be used to time some operation:
time_t time_addition(volatile const struct timer *t, int a)
{
int n;
int x;
time_t then;
x = 0;
then = t->value;
for (n = 0; n < 1000; n++)
{
x = x + a;
}
return t->value - then;
}
In this code, the variable t->value is actually a hardware counter that is being incremented as time passes. The function adds the value of a to x 1000 times, and it returns the amount the timer was incremented by while the 1000 additions were being performed.
Without the volatile modifier, a clever optimizer might assume that the value of t does not change during the execution of the function, because there is no statement that explicitly changes it. In that case, there's no need to read it from memory a second time and subtract it, because the answer will always be 0. The compiler might therefore "optimize" the function by making it always return 0.
If a variable points to data in shared memory, you also don't want the compiler to perform redundant load and store optimizations. Shared memory is normally used to enable two programs to communicate with each other by having one program store data in the shared portion of memory and the other program read the same portion of memory. If the compiler optimizes away a load or store of shared memory, communication between the two programs will be affected.
7. Can a variable be both const and volatile?
Yes. The const modifier means that this code cannot change the value of the variable, but that does not mean that the value cannot be changed by means outside this code. For instance, the timer structure was accessed through a volatile const pointer. The function itself did not change the value of the timer, so it was declared const. However, the value was changed by hardware on the computer, so it was declared volatile. If a variable is both const andvolatile, the two modifiers can appear in either order.
8. When should the const modifier be used?
There are several reasons to use const pointers. First, it allows the compiler to catch errors in which code accidentally changes the value of a variable, as in
while (*str = 0) /* programmer meant to write *str != 0 */
{
/* some code here */
str++;
}
in which the = sign is a typographical error. Without the const in the declaration of str, the program would compile but not run properly.
Another reason is efficiency. The compiler might be able to make certain optimizations to the code generated if it knows that a variable will not be changed.
Any function parameter which points to data that is not modified by the function or by any function it calls should declare the pointer a pointer to const. Function parameters that are passed by value (rather than through a pointer) can be declared const if neither the function nor any function it calls modifies the data.
In practice, however, such parameters are usually declared const only if it might be more efficient for the compiler to access the data through a pointer than by copying it.
9. How reliable are floating-point comparisons?
Floating-point numbers are the "black art" of computer programming. One reason why this is so is that there is no optimal way to represent an arbitrary number. The Institute of Electrical and Electronic Engineers (IEEE) has developed a standard for the representation of floating-point numbers, but you cannot guarantee that every machine you use will conform to the standard.
Even if your machine does conform to the standard, there are deeper issues. It can be shown mathematically that there are an infinite number of "real" numbers between any two numbers. For the computer to distinguish between two numbers, the bits that represent them must differ. To represent an infinite number of different bit patterns would take an infinite number of bits. Because the computer must represent a large range of numbers in a small number of bits (usually 32 to 64 bits), it has to make approximate representations of most numbers.
Because floating-point numbers are so tricky to deal with, it's generally bad practice to compare a floating- point number for equality with anything. Inequalities are much safer. If, for instance, you want to step through a range of numbers in small increments, you might write this:
#include <stdio.h>
const float first = 0.0;
const float last = 70.0;
const float small = 0.007;
main()
{
float f;
for (f = first; f != last && f < last + 1.0; f += small)
;
printf("f is now %g\n", f);
}
However, rounding errors and small differences in the representation of the variable small might cause f to never be equal to last (it might go from being just under it to being just over it). Thus, the loop would go past the value last. The inequality f < last + 1.0 has been added to prevent the program from running on for a very long time if this happens. If you run this program and the value printed for f is 71 or more, this is what has happened.
A safer way to write this loop is to use the inequality f < last to test for the loop ending, as in this example:
float f;
for (f = first; f < last; f += small)
;
You could even precompute the number of times the loop should be executed and use an integer to count iterations of the loop, as in this example:
float f;
int count = (last - first) / small;
for (f = first; count-- > 0; f += small)
10. How can you determine the maximum value that a numeric variable can hold?
The easiest way to find out how large or small a number that a particular type can hold is to use the values defined in the ANSI standard header file limits.h. This file contains many useful constants defining the values that can be held by various types, including these:
Value
Description
CHAR_BIT
-
Number of bits in a char
CHAR_MAX
-
Maximum decimal integer value of a char
CHAR_MIN
-
Minimum decimal integer value of a char
MB_LEN_MAX
-
Maximum number of bytes in a multibyte character
INT_MAX
-
Maximum decimal value of an int
INT_MIN
-
Minimum decimal value of an int
LONG_MAX
-
Maximum decimal value of a long
LONG_MIN
-
Minimum decimal value of a long
SCHAR_MAX
-
Maximum decimal integer value of a signed char
SCHAR_MIN
-
Minimum decimal integer value of a signed char
SHRT_MAX
-
Maximum decimal value of a short
SHRT_MIN
-
Minimum decimal value of a short
UCHAR_MAX
-
Maximum decimal integer value of unsigned char
UINT_MAX
-
Maximum decimal value of an unsigned integer
ULONG_MAX
-
Maximum decimal value of an unsigned long int
USHRT_MAX
-
Maximum decimal value of an unsigned short int
For integral types, on a machine that uses two's complement arithmetic (which is just about any machine you're likely to use), a signed type can hold numbers from -2(number of bits - 1) to +2(number of bits - 1) - 1.
An unsigned type can hold values from 0 to +2(number of bits)- 1. For instance, a 16-bit signed integer can hold numbers from -215(-32768) to +215 - 1 (32767).
11. Are there any problems with performing mathematical operations on different variable types?
C has three categories of built-in data types: pointer types, integral types, and floating-point types. Pointer types are the most restrictive in terms of the operations that can be performed on them. They are limited to
- subtraction of two pointers, valid only when both pointers point to elements in the same array. The result is the same as subtracting the integer subscripts corresponding to the two pointers.
+ addition of a pointer and an integral type. The result is a pointer that points to the element which would be selected by that integer.
Floating-point types consist of the built-in types float, double, and long double. Integral types consist of char, unsigned char, short, unsigned short, int, unsigned int, long, and unsigned long. All of these types can have the following arithmetic operations performed on them:
+ Addition
- Subtraction
* Multiplication
/ Division
Integral types also can have those four operations performed on them, as well as the following operations: % Modulo or remainder of division
<< Shift left
>> Shift right
& Bitwise AND operation
| Bitwise OR operation
^ Bitwise exclusive OR operation
! Logical negative operation
~ Bitwise "one's complement" operation
Although C permits "mixed mode" expressions (an arithmetic expression involving different types), it actually converts the types to be the same type before performing the operations (except for the case of pointer arithmetic described previously). The process of automatic type conversion is called "operator promotion."
12. What is operator promotion?
If an operation is specified with operands of two different types, they are converted to the smallest type that can hold both values. The result has the same type as the two operands wind up having. To interpret the rules, read the following table from the top down, and stop at the first rule that applies.
If Either Operand Is
And the Other Is
Change Them To
long double
-
any other type
-
long double
double
-
any smaller type
-
Double
float
-
any smaller type
-
Float
unsigned long
-
any integral type
-
unsigned long
long
-
unsigned > LONG_MAX
-
Long
long
-
any smaller type
-
Long
unsigned
-
any signed type
-
Unsigned
The following example code illustrates some cases of operator promotion. The variable f1 is set to 3/4. Because both 3 and 4 are integers, integer division is performed, and the result is the integer 0. The variable f2 is set to 3/4.0. Because 4.0 is a float, the number 3 is converted to a float as well, and the result is the float 0.75.
#include <stdio.h>
main()
{
float f1 = 3 / 4;
float f2 = 3 / 4.0;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
13. When should a type cast be used?
There are two situations in which to use a type cast. The first use is to change the type of an operand to an arithmetic operation so that the operation will be performed properly. The variable f1 is set to the result of dividing the integer i by the integer j. The result is 0, because integer division is used. The variable f2 is set to the result of dividing i by j as well. However, the (float) type cast causes i to be converted to a float. That in turn causes floating-point division to be used and gives the result 0.75.
#include <stdio.h>
main()
{
int i = 3;
int j = 4;
float f1 = i / j;
float f2 = (float) i / j;
printf("3 / 4 == %g or %g depending on the type used.\n", f1, f2);
}
The second case is to cast pointer types to and from void * in order to interface with functions that expect or return void pointers. For example, the following line type casts the return value of the call to malloc() to be a pointer to a foo structure.
struct foo *p = (struct foo *) malloc(sizeof(struct foo));
14. When should a type cast not be used?
A type cast should not be used to override a const or volatile declaration. Overriding these type modifiers can cause the program to fail to run correctly.
A type cast should not be used to turn a pointer to one type of structure or data type into another. In the rare events in which this action is beneficial, using a union to hold the values makes the programmer's intentions clearer.
15. Is it acceptable to declare/define a variable in a C header?
A global variable that must be accessed from more than one file can and should be declared in a header file. In addition, such a variable must be defined in one source file. Variables should not be defined in header files, because the header file can be included in multiple source files, which would cause multiple definitions of the variable.
The ANSI C standard will allow multiple external definitions, provided that there is only one initialization. But because there's really no advantage to using this feature, it's probably best to avoid it and maintain a higher level of portability.
"Global" variables that do not have to be accessed from more than one file should be declared static and should not appear in a header file.
16. What is the difference between declaring a variable and defining a variable?
Declaring a variable means describing its type to the compiler but not allocating any space for it. Defining a variable means declaring it and also allocating space to hold the variable. You can also initialize a variable at the time it is defined. Here is a declaration of a variable and a structure, and two variable definitions, one with initialization:
extern int decl1; /* this is a declaration */
struct decl2
{
int member;
}; /* this just declares the type--no variable mentioned */
int def1 = 8; /* this is a definition */
int def2; /* this is a definition */
To put it another way, a declaration says to the compiler, "Somewhere in my program will be a variable with this name, and this is what type it is." A definition says, "Right here is this variable with this name and this type."
A variable can be declared many times, but it must be defined exactly once. For this reason, defi nitions do not belong in header files, where they might get #included into more than one place in your program.
17. Can static variables be declared in a header file?
You can't declare a static variable without defining it as well (this is because the storage class modifiers staticand extern are mutually exclusive). A static variable can be defined in a header file, but this would cause each source file that included the header file to have its own private copy of the variable, which is probably not what was intended.
18. What is the benefit of using const for declaring constants?
The benefit of using the const keyword is that the compiler might be able to make optimizations based on the knowledge that the value of the variable will not change. In addition, the compiler will try to ensure that the values won't be changed inadvertently.
Of course, the same benefits apply to #defined constants. The reason to use const rather than #define to define a constant is that a const variable can be of any type (such as a struct, which can't be represented by a#defined constant). Also, because a const variable is a real variable, it has an address that can be used, if needed, and it resides in only one place in memory.
Bit And Bytes
1. What is the most efficient way to store flag values?
A flag is a value used to make a decision between two or more options in the execution of a program. For instance, the /w flag on the MS-DOS dir command causes the command to display filenames in several columns across the screen instead of displaying them one per line. In which a flag is used to indicate which of two possible types is held in a union. Because a flag has a small number of values (often only two), it is tempting to save memory space by not storing each flag in its own int or char.
Efficiency in this case is a tradeoff between size and speed. The most memory-space efficient way to store a flag value is as single bits or groups of bits just large enough to hold all the possible values. This is because most computers cannot address individual bits in memory, so the bit or bits of interest must be extracted from the bytes that contain it.
The most time-efficient way to store flag values is to keep each in its own integer variable. Unfortunately, this method can waste up to 31 bits of a 32-bit variable, which can lead to very inefficient use of memory. If there are only a few flags, it doesn't matter how they are stored. If there are many flags, it might be advantageous to store them packed in an array of characters or integers. They must then be extracted by a process called bit masking, in which unwanted bits are removed from the ones of interest.
Sometimes it is possible to combine a flag with another value to save space. It might be possible to use high- order bits of integers that have values smaller than what an integer can hold. Another possibility is that some data is always a multiple of 2 or 4, so the low-order bits can be used to store a flag.
2. What is meant by "bit masking"?
Bit masking means selecting only certain bits from byte(s) that might have many bits set. To examine some bits of a byte, the byte is bitwise "ANDed" with a mask that is a number consisting of only those bits of interest. For instance, to look at the one's digit (rightmost digit) of the variable flags, you bitwise AND it with a mask of one (the bitwise AND operator in C is &):
flags & 1;
To set the bits of interest, the number is bitwise "ORed" with the bit mask (the bitwise OR operator in C is |). For instance, you could set the one's digit of flags like so:
flags = flags | 1;
Or, equivalently, you could set it like this:
flags |= 1;
To clear the bits of interest, the number is bitwise ANDed with the one's complement of the bit mask. The "one's complement" of a number is the number with all its one bits changed to zeros and all its zero bits changed to ones. The one's complement operator in C is ~. For instance, you could clear the one's digit of flags like so:
flags = flags & ~1;
Or, equivalently, you could clear it like this:
flags &= ~1;
Sometimes it is easier to use macros to manipulate flag values.
Example Program : Macros that make manipulating flags easier.
/* Bit Masking */
/* Bit masking can be used to switch a character
between lowercase and uppercase */
#define BIT_POS(N) ( 1U << (N) )
#define SET_FLAG(N, F) ( (N) |= (F) )
#define CLR_FLAG(N, F) ( (N) &= -(F) )
#define TST_FLAG(N, F) ( (N) & (F) )
#define BIT_RANGE(N, M) ( BIT_POS((M)+1 - (N))-1 << (N) )
#define BIT_SHIFTL(B, N) ( (unsigned)(B) << (N) )
#define BIT_SHIFTR(B, N) ( (unsigned)(B) >> (N) )
#define SET_MFLAG(N, F, V) ( CLR_FLAG(N, F), SET_FLAG(N, V) )
#define CLR_MFLAG(N, F) ( (N) &= ~(F) )
#define GET_MFLAG(N, F) ( (N) & (F) )
#include <stdio.h>
void main()
{
unsigned char ascii_char = 'A'; /* char = 8 bits only */
int test_nbr = 10;
printf("Starting character = %c\n", ascii_char);
/* The 5th bit position determines if the character is
uppercase or lowercase.
5th bit = 0 - Uppercase
5th bit = 1 - Lowercase */
printf("\nTurn 5th bit on = %c\n", SET_FLAG(ascii_char, BIT_POS(5)) );
printf("Turn 5th bit off = %c\n\n", CLR_FLAG(ascii_char, BIT_POS(5)) );
printf("Look at shifting bits\n");
printf("=====================\n");
printf("Current value = %d\n", test_nbr);
printf("Shifting one position left = %d\n",
test_nbr = BIT_SHIFTL(test_nbr, 1) );
printf("Shifting two positions right = %d\n",
BIT_SHIFTR(test_nbr, 2) );
}
BIT_POS(N) takes an integer N and returns a bit mask corresponding to that single bit position (BIT_POS(0) returns a bit mask for the one's digit, BIT_POS(1) returns a bit mask for the two's digit, and so on). So instead of writing
#define A_FLAG 4096
#define B_FLAG 8192
you can write
#define A_FLAG BIT_POS(12)
#define B_FLAG BIT_POS(13)
which is less prone to errors.
The SET_FLAG(N, F) macro sets the bit at position F of variable N. Its opposite is CLR_FLAG(N, F), which clears the bit at position F of variable N. Finally, TST_FLAG(N, F) can be used to test the value of the bit at position F of variable N, as in
if (TST_FLAG(flags, A_FLAG))
/* do something */;
The macro BIT_RANGE(N, M) produces a bit mask corresponding to bit positions N through M, inclusive. With this macro, instead of writing
#define FIRST_OCTAL_DIGIT 7 /* 111 */
#define SECOND_OCTAL_DIGIT 56 /* 111000 */
you can write
#define FIRST_OCTAL_DIGIT BIT_RANGE(0, 2) /* 111 */
#define SECOND_OCTAL_DIGIT BIT_RANGE(3, 5) /* 111000 */
which more clearly indicates which bits are meant.
The macro BIT_SHIFT(B, N) can be used to shift value B into the proper bit range (starting with bit N). For instance, if you had a flag called C that could take on one of five possible colors, the colors might be defined like this:
#define C_FLAG BIT_RANGE(8, 10) /* 11100000000 */
/* here are all the values the C flag can take on */
#define C_BLACK BIT_SHIFTL(0, 8) /* 00000000000 */
#define C_RED BIT_SHIFTL(1, 8) /* 00100000000 */
#define C_GREEN BIT_SHIFTL(2, 8) /* 01000000000 */
#define C_BLUE BIT_SHIFTL(3, 8) /* 01100000000 */
#define C_WHITE BIT_SHIFTL(4, 8) /* 10000000000 */
#define C_ZERO C_BLACK
#define C_LARGEST C_WHITE
/* A truly paranoid programmer might do this */
#if C_LARGEST > C_FLAG
Cause an error message. The flag C_FLAG is not
big enough to hold all its possible values.
#endif /* C_LARGEST > C_FLAG */
The macro SET_MFLAG(N, F, V) sets flag F in variable N to the value V. The macro CLR_MFLAG(N, F) is identical toCLR_FLAG(N, F), except the name is changed so that all the operations on multibit flags have a similar naming convention. The macro GET_MFLAG(N, F) gets the value of flag F in variable N, so it can be tested, as in
if (GET_MFLAG(flags, C_FLAG) == C_BLUE)
/* do something */;
3. Are bit fields portable?
Bit fields are not portable. Because bit fields cannot span machine words, and because the number of bits in a machine word is different on different machines, a particular program using bit fields might not even compile on a particular machine.
Assuming that your program does compile, the order in which bits are assigned to bit fields is not defined. Therefore, different compilers, or even different versions of the same compiler, could produce code that would not work properly on data generated by compiled older code. Stay away from using bit fields, except in cases in which the machine can directly address bits in memory and the compiler can generate code to take advantage of it and the increase in speed to be gained would be essential to the operation of the program.
4. Is it better to bitshift a value than to multiply by 2?
Any decent optimizing compiler will generate the same code no matter which way you write it. Use whichever form is more readable in the context in which it appears. The following program's assembler code can be viewed with a tool such as CODEVIEW on DOS/Windows or the disassembler (usually called "dis") on UNIX machines:
Example: Multiplying by 2 and shifting left by 1 are often the same.
void main()
{
unsigned int test_nbr = 300;
test_nbr *= 2;
test_nbr = 300;
test_nbr <<= 1;
}
5. What is meant by high-order and low-order bytes?
We generally write numbers from left to right, with the most significant digit first. To understand what is meant by the "significance" of a digit, think of how much happier you would be if the first digit of your paycheck was increased by one compared to the last digit being increased by one.
The bits in a byte of computer memory can be considered digits of a number written in base 2. That means the least significant bit represents one, the next bit represents 2´1, or 2, the next bit represents 2´2´1, or 4, and so on. If you consider two bytes of memory as representing a single 16-bit number, one byte will hold the least significant 8 bits, and the other will hold the most significant 8 bits. Figure shows the bits arranged into two bytes. The byte holding the least significant 8 bits is called the least significant byte, or low-order byte. The byte containing the most significant 8 bits is the most significant byte, or high- order byte.
6. How are 16- and 32-bit numbers stored?
A 16-bit number takes two bytes of storage, a most significant byte and a least significant byte. If you write the 16-bit number on paper, you would start with the most significant byte and end with the least significant byte. There is no convention for which order to store them in memory, however.
Let's call the most significant byte M and the least significant byte L. There are two possible ways to store these bytes in memory. You could store M first, followed by L, or L first, followed by M. Storing byte M first in memory is called "forward" or "big-endian" byte ordering. The term big endian comes from the fact that the "big end" of the number comes first, and it is also a reference to the book Gulliver's Travels, in which the term refers to people who eat their boiled eggs with the big end on top.
Storing byte L first is called "reverse" or "little-endian" byte ordering. Most machines store data in a big- endian format. Intel CPUs store data in a little-endian format, however, which can be confusing when someone is trying to connect an Intel microprocessor-based machine to anything else.
A 32-bit number takes four bytes of storage. Let's call them Mm, Ml, Lm, and Ll in decreasing order of significance. There are 4! (4 factorial, or 24) different ways in which these bytes can be ordered. Over the years, computer designers have used just about all 24 ways. The most popular two ways in use today, however, are (Mm, Ml, Lm, Ll), which is big-endian, and (Ll, Lm, Ml, Mm), which is little-endian. As with 16-bit numbers, most machines store 32-bit numbers in a big-endian format, but Intel machines store 32-bit numbers in a little-endian format.
Functions
1. When should I declare a function?
Functions that are used only in the current source file should be declared as static, and the function's declaration should appear in the current source file along with the definition of the function. Functions used outside of the current source file should have their declarations put in a header file, which can be included in whatever source file is going to use that function. For instance, if a function named stat_func() is used only in the source file stat.c, it should be declared as shown here:
/* stat.c */
#include <stdio.h>
static int stat_func(int, int); /* static declaration of stat_func() */
void main(void);
void main(void)
{
...
rc = stat_func(1, 2);
...
}
/* definition (body) of stat_func() */
static int stat_func(int arg1, int arg2)
{
...
return rc;
}
In this example, the function named stat_func() is never used outside of the source file stat.c. There is therefore no reason for the prototype (or declaration) of the function to be visible outside of the stat.c source file. Thus, to avoid any confusion with other functions that might have the same name, the declaration of stat_func() should be put in the same source file as the declaration of stat_func().
In the following example, the function glob_func() is declared and used in the source file global.c and is used in the source file extern.c. Because glob_func() is used outside of the source file in which it's declared, the declaration of glob_func() should be put in a header file (in this example, named proto.h) to be included in both the global.c and the extern.c source files. This is how it's done:
/* proto.h */
int glob_func(int, int); /* declaration of the glob_func() function */
/* global.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void main(void);
void main(void)
{
...
rc = glob_func(1, 2);
...
}
/* definition (body) of the glob_func() function */
int glob_func(int arg1, int arg2)
{
...
return rc;
}
/* extern.c */
#include <stdio.h>
#include "proto.h"
/* include this proto.h file for the declaration of glob_func() */
void ext_func(void);
void ext_func(void)
{
...
/* call glob_func(), which is defined in the global.c source file */
rc = glob_func(10, 20);
...
}
In the preceding example, the declaration of glob_func() is put in the header file named proto.h becauseglob_func() is used in both the global.c and the extern.c source files. Now, whenever glob_func() is going to be used, you simply need to include the proto.h header file, and you will automatically have the function's declaration. This will help your compiler when it is checking parameters and return values from global functions you are using in your programs. Notice that your function declarations should always appear before the first function declaration in your source file.
In general, if you think your function might be of some use outside of the current source file, you should put its declaration in a header file so that other modules can access it. Otherwise, if you are sure your function will never be used outside of the current source file, you should declare the function as static and include the declaration only in the current source file.
2. Why should I prototype a function?
A function prototype tells the compiler what kind of arguments a function is looking to receive and what kind of return value a function is going to give back. This approach helps the compiler ensure that calls to a function are made correctly and that no erroneous type conversions are taking place. For instance, consider the following prototype:
int some_func(int, char*, long);
Looking at this prototype, the compiler can check all references (including the definition of some_func()) to ensure that three parameters are used (an integer, a character pointer, and then a long integer) and that a return value of type integer is received. If the compiler finds differences between the prototype and calls to the function or the definition of the function, an error or a warning can be generated to avoid errors in your source code. For instance, the following examples would be flagged as incorrect, given the preceding
prototype of some_func():
x = some_func(1); /* not enough arguments passed */
x = some_func("HELLO!", 1, "DUDE!"); /* wrong type of arguments used */
x = some_func(1, str, 2879, "T"); /* too many arguments passed */
/* In the following example, the return value expected
from some_func() is not an integer: */
long* lValue;
lValue = some_func(1, str, 2879); /* some_func() returns an int,
not a long* */
Using prototypes, the compiler can also ensure that the function definition, or body, is correct and correlates with the prototype. For instance, the following definition of some_func() is not the same as its prototype, and it therefore would be flagged by the compiler:
int some_func(char* string, long lValue, int iValue) /* wrong order of
parameters */
{
...
}
The bottom line on prototypes is that you should always include them in your source code because they provide a good error-checking mechanism to ensure that your functions are being used correctly. Besides, many of today's popular compilers give you warnings when compiling if they can't find a prototype for a function that is being referenced.
3. How many parameters should a function have?
There is no set number or "guideline" limit to the number of parameters your functions can have. However, it is considered bad programming style for your functions to contain an inordinately high (eight or more) number of parameters. The number of parameters a function has also directly affects the speed at which it is called—the more parameters, the slower the function call. Therefore, if possible, you should minimize the number of parameters you use in a function. If you are using more than four parameters, you might want to rethink your function design and calling conventions.
One technique that can be helpful if you find yourself with a large number of function parameters is to put your function parameters in a structure. Consider the following program, which contains a function namedprint_report() that uses 10 parameters. Instead of making an enormous function declaration and proto- type, theprint_report() function uses a structure to get its parameters:
#include <stdio.h>
typedef struct
{
int orientation;
char rpt_name[25];
char rpt_path[40];
int destination;
char output_file[25];
int starting_page;
int ending_page;
char db_name[25];
char db_path[40];
int draft_quality;
} RPT_PARMS;
void main(void);
int print_report(RPT_PARMS*);
void main(void)
{
RPT_PARMS rpt_parm; /* define the report parameter
structure variable */
...
/* set up the report parameter structure variable to pass to the
print_report() function */
rpt_parm.orientation = ORIENT_LANDSCAPE;
rpt_parm.rpt_name = "QSALES.RPT";
rpt_parm.rpt_path = "C:\REPORTS";
rpt_parm.destination = DEST_FILE;
rpt_parm.output_file = "QSALES.TXT";
rpt_parm.starting_page = 1;
rpt_parm.ending_page = RPT_END;
rpt_parm.db_name = "SALES.DB";
rpt_parm.db_path = "C:\DATA";
rpt_parm.draft_quality = TRUE;
/* Call the print_report() function, passing it a pointer to the
parameters instead of passing it a long list of 10 separate
parameters. */
ret_code = print_report(&rpt_parm);
...
}
int print_report(RPT_PARMS* p)
{
int rc;
...
/* access the report parameters passed to the print_report()
function */
orient_printer(p->orientation);
set_printer_quality((p->draft_quality == TRUE) ? DRAFT : NORMAL);
...
return rc;
}
The preceding example avoided a large, messy function prototype and definition by setting up a predefined structure of type RPT_PARMS to hold the 10 parameters that were needed by the print_report() function. The only possible disadvantage to this approach is that by removing the parameters from the function definition, you are bypassing the compiler's capability to type-check each of the parameters for validity during the compile stage.
Generally, you should keep your functions small and focused, with as few parameters as possible to help with execution speed. If you find yourself writing lengthy functions with many parameters, maybe you should rethink your function design or consider using the structure-passing technique presented here. Additionally, keeping your functions small and focused will help when you are trying to isolate and fix bugs in your programs.
4. What is a static function?
A static function is a function whose scope is limited to the current source file. Scope refers to the visibility of a function or variable. If the function or variable is visible outside of the current source file, it is said to have global, or external, scope. If the function or variable is not visible outside of the current source file, it is said to havelocal, or static, scope.
A static function therefore can be seen and used only by other functions within the current source file. When you have a function that you know will not be used outside of the current source file or if you have a function that you do not want being used outside of the current source file, you should declare it as static. Declaring local functions as static is considered good programming practice. You should use static functions often to avoid possible conflicts with external functions that might have the same name.
For instance, consider the following example program, which contains two functions. The first function,open_customer_table(), is a global function that can be called by any module. The second function,open_customer_indexes(), is a local function that will never be called by another module. This is because you can't have the customer's index files open without first having the customer table open. Here is the code:
#include <stdio.h>
int open_customer_table(void); /* global function, callable from
any module */
static int open_customer_indexes(void); /* local function, used only in
this module */
int open_customer_table(void)
{
int ret_code;
/* open the customer table */
...
if (ret_code == OK)
{
ret_code = open_customer_indexes();
}
return ret_code;
}
static int open_customer_indexes(void)
{
int ret_code;
/* open the index files used for this table */
...
return ret_code;
}
Generally, if the function you are writing will not be used outside of the current source file, you should declare it as static.
5. Should a function contain a return statement if it does not return a value?
In C, void functions (those that do not return a value to the calling function) are not required to include a return statement. Therefore, it is not necessary to include a return statement in your functions declared as being void.
In some cases, your function might trigger some critical error, and an immediate exit from the function might be necessary. In this case, it is perfectly acceptable to use a return statement to bypass the rest of the function's code. However, keep in mind that it is not considered good programming practice to litter your functions with return statements-generally, you should keep your function's exit point as focused and clean as possible.
6. How can you pass an array to a function by value?
An array can be passed to a function by value by declaring in the called function the array name with square brackets ([ and ]) attached to the end. When calling the function, simply pass the address of the array (that is, the array's name) to the called function. For instance, the following program passes the array x[] to the function namedbyval_func() by value:
#include <stdio.h>
void byval_func(int[]); /* the byval_func() function is passed an
integer array by value */
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call byval_func(), passing the x array by value. */
byval_func(x);
}
/* The byval_function receives an integer array by value. */
void byval_func(int i[])
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", i[y]);
}
In this example program, an integer array named x is defined and initialized with 10 values. The functionbyval_func() is declared as follows:
int byval_func(int[]);
The int[] parameter tells the compiler that the byval_func() function will take one argument—an array of integers. When the byval_func() function is called, you pass the address of the array to byval_func():byval_func(x);
Because the array is being passed by value, an exact copy of the array is made and placed on the stack. The called function then receives this copy of the array and can print it. Because the array passed to byval_func() is a copy of the original array, modifying the array within the byval_func() function has no effect on the original array.
Passing arrays of any kind to functions can be very costly in several ways. First, this approach is very inefficient because an entire copy of the array must be made and placed on the stack. This takes up valuable program time, and your program execution time is degraded. Second, because a copy of the array is made, more memory (stack) space is required. Third, copying the array requires more code generated by the compiler, so your program is larger.
Instead of passing arrays to functions by value, you should consider passing arrays to functions by reference: this means including a pointer to the original array. When you use this method, no copy of the array is made. Your programs are therefore smaller and more efficient, and they take up less stack space. To pass an array by reference, you simply declare in the called function prototype a pointer to the data type you are holding in the array.
Consider the following program, which passes the same array (x) to a function:
#include <stdio.h>
void const_func(const int*);
void main(void);
void main(void)
{
int x[10];
int y;
/* Set up the integer array. */
for (y=0; y<10; y++)
x[y] = y;
/* Call const_func(), passing the x array by reference. */
const_func(x);
}
/* The const_function receives an integer array by reference.
Notice that the pointer is declared as const, which renders
it unmodifiable by the const_func() function. */
void const_func(const int* i)
{
int y;
/* Print the contents of the integer array. */
for (y=0; y<10; y++)
printf("%d\n", *(i+y));
}
In the preceding example program, an integer array named x is defined and initialized with 10 values. The function const_func() is declared as follows:
int const_func(const int*);
The const int* parameter tells the compiler that the const_func() function will take one argument—a constant pointer to an integer. When the const_func() function is called, you pass the address of the array toconst_func():
const_func(x);
Because the array is being passed by reference, no copy of the array is made and placed on the stack. The called function receives simply a constant pointer to an integer. The called function must be coded to be smart enough to know that what it is really receiving is a constant pointer to an array of integers. The const modifier is used to prevent the const_func() from accidentally modifying any elements of the original array.
The only possible drawback to this alternative method of passing arrays is that the called function must be coded correctly to access the array—it is not readily apparent by the const_func() function prototype or definition that it is being passed a reference to an array of integers. You will find, however, that this method is much quicker and more efficient, and it is recommended when speed is of utmost importance.
7. Is it possible to execute code even after the program exits the main() function?
The standard C library provides a function named atexit() that can be used to perform "cleanup" operations when your program terminates. You can set up a set of functions you want to perform automatically when your program exits by passing function pointers to the atexit() function. Here's an example of a program that uses the atexit()function:
#include <stdio.h>
#include <stdlib.h>
void close_files(void);
void print_registration_message(void);
int main(int, char**);
int main(int argc, char** argv)
{
...
atexit(print_registration_message);
atexit(close_files);
while (rec_count < max_records)
{
process_one_record();
}
exit(0);
}
This example program uses the atexit() function to signify that the close_files() function and theprint_registration_message() function need to be called automatically when the program exits. When themain() function ends, these two functions will be called to close the files and print the registration message. There are two things that should be noted regarding the atexit() function. First, the functions you specify to execute at program termination must be declared as void functions that take no parameters. Second, the functions you designate with the atexit() function are stacked in the order in which they are called with atexit(), and therefore they are executed in a last-in, first-out (LIFO) method. Keep this information in mind when using the atexit()function. In the preceding example, the atexit() function is stacked as shown here:
atexit(print_registration_message);
atexit(close_files);
Because the LIFO method is used, the close_files() function will be called first, and then theprint_registration_message() function will be called.
The atexit() function can come in handy when you want to ensure that certain functions (such as closing your program's data files) are performed before your program terminates.
8. What does a function declared as PASCAL do differently?
A C function declared as PASCAL uses a different calling convention than a "regular" C function. Normally, C function parameters are passed right to left; with the PASCAL calling convention, the parameters are passed left to right.
Consider the following function, which is declared normally in a C program:
int regular_func(int, char*, long);
Using the standard C calling convention, the parameters are pushed on the stack from right to left. This means that when the regular_func() function is called in C, the stack will contain the following parameters:
long
char*
int
The function calling regular_func() is responsible for restoring the stack when regular_func() returns.
When the PASCAL calling convention is being used, the parameters are pushed on the stack from left to right.
Consider the following function, which is declared as using the PASCAL calling convention:
int PASCAL pascal_func(int, char*, long);
When the function pascal_func() is called in C, the stack will contain the following parameters:
int
char*
long
The function being called is responsible for restoring the stack pointer. Why does this matter? Is there any benefit to using PASCAL functions?
Functions that use the PASCAL calling convention are more efficient than regular C functions—the function calls tend to be slightly faster. Microsoft Windows is an example of an operating environment that uses the PASCAL calling convention. The Windows SDK (Software Development Kit) contains hundreds of functions declared as PASCAL.
When Windows was first designed and written in the late 1980s, using the PASCAL modifier tended to make a noticeable difference in program execution speed. In today's world of fast machinery, the PASCAL modifier is much less of a catalyst when it comes to the speed of your programs. In fact, Microsoft has abandoned the PASCAL calling convention style for the Windows NT operating system.
In your world of programming, if milliseconds make a big difference in your programs, you might want to use the PASCAL modifier when declaring your functions. Most of the time, however, the difference in speed is hardly noticeable, and you would do just fine to use C's regular calling convention.
9. Is using exit() the same as using return?
No. The exit() function is used to exit your program and return control to the operating system. The return statement is used to return from a function and return control to the calling function. If you issue a return from themain() function, you are essentially returning control to the calling function, which is the operating system. In this case, the return statement and exit() function are similar. Here is an example of a program that uses the exit()function and return statement:
#include <stdio.h>
#include <stdlib.h>
int main(int, char**);
int do_processing(void);
int do_something_daring();
int main(int argc, char** argv)
{
int ret_code;
if (argc < 3)
{
printf("Wrong number of arguments used!\n");
/* return 1 to the operating system */
exit(1);
}
ret_code = do_processing();
...
/* return 0 to the operating system */
exit(0);
}
int do_processing(void)
{
int rc;
rc = do_something_daring();
if (rc == ERROR)
{
printf("Something fishy is going on around here..."\n);
/* return rc to the operating system */
exit(rc);
}
/* return 0 to the calling function */
return 0;
}
In the main() function, the program is exited if the argument count (argc) is less than 3. The statement exit(1);tells the program to exit and return the number 1 to the operating system. The operating system can then decide what to do based on the return value of the program. For instance, many DOS batch files check the environment variable named ERRORLEVEL for the return value of executable programs.
Strings
1. What is the difference between a string copy (strcpy) and a memory copy (memcpy)? When should each be used?
The strcpy() function is designed to work exclusively with strings. It copies each byte of the source string to the destination string and stops when the terminating null character (\0) has been moved. On the other hand, thememcpy() function is designed to work with any type of data.
Because not all data ends with a null character, you must provide the memcpy() function with the number of bytes you want to copy from the source to the destination. The following program shows examples of both the strcpy()and the memcpy() functions:
#include <stdio.h>
#include <string.h>
typedef struct cust_str {
int id;
char last_name[20];
char first_name[15];
} CUSTREC;
void main(void);
void main(void)
{
char* src_string = "This is the source string";
char dest_string[50];
CUSTREC src_cust;
CUSTREC dest_cust;
printf("Hello! I'm going to copy src_string into dest_string!\n");
/* Copy src_string into dest_string. Notice that the destination
string is the first argument. Notice also that the strcpy()
function returns a pointer to the destination string. */
printf("Done! dest_string is: %s\n",
strcpy(dest_string, src_string));
printf("Encore! Let's copy one CUSTREC to another.\n");
printf("I'll copy src_cust into dest_cust.\n");
/* First, initialize the src_cust data members. */
src_cust.id = 1;
strcpy(src_cust.last_name, "Strahan");
strcpy(src_cust.first_name, "Troy");
/* Now, use the memcpy() function to copy the src_cust structure to
the dest_cust structure. Notice that, just as with strcpy(), the
destination comes first. */
memcpy(&dest_cust, &src_cust, sizeof(CUSTREC));
printf("Done! I just copied customer number #%d (%s %s).",
dest_cust.id, dest_cust.first_name, dest_cust.last_name);
}
When dealing with strings, you generally should use the strcpy() function, because it is easier to use with strings. When dealing with abstract data other than strings (such as structures), you should use the memcpy() function.
2. How can I remove the trailing spaces from a string?
The C language does not provide a standard function that removes trailing spaces from a string. It is easy, however, to build your own function to do just this. The following program uses a custom function named rtrim() to remove the trailing spaces from a string. It carries out this action by iterating through the string backward, starting at the character before the terminating null character (\0) and ending when it finds the first nonspace character. When the program finds a nonspace character, it sets the next character in the string to the terminating null character (\0), thereby effectively eliminating all the trailing blanks. Here is how this task is performed:
#include <stdio.h>
#include <string.h>
void main(void);
char* rtrim(char*);
void main(void)
{
char* trail_str = "This string has trailing spaces in it. ";
/* Show the status of the string before calling the rtrim()
function. */
printf("Before calling rtrim(), trail_str is '%s'\n", trail_str);
printf("and has a length of %d.\n", strlen(trail_str));
/* Call the rtrim() function to remove the trailing blanks. */
rtrim(trail_str);
/* Show the status of the string
after calling the rtrim() function. */
printf("After calling rtrim(), trail_str is '%s'\n", trail_str);
printf("and has a length of %d.\n", strlen(trail_str));
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE
the null character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
Notice that the rtrim() function works because in C, strings are terminated by the null character. With the insertion of a null character after the last nonspace character, the string is considered terminated at that point, and all characters beyond the null character are ignored.
3. How can I remove the leading spaces from a string?
The C language does not provide a standard function that removes leading spaces from a string. It is easy, however, to build your own function to do just this. you can easily construct a custom function that uses the rtrim()function in conjunction with the standard C library function strrev() to remove the leading spaces from a string. Look at how this task is performed:
#include <stdio.h>
#include <string.h>
void main(void);
char* ltrim(char*);
char* rtrim(char*);
void main(void)
{
char* lead_str = " This string has leading spaces in it.";
/* Show the status of the string before calling the ltrim()
function. */
printf("Before calling ltrim(), lead_str is '%s'\n", lead_str);
printf("and has a length of %d.\n", strlen(lead_str));
/* Call the ltrim() function to remove the leading blanks. */
ltrim(lead_str);
/* Show the status of the string
after calling the ltrim() function. */
printf("After calling ltrim(), lead_str is '%s'\n", lead_str);
printf("and has a length of %d.\n", strlen(lead_str));
}
/* The ltrim() function removes leading spaces from a string. */
char* ltrim(char* str)
{
strrev(str); /* Call strrev() to reverse the string. */
rtrim(str); /* Call rtrim() to remove the "trailing" spaces. */
strrev(str); /* Restore the string's original order. */
return str; /* Return a pointer to the string. */
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE
the null character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
Notice that the ltrim() function performs the following tasks: First, it calls the standard C library functionstrrev(), which reverses the string that is passed to it. This action puts the original string in reverse order, thereby creating "trailing spaces" rather than leading spaces. Now, the rtrim() function is used to remove the "trailing spaces" from the string. After this task is done, the strrev() function is called again to "reverse" the string, thereby putting it back in its original order.
4. How can I right-justify a string?
Even though the C language does not provide a standard function that right-justifies a string, you can easily build your own function to perform this action. Using the rtrim() function, you can create your own function to take a string and right-justify it. Here is how this task is accomplished:
#include <stdio.h>
#include <string.h>
#include <malloc.h>
void main(void);
char* rjust(char*);
char* rtrim(char*);
void main(void)
{
char* rjust_str = "This string is not right-justified. ";
/* Show the status of the string before calling the rjust()
function. */
printf("Before calling rjust(), rjust_str is '%s'\n.", rjust_str);
/* Call the rjust() function to right-justify this string. */
rjust(rjust_str);
/* Show the status of the string
after calling the rjust() function. */
printf("After calling rjust(), rjust_str is '%s'\n.", rjust_str);
}
/* The rjust() function right-justifies a string. */
char* rjust(char* str)
{
int n = strlen(str); /* Save the original length of the string. */
char* dup_str;
dup_str = strdup(str); /* Make an exact duplicate of the string. */
rtrim(dup_str); /* Trim off the trailing spaces. */
/* Call sprintf() to do a virtual "printf" back into the original
string. By passing sprintf() the length of the original string,
we force the output to be the same size as the original, and by
default the sprintf() right-justifies the output. The sprintf()
function fills the beginning of the string with spaces to make
it the same size as the original string. */
sprintf(str, "%*.*s", n, n, dup_str);
free(dup_str); /* Free the memory taken by
the duplicated string. */
return str; /* Return a pointer to the string. */
}
/* The rtrim() function removes trailing spaces from a string. */
char* rtrim(char* str)
{
int n = strlen(str) - 1; /* Start at the character BEFORE the null
character (\0). */
while (n>0) /* Make sure we don't go out of bounds... */
{
if (*(str+n) != ' ') /* If we find a nonspace character: */
{
*(str+n+1) = '\0'; /* Put the null character at one
character past our current
position. */
break; /* Break out of the loop. */
}
else /* Otherwise, keep moving backward in the string. */
n--;
}
return str; /* Return a pointer to the string. */
}
The rjust() function first saves the length of the original string in a variable named n. This step is needed because the output string must be the same length as the input string. Next, the rjust() function calls the standard C library function named strdup() to create a duplicate of the original string. A duplicate of the string is required because the original version of the string is going to be overwritten with a right-justified version. After the duplicate string is created, a call to the rtrim() function is invoked (using the duplicate string, not the original), which eliminates all trailing spaces from the duplicate string.
Next, the standard C library function sprintf() is called to rewrite the new string to its original place in memory. The sprintf() function is passed the original length of the string (stored in n), thereby forcing the output string to be the same length as the original. Because sprintf() by default right-justifies string output, the output string is filled with leading spaces to make it the same size as the original string. This has the effect of right-justifying the input string. Finally, because the strdup() function dynamically allocates memory, the free() function is called to free up the memory taken by the duplicate string.
5. How can I pad a string to a known length?
Padding strings to a fixed length can be handy when you are printing fixed-length data such as tables or spreadsheets. You can easily perform this task using the printf() function. The following example program shows how to accomplish this task:
#include <stdio.h>
char *data[25] = {
"REGION", "--Q1--", "--Q2--", "--Q3--", " --Q4--",
"North", "10090.50", "12200.10", "26653.12", "62634.32",
"South", "21662.37", "95843.23", "23788.23", "48279.28",
"East", "23889.38", "23789.05", "89432.84", "29874.48",
"West", "85933.82", "74373.23", "78457.23", "28799.84" };
void main(void);
void main(void)
{
int x;
for (x=0; x<25; x++)
{
if ((x % 5) == 0 && (x != 0))
printf("\n");
printf("%-10.10s", data[x]);
}
}
In this example, a character array (char* data[]) is filled with this year's sales data for four regions. Of course, you would want to print this data in an orderly fashion, not just print one figure after the other with no formatting. This being the case, the following statement is used to print the data:
printf("%-10.10s", data[x]);
The "%-10.10s" argument tells the printf() function that you are printing a string and you want to force it to be 10 characters long. By default, the string is right-justified, but by including the minus sign (-) before the first 10, you tell the printf() function to left-justify your string. This action forces the printf() function to pad the string with spaces to make it 10 characters long. The result is a clean, formatted spreadsheet-like
output:
REGION --Q1-- --Q2-- --Q3-- --Q4--
North 10090.50 12200.10 26653.12 62634.32
South 21662.37 95843.23 23788.23 48279.28
East 23889.38 23789.05 89432.84 29874.48
West 85933.82 74373.23 78457.23 28799.84
6. How can I copy just a portion of a string?
You can use the standard C library function strncpy() to copy one portion of a string into another string. Thestrncpy() function takes three arguments: the first argument is the destination string, the second argument is the source string, and the third argument is an integer representing the number of characters you want to copy from the source string to the destination string. For example, consider the following program, which uses the strncpy()function to copy portions of one string to another:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* source_str = "THIS IS THE SOURCE STRING";
char dest_str1[40] = {0}, dest_str2[40] = {0};
/* Use strncpy() to copy only the first 11 characters. */
strncpy(dest_str1, source_str, 11);
printf("How about that! dest_str1 is now: '%s'!!!\n", dest_str1);
/* Now, use strncpy() to copy only the last 13 characters. */
strncpy(dest_str2, source_str + (strlen(source_str) - 13), 13);
printf("Whoa! dest_str2 is now: '%s'!!!\n", dest_str2);
}
The first call to strncpy() in this example program copies the first 11 characters of the source string intodest_str1. This example is fairly straightforward, one you might use often. The second call is a bit more complicated and deserves some explanation. In the second argument to the strncpy() function call, the total length of thesource_str string is calculated (using the strlen() function). Then, 13 (the number of characters you want to print) is subtracted from the total length of source_str. This gives the number of remaining characters in source_str. This number is then added to the address of source_str to give a pointer to an address in the source string that is 13 characters from the end of source_str.
Then, for the last argument, the number 13 is specified to denote that 13 characters are to be copied out of the string. The combination of these three arguments in the second call to strncpy() sets dest_str2 equal to the last 13 characters of source_str.
The example program prints the following output:
How about that! dest_str1 is now: 'THIS IS THE'!!!
Whoa! dest_str2 is now: 'SOURCE STRING'!!!
Notice that before source_str was copied to dest_str1 and dest_st2, dest_str1 and dest_str2 had to be initialized to null characters (\0). This is because the strncpy() function does not automatically append a null character to the string you are copying to. Therefore, you must ensure that you have put the null character after the string you have copied, or else you might wind up with garbage being printed.
7. How can I convert a number to a string?
The standard C library provides several functions for converting numbers of all formats (integers, longs, floats, and so on) to strings and vice versa. One of these functions, itoa(), is used here to illustrate how an integer is converted to a string:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
int num = 100;
char str[25];
itoa(num, str, 10);
printf("The number 'num' is %d and the string 'str' is %s.\n",
num, str);
}
Notice that the itoa() function takes three arguments: the first argument is the number you want to convert to the string, the second is the destination string to put the converted number into, and the third is the base, or radix, to be used when converting the number. The preceding example uses the common base 10 to convert the number to the string.
The following functions can be used to convert integers to strings:
Function Name
Purpose
itoa()
-
Converts an integer value to a string.
ltoa()
-
Converts a long integer value to a string.
ultoa()
-
Converts an unsigned long integer value to a string.
Note that the itoa(), ltoa(), and ultoa() functions are not ANSI compatible. An alternative way to convert an integer to a string (that is ANSI compatible) is to use the sprintf() function, as in the following example:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
int num = 100;
char str[25];
sprintf(str, "%d", num);
printf("The number 'num' is %d and the string 'str' is %s.\n",
num, str);
}
When floating-point numbers are being converted, a different set of functions must be used. Here is an example of a program that uses the standard C library function fcvt() to convert a floating-point value to a string:
#include <stdio.h>
#include <stdlib.h>
void main(void);
void main(void)
{
double num = 12345.678;
char* str;
int dec_pl, sign, ndigits = 3; /* Keep 3 digits of precision. */
str = fcvt(num, ndigits, &dec_pl, &sign); /* Convert the float
to a string. */
printf("Original number: %f\n", num); /* Print the original
floating-point
value. */
printf("Converted string: %s\n", str); /* Print the converted
string's value */
printf("Decimal place: %d\n", dec_pl); /* Print the location of
the decimal point. */
printf("Sign: %d\n", sign); /* Print the sign.
0 = positive,
1 = negative. */
}
Notice that the fcvt() function is quite different from the itoa() function used previously. The fcvt() function takes four arguments. The first argument is the floating-point value you want to convert. The second argument is the number of digits to be stored to the right of the decimal point. The third argument is a pointer to an integer that is used to return the position of the decimal point in the converted string. The fourth argument is a pointer to an integer that is used to return the sign of the converted number (0 is positive, 1 is negative).
Note that the converted string does not contain the actual decimal point. Instead, the fcvt() returns the position of the decimal point as it would have been if it were in the string. In the preceding example, the dec_pl integer variable contains the number 5 because the decimal point is located after the fifth digit in the resulting string. If you wanted the resulting string to include the decimal point, you could use the gcvt() function (described in the following table).
The following functions can be used to convert floating-point values to strings:
Function
Purpose
ecvt()
-
Converts a double-precision floating-point value to a string without an embedded decimal point.
fcvt()
-
Same as ecvt(), but forces the precision to a specified number of digits.
gcvt()
-
Converts a double-precision floating-point value to a string with an embedded decimal point.
8. How can I convert a string to a number?
The standard C library provides several functions for converting strings to numbers of all formats (integers, longs, floats, and so on) and vice versa. One of these functions, atoi(), is used here to illustrate how a string is converted to an integer:
#include <stdio.h>
#include <stdlib.h>
void main(void);
{
int num;
char* str = "100";
num = atoi(str);
printf("The string 'str' is %s and the number 'num' is %d.\n",
str, num);
}
To use the atoi() function, you simply pass it the string containing the number you want to convert. The return value from the atoi() function is the converted integer value.
The following functions can be used to convert strings to numbers:
Function Name
Purpose
atof()
-
Converts a string to a double-precision floating-point value.
atoi()
-
Converts a string to an integer.
atol()
-
Converts a string to a long integer.
strtod()
-
Converts a string to a double-precision floating-point value and reports any "leftover" numbers that could not be converted.
strtol()
-
Converts a string to a long integer and reports any "leftover" numbers that could not be converted.
strtoul()
-
Converts a string to an unsigned long integer and reports any "leftover" numbers that could not be converted.
Sometimes, you might want to trap overflow errors that can occur when converting a string to a number that results in an overflow condition. The following program shows an example of the strtoul() function, which traps this overflow condition:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
void main(void);
void main(void)
{
char* str = "1234567891011121314151617181920";
unsigned long num;
char* leftover;
num = strtoul(str, &leftover, 10);
printf("Original string: %s\n", str);
printf("Converted number: %lu\n", num);
printf("Leftover characters: %s\n", leftover);
}
In this example, the string to be converted is much too large to fit into an unsigned long integer variable. Thestrtoul() function therefore returns ULONG_MAX (4294967295) and sets the char* leftover to point to the character in the string that caused it to overflow. It also sets the global variable errno to ERANGE to notify the caller of the function that an overflow condition has occurred. The strtod() and strtol() functions work exactly the same way as the strtoul() function shown above. Refer to your C compiler documentation for more information regarding the syntax of these functions.
9. How can you tell whether two strings are the same?
The standard C library provides several functions to compare two strings to see whether they are the same. One of these functions, strcmp(), is used here to show how this task is accomplished:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* str_1 = "abc";
char* str_2 = "abc";
char* str_3 = "ABC";
if (strcmp(str_1, str_2) == 0)
printf("str_1 is equal to str_2.\n");
else
printf("str_1 is not equal to str_2.\n");
if (strcmp(str_1, str_3) == 0)
printf("str_1 is equal to str_3.\n");
else
printf("str_1 is not equal to str_3.\n");
}
This program produces the following output:
str_1 is equal to str_2.
str_1 is not equal to str_3.
Notice that the strcmp() function is passed two arguments that correspond to the two strings you want to compare. It performs a case-sensitive lexicographic comparison of the two strings and returns one of the following values:
Return Value
Meaning
<0
-
The first string is less than the second string.
0
-
The two strings are equal.
>0
-
The first string is greater than the second string.
In the preceding example code, strcmp() returns 0 when comparing str_1 (which is "abc") and str_2 (which is "abc"). However, when comparing str_1 (which is "abc") with str_3 (which is "ABC"), strcmp() returns a value greater than 0, because the string "ABC" is greater than (in ASCII order) the string "abc".
Many variations of the strcmp() function perform the same basic function (comparing two strings), but with slight differences. The following table lists some of the functions available that are similar to strcmp():
Function Name
Description
strcmp()
-
Case-sensitive comparison of two strings
strcmpi()
-
Case-insensitive comparison of two strings
stricmp()
-
Same as strcmpi()
strncmp()
-
Case-sensitive comparison of a portion of two strings
strnicmp()
-
Case-insensitive comparison of a portion of two strings
Looking at the example provided previously, if you were to replace the call to strcmp() with a call to strcmpi() (a case-insensitive version of strcmp()), the two strings "abc" and "ABC" would be reported as being equal.
10. How do you print only part of a string?
The following program shows how to print only part of a string using the printf() function:
#include <stdio.h>
#include <string.h>
void main(void);
void main(void)
{
char* source_str = "THIS IS THE SOURCE STRING";
/* Use printf() to print the first 11 characters of source_str. */
printf("First 11 characters: '%11.11s'\n", source_str);
/* Use printf() to print only the
last 13 characters of source_str. */
printf("Last 13 characters: '%13.13s'\n",
source_str + (strlen(source_str) - 13));
}
This example program produces the following output:
First 11 characters: 'THIS IS THE'
Last 13 characters: 'SOURCE STRING'
The first call to printf() uses the argument "%11.11s" to force the printf() function to make the output exactly 11 characters long. Because the source string is longer than 11 characters, it is truncated, and only the first 11 characters are printed. The second call to printf() is a bit more tricky. The total length of the source_str string is calculated (using the strlen() function). Then, 13 (the number of characters you want to print) is subtracted from the total length of source_str.
This gives the number of remaining characters in source_str. This number is then added to the address of source_str to give a pointer to an address in the source string that is 13 characters from the end of source_str. By using the argument "%13.13s", the program forces the output to be exactly 13 characters long, and thus the last 13 characters of the string are printed.