C Programming Tutorial 5: Structures and Unions

Subscribe

1.0 Structures

A structure is a collection of variables for an entity. The variables are known as members of the structure and are located at consecutive memory locations. The size of the structure is the sum of sizes of the members plus any padding that might be placed by the system. Structures are known as records in many programming languages. A common example of a structure is the payroll record data for an employee.

struct employee {
    char name [NAME_SIZE + 1];  // +1 for null character
    char address [ADDRESS_SIZE + 1];
    char dob [9]; // ddmmyyyy + null
    char id [ID_SIZE + 1];
    // ...
};

Notice the semicolon after the closing brace. A structure declaration as above declares a type, or to use the terminology, defines a structure tag, employee. To define a variable, we need to put the variable name after the closing brace, or after the tag as in examples below.

struct employee emp;   // defines emp with the structure struct employee

struct point {
    double x; // x coordinate
    double y; // y coordinate
} point; // defines point with structure struct point  
 
struct {
    char city [CITYNAME_SIZE + 1];
    double latitude;
    double longitude;
    float temperature;
    float wind_speed;
    float humidity;
    float dew_point;
} weather; // defines variable weather, struct tag is not mandatory                   

We can access members of structures using the dot operator. For, example the city member in the above structure can be accessed as, weather.city.

strcpy (weather.city, "Delhi");
printf ("%s\n", weather.city);

The following operations are allowed on structures.

  1. Structures with the same tag can be assigned to each other.
  2. Structures can be passed to functions.
  3. Structures can be return type of functions.
  4. Structures can be initialized at the time of definition.

2.0 Pointers to structures

Just like other variables, we can define a pointer to a structure and initialize it with the address of a structure. C provides a special operator for accessing members of a structure using a pointer. It is ->, a minus sign followed by the greater than symbol. For example,

struct employee emp, *ptr;

strcpy (emp.name, "Alice");
ptr = &emp;

printf ("%s\n", ptr -> name);  //prints Alice

3.0 Arrays of structures

Consider the data for students in a class. We can have an array for names of the students. The size of the array can be equal to the number of students in the class. Since the array index starts from zero, we can follow the convention that the array index for a student is roll number minus 1 for that student. Assuming the students study five subjects, we can have five arrays for storing the marks obtained by students in respective subjects. We can see where all this is leading to. We can have a structure for a student, and an array of struct student for storing data for the whole class.

#define NAME_SIZE 40
#define CLASS_SIZE 60

struct student {
    char name [NAME_SIZE + 1];
    int marks [5];
};

struct student student [CLASS_SIZE];

We can initialize the structures of the array student as shown for the first three elements below.

struct student student [CLASS_SIZE] = {
                                          {"Alice", 78, 56, 87, 76, 45},
                                          {"Bob", 43, 65, 45, 88, 92},
                                          {"Carol", 53, 95, 75, 78, 72},
                                          // ...
                                      };
int main (int argc, char **argv)
{
   printf ("%s\n", student [1].name);  // prints Bob
}

4.0 Self-referential structures

Self-referential structures have members which point to (other) structures with the same tag. For example, consider the structure for a node of a binary tree.

struct tnode {
    char *city;            // key
    double temperature;    // degrees Celsius
    struct tnode *left;
    struct tnode *right;
};

The last two members, left and right are pointers to the structure, struct tnode, itself. This helps in designing recursive algorithms for traversing the binary tree. A program using a binary tree for data storage and retrieval is given here.

5.0 Example program: Dictionary

As an example of structures, pointers and arrays, we have a dictionary program for English language. Our dictionary program starts with a clean slate; there are no words in the dictionary at the start of the program. But, you can add words to the dictionary. And once, a word is added, you can check its meaning (sometime later). The data structures used in the program can be graphically represented as in the figure below.

Dictionary data structure

These data structures are implemented with the following declarations.

#define MAX_WORDS 273001

typedef enum {
    NOUN = 1,
    PRONOUN,
    VERB,
    ADJECTIVE, 
    ADVERB, 
    PROPOSITION,
    CONJUNCTION,
    INTERJECTION
} Part_of_Speech;

struct word_meanings {
    Part_of_Speech pos;
    char **meanings;
};

struct word {
    char *word;
    char *pronunciation;
    struct word_meanings **word_meanings_arr;
    struct word *next;
};

struct word *words [MAX_WORDS];

The English language has about 273,000 words. We will be using a hashing algorithm for random access of words. For storing and accessing a given word, a hash function is applied to the word. The hash function uses a constant integer value MAX_INDEX, which is a prime number close to the maximum number of keys. In this example, the total number of English words, or keys, is 273000 and a prime number close to this is 273001. So, 273001 is the value of MAX_INDEX. The hash function takes a word as an input and gives an integer index in the range 0 to (MAX_INDEX-1) as output. We use this index for storing keys and associated information.

It is possible that the hash function gives the same index value output for multiple input key values. So the keys and the associated data is chained in a linked list at the array index value given by the hash function. The main array, words, is an array of 273001 pointers to the structure struct word. The structure struct word comprises of pointer to the actual word, its pronunciation (we keep phonetic spelling, with the accent syllable in uppercase), a pointer to array of structures struct word_meanings, and, since multiple keys can map to the same index, a pointer to the next word structure.

A word can be a noun and also a verb. It can have multiple parts of speech. And, for a part of speech, it can have multiple (shades of) meanings. So there is an array of structures of part of speech and meanings. And, finally, there is an array of meanings. There are two main functions, one to search the word and display its meanings and the other to store a word and its meanings. The program is as follows.

//
//           dictionary.c: English dictionary
//           
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

#define MAX_WORDS 273001

typedef enum {
    NOUN = 1,
    PRONOUN,
    VERB,
    ADJECTIVE, 
    ADVERB, 
    PROPOSITION,
    CONJUNCTION,
    INTERJECTION
} Part_of_Speech;

struct word_meanings {
    Part_of_Speech pos;
    char **meanings;
};

struct word {
    char *word;
    char *pronunciation;
    struct word_meanings **word_meanings_arr;
    struct word *next;
};

struct word *words [MAX_WORDS];

char buffer [128];

int meaning (char *word);
void enter_word (char *word);
int hash (char *word);
void error (char *msg);

int main (int argc, char **argv)
{
    /* initialize */
    for (int i = 0; i < MAX_WORDS; i++)
	words [i] = NULL;

    printf ("\n\nType a word: ");
    while (fgets (buffer, sizeof (buffer), stdin)) {
        int len = strlen (buffer);
        if (buffer [len - 1] == '\n')
            buffer [len - 1] = '\0';

        // find and print the meaning

        if (meaning (buffer) == -1)
	    enter_word (buffer);
	printf ("\n\nType a word: ");
    }
    printf ("\n");
}

int meaning (char *word)
{
    int index = hash (word);
    bool found = false;
    struct word *ptr = words [index];

    while (ptr) {
        if (strcmp ((ptr -> word), word) == 0) {
            found = true;
	    break;
	}
        ptr = ptr -> next;
    }

    if (!found) 
	return -1;

    // display the meaning
    printf ("%s  [ %s ]\n", ptr -> word, ptr -> pronunciation);
    struct word_meanings **wm_ptr = ptr -> word_meanings_arr;

    while (*wm_ptr) {
	printf ("\n");
        switch ((*wm_ptr) -> pos) {
	    case NOUN:
                       printf ("Noun\n");
		       break;
	    case PRONOUN: 
                       printf ("Pronoun\n");
	               break;
	    case VERB: 
                       printf ("Verb\n");
		       break;
	    case ADJECTIVE: 
                       printf ("Adjective\n");
		       break;
	    case ADVERB:  
                       printf ("Adverb\n");
		       break;
	    case PROPOSITION: 
                       printf ("Preposition\n");
		       break;
	    case CONJUNCTION: 
                       printf ("Conjunction\n");
		       break;
	    case INTERJECTION:
                       printf ("Interjection\n");
		       break;
        }		
	char **meaning = (*wm_ptr) -> meanings;

	while (*meaning) 
            printf ("%s\n", *meaning++);
	wm_ptr++;
    }

    return 0;
}

void enter_word (char *word)
{
    char buf [128];

    printf ("No entry for %s in the dictionary. Do you wish create it? ", word);
    if (fgets (buf, sizeof (buf), stdin) == NULL)
	return;
    if (buf [0] != 'y' && buf [0] != 'Y')
        return;

    // Create and entry in dictionary
    struct word *word_ptr;
    
    if ((word_ptr = malloc (sizeof (struct word))) == NULL)
	error ("malloc");

    if ((word_ptr -> word = malloc (strlen (word) + 1)) == NULL)
	error ("malloc");

    strcpy (word_ptr -> word, word);

    printf ("\n%s\n\n", word_ptr -> word);

    printf ("Pronunciation: ");
    if (fgets (buf, sizeof (buf), stdin) == NULL)
	return;

    int len = strlen (buf);

    if (buf [len - 1] == '\n')
        buf [len - 1] = '\0';

    if ((word_ptr -> pronunciation = malloc (strlen (buf) + 1)) == NULL)
	error ("malloc");
    strcpy (word_ptr -> pronunciation, buf);

    int parts_of_speech, num_meanings;

    bool got_it = false;

    while (!got_it) {
        printf ("How many parts of speech for this word? ");
        if (fgets (buf, sizeof (buf), stdin) == NULL)
            return;
        sscanf (buf, "%d", &parts_of_speech);
        if (parts_of_speech < 1 || parts_of_speech > 5) {
            printf ("Up to 5 meanings allowed, try again\n");
	    continue;
        }
        else
	    got_it = true;
    }

    struct word_meanings **word_meanings_ptr;
    if ((word_meanings_ptr = malloc (sizeof (struct word_meanings *) * (parts_of_speech + 1))) == NULL)
	error ("malloc");

    word_ptr -> word_meanings_arr = word_meanings_ptr;

    for (int i = 0; i < parts_of_speech; i++)  {
	got_it = false;
	int choice;
	while (!got_it) {
            printf ("Enter Part of Speech: \n\n");
	    printf ("1 Noun\n2 Pronoun\n3 Verb\n4 Adjective\n5 Adverb\n6 Preposition\n7 Conjunction\n8 Interjection\n\n0 Done\n\n Enter Part of Speech: ");
            if (fgets (buf, sizeof (buf), stdin) == NULL)
	        return;
	    sscanf (buf, "%d", &choice);
	    if (choice > 0 && choice < 9) 
		got_it = true;
	    else
		printf ("Incorrect Part of Speech, try again\n\n");
	}
        struct word_meanings *word_meanings_ptr1;

	if ((word_meanings_ptr1 = malloc (sizeof (struct word_meanings))) == NULL)
	    error ("malloc");
	word_meanings_ptr1 -> pos = choice;
        word_meanings_ptr [i] = word_meanings_ptr1;
	got_it = false;

	while (!got_it) {
	    printf ("How many meanings? ");
            if (fgets (buf, sizeof (buf), stdin) == NULL)
	        return;
	    sscanf (buf, "%d", &num_meanings);
	    if (num_meanings < 1 || num_meanings > 10) {
                printf ("Up to 10 meanings allowed, try again\n");
	        continue;
	    }
	    else
	        got_it = true;
        }

	char **meanings_ptr;

	if ((meanings_ptr = malloc (sizeof (char *) * (num_meanings + 1))) == NULL)
	    error ("malloc");

	for (int j = 0; j < num_meanings; j++) {
            printf ("Meaning %d: ", j + 1);
            if (fgets (buf, sizeof (buf), stdin) == NULL)
	        return;

            len = strlen (buf);

            if (buf [len - 1] == '\n')
                buf [len - 1] = '\0';

            char *cp;
	    if ((cp = malloc (strlen (buf) + 1)) == NULL)
		error ("malloc");
	    strcpy (cp, buf);
	    meanings_ptr [j] = cp;
	}
	meanings_ptr [num_meanings] = NULL;
	word_meanings_ptr1 -> meanings = meanings_ptr;
    }
    word_meanings_ptr [parts_of_speech] = NULL;

    // Link this word in the dictionary
    int index = hash (word);
    struct word *ptr;
    ptr = words [index];
    words [index] = word_ptr;
    word_ptr -> next = ptr;
}

int hash (char *word)
{
    int sum = 0;

    while (*word)
        sum += *word++;
    return sum % MAX_WORDS;
}

void error (char *msg)
{
    perror (msg);
    exit (1);
}

We can compile and run the above program.

$ gcc dictionary.c -o dictionary
$ ./dictionary

Type a word: truth
No entry for truth in the dictionary. Do you wish create it? y

truth

Pronunciation: trooth
How many parts of speech for this word? 1
Enter Part of Speech: 

1 Noun
2 Pronoun
3 Verb
4 Adjective
5 Adverb
6 Preposition
7 Conjunction
8 Interjection

0 Done

 Enter Part of Speech: 1
How many meanings? 4
Meaning 1: true facts
Meaning 2: conforming to fact
Meaning 3: being true to somebody
Meaning 4: something acknowledged to be true

Type a word: endeavor
No entry for endeavor in the dictionary. Do you wish create it? y

endeavor

Pronunciation: en-DEV-er
How many parts of speech for this word? 2
Enter Part of Speech: 

1 Noun
2 Pronoun
3 Verb
4 Adjective
5 Adverb
6 Preposition
7 Conjunction
8 Interjection

0 Done

 Enter Part of Speech: 1
How many meanings? 1
Meaning 1: a sincere effort
Enter Part of Speech: 

1 Noun
2 Pronoun
3 Verb
4 Adjective
5 Adverb
6 Preposition
7 Conjunction
8 Interjection

0 Done

 Enter Part of Speech: 3
How many meanings? 4
Meaning 1: to exert oneself
Meaning 2: to attempt
Meaning 3: to do something
Meaning 4: to work with purpose

Type a word: truth
truth  [ trooth ]

Noun
true facts
conforming to fact
being true to somebody
something acknowledged to be true

Type a word: endeavor
endeavor  [ en-DEV-er ]

Noun
a sincere effort

Verb
to exert oneself
to attempt
to do something
to work with purpose
...

6.0 Unions

Unions are variables that may hold values of different types and sizes at different times. For example,

union u_example {
    int ival;
    double dval;
    char cname [15];
};

A variable of type union u_example can hold an integer or a double or a character array at a time. The size of a union variable is the size of the largest member plus any padding required.

union u_example example;

printf ("sizeof (example) = %ld\n", sizeof (example));  // prints 16

We can initialize only the first member at the time of definition. An explicit assignment is required to give value to any other member.

union u_example example = {17};

printf ("%d\n", example.ival); //prints 17

union u_example example = {"hello"}; // Error

strcpy (example.cname, "Hello");
printf ("%s\n", example.cname); //prints "Hello"

union d_example {
    char cname [15];
    int ival;
    double dval;
};
union d_example oexample = {"hello"} ; // OK
printf ("%s\n", oexample.cname); //prints "Hello"

How does a program know which member of a union is being used? Programs using unions need to maintain a separate variable which keeps track of the actual member of a union being used. For example, the program given below keeps track of the value stored in the union variable example in another variable, etype.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef enum {
    INT,
    DOUBLE,
    CHAR
} Type;

union u_example {
    int ival;
    double dval;
    char cname [15];
};

int main (int argc, char **argv)
{
    union u_example example = {17};
    Type etype = INT;

    strcpy (example.cname, "Hello");
    etype = CHAR;

    // print the value of example - prints "Hello"
    switch (etype) {
        case INT: printf ("%d\n", example.ival);
                  break;
        case DOUBLE: printf ("%f\n", example.dval);
                  break;
        case CHAR: printf ("%s\n", example.cname);
                  break;
        default: printf ("Error\n");
    }
}