top

Search

C Tutorial

.

UpGrad

C Tutorial

Data Structures in C

What is a Data Structure?

A data structure is a specialised format for organizing and storing data. It's essentially a collection of data values, the relationships among them, and the functions or operations that can be applied to the data. Effective use of data structures enables efficient algorithms and drastically improves the performance of the software or an entire system.

Types of Data Structures

Data structures can be broadly classified into two categories:

1. Linear Data Structures: These are data structures where data elements are sequentially connected, where each element is connected to its previous and next elements. Examples include Arrays, Stacks, Queues, and Linked Lists.

2. Non-linear Data Structures: In non-linear data structures, data elements aren't sequentially connected and take a hierarchical structure. Examples include Trees and Graphs.

Below, let's dissect each type of data structure with relevant examples in C.

Array Data Structure in C

An array refers to a linear data structure storing a collection of elements (values or variables), each identified by at least one array index or key. Elements in an array are of the same type, and this type can be anything: integers, floats, objects, or even other arrays.

When defining an array in C, we specify its type and maximum length.

int array[5] = {1, 2, 3, 4, 5}; // An integer array of length 5

Here, we declare an integer array of length 5, with initial values from 1 to 5.

Stack Data Structure in C

A Stack is a type of linear data structure that adheres to the LIFO principle. This means that the most recently added element to the stack will be the first one to be removed.

Think of a stack as a stack of books. You can only add (push) a new book on top of the stack and remove (pop) the book that is currently on top of the stack.

In C, a stack can be implemented using an array and an integer representing the top index. The provided code demonstrates the implementation of a stack data structure, including functions for initialisation, retrieving the top element, pushing elements onto the stack, and popping elements from the stack. 

#define MAX 100

typedef struct {
    int top;
    int data[MAX];
} Stack;

void Stack_Init(Stack *S)
{
    S->top = -1;
}

int Stack_Top(Stack *S)
{
    return S->data[S->top];
}

void Stack_Push(Stack *S, int d)
{
    if (S->top == MAX - 1) { // stack is full
        printf("Error: stack overflow\n");
        return;
    } 

    S->data[++S->top] = d;
}

void Stack_Pop(Stack *S)
{
    if (S->top == -1) { // stack is empty 
        printf("Error: No element to pop\n");
        return;
    }

    S->top--;
}

Queue Data Structure in C

A Queue refers to a linear data structure in C which follows the FIFO principle (First In, First Out). This means that the element that is added first will be the first one to be removed. A queue can be visualised as a line of people waiting for a service, where the person who arrives first is served first.

A queue can be implemented in C using an array and two pointers: front and rear. The front pointer keeps track of the front element, and the rear pointer keeps track of the rear element of the queue. 

#define MAX_SIZE 100

typedef struct {
    int front, rear;
    int data[MAX_SIZE];
} Queue;

void Queue_Init(Queue *Q)
{
    Q->front = -1;
    Q->rear = -1;
}

int is_empty(Queue *Q)
{
    return Q->front == -1;
}

void enqueue(Queue *Q, int value)
{
    if (Q->rear == MAX_SIZE - 1) {
        printf("Error: Queue is full\n");
        return;
    }

    Q->data[++Q->rear] = value;

    if (Q->front == -1) {
        Q->front = 0;
    }
}

void dequeue(Queue *Q)
{
    if (is_empty(Q)) {
        printf("Error: Queue is empty\n");
        return;
    }

    if (Q->front == Q->rear) {
        Q->front = -1;
        Q->rear = -1;
    } else {
        Q->front++;
    }
}

The code defines a Queue structure with an array 'data' and two integers 'front' and 'rear' that represent the positions of the front and rear elements in the queue. Queue_Init initialises the queue, enqueue adds an element to the rear, and dequeue removes the front element.

Linked List Data Structure in C

A Linked List is a data structure consisting of a node sequence comprising data and a pointer to the next node in the sequence. Linked lists are not constrained by a fixed size like arrays and can dynamically grow or shrink as needed. Linked lists provide efficient insertion and deletion operations at any position within the list. 

In C, different types of linked lists can be implemented based on their structure and functionality. Let's explore some of the commonly used types:

Singly Linked List

A Singly Linked List is the most basic type of linked list. Each node in a singly linked list contains data and a pointer to the proceeding node present in the sequence. The last node's pointer points to NULL, indicating the end of the list. Singly linked lists allow traversal in one direction only, from the head (first node) to the tail (last node).

typedef struct Node {
    int data;
    struct Node* next;
} Node;

Doubly Linked List

A Doubly Linked List extends the functionality of a singly linked list by including an additional pointer in each node. In addition to the next pointer, each node in a doubly linked list also has a previous pointer that points to the previous node, further enabling traversal in both directions.

typedef struct Node {
    int data;
    struct Node* prev;
    struct Node* next;
} Node;

Circular Linked List

A Circular Linked List is a variation of a linked list in which the last node's pointer points back to the first node, forming a circular structure. Circular linked lists can either be singly or doubly linked. 

typedef struct Node {
    int data;
    struct Node* next;
} Node;

Skip List

A Skip List allows for efficient searching, insertion, and deletion operations in logarithmic time complexity. It achieves this efficiency by maintaining multiple layers of linked lists, with each layer skipping over a variable number of elements. Skip lists are particularly useful for maintaining sorted lists and performing fast search operations.

typedef struct SkipListNode {
    int data;
    struct SkipListNode* next;
    struct SkipListNode* down;
} SkipListNode;

Here’s a linked list example tp understand accurate implementation-

#include <stdio.h>
#include <stdlib.h>

typedef struct Node {
    int data;
    struct Node* next;
} Node;

Node* createNode(int value)
{

    Node* newNode = (Node*)malloc(sizeof(Node));
    newNode->data = value;
    newNode->next = NULL;
    return newNode;
}

void insertAtBeginning(Node** head, int value)
{
    Node* newNode = createNode(value);
    newNode->next = *head;
    *head = newNode;
}

void insertAtEnd(Node** head, int value)
{
    Node* newNode = createNode(value);
    if (*head == NULL) {
        *head = newNode;
    } else {
        Node* curr = *head;
        while (curr->next != NULL) {
            curr = curr->next;
        }
        curr->next = newNode;
    }
}

void displayList(Node* head)
{
    Node* curr = head;
    while (curr != NULL) {
        printf("%d ", curr->data);
        curr = curr->next;
    }
    printf("\n");
}

The code defines a Node structure for the linked list, where each node holds data and a pointer to the next node. Functions like createNode, insertAtBeginning, insertAtEnd, and displayList are used to create nodes, insert them at the beginning or end of the list, and display the list's elements.

Tree Data Structure in C

A Tree is a non-linear data structure composed of nodes that are connected by edges. In a tree, each node can have zero or more child nodes, forming a hierarchical structure. The topmost node in the tree is called the root, and each node can have a parent node (except the root) and zero or more child nodes.

In C, a tree can be implemented using different types, including -

Binary Tree

A Binary Tree is a tree data structure where each node can have at most two child nodes: a left child and a right child. The left child node is typically smaller or less than the parent node, while the right child node is greater or equal to the parent node. Binary trees are commonly used for efficient searching and sorting algorithms like binary search.

typedef struct TreeNode {
    int value;
    struct TreeNode* leftChild;
    struct TreeNode* rightChild;
} 

Binary Search Tree (BST)

A Binary Search Tree (BST) is a variant of a binary tree that maintains a specific order among its nodes. For any given node, all the nodes in its left subtree are smaller, and all the nodes in its right subtree are greater. This ordering property allows for efficient search, insertion and deletion operations. BSTs are widely used in database systems, symbol tables, and dynamic sets.

typedef struct TreeNode {
    int data;
    struct TreeNode* left;
    struct TreeNode* right;
} TreeNode;

AVL Tree

An AVL Tree is a self-balancing binary search tree in which the heights of any node's left and right subtrees differ by at most one. It ensures that the tree remains balanced. AVL trees provide fast search, insertion, and deletion operations, making them suitable for applications that require frequent modifications while maintaining balance.

typedef struct AVLNode {
    int value;
    struct AVLNode* leftChild;
    struct AVLNode* rightChild;
    int nodeHeight;
} AVLNode;

Red-Black Tree

A Red-Black Tree is another self-balancing binary search tree that guarantees logarithmic time complexity for search, insert, and delete operations. It uses a colouring scheme and a set of rules to ensure that the tree remains balanced. Each node is coloured either red or black, and the tree maintains balance by performing rotations and colour adjustments. Red-Black Trees are widely used in libraries and frameworks for efficient data storage and retrieval.

typedef enum { RED_COLOR, BLACK_COLOR } Color;

typedef struct RBNode {
    int value;
    Color nodeColor;
    struct RBNode* leftChild;
    struct RBNode* rightChild;
    struct RBNode* parentNode;
} RBNode;

B-Tree

A B-Tree is a self-balancing search tree that is designed to efficiently store large amounts of data on secondary storage devices. A B-Tree generalises the concept of a binary search tree by allowing multiple keys and child pointers per node. It maintains a balance between depth and number of keys, optimising disk I/O operations by reducing the number of access times.

typedef struct BTreeNode {
    int* keys;
    int t;
    struct BTreeNode** child;
    int n;
    bool leaf;
} BTreeNode;

Graph Data Structure in C

A Graph refers to a non-linear C data structure comprising a set of vertices (nodes) and edges that connect pairs of vertices. It is a powerful data structure used to represent relationships between different entities. Graphs can be classified as - directed (edges with a specific direction) or undirected (edges without a direction).

In C, a graph can be implemented using an adjacency matrix or an adjacency list. The adjacency matrix representation uses a 2D array to store the connections between vertices, while the adjacency list representation uses an array of linked lists.

Adjacency Matrix

An Adjacency Matrix is a 2D matrix representing the connections between vertices in a graph. The rows and columns of the matrix correspond to the vertices, and the entries indicate whether there is an edge between two vertices. 

#define MAX_VERTICES 100

typedef struct {
    int matrix[MAX_VERTICES][MAX_VERTICES];
    int numVertices;
} AdjacencyMatrixGraph;

Adjacency List

An Adjacency List is a data structure that represents a graph as an array of linked lists. Each element in the array represents a vertex, and the linked list associated with each vertex stores its adjacent vertices. The adjacency list representation is more space-efficient for sparse graphs, as it only stores existing connections. 

typedef struct Node {
    int dest;
    struct Node* next;
} Node;

typedef struct {
    int numVertices;
    Node** adjList;
} AdjacencyListGraph;

Incidence Matrix

An Incidence Matrix is a 2D matrix that represents a graph by indicating the presence or absence of edges between vertices and edges. The rows of the matrix represent the vertices, and the columns represent the edges. Each entry in the matrix can be a Boolean value indicating whether the vertex is connected to the edge. The incidence matrix representation is useful for graphs with a large number of edges but requires more space than the adjacency matrix for dense graphs.

#define MAX_VERTICES 100
#define MAX_EDGES 100

typedef struct {
    int matrix[MAX_VERTICES][MAX_EDGES];
    int numVertices;
    int numEdges;
} IncidenceMatrixGraph;

Edge List

An Edge List is a simple list that represents a graph by storing the edges as pairs of vertices. Each edge is represented by a tuple (u, v), where u and v are the vertices connected by the edge. The edge list representation is straightforward and efficient in terms of memory usage for sparse graphs. However, it may require a linear search for edge existence and may not support efficient neighbour traversal.

typedef struct {
    int numVertices;
    int numEdges;
    int** edgeList;
} EdgeListGraph;

These are some of the commonly used graph data structures in C. Each type has its own advantages and trade-offs in terms of space complexity, time complexity, and suitability for different graph operations. 

Conclusion

In this comprehensive guide, we explored various data structures in C, including arrays, stacks, queues, linked lists, trees, and graphs. Understanding these data structures and algorithms in C is essential for solving complex problems.

To further enhance your knowledge and skills in data structures, algorithms, and computer science, explore the Executive PG Program in Data Science course offered by upGrad. With this course, you can gain practical insights, hands-on experience, and industry-relevant knowledge to excel in your programming journey. 

FAQs

1. What is the difference between an array and a linked list?

An array is a fixed-size data structure that stores elements sequentially in memory, while a linked list is a dynamic data structure where elements are stored in separate nodes that are connected via pointers.

2. What is the time complexity of searching for an element in a binary search tree?

The time complexity for searching in a binary search tree is O(log n) on average, where n is the number of nodes in the tree.

3. What is the benefit of using an adjacency list to represent a graph?

The adjacency list representation is efficient for sparse graphs, as it only stores the connections between vertices, resulting in reduced memory usage.

Leave a Reply

Your email address will not be published. Required fields are marked *