58 Essential Data Structure Viva Questions + Sample Answers: 2025 Edition
By Rohit Sharma
Updated on May 21, 2025 | 47 min read | 13.15K+ views
Share:
For working professionals
For fresh graduates
More
By Rohit Sharma
Updated on May 21, 2025 | 47 min read | 13.15K+ views
Share:
Table of Contents
Did you know? Recent breakthroughs in data structure technology are making it possible for systems to automatically swap out data structures in real-time! By tracking how data structures are used, these smart systems dynamically choose the most efficient options on the fly, slashing memory usage and boosting performance.
Data structures are at the heart of computer science, and understanding them thoroughly is key to succeeding in a viva. This blog covers a wide spectrum of topics, ranging from foundational elements like arrays and linked lists to more advanced structures such as trees, graphs, and hash tables. Each section is crafted to break down complex ideas into simple explanations, ensuring a deep understanding of the subject matter.
In this blog, you’ll explore 58 essential questions related to data structures. It also includes clear, practical answers designed to sharpen knowledge and enhance preparation for 2025 exams.
Want to deepen your understanding of data structures? Enroll in upGrad's Online Software Development Courses and get hands-on experience doing 5+ capstone projects with industry-leading tools and technologies. Join today!
This set of six beginner-friendly data structure and algorithm analysis viva questions is designed to help you understand fundamental concepts with ease. Use these questions to strengthen your understanding and prepare effectively for entry-level exams and interviews.
With the growing demand for skilled professionals, it's important to strengthen your abilities. Take a look at these top courses to refine your skills and advance your career.
Let’s explore the details with the following questions and answers.
How to Answer
Sample Answer
A data structure is a way of organizing and storing data so that we can perform operations like searching, inserting, or deleting efficiently. The choice of data structure is crucial because it can dramatically improve both time and memory usage.
For example, arrays allow quick access to elements using an index, but when frequent insertions or deletions are required, linked lists become more efficient as they allow dynamic changes without shifting all the elements in memory.
Become a top-tier full-stack developer by learning AI tools like OpenAI and GitHub Copilot, and work on hands-on projects. Join upGrad’s Full Stack Development Bootcamp and get personalized mentorship for career growth.
Also Read: What are Data Structures & Algorithm
How to Answer
Sample Answer
“Storage structure refers to how data resides in the computer’s main memory, such as arrays or linked lists that the program handles directly. File structure points to how data is organized on external media, like a hard drive.
In file structure, access times are usually slower, and we often rely on indexing and specialized formats to retrieve files efficiently.”
Key Differences:
Aspect |
Storage Structure (Main Memory) |
File Structure (Secondary Storage) |
Location | Resides in the computer’s main memory | Stored on external devices (hard drives, SSDs, etc.) |
Access Speed | Generally fast, with near-instant read/write times | Slower due to mechanical or interface limits |
Volatility | Data is volatile and lost when power is off | Data persists until explicitly removed or overwritten |
Organization | Uses arrays, stacks, or similar structures in RAM | Uses indices or file systems on disk |
Common Usage | Real-time data processing and manipulation | Persistent storage, archival, and large file handling |
How to Answer
Sample Answer
Algorithmic complexity refers to how an algorithm’s time and space requirements grow as the input size increases. Time complexity measures the amount of time an algorithm takes to execute, while space complexity tracks the memory it consumes. If a program is processing large datasets, choosing the wrong algorithm can lead to significant performance issues.
For instance, an algorithm with O(n^2) time complexity will become slow as data grows, whereas an algorithm with O(log n) complexity will scale much better. Space complexity also matters in selecting a data structure like a hash table can minimize search time but might require extra memory.
Also Read: Algorithm Complexity and Data Structure: Types of Time Complexity
How to Answer
Sample Answer
Big O, Big Omega, and Big Theta are used to describe an algorithm’s performance. Big O represents the worst-case scenario, describing the upper bound of an algorithm’s time or space complexity. Big Omega is the lower bound, showing the best-case scenario, while Big Theta indicates the exact or tight bound, where both the upper and lower bounds are the same.
These notations allow us to evaluate and compare the efficiency of different algorithms regardless of the underlying hardware, ensuring we select the most efficient approach based on our needs.
How to Answer
Sample Answer
Divide-and-conquer is an approach where a problem is broken down into smaller subproblems, which are solved independently, and then their results are combined to solve the original problem. This is powerful because by breaking the problem into smaller parts, we reduce the complexity of the task. Quick Sort and Merge Sort are two classic examples.
Quick Sort partitions data around a pivot, sorting each partition recursively, while Merge Sort divides the data in half and merges the sorted halves. Both algorithms exploit divide-and-conquer to achieve faster performance, especially for large datasets.
How to Answer
Sample Answer
A friend function is a function that is not a member of the class but is granted access to its private and protected members. Classes may declare functions or other classes as ‘friends’ when external functions need to interact with private data without exposing it publicly.
For example, in operator overloading, we often use friend functions to directly modify private variables of a class object, allowing efficient, readable code while preserving encapsulation.
Also Read: 74 C++ Interview Questions and Answers for 2025
An array is a basic data structure that holds elements in contiguous memory, allowing direct index-based operations. It is beneficial when random access matters, and there is no need for frequent insertions or deletions in the middle.
Arrays show up in many coding tasks, from storing sensor readings to representing sequences of objects. They also serve as a building block for more complex layouts.
The next few questions address array basics, handling duplicates, and more specialized scenarios such as sparse storage.
How to Answer
Sample Answer
An array is a collection of elements stored in contiguous memory locations, where each element is directly accessible by its index in constant time, O(1). In contrast, a linked list consists of nodes scattered in memory, where each node contains data and a reference to the next node.
While arrays excel at random access, linked lists are better suited for scenarios where frequent insertions and deletions are required because they don’t require shifting elements as arrays do.
Advance your understanding of machine learning algorithms with upGrad’s Executive Diploma in Machine Learning and AI. Gain practical, hands-on experience that bridges theory and application, preparing you for in-demand roles like AI Engineer and ML Specialist.
How to Answer
Sample Answer
In row-major order, the elements of a 2D array are stored sequentially by row. For example, the first row is stored first, followed by the second row, and so on. In column-major order, the elements of each column are stored together, and the next column is stored after the current one. This choice impacts memory access patterns, row-major order works better when iterating over rows, as accessing elements in sequence reduces cache misses. Column-major order is preferred when operations require accessing elements by column, as it aligns better with the data layout.
How to Answer
Sample Answer
Multidimensional arrays are useful when working with data that spans multiple dimensions, like images or grids. For example, in my previous project, I worked on a simulation that tracked the movement of particles in a 3D space. We used a 3D array to store the position, velocity, and acceleration of each particle across three spatial dimensions. This structure allowed us to efficiently update and query the properties of each particle, optimizing the overall performance of the simulation.
Also Read: Multidimensional Array in Java
How to Answer
Sample Answer
One approach to finding duplicates is to first sort the array, which takes O(n log n) time, then perform a linear scan to detect adjacent duplicate values. Another approach is to use a hash set, which can check for duplicates in O(n) time on average, though it requires extra memory. Sorting reduces memory usage but takes longer to execute, while the hash set method is faster but requires additional space to store the elements seen so far.
How to Answer
Sample Answer
A sparse array is one where most of the elements are default values, such as zero, and only a few elements contain meaningful data. Instead of allocating memory for all positions, sparse arrays store only the non-default values along with their indices.
This is especially useful in situations where the data is sparse, such as representing a chessboard, where most squares are empty. Sparse arrays save memory and allow us to efficiently handle large datasets with many empty or default values.
Also Read: A Guide to Sparse Matrix Representation with Examples
Linked lists use pointers to connect nodes one by one, which makes insertion or deletion less dependent on shifting large blocks of elements. This model differs from arrays, where every item lives in a contiguous space in memory.
Linked lists in data structure also come in single, doubly, and circular variants, each with a unique way of linking nodes.
The questions below look at how these lists work, why they’re valuable, and when they’re a better choice than an array.
How to Answer
Sample Answer
A singly linked list is a sequence of nodes where each node contains data and a pointer to the next node. This structure does not require contiguous memory, making it flexible for dynamic changes. Insertions and deletions can be performed in constant time (O(1)) if the relevant pointer references are updated correctly.
In contrast, an array requires shifting elements to maintain order when an insertion or deletion occurs, making operations in the middle of the array more costly, especially as the array grows in size. Therefore, while linked lists excel in dynamic growth and frequent modifications, arrays are better when random access is needed, due to their O(1) access time.
Struggling with DSA interview questions? Master them for free with upGrad’s Data Structures and Algorithms course. Learn key concepts of data structure and explore real-world applications. Start learning now!
How to Answer
Sample Answer
One of the most common techniques for detecting a loop in a singly linked list is Floyd's cycle-finding algorithm, also known as the tortoise and hare method. It uses two pointers: a slow pointer, which moves one node at a time, and a fast pointer, which moves two nodes at a time. If there is a loop, the fast pointer will eventually meet the slow pointer within the cycle. Another alternative is to store each visited node’s address in a set. If a node is encountered again, a loop exists. While Floyd’s algorithm runs in O(n) time with O(1) space, the set-based method requires extra space for the set.
How to Answer
Sample Answer
A doubly linked list is a type of linked list where each node has two pointers: one pointing to the next node and another pointing to the previous node. This enables easy traversal in both forward and backward directions. In contrast, a singly linked list only stores a reference to the next node, making backward traversal more complex and inefficient. However, each node in a doubly linked list requires extra memory to store the second pointer, which is a trade-off for the added flexibility in traversal.”
How to Answer
Sample Answer
A singly linked list uses less memory because each node only stores a single pointer. It is easier to implement and uses less space, making it more memory-efficient. However, insertion or deletion in the middle requires traversal from the head of the list.
In contrast, a doubly linked list uses more memory since each node stores two pointers (next and previous), but this allows faster insertion and deletion at both ends and in the middle because the previous node is easily accessible. When it comes to searching, both have a time complexity of O(n) unless additional indexing is applied, but doubly linked lists make it easier to navigate backward.
How to Answer
Sample Answer
A linked list is ideal when frequent insertions and deletions are required, especially in the middle of the structure, or when the exact size of the data is not known in advance. Linked lists excel at dynamic memory allocation without the need for resizing or shifting elements.
Conversely, an array would be preferred when random access to elements is crucial, as accessing an element by index in an array is an O(1) operation. Arrays are also the better choice when memory layout is important, as their contiguous structure benefits algorithms like sorting that rely on consecutive memory allocation for optimal performance.”
📌 Did You Know? – Data Structure Interview Insights According to academic forums, Linked Lists, Stacks & Queues, and Tree Traversals are among the top 5 most frequently asked viva questions in Data Structures. |
Stacks follow a simple last-in, first-out approach. Items enter at the top, and the last one added is the first removed. This concept appears in undo mechanisms, expression parsing, and many other computing tasks. Operations such as push and pop happen at a single end, making stack-based logic clear and often efficient.
The next set of questions explores how stacks work, why they handle certain jobs so well, and how they manage function calls in programs.
How to Answer
Sample Answer
A stack is a collection of elements where the most recently added element is the first to be removed. This is known as LIFO (Last-In, First-Out). Think of a stack of plates at a buffet: you add plates to the top and remove the top plate when you need one. The last plate you put on the stack is the first one you’ll take off. This model illustrates how stacks work—only the top element is accessible at any given time, which is why operations like push and pop occur at the top of the stack.
How to Answer
Sample Answer
Typical operations on a stack include push(x) to add an element to the top, pop() to remove the top element, and peek() to view the top element without removing it. Both push and pop are typically O(1) because they involve adding or removing an item from a single end (the top of the stack), without shifting any other elements. The constant-time performance is achieved because only the top element is modified, and no traversal or rearrangement of the stack is required.
How to Answer
Sample Answer
“A stack is frequently used in evaluating arithmetic expressions like converting from infix to postfix notation. When parsing an expression, the stack holds operators and ensures the correct precedence is maintained. For example, in converting an infix expression like (3 + 5) * 2 to postfix, the stack temporarily holds the operators to ensure they are applied in the correct order.
Similarly, balanced parentheses checking relies on stacks. Each time an opening bracket is encountered, it’s pushed onto the stack, and when a closing bracket appears, the stack is popped to ensure the parentheses match. If the stack is empty at the end, the parentheses are balanced.”
How to Answer
Sample Answer
“Stack underflow occurs when there is an attempt to pop an element from an empty stack. It typically results in an error, as there are no elements to remove. Stack overflow, on the other hand, happens when a push operation is attempted on a stack that is already at its maximum capacity, leading to memory exhaustion.
Some high-level implementations resize the stack automatically to prevent overflow. Both conditions indicate that the stack’s boundaries have been exceeded, and appropriate checks need to be in place to prevent them.”
Also Read: Overflow And Underflow in C
Sample Answer
“Most programming languages store function calls on a call stack. When a new function is called, details like local variables and the return address go on top of this stack. Once the function finishes, those details are popped, returning control to the previous function.
This structure preserves the correct order of function calls and ensures that each function’s variables stay separate.”
Why Does This Answer Work?
How to Answer
Sample Answer
“In programming languages, a call stack is used to manage function calls. When a function is called, the call stack stores information about the function, such as local variables and the return address. As the function completes, this information is popped from the stack, and control is transferred back to the previous function. This LIFO structure ensures that the most recent function call is always completed first. The call stack helps manage the execution order and ensures that each function’s state is preserved separately.”
Also Read: How to Implement Stacks in Data Structure? Stack Operations Explained
Queues arrange data so that the first item placed is the first one removed. This style suits scenarios where requests or tasks must be processed in arrival order. Unlike stacks, queues insert elements at one end and remove them from the other. Many operating systems employ queues for scheduling, and various applications rely on them to maintain consistent data flow.
The questions below highlight several queue features and variants, including circular queues and deques.
How to Answer
Sample Answer
“A queue processes elements in the order they enter, known as FIFO. This differs from a stack’s LIFO model, which removes the most recent item first. A queue typically adds data at the rear and removes it from the front, making it better suited for situations that must follow a strict arrival order, such as print jobs or network packets.”
Key differences between a queue and a stack:
Aspect |
Queue (FIFO) |
Stack (LIFO) |
Main Principle | First-In, First-Out: earliest item inserted leaves first | Last-In, First-Out: most recent item inserted leaves first |
Insertion Point | Rear (enqueue) | Top (push) |
Removal Point | Front (dequeue) | Top (pop) |
Real-World Use | Print jobs, task scheduling, buffering | Function call stack, undo mechanisms, expression parsing |
Access Pattern | Strictly from opposite ends | Single point for both add/remove |
How to Answer
Sample Answer
“Enqueue places a new element at the back (rear) of the queue, while dequeue removes an element from the front. A linked-list implementation maintains two pointers: one for the head (front) and one for the tail (rear).
In an array-based queue (especially a circular version), two indices track the front and rear. Storing a new element is as simple as placing it at queue[rear] and incrementing rear by one (possibly wrapping around in a circular fashion).
Removing an item increments the front. Neither action requires shifting the entire array, so each operation completes in O(1).”
How to Answer
Sample Answer
"A normal queue has a fixed size, and when elements are dequeued, the space at the front of the queue becomes wasted, even though there may still be room at the rear. This can lead to inefficient memory use, especially when the queue is not completely full.
A circular queue, however, resolves this by allowing the rear pointer to 'wrap around' to the front of the array once space becomes available. This approach ensures that all allocated memory is used efficiently, as the elements fill the queue in a continuous circular manner. This prevents wasted space and helps maximize the buffer usage."
Also Read: Difference Between Circular Queue and Linear Queue
How to Answer
Sample Answer
"A deque (double-ended queue) is a data structure that allows insertion and removal of elements from both the front and the rear, unlike a standard queue, which only supports insertion at the rear and removal at the front. This flexibility makes a deque ideal for situations where we need to perform operations at both ends efficiently.
For instance, a deque is often used in scheduling algorithms where tasks can be added or removed from either end based on priority. This is in contrast to a standard queue, which processes elements strictly in the order they were added."
Also Read: Deque interface in Java with Example
How to Answer
Sample Answer
"Queues are essential in systems where tasks need to be processed in the order they arrive. For example:
Hashing maps data to fixed-size indexes, which allows rapid access and updates when the hash function distributes keys evenly. This approach gives average near O(1) lookups, but collisions can still arise if two keys produce the same index. Different collision-resolution methods limit performance loss and maintain quick retrieval.
The following questions cover hash functions, collision handling, rehashing, and typical software cases where hash tables excel.
How to Answer
Sample Answer
"Hashing is a technique used to map data (such as a string or integer) to a specific index in a hash table using a hash function. A good hash function distributes the keys evenly across the table, which minimizes the number of collisions (when two keys map to the same index). This allows for quick constant-time (O(1)) lookups, as once the key is hashed, we can directly access the value stored at that index. However, if the hash function leads to a poor distribution of keys, many collisions can occur, slowing down the lookup process."
Also Read: A Comprehensive Guide on Hashing in Data Structures
How to Answer
Sample Answer
"A collision occurs when two different keys hash to the same index in a hash table. To handle collisions, we use strategies like separate chaining and open addressing. In separate chaining, each index in the hash table points to a linked list that stores all the keys hashing to that index. In open addressing, when a collision occurs, the algorithm probes the next available spot in the table (using linear or quadratic probing) to find an open slot for the key."
Sample Answer
“In separate chaining, each position in the table holds a pointer to a small list or chain of entries that share the same index. This structure grows as needed without disturbing other buckets.
Open addressing keeps all keys within the table itself by probing for another free position whenever collisions happen. It might check subsequent indexes (linear probing) or skip by fixed intervals (quadratic probing).
Chaining can consume more memory for node pointers, while open addressing must manage the probe sequence to reduce clustering.”
Why Does This Answer Work?
How to Answer
Sample Answer
"In separate chaining, when multiple keys hash to the same index, they are stored in a linked list at that index. This allows the hash table to store an arbitrary number of keys at each index but comes at the cost of extra memory for the linked lists. In open addressing, all keys are stored within the table itself. When a collision occurs, the algorithm searches for the next available slot (using probing) until it finds an open position. While open addressing uses less memory, it can suffer from clustering where groups of keys are placed in contiguous slots, degrading performance."
How to Answer
Explain the concept of load factor as a ratio. Mention what happens when it exceeds a threshold, and how this affects collisions, rehashing, and lookup efficiency.
Sample Answer
“The load factor is the ratio of stored elements to the total capacity of the table (for example, the number of keys divided by the number of buckets). A higher ratio can mean more collisions.
Many implementations pick a threshold (like 0.75). When the load factor goes beyond that level, the table is resized — this process is called rehashing. Rehashing redistributes all existing keys into a bigger array, restoring efficient lookups at the cost of a one-time overhead during resizing.”
How to Answer
Point to real-world systems where fast key-based access is essential, such as compilers and caching. Emphasize why hash tables are used based on time complexity benefits.
Sample Answer
“Hash tables often power language compilers, storing identifiers in a symbol table for quick lookup. They also serve as the basis for many caching systems, where items are stored and retrieved by key at high speed. These scenarios leverage the near O(1) average lookup to keep operations efficient and reduce response times.”
Also Read: Hash Tables and Hash Maps in Python
Trees store information in a hierarchical manner, starting with a root and branching into various child nodes. This structure suits problems involving hierarchical data, such as file systems and search operations. Some trees allow quick lookups by enforcing ordering rules, while others focus on maintaining balance for consistent performance.
The following questions explore binary tree questions, binary search trees, and self-balancing trees used in large-scale applications.
How to Answer
Define a binary tree with its child constraints. Compare it to a general tree to highlight what makes binary trees simpler for certain algorithms.
Sample Answer
“A binary tree is a tree structure where each node has at most two children: a left child and a right child. This contrasts with a general tree, which can have an arbitrary number of children for each node.
The binary constraint simplifies certain algorithms and data-handling tasks, making it easier to implement operations like tree traversals or specialized forms of searching. A general tree could have multiple child pointers, but a binary tree always limits each node to two.”
Also Read: Binary Tree in Data Structure: Properties, Types, Representation & Benefits
How to Answer
Explain the BST rule involving left and right subtrees. Connect this ordering to efficient search by halving the problem space at each level.
Sample Answer
“A BST enforces a special ordering: all values in the left subtree of a node must be smaller than the node’s key, and all values in the right subtree must be larger. This rule applies recursively throughout the tree. Because of this arrangement, lookups compare a target value to the current node and decide whether to branch left or right.
In a balanced BST, this cuts the search space roughly in half with each comparison, leading to an average of O(log n) time. A general binary tree lacks this ordering, so a search might inspect most or all nodes.”
Also Read: Binary Tree vs Binary Search Tree: Difference Between Binary Tree and Binary Search Tree
How to Answer
List the three traversal types with their visiting orders. For inorder, mention its use with BSTs for sorted output.
Sample Answer
“Tree traversals define the order in which nodes are visited.
Inorder traversal is often useful for situations like printing the contents of a BST in ascending order because it naturally visits nodes in sorted order based on the BST’s arrangement.
How to Answer
Define AVL as a self-balancing BST. Explain the role of height balancing and rotations, and how they maintain optimal time complexity.
Sample Answer
“An AVL tree is a self-balancing binary search tree. It monitors the heights of the left and right subtrees for every node and maintains their difference at no more than one. If an operation like insertion or deletion causes an imbalance, the tree rebalances itself through rotations.
Keeping the height close to log n prevents degenerate cases that can degrade performance in a normal BST, thus preserving O(log n) lookups, insertions, and deletions on average."
Here’s a sample code (Python) illustrating rebalancing on Insert:
The code manages a self-balancing Binary Search Tree. Each node tracks its own height, and insertion follows normal BST rules before checking balance factors. If the tree is unbalanced, rotations adjust local pointers to keep the height near log(n).
class AVLNode:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
self.height = 1
def get_height(root):
if not root:
return 0
return root.height
def get_balance(root):
if not root:
return 0
return get_height(root.left) - get_height(root.right)
def right_rotate(z):
y = z.left
T3 = y.right
# Perform rotation
y.right = z
z.left = T3
# Update heights
z.height = 1 + max(get_height(z.left), get_height(z.right))
y.height = 1 + max(get_height(y.left), get_height(y.right))
# Return the new root
return y
def left_rotate(z):
y = z.right
T2 = y.left
# Perform rotation
y.left = z
z.right = T2
# Update heights
z.height = 1 + max(get_height(z.left), get_height(z.right))
y.height = 1 + max(get_height(y.left), get_height(y.right))
# Return the new root
return y
def insert(root, key):
# 1. Regular BST insertion
if not root:
return AVLNode(key)
elif key < root.key:
root.left = insert(root.left, key)
else:
root.right = insert(root.right, key)
# 2. Update this node's height
root.height = 1 + max(get_height(root.left), get_height(root.right))
# 3. Get the balance factor
balance = get_balance(root)
# 4. Rebalance if needed
# Case 1: Left Left
if balance > 1 and key < root.left.key:
return right_rotate(root)
# Case 2: Right Right
if balance < -1 and key > root.right.key:
return left_rotate(root)
# Case 3: Left Right
if balance > 1 and key > root.left.key:
root.left = left_rotate(root.left)
return right_rotate(root)
# Case 4: Right Left
if balance < -1 and key < root.right.key:
root.right = right_rotate(root.right)
return left_rotate(root)
return root
How to Answer
Define B-trees in terms of structure and purpose. Highlight their relevance in disk-based systems and large-scale indexing.
Sample Answer
“A B-tree is a self-balancing tree that generalizes the concept of a binary search tree to allow multiple keys per node, along with multiple child pointers. It’s designed to optimize disk access by reducing the number of read/write operations required when searching or updating.
In a B+ tree, all actual data records appear in leaf nodes, linked for sorted access. B-trees and B+ trees typically appear in databases and file systems where large volumes of data must be indexed efficiently, and block-based I/O performance matters.”
Graphs in data structure capture relationships between items in a flexible structure, allowing multiple connections (edges) between different points (vertices). They can represent anything from roads on a map to user connections on a social platform. Unlike trees with a clear parent-child hierarchy, graphs may contain cycles or complex linkages.
The questions below explore various graph representations, search algorithms, and methods for finding paths and spanning trees.
How to Answer
Define a graph in terms of nodes and edges. Contrast it with trees by highlighting the absence of cycles in trees and the presence of a strict parent-child relationship. Mention real-world applications for both.
Sample Answer
“A graph is a set of vertices (nodes) connected by edges, which may form loops or intricate paths. A tree is a special case of a graph that has no cycles and exactly one path between any two nodes.
In a tree, each node (except the root) has exactly one parent, while a graph can have multiple ways to reach the same vertex, potentially including cycles.”
Key differences between a Graph and a Tree:
Aspect |
Graph |
Tree |
Structure | Consists of vertices and edges, possibly including cycles | A connected acyclic graph with exactly one path between any two nodes |
Cycles | May have cycles | No cycles; any loop breaks it from being a tree |
Connectivity | Might be fully connected, partially connected, or disconnected | Always connected when considered a valid tree (except an empty tree) |
Parent-Child Links | Not strictly defined | Each node (except root) has exactly one parent |
Typical Use Cases | Social networks, road maps, dependency graphs | Hierarchies, file directory structures, family trees |
How to Answer
Explain both structures and how they represent connections. Emphasize when one saves more space, especially in sparse graphs. Mention the memory vs. access-time trade-offs.
Sample Answer
“An adjacency list keeps a list of neighboring vertices for each node. If node A connects to node B and node C, those neighbors appear in A’s list. An adjacency matrix uses a 2D array where entry [i][j] indicates whether there is an edge between node i and node j.
When the graph is sparse, adjacency lists save space by storing only existing edges, whereas an adjacency matrix always allocates memory for every possible edge, even if many remain unused.”
How to Answer
Define each type based on edge direction. Provide simple real-world analogies like one-way and two-way roads. Give a specific example of systems where each graph type is used.
Sample Answer
“A directed graph has edges that point from one vertex to another, like a one-way street where traffic can only move in one direction. An undirected graph treats connections as two-way, similar to a road allowing traffic in both directions.
A social network where someone can follow another user without a reciprocal follow might be modeled by a directed graph, whereas a basic friendship link often suits an undirected graph because each connection goes both ways by default.
Also Read: Types of Graphs in Data Structure & Applications
How to Answer
Define both traversal algorithms in terms of their strategy (level-order vs. depth-first). Link each to a practical scenario like shortest path or cycle detection.
Sample Answer
“Breadth-First Search explores nodes level by level, starting at a source vertex and visiting all its immediate neighbors before moving on. This strategy often helps find the shortest path in an unweighted graph. Depth-First Search moves along one branch deeply before backtracking, which suits detecting cycles or exploring all connected components quickly.
A shortest-path problem might use BFS, while a cycle-finding algorithm or path-based puzzle can favor DFS.”
How to Answer
Summarize the main mechanics of Dijkstra’s (greedy selection + distance updates). Mention that it fails with negative weights and explain why.
Sample Answer
“Dijkstra’s algorithm tracks distances from a start node to all others by picking the node with the smallest known distance and examining its neighbors to see if a better path exists. It uses a priority queue (or min-heap) to efficiently choose the next closest node.
The algorithm fails on graphs with negative edge weights because it assumes that once a node’s minimum distance is finalized, it will never be improved, which doesn’t hold when negative values can reduce path costs later.”
How to Answer
Define MST and relate it to cost-saving real-world problems. Explain the step-by-step difference between Prim’s and Kruskal’s algorithms, and when each is preferable.
Sample Answer
“A Minimum Spanning Tree is a subset of edges that connects all vertices in a weighted graph with no cycles and the smallest total edge cost. It applies to tasks like planning road systems or reducing wiring costs in a network.
Prim’s algorithm grows the MST one edge at a time, always picking the cheapest edge from the existing tree to a new vertex. Kruskal’s algorithm sorts all edges first, then picks them in ascending order while avoiding cycles.
Both arrive at a minimal spanning tree, but Prim’s often suits dense graphs, whereas Kruskal’s can be simpler for sparser ones, especially if edges are already sorted.”
Also Read: Time Complexity of Kruskal Algorithm: Data Structure, Example
How to Answer
Describe the role of Union-Find (Disjoint Set Union - DSU) in detecting cycles during MST construction (especially Kruskal’s). Explain how it works with merging and checks.
Sample Answer
“When adding edges in Kruskal’s method, a cycle can form if two vertices are already in the same connected component. A union-find (or disjoint set) structure keeps track of the component to which each vertex belongs.
Each time an edge is considered, the algorithm checks if both endpoints belong to the same set. If so, adding that edge would form a cycle. If not, it unites the sets of those vertices.”
Searching and sorting lie at the heart of performance tuning. Fast searches reduce the time it takes to locate information, while efficient sorting keeps data organized for follow-up tasks. Various algorithms tackle these problems in diverse ways, whether by dividing input data into halves for quick lookups or by reordering elements to lower comparisons.
The questions examine classical methods like binary search, their worst-case scenarios, and widely used sorting algorithms.
How to Answer:
Sample Answer
“Binary search works on sorted data by comparing the target value with the midpoint of the current range. If the target is smaller, the search narrows to the lower half; if larger, it focuses on the upper half. This process continues recursively or iteratively until the item is found or the range is empty.
Because each step halves the search space, the time complexity is O(log n), which is much faster than the O(n) of linear search when dealing with large datasets. In a dataset of one million elements, linear search might check each element, whereas binary search only requires around 20 comparisons in the best scenario.”
Also Read: Searching in Data Structure: Different Search Algorithms and Their Applications
How to Answer:
Sample Answer
“Binary search’s worst-case time complexity remains O(log n), but the comparison steps can still add up if the target is missing or located at the outer boundary. In extreme cases, if the dataset is not sorted, binary search is not applicable at all.
Linear search might be preferable when the data is unsorted or subject to constant updates, as continually sorting a structure just to use binary search could outweigh any benefit. Also, if the dataset is very small, the overhead of setting up binary search or sorting might not pay off.”
Also Read: Time and Space Complexity of Binary Search Explained
How to Answer:
Sample Answer
“Merge Sort follows a divide-and-conquer path: it splits the array into halves, sorts each half, then merges the results. Splitting continues until subarrays contain only one element each, which are trivially sorted. Merging combines sorted arrays efficiently.
Quick Sort also divides the array but chooses a pivot element. Items smaller than the pivot go to one side, and items larger go to the other. Each side is then sorted recursively. Though its average time is O(n log n), a poor pivot selection can degrade performance to O(n²).”
How to Answer:
Sample Answer
“Merge Sort consistently runs in O(n log n) time, using extra space to hold the merged output. Quick Sort also averages O(n log n), but a badly chosen pivot can cause O(n²) in the worst case. Merge Sort needs O(n) auxiliary space because it often relies on a second array for merging steps, whereas Quick Sort can work in place, using minimal extra memory.
Merge Sort excels in scenarios where consistent O(n log n) performance is needed, while Quick Sort often runs faster in practice, provided pivot selection is managed correctly.”
How to Answer:
Sample Answer
“For data that exceeds main memory, External Merge Sort is commonly chosen. It sorts chunks of data that fit in memory, writes each chunk out to disk, then merges those chunks in passes. This method is ideal for large datasets because it carefully uses limited memory and processes slices sequentially.
Quick Sort is less common for on-disk sorting due to its partition-based swapping, which involves more random access and can increase I/O overhead.”
Also Read: Sorting in Data Structure: Categories & Types [With Examples]
Heaps are specialized tree-based structures that focus on quick retrieval of the smallest or largest element. They maintain a complete binary shape, keeping operations predictable regarding time complexity.
A min-heap places the smallest element at the root, while a max-heap keeps the largest at the root. Priority queues often build on heaps to manage tasks or data with different urgency levels.
The questions below explore how heaps function, handle insertion, and power sorting and scheduling.
How to Answer:
Sample Answer
“A heap is a complete binary tree in which each parent node satisfies an order property relative to its children. In a min-heap, every parent holds a value smaller than or equal to its children, so the smallest element is always at the root.
In a max-heap, the situation is reversed: the parent's value is greater than or equal to that of its children, and the largest element resides at the root.
Both heaps allow quick access to the root element in O(1) time, yet maintain overall operations (like insertion and removal) in O(log n).”
Min-heap vs Max-heap:
Aspect |
Min-Heap |
Max-Heap |
Order Rule | Parent ≤ children | Parent ≥ children |
Root Element | Smallest item at the root | Largest item at the root |
Removal Priority | Removes or finds the smallest element first | Removes or finds the largest element first |
Common Uses | Situations where the minimum priority item is needed quickly (e.g., shortest task) | Situations where the maximum priority item is needed first (e.g., highest-priority process) |
Time Complexity | Both have O(1) root access; insertions/removals in O(log n) | Same as min-heap—O(1) for root access; O(log n) for insertions/removals |
How to Answer:
Sample Answer
“Min-heap insertion follows two major steps:
This process continues until the new element’s parent is no longer greater or the element reaches the root, preserving the min-heap property where each node is smaller than or equal to its children.”
Here’s a practical demonstration of the same:
Assume the current min-heap (stored in an array) is [1, 3, 2, 8, 5, 7]. The first element 1 is the root, and the array visually represents:
(1)
/ \
(3) (2)
/ \ /
(8) (5) (7)
Suppose the new element to insert is 4.
Step 1: Insert at the Next Open Position
Visually:
(1)
/ \
(3) (2)
/ \ / \
(8) (5) (7) (4)
Step 2: Bubble Up (Percolate Up)
The heap already satisfies the min-heap property with 4 as a child of 2, because 2 is smaller than 4.
Final Min-Heap Array: [1, 3, 2, 8, 5, 7, 4]
Visually:
(1)
/ \
(3) (2)
/ \ / \
(8) (5) (7) (4)
How to Answer:
Sample Answer
“A priority queue always removes the item with the highest or lowest priority first. A heap suits this need because it organizes elements so that the highest or lowest priority item is at the root.
In an operating system’s process scheduler, a max-heap can store processes where higher priority means larger key. The scheduler pops the root for the next process to run, then re-heapifies. This method ensures quick identification of which process or task should execute next.
How to Answer:
Sample Answer
“Removing the top node (the min in a min-heap or max in a max-heap) takes O(log n) time overall. After removing the root element, the last node in the bottom layer is moved to the root.
A down-heap step follows, also known as bubble-down. It compares a node with its children and swaps if the heap order is violated. This reordering travels at most the height of the tree, which is O(log n) for a complete binary tree.
How to Answer:
Sample Answer
“Heap sort treats the input array as a heap.
This process repeats until the entire array is sorted in ascending order. Heap sort runs in O(n log n) time and sorts in place, so it does not require additional large memory allocations.
Also Read: Sorting in Data Structure: Categories & Types Explained
Many advanced data structures focus on specialized tasks. A trie can speed up prefix lookups for strings while union-find groups items into sets for quick cycle checks. Dynamic programming stores intermediate results to avoid wasted recursion. The Floyd-Warshall algorithm targets multi-source shortest paths in graphs.
The questions below cover these and related techniques, showing how they solve problems efficiently.
How to Answer:
Sample Answer
“A trie is a tree-like data structure where each node represents a single character in a sequence. Words or keys branch out from the root, storing shared prefixes in common paths. This setup helps when finding items that begin with a given prefix because the traversal follows a single path for that sequence of characters.
Once the end of the prefix is reached, every branch below corresponds to a matching entry. Tries often appear in autocomplete features or spelling checkers, as they handle prefix queries faster than scanning every word in a list.”
Also Read: Trie Data Structure: A Comprehensive Guide
How to Answer:
Sample Answer
“A hash table maps each key to a hash code, then stores it at a certain bucket index. It can fetch entire strings quickly if collisions remain low. A trie stores one character per node along a path. It can be more space-intensive if many strings share only partial overlaps because each unique character sequence may add nodes.
On the other hand, a hash table may need large arrays or rehashing under growth. A trie might also outperform a hash table for large numbers of shared prefixes, but it usually requires more pointers and can grow large in memory if many distinct branches exist.”
Trie vs Traditional Hash Table:
Aspect |
Trie (Prefix Tree) |
Hash Table |
Data Structure | Tree-like, with each node representing one character in a sequence. | Array or buckets indexed by hashing a string key. |
Lookup Approach | Follows the path of characters from root to leaf (or end marker). | Computes a hash code, then checks the corresponding bucket (may involve collision handling). |
Strengths | Excellent for prefix-based search and retrieval; can exploit shared prefixes. | Fast average-case lookups, simpler to code for basic key retrieval. |
Weaknesses | May become large if many strings share only partial overlaps; node pointers add overhead. | Collisions can degrade performance; must resize or rehash when load factor rises. |
Memory Usage | Typically higher when storing diverse strings; each character node includes references. | Depends on table size and collision strategy; can also be high if keys bunch up under collisions. |
Use Cases | Autocomplete, spell checkers, IP routing (prefix matching). | Dictionary or symbol table lookups, caching mechanisms. |
How to Answer:
Sample Answer
“Dynamic programming breaks a problem into overlapping subproblems, storing answers for each one so those results can be reused instead of recalculated.
For example, computing the nth Fibonacci number involves summing the (n-1)th and (n-2)th numbers. A naive recursion might recalculate sub-Fibonacci values multiple times. Dynamic programming fixes this by caching every Fibonacci result in a table or array, turning an exponential-time approach into something closer to linear or n log n, depending on the specific problem.”
Also Read: A Deep Dive into Fibonacci Heap: Unlocking the Secrets of Its Efficiency
How to Answer:
Sample Answer
“A disjoint set, also called a union-find, maintains a collection of non-overlapping subsets. Each element belongs to exactly one subset, and union-find supports two main operations: finding which subset an element belongs to (find) and merging two subsets (union).
A classic use is in Kruskal’s algorithm for building a minimum spanning tree. Before adding an edge, the algorithm checks if it unites two different subsets. If so, it merges them, ensuring no cycles form.”
How to Answer:
Sample Answer
“Floyd-Warshall algorithm calculates shortest paths between every pair of nodes in a graph. It iterates through each node as a possible intermediate step, updating a distance matrix if the path through that node improves the distance between two others. This produces all-pairs results in O(V³) time, where V is the number of vertices.
By contrast, Dijkstra’s algorithm concentrates on one starting node and typically runs in O(E + V log V) with a suitable priority queue. For a single source, Floyd-Warshall is slower because it solves a more general, all-pairs problem that might not be needed.
Also Read: Dijkstra's Algorithm: Finding the Shortest Route Made Easy
Strong preparation for data structure and algorithm interviews hinges on consistent problem-solving and an organized study plan. Written outlines, coding challenges, and practice with time limits all help build confidence.
Here are some focused suggestions to consider:
These steps build a firm base for tackling a wide variety of DSA challenges and refining the ability to communicate clearly during interviews.
Also Read: Explore the Top 30+ DSA projects with source code in 2025
Understanding data structures and algorithms is essential for excelling in exams and interviews. The concepts discussed, including arrays, linked lists, dynamic programming, and other advanced techniques, provide a strong foundation for tackling viva questions. They also help enhance problem-solving skills, ensuring you're well-prepared for both exams and real-world challenges.
If you're looking to take your skills further and advance your career, upGrad’s Software Development and Data Science courses provide a structured learning experience with expert guidance.
In addition to the programs mentioned above, here are some more free courses to further enhance your learning and skills:
Not sure how to start a rewarding career in software development. Contact upGrad’s expert career counselors, who will guide you based on your goals. You can also visit a nearby upGrad offline center to explore course options, get hands-on experience, and speak directly with mentors!
Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!
Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!
Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!
References:
763 articles published
Rohit Sharma shares insights, skill building advice, and practical tips tailored for professionals aiming to achieve their career goals.
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources