Selenium WebDriver Architecture: Components, Functions & Limitations

Testing the system is a challenging task, and there is nothing like a tool that automates that. One tool that comes into mind for automation testers is Selenium.  If you are eager to learn about automation testing skills using Selenium WebDriver, then, you have come to the right place. Let’s get started. 

What is Selenium?

Selenium is an open-source automation testing tool. The tool only tests web-based applications and is compatible with multiple browsers and operating systems.

There are primarily three versions of Selenium:

  • Selenium RC
  • Selenium IDE
  • Selenium Grid

All these versions were released in 2007.

Selenium WebDriver

Until 2011, Selenium RC was widely used. In mid-2011 Selenium released a new version, WebDriver 2.0. It was not an upgrade to RC but a completely different tool. The difference was Selenium WebDriver 2.0 has its own commands. The latest Selenium WebDriver version is 3.14.

Features of Selenium WebDriver

  • Capable of making dynamic scripts.
  • Compatible with multiple browsers. 
  • Generates reports and logs. 
  • Fast, as it communicates directly with the browser using the browser’s engine.
  • Real-life interaction between page elements.
  • Selenium WebDriver’s API is much simpler and does not contain confusing and redundant commands.
  • Selenium WebDriver can support the headless HtmlUnit browser.

There are five components of Selenium WebDriver Architecture:

  1. Language Binding or Selenium Client Library: These are Jar files, and this is the language used to write the Selenium framework. The script for Selenium is written in Java, C#, Ruby, Python and Perl.
  2. Selenium Application Programming Interface (API): API provides the set of rules and specifications that any software language adheres to. It is also necessary to communicate with other software programs. In short, API acts as the interface between software programs and AC channels of communication. 
  3. Remote WebDriver: It is the WebDriver interface’s implementation class. A test script developer uses the class on a remote machine to execute the test script through a WebDriver server.
  4. JavaScript Object Notation (JSON) Wired Protocol: JSON is a lightweight data-interchangeable format to facilitate the interchange of data. It transfers data between the client and server on the web. The JSON file has a .json extension. JSON wired protocol sends data in the JSON format. Then, the server parses the data and executes it. After execution, the server gives a response and sends it back to the client in JSON format. 
  5. WebDriver: WebDriver is the tool that automates web applications and verifies they work as expected.

Selenium WebDriver Architecture

We will now focus on the Selenium WebDriver Architecture. The Selenium WebDriver API facilitates interactions between browsers and browser drivers. The architecture comprises the following four layers: 

  • Selenium Client Library
  • JSON Wire Protocol
  • Browser Drivers
  • Browsers

How Selenium WebDriver Works Internally?

The code for Selenium WebDriver is written in the Eclipse Integrated Development Environment (IDE). It uses any one of the Selenium client libraries such as Java.

Once the script is ready, click Run to execute the program. Based on the above script, the Chrome browser will launch and navigate to the SeleniumHQ website.

Use the following generic steps for Selenium WebDriver’s internal architecture:

1. Click Run.

The Selenium client library communicates with the Selenium API.

2. Selenium API sends the language command from the level binding to the browser driver. 

The communication is done via JSON wired protocol.

3. Selenium API sends the request to the browser driver.

The browser driver uses the HTTP server for getting the HTTP request.

4. The HTTP server filters out all the commands needed for execution.

The commands in the Selenium script execute on the browser.

5. The HTTP server sends the response to the automation test script.

Technical Specifications of Selenium WebDriver

  • Operation System (OS) – Windows, Solaris, Linux and Mac OS
  • Supported Browser – Internet Explorer, Google Chrome 12.0.712.0 and above, Safari, Opera 11.5 and above, Mozilla Firefox, Internet Explorer, HtmlUnit 2.9, Android and iOS

Best Features of Selenium WebDriver

  • Multiple Browser Support – Supports almost all browsers.
  • Multiple Languages Support – Supports most of the commonly used programming languages.
  • Speed – Selenium WebDriver is faster compared to other tools of Selenium Suite.
  • Simple Commands – Common commands are used and implemented in Selenium WebDriver easily. For example, to launch a browser in Selenium WebDriver execute the following command::
    • WebDriver driver = new FirefoxDriver(); (Firefox browser )
    • WebDriver driver = new ChromeDriver(); (Chrome browser)
    • WebDriver driver = new InternetExplorerDriver(); (Internet Explorer browser)
  • Methods and Classes – Selenium WebDriver has multiple solutions to resolve potential challenges in automation testing.

Read: Selenium Project Ideas & Topics

Limitations of Selenium WebDriver

  • Selenium WebDriver does not automatically support new browsers 

As WebDriver operates on the OS-level, every browser communicates with the OS in varied ways. So, for a new browser, the communication with the OS may be different, resulting in a compatibility issue. You will have to provide your Selenium WebDriver team some time to make the new browser compatible with the Selenium WebDriver.

  • Selenium WebDriver does not have a built-in command to automatically generate a ‘Test Results’ file

You have to rely on the integrated development environment’s (IDE) output window. You can also design it yourself using your preferred language and store it as an HTML file or as text.

Also Read: Selenium Developer Salary in India

Final Thoughts

  • Selenium WebDriver is a tool that tests web applications on different browsers. 
  • It uses different programming languages.
  • Selenium WebDriver is an upgraded version of Selenium RC because of its simpler architecture.
  • Selenium WebDriver has a concise API.

If you’re interested to learn more about full-stack software development, check out upGrad & IIIT-B’s PG Diploma in Full-stack Software Development which is designed for working professionals and offers 500+ hours of rigorous training, 9+ projects, and assignments, IIIT-B Alumni status, practical hands-on capstone projects & job assistance with top firms.

Prepare for a Career of the Future

INDUSTRY TRUSTED LEARNING - INDUSTRY-RECOGNIZED CERTIFICATION.
Learn More

Leave a comment

Your email address will not be published. Required fields are marked *

×