Selenium WebDriver Architecture

Selenium is a browser automation tool, commonly used for writing end-to-end tests of web applications. A browser automation tool does exactly what you would expect: automate the control of a browser so that repetitive tasks can be automated. It sounds like a simple problem to solve, but as we will see, a lot has to happen behind the scenes to make it work.
 
Before describing the architecture of Selenium it helps to understand how the various related pieces of the project fit together. At a very high level, Selenium is a suite of three tools. The first of these tools, Selenium IDE, is an extension for Firefox that allows users to record and playback tests. The record/playback paradigm can be limiting and isn't suitable for many users, so the second tool in the suite, Selenium WebDriver, provides APIs in a variety of languages to allow for more control and the application of standard software development practices. The final tool, Selenium Grid, makes it possible to use the Selenium APIs to control browser instances distributed over a grid of machines, allowing more tests to run in parallel. Within the project, they are referred to as "IDE", "WebDriver" and "Grid". This chapter explores the architecture of Selenium WebDriver.
 
This chapter was written during the betas of Selenium 2.0 in late 2010. If you're reading the book after then, then things will have moved forward, and you'll be able to see how the architectural choices described here have unfolded. If you're reading before that date: Congratulations! You have a time machine. Can I have some winning lottery numbers?
 
Continue here>>