Robot Framework and the keyword-driven approach to test automation – Part 1 of 3
Hans Buwalda is generally credited with the introduction of the keyword-driven paradigm of functional test automation, initially calling it the ‘action word’ approach.
This approach tackled certain fundamental problems pertaining to the efficiency of the process of creating test code (mainly the lack of reuse) and the maintainability, readability and robustness of that code. Problems surrounding these aspects frequently led to failed automation efforts. The keyword-driven framework therefore was (and is) a quantum leap forward, providing a solution to these problems by facilitating the application of modularity, abstraction and other design patterns to the automation code.
Robot Framework (RF) can be regarded as the epitome of this type of automation framework. Our first post on the RF concentrated on the high-level design of the platform. In this second of our three-part series of introductory-level posts, we will take a closer look at what the keyword-driven approach to test automation is all about.
This second post will itself be divided into three parts. In part 1, we will look at the position of the keyword-driven approach within the history of test automation frameworks. In part 2 we will delve into the concept of a ‘keyword’. Finally, in part 3, we will look at how the keyword-driven approach is implemented by the RF.
A short history of test automation frameworks
In order to get a first, overall impression of the nature and advantages of the keyword-driven approach to test automation, let’s have a look at the different framework generations that make up the history of test automation platforms. In doing so, we’ll come across some of the differences, similarities and interdependencies between these various types of frameworks.
Please note that this history is partially a recap of similar (and sometimes more elaborate) genealogies in existing books, articles and blog posts. Nevertheless, I want to present it here as part of my comprehensive RF introduction. Moreover, I intend to give my own spin to the categorization of frameworks, by arguing that hybrid, MBT (Model-based testing) and Agile frameworks have no place in this lineage and that Scriptless Frameworks are not really scriptless.
Having said that, let’s take a closer look at the various types of automation frameworks. The methodological and technological evolution of automated functional testing platforms is often divided into the following stages.
Linear frameworks (such as record & playback frameworks)
The code that is written or generated is procedural in nature, lacking both control structures (for run-time decision making and the reuse of code sequences) and calling structures (for the reuse of code modules). To wit, code consists of long sequences of statements that are executed one after the other.
Test cases are thus implemented as typically large, monolithic blocks of static code, mixing what Gojko Adzic calls the technical activity level, user interface workflow level and business rule level. That is, the procedural test case consists of a series of lowest-level statements that implements and ‘represents’ all three levels. Up until the advance of the keyword-driven approach, this mixing was a staple of test automation.
It will be clear that such code lacks reusability, maintainability, readability and several other critical code qualities. The linear framework therefore always injected major drawbacks and limitations into an automation solution.
Amongst many other examples (of such drawbacks and limitations), is that adding even the slightest variation on an existing test case was labour-intensive and leading to more code to maintain. Due to the lack of control structures, any alternative functional flow had to be implemented through an additional test case, since the test code had no way of deciding dynamically (i.e. run-time) which alternative set of test actions to execute. This was worsened by the fact that test cases could not be made generic due to the lack of data-driven capabilities. Consequently, each test case had to be implemented through a dedicated, separate script, even in those cases where the functional flow and (thus) the required test actions were completely identical and only the data input had to be varied.
The lack of control structures also meant less robustness, stability and reliability of the code, since custom error detection and recovery logic could not be implemented and, therefore, run-time error handling was completely lacking.
It was also hard to understand the purpose of test cases. Especially business stake holders were dependant on documentation provided with each test case and this documentation needed to be maintained as well.
With the advent of each subsequent framework generation, this situation would gradually be improved upon.
The automation code now featured control/flow structures (logic), such as for-loops and conditional statements, making the code maybe even harder to read while not improving much upon maintainability and reusability.
Of course, this approach did provide flexibility and power to the automation engineer. It also prevented code duplication to some extent, on the one hand due to the reuse of blocks of statements through looping constructs and on the other hand because alternative functional flows could now be handled by the same test case through decisions/branching. Additionally, robustness and stability was greatly improved upon because through the control structures routines for error detection and handling could be implemented.
Data is now extracted from the code, tremendously improving upon automation efficiency by increased levels of both reusability and maintainability.
The code is made generic by having the data passed into the code by way of argument(s). The data itself persists either within the framework (e.g. in lists or tables) or outside the framework (e.g. in databases, spread sheets or text files).
The automation platform is thereby capable of having the same script iterate through sets of data items, allowing for a truly data-driven test design. Variations on test cases can now be defined by simply specifying the various sets of parameters that a certain script is to be executed with. E.g. a login routine that is repeatedly executed with different sets of credentials to test all relevant login scenario’s.
Through this approach, the 1-1 relation between test case and test script could be abandoned for the first time. Multiple test cases, that require identical sequences of test actions but different input conditions, could be implemented through one test script. This increased the level of data coverage immensely, while simultaneously reducing the number of scripts to maintain.
Of course, the data-driven approach to test design comes with a trade-off between efficiency and readability. Nevertheless, it is now possible to very quickly and efficiently extend test coverage. For instance through applying boundary-value analysis and equivalence partitioning to the test designs.
Where the structured frameworks added efficiency to the testing of variations in functional flows, the data-driven framework added efficiency to the testing of variations in data-flows.
Keyword-driven frameworks (sometimes called modularity frameworks)
Reusable blocks of code statements are now extracted into one or more layers of lower-level test functions. These functions are called ‘keywords’. Consequently, there are now at least two layers in the automation code: the higher-level test cases and the lower-level, reusable test functions (keywords).
Test cases now can call keywords, thereby reusing the involved keyword logic and abstracting from the technical complexities of the automation solution.
The keywords can live in code (e.g. Python, Java, .Net) or can be created through a (proprietary) scripting language. A combination of coding and scripting is also possible. In that case the scripted functions typically reuse (and are implemented through) the lower-level, coded functions.
By facilitating modularization and abstraction, the keyword-driven framework dramatically improves the reusability and maintainability of the test code as well as the readability of both test code and test cases.
More on the nature and advantages of the keyword-driven framework, in part 2 of this second post.
There is a substantial level of controversy surrounding this type of framework. That is, there is a lot of polarized discussion on the validity of the claims that tool vendors make with regard to the capabilities and benefits of their tools.
The principal claim seems to be that these frameworks automatically generate the reusable, structured code modules as featured in the keyword-driven approach. No scripting required.
I must admit to not having a lot of knowledge of (let alone hands-on experience with) these frameworks. But based on what I have seen and read (and heard from people who actually use them), it appears to me that the ‘scriptless’ amounts to nothing more than the automated generation of an (admittedly advanced) object/UI map through some sort of code scanning process. For example, the controls (i.e. GUI elements such as edit fields, buttons, etc.) of some part of an application’s GUI may be inventoried (in terms of all kinds of properties) and then be made available for the application of some sort of operation (e.g. click or input) or assertion (e.g. isEnabled, hasValue). The available types of operations and assertions per GUI element are thus part of the object map. But all of this hardly constitutes a truly scriptless approach, since the logic that these actions must be embedded in (to model the workflow of the SUT), still needs to be scripted and/or coded.
Sometimes hybrid frameworks. are mentioned as the follow-up generation of the keyword-driven approach, but, since keyword-driven frameworks are always data-driven as well, this is a superfluous category. In general, any generation is ‘subsuming’ the essential features of its predecessors, adding new features. So, for example, keyword-driven means structured and data-driven as well.
Note though that certain modern, keyword-driven frameworks are not inherently structured, since they do not feature a (proprietary) scripting language in which to create (mid-level) structured functions, but only support coding such functions at the lowest level in e.g. Java.
For instance, the ‘scenario’ tables of FitNesse, although often used as a middle layer between test cases (specifications) and the fixtures implemented in code, cannot hold structured functions, since FitNesse does not provide control structures of any form. Therefore, the scenario table contains linear ‘code’. For a long time, these tables weren’t even able to provide return values.
As a side note: of all the keyword-driven frameworks, RF features the most mature, complete and powerful scripting engine. We will go into details of the RF scripting facilities in a later post. Of course, to what extent such scripting should be applied to test code development is subject to discussion. This discussion centers mainly around the impact of scripting on the risks that apply to the involved automation efforts. Again, a later RF post will focus on that discussion.
Model-based testing frameworks
Similarly, Model-based testing (MBT) frameworks are sometimes mentioned as preceding the scriptless framework generation. But the specific capabilities of this type of tool pertain to the automated generation of highly formalized (logical) test designs with a specific form of coverage. Therefore, in my opinion, model-based frameworks do not belong in this genealogy at all.
With MBT frameworks test cases are derived from a model, such as e.g. a finite state machine, that represents the (behavior of the) SUT. The model, that serves as input to the framework, is sometimes created dynamically as well. However, the model-based approach to test design is, in itself, agnostic towards the question of how to execute the generated set of cases. It might be automated or it might be manual. Accordingly, there are three ‘deployment models’ for the execution of a model-based test design.
Only seldom, in the case of the so-called ‘online testing’ deployment scheme, the generated tests can be executed automatically. The logical test cases are first dynamically made concrete (physical) and are then executed. In that case, the execution is typically performed by an execution engine that is part of the same framework (as is the case with e.g. Tricentis Tosca). However, depending on the product technology and interfaces, customizations (extensions) to the execution engine could be required.
At times the created test cases are candidates for automated execution. This is the case with the so-called ‘offline generation of executable tests’, deployment scheme which generates the test cases as machine-readable code modules, e.g. a set of JAVA or Python classes. These can then be integrated into the test automation code created on platforms such as Cucumber or RF and subsequently be executed by them.
Most often though the generated test designs adhere to the so-called ‘offline generation of manually deployable tests’ deployment scheme. In that case, the framework output is a set of human-readable test cases that are to be executed manually.
The fact that occasionally model-based test designs can be executed directly by an MBT framework component (namely in one of the three deployment schemes), is the reason that these frameworks are sometimes (and erroneously) listed in the genealogy of test automation frameworks.
Finally, sometimes Agile, ATDD (Acceptance Test Driven Development) or BDD (Behavior Driven Development) frameworks are mentioned as well. However, this is not so much a category onto itself, but rather, at the utmost, an added feature to (or on top of) modern, keyword-driven frameworks. Frameworks such as the Robot Framework or FitNesse allow for high-level, business-readable, BDD-style test designs or specifications. For instance, a test design applying the Gherkin syntax of Given-When-Then.
These BDD-specific features thus add to the frameworks the capability of facilitating specification and collaboration. That is, they enable collaborative specification or specification by example. Some frameworks, such as Cucumber or JBehave actually define, understand and promote themselves in terms of this specific usage. In other words, although they can be used in a ‘purely’ keyword- and data-driven manner as well, they stress the fact that the main use case for these frameworks is ‘specification by example‘. They position themselves as advocates of the BDD philosophy and, as such, want to stimulate and evangelize BDD. From the JBehave web site: “JBehave is a framework for Behaviour-Driven Development (BDD). BDD is an evolution of test-driven development (TDD) and acceptance-test driven design. It shifts the vocabulary from being test-based to behaviour-based, and positions itself as a design philosophy.” [italics MH]
It is precisely through their ability to create modularized and data-driven test automation code, that the keyword-driven frameworks facilitate and enable a layered and generic test design and thus the ATDD/BDD practices of creating executable specifications and of specifying by example. These specifications or examples need to be business-readable and, consequently, require all of the implementation details and complexities to be hidden in lower layers of the automation solution. They must also be able to allow for variations on test cases (i.e. examples), for instance in terms of boundary cases or invalid data. The keyword-driven mechanisms, mainly those of code extraction and data extraction, thus form the technical preconditions to the methodological and conceptual shift from the ‘test-based vocabulary’ to the ‘behavior-based vocabulary’. ATDD/BDD practices, methods and techniques such as collaborative design, Gherkin, living documentation, etc. can be applied through putting those keyword-driven features that form the key advantages of this type of framework to their best use. So, to create, for instance, high-level scenario’s in Gherkin is nothing more than a (somewhat) new form of usage of a keyword-driven framework and a (slightly) different spin on the keyword-driven approach. Although originally not devised and designed with these usages in mind, the keyword-driven frameworks proved to be just the perfect match for such practices.
In conclusion, although Agile, ATDD and BDD as methodological paradigms or conceptual frameworks are new, the phrase ‘Agile/ATDD/BDD test automation frameworks’ merely refers to a variation on the application of the keyword-driven approach and on the usage of the involved frameworks. This usage can, however, be regarded as applying the keyword-driven approach to the fullest extent and as bringing it to its logical conclusion.
At the end of the day, applying ATDD/BDD-style test designs comes down to layering these designs along the lines of the (already mentioned) distinction made by Gojko Adzic. This layering is made possible by the keyword-driven nature of the utilized frameworks and, consequently, always implies a leyword-driven approach.
We now have a better understanding of the position that the keyword-driven approach holds in relation to other types of test automation frameworks and of some of its unique advantages.
As a side-effect, we have also gained some knowledge concerning the evolution of test automation frameworks as well as concerning the problem space that they were developed in.
And finally, we have seen that certain framework ‘types’ do not belong into the pantheon of functional test automation frameworks.
Building upon this context, we will now give a brief, condensed analysis of the notion of a ‘keyword’ in part 2 of this three-part post.