Test Run
UI Automation with Windows PowerShell
Dr. James McCaffrey
Code download available at:TestRun2007_12.exe(164 KB)
Contents
Using Windows PowerShell
Overview of My Test Script
The Application Under Test
Creating a Custom Cmdlet Library
Using the Custom Cmdlet Library
Using, Modifying, and Extending the UI Cmdlet Library
Though it has only been around for a relatively short amount of time, Windows PowerShellTM is already one of my favorite tools. I recently discovered that Windows PowerShell has all the features you need for creating a tiny library that will enable you to write ultralightweight UI automation.
In this month's column, I will show you how to create a small collection of custom Windows PowerShell cmdlets that perform Windows® UI automation tasks. These include obtaining handles to applications and controls, manipulating controls, and checking application state. In this discussion, I will assume you already have a basic familiarity with Windows PowerShell and with calling Win32® API functions using the Microsoft® .NET Framework P/Invoke mechanism with the C# language. But even if these are new to you, you should still be able to follow this column with a little bit of effort.
Using Windows PowerShell
The easiest way for me to illustrate the key points of using Windows PowerShell for ultralightweight UI automation is with a screenshot, as shown in Figure 1. The first couple of lines of output in the shell simply indicate that I'm using Windows PowerShell. The next few lines then tell me that my custom cmdlets are being registered; this is done by a custom startup script that runs every time a new instance of the Windows PowerShell shell is launched. (My startup script also sets the current working directory to C:\UIautomationWithPowerShell.) In this example, I've written and registered eight custom cmdlets for ultralightweight UI automation: get-window, get-control, get-controlByIndex, send-chars, send-click, get-listBox, send-menu, and get-textBox.
Figure 1** UI Automation with Windows PowerShell **(Click the image for a larger view)
Next, I issue the following command to display the names of all items in the current directory that begin with the letter "t":
PS C:\UIautomationWithPowerShell> get-childitem t* | select-object Name | format-table -auto
In Figure 1, the output tells me that I have a directory named TheAppToTest and a file named testScenario.ps1. This file is my Windows PowerShell test script.
The get-childitem command is one of approximately 130 built-in Windows PowerShell cmdlets. Many of these cmdlets also have aliases. The get-childitem cmdlet, for instance, has a few aliases—it can be given as "dir" (for those most familiar with the old cmd.exe command shell and .bat files), as "ls" (for those engineers used to a UNIX environment), and "gci" (a simple abbreviation for convenience at the command line).
My command pipes the output of get-childitem to the select-object cmdlet, which I use to filter the results down to just file and directory name properties. Then I pipe that result to the format-table cmdlet with an -auto switch to produce a condensed display. Of course, I could have typed just "get-childitem" (or one of its aliases) without any piping to accept all the default arguments for the command.
I can invoke my test script by entering the command .\testScenario.ps1 on the command line. Unlike most scripting languages, note that in Windows PowerShell, even if your script is in the current working directory, you still have to specify the path to the script using either a relative path (as I've done) or a full path. This is for security purposes. Additionally, Windows PowerShell does not allow script execution by default, so you must explicitly enable execution if you want to run scripts. You can check the current script execution policy by entering the command "get-executionpolicy" and modify the policy by entering "set-executionpolicy" with an argument of "remoteSigned" or "unrestricted".
Once set, the execution policy on a particular system will remain in effect for all new instances of the Windows PowerShell shell and user sessions. The fact that my startup script, which is just an ordinary Windows PowerShell script, was actually able to execute when I launched a new shell indicates that my current execution policy allows script execution.
Overview of My Test Script
My UI test automation script has quite a few logging messages, so you should be able to follow its progress without too much trouble. The script first launches the application under test. As you can see in Figure 1, I'm testing a simple Windows form-based application. While this particular application is built on the .NET Framework, these Windows PowerShell UI automation techniques will also work with Win32 applications. The techniques I'm showing you, however, are not designed to work with Web applications.
After launching the application under test, my script obtains a handle to the application itself and the child controls on the application. It then simulates a user-click on the application's Search button. The application responds to the button click with an error message box. My script locates the message box and simulates a click on the OK button in the message box.
The next part of the automation sends text "Product Name" to the ComboBox control (to specify how the application should search) and "ad" to the TextBox control (to specify a search target). The script clicks the search button again and this time the application searches through a store of dummy data, displaying matches in a ListBox control.
My script finishes by checking the state of the application (in this case, looking at the contents of the ListBox and TextBox controls). It determines whether the test scenario passed or failed, and then performs a user-click on the application File | Exit menu item.
The Application Under Test
As I mentioned, the dummy application under test is .NET-based. It has basic user controls, including a TextBox control, a ComboBox control, a Button control, two Label controls, and a ListBox control. I often use this dummy application to explore different UI automation techniques. Note that the complete code for this application, along with all the other code referenced in this column, is available in the download that accompanies this column.
When performing UI automation, you must know the name or index control number for any control you want to manipulate or examine programmatically. Bear with me for a moment because you may not be familiar with this concept. All user controls are windows. To manipulate a control you need a handle to it. Some user controls have window names, but many don't. In my dummy application, the window name of the application itself is Form1, while the TextBox control has no window name. If a control has a window name, you can use that name to programmatically get a handle to the control. However, when a control does not have a window name, you can use the control index number.
Control index numbers are implied rather than explicit, and they are determined by the order in which they are added to the main Form in the InitializeComponent method. If you look at the source code of my dummy application, you'll see:
this.Controls.Add(this.listBox1); this.Controls.Add(this.button1); this.Controls.Add(this.textBox1); this.Controls.Add(this.comboBox1); this.Controls.Add(this.label2); this.Controls.Add(this.label1);
The control index for the parent Form window ("this") is 0, so the ListBox control has control index 1, the Button has index 2, the TextBox has index 3, the ComboBox has index 4, Label2 has index 5, and Label1 has index 6. It's important to note that the implied control index value of a control is not the same as the tab order property of the control.
In this scenario, I can determine the index values for all the controls since I have access to the application source code. If the source code is not available, you can still determine control index values using the Spy++ tool that ships with Visual Studio®.
Realistically, a search application would most likely retrieve search results from a database server. For simplicity, however, I use a local data store in the form of a simple Product class:
public class Product { public readonly string ID, Name, Price; public Product(string id, string name, string price) { ID = id; Name = name; Price = price; } }
Data for Product instances is stored in a collection and supplied when the main Form object loads:
private List<Product> list = new List<Product>(); private void Form1_Load(object sender, System.EventArgs e) { list.Add(new Product("111", "Widget", "$11.11")); list.Add(new Product("222", "Gadget", "$22.22")); list.Add(new Product("333", "Thingy", "$33.33")); }
Since I'm performing UI automation, the data source is to some extent irrelevant. Regardless of where the underlying result data comes from, the results are reflected in the application's UI and the UI state can be examined in order to determine a test scenario pass/fail result.
All the application logic occurs in the search button control's click method. The application begins by performing a rudimentary check to see if the user has specified a search criterion and a search target. If either item is missing, the application displays a MessageBox window with an error message:
if (comboBox1.Text == "" || textBox1.Text == "") { MessageBox.Show("Please enter search criteria and term", "Error"); } listBox1.Items.Clear();
MessageBox windows are not part of an application's Form object and dealing with them is a basic UI test automation technique. After the application clears the ListBox control (which holds the search results), the application fetches the search criteria and targets string from the TextBox and ComboBox controls and then traverses through the List data store looking for matches:
string target = textBox1.Text, criterion = comboBox1.Text; foreach (Product p in list) { if (criterion == "Product ID") { if (p.ID.IndexOf(target) >= 0) { listBox1.Items.Add(p.ID + " " + p.Name + " " + p.Price); } } else if (criterion == "Product Name") { // search for by product name similarly } }
In the interest of keeping my code short and understandable for this column, I am not following every best coding practice. Of course, crude code such as this approximates the unrefined nature of an application in its early stages of development, which is often the situation when you should be starting to test the application.
Creating a Custom Cmdlet Library
Now I will step you through the creation of a small set of custom cmdlets. In particular, I will implement the eight UI-related cmdlets listed in Figure 1. The approach I am using here is just one of several possible design options you can take. (I will discuss a few of the alternatives in a moment.)
First, I launch Visual Studio and create a new C# Class Library project named CustomUICmdletsLib, which creates by default a namespace with the same name. The choice of namespace is arbitrary, but the Windows PowerShell documentation suggests naming custom cmdlet libraries as XXX.Commands, where the XXX represents your organization or the library functionality. The overall structure of my custom UI cmdlet library is shown in Figure 2.
Figure 2 Custom UI Cmdlet Library Structure
using System; using System.Management.Automation; using System.ComponentModel; using System.Configuration.Install; using System.Runtime.InteropServices; namespace CustomUICmdletsLib { [Cmdlet(VerbsCommon.Get, "Window")] public class GetWindowCommand : Cmdlet { . . . } [Cmdlet(VerbsCommon.Get, "Control")] public class GetControlCommand : Cmdlet { . . . } [Cmdlet(VerbsCommon.Get, "ControlByIndex")] public class GetControlByIndexCommand : Cmdlet { . . . } [Cmdlet(VerbsCommunications.Send, "Chars")] public class SetCharsCommand : Cmdlet { . . . } [Cmdlet(VerbsCommunications.Send, "Click")] public class SetClickCommand : Cmdlet { . . . } [Cmdlet(VerbsCommon.Get, "ListBox")] public class GetListBoxCommand : Cmdlet { . . . } [Cmdlet(VerbsCommon.Get, "TextBox")] public class GetTextBoxCommand : Cmdlet { . . . } [Cmdlet(VerbsCommunications.Send, "Menu")] public class SetMenuCommand : Cmdlet { . . . } [RunInstaller(true)] public class LibPSSnapIn : PSSnapIn { . . . } }
My library has a project reference to the assembly contained in the System.Management.Automation.dll file. This DLL is not one of the normal .NET Framework DLLs that ship with Visual Studio; instead, it's typically available as part of the Windows Software Development Kit (available at microsoft.com/downloads), or in some cases the System.Management.Automation.dll is included with the Windows PowerShell download itself.
The System.ComponentModel namespace and the System.Runtime.InteropServices namespace are available by default to a C# Class Library project so all you have to do is add the corresponding using statements. These namespaces are required for the LibPSSnapIn class and P/Invoke code, respectively.
You must add a project reference to the System.Configuration.Install.dll assembly; this is part of the standard .NET Framework assembly set so you will find it listed in the Add | Project Reference | .NET GUI dialog box. You may have noticed that there are nine classes in Figure 2—one for each custom cmdlet and one special snap-in class that is used to register the custom cmdlets.
My first custom cmdlet, get-window, accepts the name of a top-level window (such as an application Form) and returns a handle to the window/application. The code is shown in Figure 3.
Figure 3 Custom get-window Cmdlet
[Cmdlet(VerbsCommon.Get, "Window")] public class GetWindowCommand : Cmdlet { [DllImport("user32.dll", CharSet=CharSet.Auto)] static extern IntPtr FindWindow( string lpClassName, string lpWindowName); private string windowName; [Parameter(Position = 0)] public string WindowName { get { return windowName; } set { windowName = value; } } protected override void ProcessRecord() { IntPtr wh = FindWindow(null, windowName); WriteObject(wh); } }
The first line of code uses the Windows PowerShell Cmdlet attribute to implicitly name my custom cmdlet. Windows PowerShell cmdlets follow a verb-noun naming convention and use verbs from a standard list. So I name my cmdlet get-window. The class itself inherits from a base Cmdlet class that does most of the plumbing for you, making it very easy to write custom cmdlets. (The name of the implementation class does not need to be related to the name of the cmdlet that is being implemented, but it's a good idea to use a consistent class-naming scheme.)
Next, I specify a single parameter named windowName for my custom cmdlet. I do not use the Class Name parameter of the FindWindow function. The Class Name of a window object is an internal Windows category—it's not related to the C# "class" language feature—and therefore it is not useful for identifying the window.
I provide get and set properties for my custom cmdlet input parameter, decorating the properties with a required Parameter attribute and specifying the zero-based index position of the parameter. (I could also supply a Mandatory argument to add some error-checking.) All the actual work in a custom cmdlet is performed in a special ProcessRecord method. Here I simply call my C# FindWindow alias, which in turn calls the underlying Win32 FindWindow function and fetches the handle of the window with the specified name. Since Windows PowerShell uses the pipeline architecture, I use the special WriteObject method to return the handle value (rather than returning the window handle value with a "return" keyword).
The get-ControlByIndex cmdlet is slightly tricky—the code is shown in Figure 4. Here I use the FindWindowEx Win32 function to get a handle to a control based on the control's implied index value. Note that FindWindowEx accepts four arguments—the second argument is a handle that tells FindWindowEx at which window/control to begin looking. By repeatedly passing in the return value from the previous call to FindWindowEx, I effectively advance one window handle on each iteration through the do...while loop in Figure 4. I stop iterating when the local ct variable reaches the value of index, which is passed in as an argument.
Figure 4 Custom get-controlByIndex Cmdlet
[Cmdlet(VerbsCommon.Get, "ControlByIndex")] public class GetControlByIndexCommand : Cmdlet { [DllImport("user32.dll", CharSet = CharSet.Auto)] static extern IntPtr FindWindowEx(IntPtr hwndParent, IntPtr hwndChildAfter, string lpszClass, string lpszWindow); private IntPtr handleParent; private int index; [Parameter(Position = 0)] public IntPtr HandleParent { get { return handleParent; } set { handleParent = value; } } [Parameter(Position = 1)] public int Index { get { return index; } set { index = value; } } protected override void ProcessRecord() { if (index == 0) { WriteObject(handleParent); } else { int ct = 0; IntPtr result = IntPtr.Zero; do { result = FindWindowEx(handleParent, result, null, null); if (result != IntPtr.Zero) ++ct; } while (ct < index && result != IntPtr.Zero); WriteObject(result); } } } // class
My remaining six custom cmdlets follow the same general design pattern as the get-window and get-controlByIndex cmdlets. All eight custom cmdlets are summarized in Figure 5 (the code download includes complete source code for all of these).
Figure 5 Summary of Custom Cmdlets for UI Automation
Cmdlet Name | Input Parameters | Return Value / Effect |
---|---|---|
get-window | windowName | Handle to a top-level window. |
get-control | handleParent, controlName | Handle to a named control. |
get-controlByIndex | handleParent, index | Handle to a non-named control. |
send-chars | handleControl, s | Sends string s to a control. |
send-click | handleControl | Clicks a control. |
get-listBox | handleControl, target | Zero-based location of target or -1. |
get-textBox | handleControl | Contents of TextBox as a string. |
send-menu | mainIndex, subIndex | Fires application menu command. |
Admittedly, my choice of cmdlet names is rather terse—most of my colleagues prefer more descriptive cmdlet names. For example, you might want to rename the get-window to something like get-automationWindow or get-uiAutoWindowHandle. (Fortunately, you can provide aliases for custom cmdlets with long names.) Every custom cmdlet library in Windows PowerShell must implement a special snap-in class that enables the custom library to be registered with Windows PowerShell. My snap-in class is as follows:
[RunInstaller(true)] public class LibPSSnapIn : PSSnapIn { public LibPSSnapIn() : base() { } public override string Name { get { return "LibPSSnapIn"; } } public override string Vendor { get { return "James McCaffrey"; } } public override string Description { get { return " Provides cmdlets for lightweight UI automation"; } } }
A detailed discussion of Windows PowerShell snap-ins is outside the scope of this column. For now, you can just take this code and replace the string values as appropriate.
After writing my custom cmdlet library, I build the Visual Studio project, which creates a single CustomUICmdletLib.dll file in my output directory. Then I must register and enable the library. There are several ways to do this. My preferred method is to create a small Windows PowerShell registration function and add it to a startup script that automatically executes every time a new instance of Windows PowerShell is launched. (Actually there are four different Windows PowerShell startup scripts.)
You can edit or create the startup script by entering "notepad $profile" on the Windows PowerShell command line. Figure 6 shows the startup script I used when capturing the screen displayed in Figure 1.
Figure 6 Registering the Custom Cmdlet Library
# file: Microsoft.PowerShell_profile.ps1 function RegisterUILib { write-host \"registering custom cmdlets for UI automation`n\" $env:path = $env:path += \";C:\Windows\Microsoft.NET\Framework\v2.0.50727\" sl 'C:\UIautomationWithPowerShell\CustomUICmdletsLib\bin\Debug' installutil.exe CustomUICmdletsLib.dll | out-null add-pssnapin LibPSSnapin write-host \"get-window, get-control, get-controlByIndex, send-chars, send-click\" write-host \"get-listBox, send-menu, and get-textBox custom cmdlets are enabled `n\" } RegisterUILib # invoke function set-location C:\UIautomationWithPowerShell # end startup script
In the script, I create a function to encapsulate my custom Windows PowerShell cmdlet registration code (I could have just as easily used individual Windows PowerShell statements). I like to place descriptive write-host statements in my startup scripts because I often change my startup profile and the messages let me know exactly what special functionality my current shell has available.
Next, I use the built-in Windows PowerShell $env:path variable to add the location of a special utility file named installutil.exe to my shell path variable. I then set the current directory location to the location of my custom cmdlet DLL file. I invoke the installutil.exe program and pipe that command to out-null to eliminate noisy progress messages.
The last key command uses the built-in add-pssnapin cmdlet to call my LibPSSnapIn class defined inside my custom library. Note that this code merely defines a function. After defining the function, I invoke it in the startup script by its name.
Using the Custom Cmdlet Library
With my custom Windows PowerShell cmdlets in place, using the library is very easy. Now I will walk you through the script that generated the output shown in Figure 1. The first few lines of my test automation script are as follows:
# testScenario.ps1 write-host \"`nBegin UI automation with PowerShell test\" $pass = $true write-host \"`nLaunching application to automate\" invoke-item '.\TheAppToTest\bin\Debug\TheAppToTest.exe' [System.Threading.Thread]::Sleep(2000)
After I begin with a Windows PowerShell comment, I use the write-host cmdlet to print a message to my shell. The `n is a Windows PowerShell escape sequence for an embedded newline character.
Next, I set a variable named $pass to true. My logic here is that I assume the test scenario will pass, and I will set $pass to false if some application state does not meet an expected value. Windows PowerShell variable names begin with the "$" character, making them easy to distinguish inside scripts. The $true variable is a special built-in Windows PowerShell variable.
After displaying a progress message using write-host, I use the built-in invoke-item cmdlet to launch the application under test. Note that Windows PowerShell uses both single quotes and double quotes (single-quote strings are literals while double-quoted strings allow evaluation of embedded escape sequences and variables).
Next, I directly call the .NET Sleep static method of the Sys- tem.Threading.Thread namespace. (Windows PowerShell provides full access to the .NET Framework. Very cool!) Here I pause my script for two seconds (2000 milliseconds) to give the application time to launch.
Now I obtain handles to the main application Form and three of the Form's child controls, like so:
$app = get-window "Form1" $cb = get-controlByIndex $app 4 $btn = get-control $app "Search" $tb = get-controlByIndex $app 3
In the first line, I use the custom get-window cmdlet to get a handle to a top-level window. And I use the custom get-controlByIndex cmdlet to get handles to controls that have no Window Name property ($cb for the ComboBox and $tb for the TextBox). To get a handle to a named control ($btn for the search button), I use the custom get-control cmdlet.
Next, I display my window handles to make sure I don't have any null or duplicate values, and then use my custom send-click cmdlet to simulate a click on the search button:
write-host "`nHandle to Application is " $app write-host "Handle to ComboBox is" $cb write-host "Handle to Button is " $btn write-host "Handle to TextBox is " $tb write-host "`nClicking on search button" send-click $btn
(In a production scenario, you should probably check for bad handle values programmatically.) The initial click on the search button will generate an error message, as shown in Figure 7, because both the search criterion on the ComboBox control and the search target in the TextBox control are empty.
Figure 7** Application Error MessageBox **
The next few commands in my script deal with the modal MessageBox window:
write-host "Finding Error Message box" [System.Threading.Thread]::Sleep(2000) $mbox = get-window "Error" write-host "Clicking Message Box away" $ok = get-control $mbox "OK" send-click $ok
I put a two-second delay in my script to make sure the MessageBox is fully available then I get a handle to the window using my custom get-window cmdlet. (Remember that MessageBox windows are separate top-level windows and not child windows of the application that generated them.) I use get-control to get a handle to the OK button control on the MessageBox and then click the MessageBox away.
My next few statements send strings to the ComboBox and TextBox controls:
write-host "`nSending 'Product Name' to ComboBox" send-chars $cb "Product Name" write-host "Sending 'ad' to TextBox" send-chars $tb "ad" write-host "Re-clicking search button" send-click $btn
After supplying search criterion and target values, I re-click the search button to cause the application under test to search for all products where the product name contains "ad" and to display results in the ListBox control.
I pause two seconds and then examine the ListBox control:
write-host "`nChecking contents of ListBox for '222'" [System.Threading.Thread]::Sleep(2000) $lb = get-controlByIndex $app 1 write-host "Handle to ListBox is " $lb $result = get-listBox $lb "222"
My custom get-listBox cmdlet returns -1 if a target string is not found, or the zero-based index of the location of the target if it is found. So, I can check the value in the $result variable like this:
if ($result -ge 0) { write-host "Found '222' in ListBox!" } else { write-host "Did NOT find '222' in ListBox" $pass = $false }
A slightly unusual quirk of Windows PowerShell is its use of Boolean comparison operators, such as -ge ("greater than or equal to") and -eq ("equal to"), rather than lexical tokens, such as ">=" and "==".
Now I use the custom get-textBox cmdlet to fetch the contents of the search target:
write-host "Checking contents of TextBox for 'ad'" [System.Threading.Thread]::Sleep(2000) $text = get-textBox $tb
Then I check to see if the application correctly left the TextBox control alone:
if ($text -eq "ad") { write-host "Found 'ad' in TextBox!" } else { write-host "Did NOT find 'ad' in TextBox" $pass = $false }
Now I can examine my $pass variable to see if it is still set to true or if it has been set to false due to an incorrect application state:
if ($pass) { write-host "`nTest scenario result = Pass" -foregroundcolor green } else { write-host "`nTest scenario result = * FAIL *" -foregroundcolor red }
I finish by calling my custom send-menu cmdlet to simulate a File | Exit to close the application:
write-host "`nClicking File -> Exit in 5 seconds . . ." [System.Threading.Thread]::Sleep(5000) send-menu $app 0 0 write-host "`nEnd UI automation with PowerShell test`n" # end script
The two 0 arguments to the send-menu cmdlet mean to use the zero-indexed main menu item (File) and then the zero-indexed sub-item (Exit).
Using, Modifying, and Extending the UI Cmdlet Library
The techniques I've presented here are most appropriate for use in very lightweight automation situations. When you want some quick and easy UI automation, using Windows PowerShell with a small custom UI cmdlet library is a great approach. These techniques are also appropriate for when you want to perform interactive style UI automation from the Windows PowerShell command line. But when writing relatively complex UI automation, you are generally better off using Visual Studio to create C# programs that use the P/Invoke mechanism.
The custom cmdlet library I've presented in this article was specifically designed to allow you to easily modify and extend my code. In particular, I've removed all error-checking to keep my code short and easy to understand. As it stands, my library has custom cmdlets that will allow you to perform quite a bit of common UI automation, but there's no reason why you can't add cmdlets that work with other user controls, such as CheckBox and RadioBox controls. Just don't get too carried away here—complete UI automation libraries that take into account the majority of UI scenarios are very large. Lightweight UI automation is for performing mundane, repetitive tasks, freeing you up to perform manual testing on more interesting and complex UI scenarios.
You may also want to extend my UI automation by creating Windows PowerShell functions that call the core custom cmdlets so that you can provide error checking and additional functionality. For example, my sample uses a very crude approach to deal with waiting for the application under test to launch—I just use a Thread.Sleep call before calling the get-window cmdlet. A much more robust approach is to write a short Windows PowerShell function that contains a loop that calls get-window with a short delay until the return value to get-window is not null or once some maximum number of attempts has been reached.
Send your questions and comments for James to [email protected].
Dr. James McCaffrey works for Volt Information Sciences, Inc., where he manages technical training for software engineers who work for Microsoft. He has worked on several Microsoft products, including Internet Explorer and MSN Search. James is the author of .NET Test Automation Recipes (Apress, 2006). He can be reached at [email protected] or [email protected].