The WordPress Jetpack Plug-In
I recently installed the WordPress Jetpack plug-in, as part of a suite of plug-ins evolved by Vanessa Chau. Jetpack leverages WordPress.com infrastructure to provide much WordPress.com functionality to WordPress blog engine instances hosted elsewhere—like this one, which happens to be hosted by ICDSoft. Installation was so easy and fast that I don’t remember it! But when I tried to slip the Ring of Power on my finger, by clicking Connect to WordPress.com, I got the dreaded “site_inaccessible” error. Soon I learned how to modify the “.htaccess” file to fix this. Your mileage will almost certainly vary, but your experience may resemble what I describe here. » Read more
Another software QA video recommended by Gilberto Castañeda is “Your Path to Data-Driven Quality,” presented by Microsoft’s Seth Eliot. (This video is actually from an earlier presentation.) Eliot delivered this presentation at the April 1-3, 2014 Seattle ALM Forum. The slides Eliot presented are available as PDF and PowerPoint files.
I confess that I never heard of Eliot before Gilberto brought him up; but Eliot’s title commands respect! He’s Microsoft’s “Principal Knowledge Engineer, Test Excellence.” I’ll bet that looks nice on a business card. He works with services, cloud delivery, and data-driven products. Before his Microsoft gig, Eliot spent some time at Amazon.com. I can’t help but wonder if his stay there overlapped that of my late friend Paul S. Davis; but that was really some time ago.
Eliot proposes to give us a road map, but aphoristically points out that this will not be enough: we also need roads. He’s going to propose “a way to get there” which listeners can apply to their particular environments.
A general impression: This presentation owes much to Alan Turing. To Eliot, everything is data; code is data. And so are management dictates….
The lowest form of data-driven quality (soon to be abbreviated as “DDQ”) is “HiPPO-driven,” based on the Highest Paid Person’s Opinion. By the way, Eliot claims to have tested HiPPO-driven decisions using data gathered online, finding that one-third were wrong, one-third were right, and one-third had no noticeable impact.
Let’s go to a higher level of DDQ. You can use scoring engines to apply Bayesian analysis to historical data. Frankly, this is not Eliot’s forte. He’s more interested in a real-time approach: “testing in production” (TiP), using production data.
“It’s not as difficult as you might think.”
That’s good to know! Because guess what everyone in Eliot’s (small) audience, and I, were tensing up over!
Why would we want to test in production? Because real users are surprising: they do weird stuff. Environments can be chaotic and uncontrolled. Fine, but I’m waiting for the other shoe to drop, and it’s got to be this: What’s implicit here is that a more structured, anticipatory testing approach will not encompass this scope of weirdness; and that this is the swamp from which things will crawl and bite you in the ass.
Like any good road map, Eliot’s is very compact. The bare outline on his slide (at 7:20) needs the context of his explanation and the following slides, so I won’t bother to reproduce its bullet items here. For novices, lack of familiarity means that the hard parts—the magic—will be in designing for production-quality data, selecting data sources, and using the right tools. There’s also a circular or cyclical quality to this, which the linear list can only enumerate as answering your questions and learning new questions. (Eliot adds a seventh bullet and graphical overlay to make the loop explicit at about 7:26.)
Next, Eliot shows a slide full of widgets, illustrating the kind of real-time production data available at Microsoft. Old school Mission Control fans will not be disappointed.
Eliot’s road map begins with something that seems easy: Defining your questions. Why not just plunge in? Why not just get data and try to draw correlations? Because they would be unhelpful, like the statistical correlation between using sunblock and being more likely to drown. Kids, don’t try this at home! Plunging into the data is for advanced users.
How about an example, Seth? What questions does Microsoft ask about Exchange Online? “Is the application available?” (Some of us may yet think of this as “dial tone.”) This is important, because when the application’s not available, it can stop the user’s work.
Note that the user’s perception of availability can be more subtle than the provider’s.
Eliot’s first example of this is an occasion when the Japanese version of Exchange Online silently failed and loaded labels in English. From the server point of view, the application worked; but most Japanese users were cast adrift.
Slow response may also make an application as good as dead to users, in a way that is not so clearly visible from the server end. Eliot shows a graph of user abandonment vs. time waiting to start a video stream (13:37). About five seconds of delay is enough to get rid of almost everyone.
Except mobile users, who have been conditioned to be much more patient.
Perhaps these users remember the ground crew members who did not immediately let go when a gust of wind caught the flying aircraft carrier USS Akron…and who can blame them? Let go while you still can!
The advantage of production testing in situations like these “is manifest.” And here’s where paradigm-shifting light bulb really began to turn on for me, as I began to understand how dramatically different Eliot’s approach is from what I’ve spent most of my testing time doing. Eliot wants to watch a dashboard showing distributed, real-time application behavior; whereas my peers and I explore the territory bounded by anticipating user behavior, and evaluating the consistency of database tables. (These intersect Eliot’s worldview as means of acquiring “active” data, as opposed to “passive,” real user data: a distinction he introduces at 18:33, discussed below.) When you click a box offering to participate in the Windows Customer Experience Improvement Program, you’re volunteering data for Eliot’s production testing pipeline, or perhaps for another one much like it.
A thumbnail sketch adapted from one of Eliot’s graphs sums up how active data—all the stuff you and your co-workers gather using your made-up test cases and jury-rigged fake data—yields to passive data as the application goes into production and, with any luck, as the users pile on like the Clampett family on Jed’s old truck.
Eliot’s “active”/”passive” semantics confused me at first. When you work at Microsoft scale, surely the application is hopping up and down in production in a way it never does in sterile testing environments. You can beat on it with load tests, but to get the noise level and surprises of real users and deployment environments that time forgot? But Eliot’s choice of adjectives is drawn from the tester’s perspective, as it really should be. Active data is what you go out and hunt. Passive data…you can catch that in drift nets.
By the way, Eliot’s schematic curves of active and passive tests are not mean to reflect the natural order of things: what happens when you sit back and watch nature take its course. “Staged data acquisition mitigates risk.” Of course! You’d stage a deployment of any large, outward-facing system, right? First the internal users, then maybe some friendly beta testers, and so on. Data acquisition goes hand in hand.
We knew that in our guts, right? Sometimes it’s important to make these things explicit.
Here Eliot begins to focus on the questions we might bring to testing in production; or rather, the kind of answer we’re looking for. How would we rank availability, performance, and usage. Or; Is there any situation in which availability does not come first?
Yes, Eliot suggests: Twitter and social platforms might value usage and feature adoption over availability, for example.
What scenarios are most important? Is it more important that a user should be able to send email immediately? Or that the product logo display properly? (The Marketing department may have a hard time with this, but I’m going to go with sending email. But maybe there’s some context in which I’d feel differently.)
In any case, think about these things, because they affect priority of your test scenarios.
I enjoy collecting buzz phrases, and wouldn’t consider ending this overview without recognizing Eliot’s linguistic contributions.
Most of us have heard of “eating your own dog food.” (If you haven’t, it means using your own products. Wikipedia’s got more depth.) Eliot takes this to the next level: “It worked in dog food,” “We dogfooded it.”
In Eliot’s world, these concepts are important enough, and thrown around frequently enough, to rate acronyms:
- DDQ: data-driven quality.
- HiPPO: highest-paid person’s opinion.
- RUM: real user measurement (i.e. acquiring passive data).
- TiP: testing in production.
While writing the post about Aaron Rudger’s wearables presentation, I summarized it for a friend, in an instant messaging encounter. That was enough to provoke my friend into a riff on the dark side. I’ve adapted it here.
Since my friend works in telecommunications, and some of his opinions could be interpreted as expressing less than total enthusiasm for any business venture his employer might conceivably undertake, we’ll call him “Mark Twain.”
[4/20/2015 10:27:45 AM] Mark Twain: Most of that experience is advertising of one form or another. Or surrounded by it. Looking for a word here…suspended in it?
[4/20/2015 10:29:14 AM] Mark Twain: Like a decent movie is suspended in a bath of advertising, like the nutrient bath (or whatever it is) Neo wakes up to find himself immersed in, in the Matrix….
[4/20/2015 10:30:23 AM] Mark Twain: OK, back to making the world more capable of delivering it to all of us on more devices in real time….
[4/20/2015 10:32:16 AM] Mark Twain: But we could start dreaming up a good horror movie (too bad I don’t like those either) about what happens to the nascent amoral machine intelligence of the cloud once we’re all covered in “wearables” and all our appliances talk to each other on the web….
[4/20/2015 10:32:58 AM] Mark Twain: Think of the havoc that might be wrought by connected chainsaws and lawnmowers. I already know guys in Denmark working on the lawnmowers….
I was picturing Dalí-esque gelled (jelloed?) smartwatches devouring debutante wrists like pythons….
[5/9/2015 10:01:18 AM] Mark Twain: Now that would be interesting to test.
I agree! But I wouldn’t undertake such a project without competent legal advice.
This is how it happened, more or less: My colleague Gilberto Castañeda asked what I thought of the SQE/Techwell STAREAST Virtual Conference.
“Oops! I missed it! What was good?”
While I parsed his response to this question, I groped for an algorithm to choose which archived session or sessions to view myself. Since Gilberto hadn’t had much time for the conference, this suddenly got easy.
“Of the sessions you didn’t see, which interested you most?”
I thought I could feed the network by viewing that presentation, and providing a summary; hence, this post. Turns out Gilberto was most curious about Aaron Rudger’s presentation, “Three Ways to Captivate Customers on Wearables.” The synopsis:
Competing for growth is a game of seconds. Superior customer experience is the #1 differentiator for digital success today. Wearable devices and the apps that power them promise to change consumer life and employee productivity. Developers and Quality teams must meet the demanding needs of users whose expectations for a flawless experience are changing.
Aaron Rudger will discuss the three keys to success for delighting customers on smartwatches, smartphones, tablets or digital form-factors yet to be imagined. Aaron will share testing best practices to prepare your apps for a smooth and reliable experience now and in the future.
Buzzword alert! But what else would you expect from a presentation on wearables? Anticipate people who are racing to establish themselves at the forefront of an emerging technology composed of both agile and massive corporations racing to establish themselves…. A conference speaker on this topic will be surfing on someone else’s bow wave.
Rudger set the scene with rhetorical questions: How do you get high quality results from testing wearables? How do you ensure a high quality user experience?
Rudger equated three models of Apple watches with “Three Horsemen of the Testing Apocalypse.” Smart watches are not a new idea, but change the testing game. These will create a new paradigm for engaging with customers, Rudger said. (Really? There it is: the market for testing software on smartwatches boils down to those who want to use the watches to sell stuff. A friend shared his dystopian reaction while I wrote this.) There will be great pressure to deliver these new experiences.
But Rudger really isn’t here to talk about the scope of wearables. He’s not interested in sensor networks woven in fabric, for example. In fact, he’s drilling right down to the Apple watch. Why?
“It’s all about the money.”
Oh, right: that! Well, then! Let’s get to it.
720,000 Android watches (or “wear devices”) were sold in all of 2014. In contrast, an estimated 1.2 million Apple watches sold during the first weekend of their availability, with 2015 sales forecast at 20 million. Rudger forecasts linear growth, quickly leading to a multi-billion dollar market with concomitant pressures. He also cited a prediction that 2.5 billion smartphones will be active by end of 2016.
Building and running apps for the Apple Watch dependent on Apple’s WatchKit. Even if you don’t develop functionality specifically for the Watch, developers will take advantage of its functionality, and of WatchKit. WatchKit affects mobile developers “whether or not they’re in the…game.”
Rudger asserted that the Apple Watch has driving the doubling of iOS release velocity.
By the way, don’t expect Google to sit still and watch while this happens. Android will continue quick evolution.
Power management is critical to apps for wearables; functionality and performance blur in this domain. Quality is tightly coupled with performance.
Rudger said that conventions in this domain are not well-defined. Really? Is this an opportunity for someone with a big mouth? Can’t one work from first principles and by analogy to establish a good first draft? What does usability god Jakob Nielsen have to say? Or Don Norman? Bruce Tognazzini? Has the smartwatch caught people like these flat-footed?
With a perfect sense of timing, while I was writing this post, Nielsen Norman Group Senior Researcher Raluca Budiu weighed in with “The Apple Watch: User-Experience Appraisal.” Her article cites an earlier look at the Samsung Galaxy Gear. Budiu describes Apple Watch GUI successes and failures with reference to broader principles, easily translated to design guidelines. I think we can say that, no, the usability crowd has not been caught flat-footed. But they can’t tell us what the smartwatch killer app is, either. Budiu’s conclusion is to offer some guidelines for those with the nerve to believe they can actually create value for smartwatch users. She can’t tell them where this value lies; only what its delivery vehicle should look and feel like.
Not long after, I came across Nielsen Norman Group principal and usability god Bruce “Tog” Tognazzini’s current draft “First Principles of Interaction Design.” This document also addresses smartwatches.
There may be a cultural or psychological factor at work in not acknowledging these or similar resources, e.g.: “We’re a startup. We don’t have time for that.”
And that brings us back to Rudger’s presentation: How hard will it be to develop good smartwatch apps? What must designers and developers consider? “…The application and experience is going to be contextually based; it’s going to require an extreme amount of curation….”
Does that help you? Me, neither.
In fact, the thrust of this section of the presentation is to reiterate what people have probably been telling each other since at least the invention of written language: Everything is faster and more complex than it’s ever been before. Only the fleetest of foot, Mercury-winged, have a prayer of even standing still (in Jeremy Scott’s styling footwear, no doubt).
Or is it completely true, and I’m just bored with the sound of being swept into the crowded dustbin of history?
But wait: Here’s the plug, and the reason Keynote LLC was happy to have Rudger, its Director of Product Marketing, spend time on this presentation: All of this means we should test in the cloud.
Cloud? What cloud? What does this mean? That we should use email and FTP to work together? Or that we have emulators available online? Oh, good: Here comes a well-timed demo.
Yes, they’re emulators. Not just emulators: emulators developed by Keystone! Like the ones I collected a dozen or two of when developing wireless applications 15 years ago, except that these run on someone else’s hardware, somewhere out there. They’re controllable through a web client.
At least that’s what the naive viewer might believe. But wait! He would be wrong! These are not emulators; Rudger’s showing us an interface to real hardware! (A free trial is available at mobiletesting.keynote.com.) Having multiple devices available at the same time enables easy exploratory testing. One device can send a message to another.
Rudger suggests that a great virtue of this approach is that it does not burden developers with the need to learn new languages. However, it’s amenable to both manual and automated testing (presumably through approaches like Selenium). The approach immediately supports new operating systems.
After about 25 minutes of introduction and presentation, we segue to audience questions and Rudger’s responses. Participants can’t help but feel rewarded for their efforts: Rudger thinks every question posed is “really great.” One that stood out:
Q [paraphrased]: Are wearables a fad?
A: No, because Apple’s involved.
Is that the deepest Rudger can go, when he could assemble a position using components like Moore’s Law?
Q: How many users are on a particular hardware device?
A: Capacity is adequate, and continuously monitored. Keynote adds devices to accommodate contention. There’s a careful process at work here. Of course there is. With any luck, Keynote won’t be as investment-averse as some of its clients are. Don’t make that process too careful!
If you want to see for yourself, you can find Rudger’s presentation online until 8/7/2015. (You’ll have to register.)
This YouTube video was my recent introduction to James Bach, the enfant terrible of the “context-driven school of software testing.” (I thank my colleague Gilberto Castañeda, “El Águila de Tenochtitlan,” for suggesting I watch this video.)
When I write enfant terrible—which I guess means “terrible infant”—I don’t mean an upset child, fists clenched, face red, who stops screaming only to draw breath. We’re talking about an adult here, who on the strength of his reputation, has been invited to speak to students in Estonia. (Does anyone in Estonia even know your name? Not mine.) In this context, we’re talking about someone who upsets some established order by expressing himself enthusiastically and articulately.
I typically find these people quite compelling. Their presence tells us that a discipline has room for finding success without first enduring a Druid-like, decades-long apprenticeship. (Not that Bach advocates ignorance, some kind of Noble Savage approach, or barefoot and wide-eyed laying on of hands. He urges learning all the tools and technologies you can.)
I suspect that when Bach planted himself behind the podium in this college lecture hall, he had a very general outline in mind. In fact, he brought a few slides, which he seems to go through in order. However, he’s not afraid to range far and wide to make his points. The broad outline is defined, but the sentences are JIT. Something he does a lot of is telling stories. (This, too, is an approach that reels me in. The “story” is a way of encapsulating ideas which I can consume like M&Ms. Stories are brain candy.) Much of the presentation consists of Bach’s anecdotes about approaching a software testing situation and turning over rocks to reveal previously visible ugly stuff.
How does he do it?
This is part of what Bach tries to express with the “context-driven testing” label. He does it by recognizing that the testing context, the context of software or system success or failure, is an inclusive one. It generally covers a lot more territory than the problem description makes clear—where “generally” means something like “while pigs do not fly.” Bach zooms out, or looks at the system from other perspectives, to see potential vulnerabilities. It’s creative. It’s insightful. It provides the stuff of good stories.
One story which stands out as I listen to the presentation for the second time, in the background, is Bach’s summary of Miyamoto Musashi’s Book of Five Rings. Bach summarizes the book as an account of how Musashi survived many duels. How? By using a particular type of sword? A technique? By training with a particular school?
In Bach’s telling, Musashi owed his survival to mastering many weapons (learn all the tools and technologies you can), and then doing whatever was necessary to win.
This will do for now as a metaphor for context-driven testing.
When you hear one of Bach’s stories, and feel that you laugh or thrill with him, this encourages you to believe that you, too, are capable of joining the context-driven brotherhood (a belief Back explicitly encourages elsewhere), and sharing in these penetrating insights. Bach encourages this perception. He begins with simple, clearly defined situations, giving his audience a chance to warm up and stretch before they hear about more complex situations. In his first example, he shows the students some simple code and then discusses all the situations not captured by the code’s logic. After a little stumbling, Bach has probably closely aligned his audience’s understanding of the example with his own. After that, each additional story reinforces their complicity.
There are many more trivial, discrete, and personal reasons I find Bach really compelling. There’s a whole body of shared memes surfacing in his discussion: The Book of Five Rings. Simulated annealing. Artificial intelligence and knowledge representation, circa 1980s. To name a few.
On the other hand, I’ve bought some of my groceries as an advocate for various processes: the SEI CMMI, XP, and a few things in between. Bach is quick to express his disdain for “best practices” and methodologies which seek to eliminate human “squishiness.”
I’m very interested in how I can use the Selenium framework for some automated testing. Bach acknowledges that this approach has its uses, but belongs in the back seat.
As I write, I’m infatuated with Bach and what I understand of context-driven testing. Will this survive more exposure?