A rant on equating testing to automation

I’m a guy who is no stranger to coding and automation on one hand, and have a lot of experience with testing on the other hand. I understand automation upsides and its limitations. I’m fine with automation, I coded / fixed a lot of it myself. But automation is not a substitute for testing. Automation scripts, however sophisticated they are, do not do what humans do, do not perceive products as humans perceive them. Knowing what I do, as a user and as a tester, and what automation does, it makes me angry to see sales pitches that some automated products “do testing”. Again, automation has its upsides and it’s good to use them. But…

Automation does what it is programmed to do. It does not what it is not programmed to do. Imagine a search field. A programmed check may check if it is on a page. Mays send some input to it. That’s fine.

But automation may be not programmed to check that the field is:

  • in the right place
  • of the right size
  • set to some good font size
  • set to some good font type
  • set to correct color
  • does appear on time but not 3 seconds later
  • does not produce any “Easter eggs” when clicked
  • does not move on screen because of some JavaScript animation, slipping from mouse click
  • does preserve the text I pasted

Quite a few things can go wrong with something as simple as an input field.

I know a thing or two about automation – but also I know how many things can go wrong for me as a user, and automation does not check it if not programmed for that specifically.

As a user, I perceive complicated reality, in which many unpleasant thing can happen with the application. As a guy who can code automation, I know that simple automation does not check all what I perceive.

That’s a kind of the differences between testing and checking which many people don’t realize – or don’t _want_ to realize.

And, of course, automation can’t fix problems of all-rounded testing coverage and of proper testing in general. It does not have brain to think for you, your stakeholders and users.

It’s okay to use automation, but one needs to know and bear in mind its limitations.

That’s how I found myself humanitarian

One former network friend of mine ironically used term “industrial archaeology” to describe dealing with systems like old plants, still working, but poorly documented or not documented at all. It drew some parallels between archaeologists, trying to guess meaning or usage of old things, and engineers, trying to guess “What does this? Or that?” “Why it was built this way?” “What were their reasons to construct it like that?”

Currently I’m dealing a lot with the legacy things designed like 10 years ago, and found myself trying to find answers to the same or similar questions.

First, it’s sad for me, as I thought of myself as of technical man, but this work looks like humanitarian.  But also it reminds me of “anthropologist” parallel of Jerry Weinberg… So I can live with that.

Another thought is that if you need to deal with system designed by humans, you better do some inquiry about their intents. Or you can test / report wrong stuff, which might be bad, considering that we have limited time for any testing.

A good article on testing and AI

I came across a good article 10 Things to Consider When Testing Artificial Intelligence , outlining current possibilities. limitations of what is called AI and challenges of its testing. It is written by Adam Smith, who has held senior technology roles at Barclays and Deutsche Bank delivering large complex transformation projects, and CTO of Piccadilly Group, the UK’s leading test and intelligence consultancy for financial services.

Some citations I’d like to put special attention to:


General AI or ‘Strong AI’ of the type that can talk to you about anything is decades away. Most of the AI solutions that are going to be available in the near future fit into the category of “Narrow AI”, i.e. they are applied to a small enough problem domain in which the AI problem can be solved

So when faced with a natural language interface, you don’t need to start trying to ask it every question from your general knowledge book, you need to focus on the functionality that it is actually trying to provide.

Machine Learning is a ‘Markov Process’*, that is, an algorithm where the last calculation impacts the next.

Modern machine learning algorithms identify patterns in data, and use those patterns as rules – as heuristics – so rather than being explicitly programmed the algorithm has experiences, learns, and this changes its behaviour in the future.

This topic is too large to cover in this article, but machine learning (a crucial element of AI technologies) is only as good as the data it is trained and tested with.

After all, machine learning essentially relies on averages to come up with the ‘best-guess’ response.

As software becomes more human, and the effort involved in testing becomes increasingly automated, we need to look at our profession carefully. In my opinion, testers are the only people who look at new software critically, before it reaches public use, and with an influence on product development. I think it is key that we evolve our profession to consider the wider societal, legal and ethical influences, before we give our stamp of quality to any technical product. If not us, who exactly?

Notes:

*  “Markov Process” is a type of random processes:
https://en.wikipedia.org/wiki/Markov_decision_process
https://en.wikipedia.org/wiki/Markov_chain
Markov chains were used to create things like “human-like” random text generators.

There are also a few talks which described to me current state of AI and its limitations:

When in between the tasks

I was asked what to do if a tester has finished a task and has some spare time.

First, there are some mundane things, like:

  • installing software updates
  • “housekeeping”. I used to create a folder for each task where various documents related to it were put — notes on task, notes on tests, data, longer letters texts, pieces of code. I might want to clean some and reorganize others. It helped a lot to keep it like that, as if I got a similar task later, I had data and some noted thoughts to re-use.

Second, I often have “my personal backlog”

  • notes on issues which I briefly noted to special file/tab to investigate later (I used Todoist ad todo service and some outliner for text and notes needed to be in one place)
  •  a note or a plan on writing some text (idea for the team, explanation of something, introduction to something I wanted to share)
  • notes on some planned changes like “add a piece of data on interruptions by abcd to the set 1.2.34”
  • I created additional “own” set of my test data and shared it with others
  •  I could check some tool other team member used and I had no time for yet. Like I came to the company knowing only a bit of SoapUI and left it knowing a bit of Postman

I had ideas on writing my own tools, sometimes “general”, sometimes specific to a taks or a class of tasks

  • it could be a tool for test data generation
  • a tool for fixing a particular problem (delete part of export data which could not be imported)
  • the code for my implementation of “monkey” to run our through our app with random inputs
  • After I presented a tool, I was also asked to create detailed documentation for it, and commit the code to a repository and the documentation to Wiki (Confluence)

I was asked sometimes to write documentation or to give a presentation, my commitment was expected.

  • I could open a PDF reader and read some digital copy of book or presentation.
  • I could “hunt” some documents, like presentation or texts about heuristics, and add them to my library
  • I could watch a webinar, did it a few times in office intentionally.

The company also proposed me to take a course on JMeter load/automation tool, and I took it, but mostly in my spare time.

And, of course, I asked if I can help my fellow other tester, or had a nice conversation with office manager or one of the developers if I had a topic to talk about.

As you may see from the list, there is always something to do for me.

Switching two screens feature and 13 bugs

The feature was supposed to switch between two screens, let’s call them “SCI” and “SCH”. I found 13 bugs not looking just for SCI / SCH, but going what I imagined as user flows, spontaneous ways and applying some of my experience.

  1. Icon for SCI screen was not updated
  2. Instruction text was not updated
  3. Invisible control: Interactions with input in bottom of SCI screen were not possible or possible only partially, because of overlapping invisible control. To catch this bug one had to add special settings for SCI.
  4. Too long time to the next screen from SCI for clients (because of merge error app was requesting all the type S data from organization instance)
  5. Empty SCI screen appeared after sending app to background and returning back
  6. Lock screen appeared if app got put to sleep after pressing power button (app should not be locked in SCI mode)
  7. After timeout exit on passage app returns to screen SL, not SCI
  8. (minor) After updating SCH data order of data with the same time may change in SCH.
  9. (crash) SCI with multi-source data -> SCH -> choosing the same client again -> cancelling -> crash
  10. Multi-source SCI does not result in successful passage if first source chosen has no SCI option set
  11. (minor) Timeout action cleaning data from SCI screen does not close dialog “Whoops!”, what may create confusion for the next client.
  12. After attempt to fix issue 10 app started to crash on Multi-source SCI screen for type of client
  13. Old issue for SCH screen display with specific test data

and there was 14th not found by me (found later by other tester):
14. If only one source selected in Choose multiple sources, the next screen does not start in the chosen mode

Nothing “cool” is meant in this story. The story is about how non-standard, non-“factory” methods reveal a lot of things “factory” methods don’t.

Was it manual testing? (PHA)

“It isn’t what we don’t know that gives us trouble, it’s what we know that ain’t so.” – Will Rogers. Citation taken from “An Introduction to General Systems Thinking” by Gerald M. Weinberg

This time our brave tester gets a bunch of new stories, codename PHA. And…

  1. Terms for the stories were not agreed
  2. Data types the stories cover were not categorized and documented
  3. Developers didn’t get right data types scope
  4. Requirements were incomplete (like preservation of old functionality, new default states)
  5. Story included incorrect generalizing statements (not all data types were and could be covered, not all the same way)
  6. There were miscommunications between stakeholders
  7. For some unknown reason initial implementation ignored graphics created by designers

 

So our protagonist…

  1. Suggested new terms and argued for their usage
  2. Documented categorization, talked about it, created a document
  3. Doubted story description (too bad the miscommunication was found too late)
  4. Pointed at what was missing, so that went to later better descriptions
  5. Pointed at incorrect statements
  6. Talked to every stakeholder
  7. Insisted that initial implementation was wrong and brought developers to talk with designers and others
  8. And reported 9 more problems with implementation, partially fixed.

Was it “manual testing”?

On Rapid Software Testing class

Rapid Software Testing (RST), even in scope of the class, is a huge topic. My account is a glance of it, not even outline. If you want to know more, you definitely want to read open course description and materials.

RST class delivers on what it promises. That’s important. Again, you want to check the course materials on what is promised. To show a bit of what is promised, I’m using two slides from the course public slides.

RST slide 01

RST slide 02

That all gets demonstrated in the class.

RST lives up to its principles. Important part of the class is asking questions, not only about testing exercises, but about what you want to know of class topics and other topics of testing. Don’t hesitate to ask about what testing problems you may need help with. Important point is that you have high chance to get in the class explanations and examples you ask for.

RST does not have “finish point”. Learning must not stop on final class day. RST brings in many topics and it does not have to end. For unprepared people course may be a starting point, for people who followed RST before taking class it is like a waypoint. Both types get a lot to continue to learn and to develop.

As it is promised, every category of participants can benefit from RST class. Unprepared participants get something new to think about, see more ways to do testing. Prepared see how this works in action and practice it. Even if you listened to several talks / podcasts of RST, “it’s one thing to read about riding a bicycle and completely different thing to ride a bicycle yourself”.

Unprepared participants can see that testing is way more than “test cases”, get opportunity to break free from factory and test case thinking. They also test in exercises for 3 days without writing single test case. Prepared get tricky exercises and “food for thinking”, practice testing in sessions and in cooperation with other participants in class.

Some topics we learned and discussed: what’s wrong with test cases, how to start testing something instantly without following lengthy procedures, where to get information for your testing, what your oracles (entities determining what is a problem and what is not) could be, when to stop testing, how to put coverage without counting test cases, what heuristics may aid your test ideas, how to think about product-associated risks, how to explain to others why testing takes so long…

Although you get certificate in the end of the class, IMO it is important to realize that RST is not about certification. You learn, you practice, you get leads to learn more, you get “entry ticket” to community where you can ask more. Thoughts, heuristics and connections you get with the course are much more important than any certificate.

Keep in mind that RST is a lot about time, “How use the time I get better”. While it may be not emphasized every time, the questions brought up in the Kiev class, like “Why not test cases?” and “How to learn newcomers?” or “Can we get rid of testing role?” are about use of time too.

Back in 1990s guys like me were fascinated a lot with action movies, since then we use meme “Show me your kung-fu!”. In some way RST is like kung-fu we saw in movies: there are no “kung-fu” cases, you learn from other people, you practice a lot, you have to be always prepared to show what you are capable of. Also, like martial arts have connection to thinking and philosophy, RST does too.

If you wonder “What the course content is about” and you are interested to learn more about RST, once more, please read open course materials and documents (section “How Students Should Prepare”) .

Reasons why linking automated checks to test management system could be not good idea

My answer to question “Do you know test case management tool which can create & execute both manual and automated tests?”

I saw such, let’s call them integrations, and I strongly advise not to connect automated checks to test case / test run / test management systems.

First I’ll name the integrations I saw by test management software participating in the integrations

Quality Center (by the time — HP Quality Center)

In my experience it

  • was laggy (slow)
  • was Internet Exlorer-dependent
  • was cumbersome
  • had issues with Unicode support

No one on team liked it, other teams disliked it too. Everyone wanted to avoid it, because composing test runs in it was a pain.

TestRail

TestRail is much better software than QC in terms of browsers support, ease of use, responsiveness and speed, but has its limitations (like any other software). If you have or going to have a lot of test cases (for whatever reason), you going to need good structure for them, search on big databases is slow, and editing / changing test runs takes time and effort.


Of the two integrations I saw every was eventually abandoned. 

Before you start any integration, you better consider what do you want it to deliver and what disadvantages it may bring. The main source of disadvantage are that

  • there are no perfect APIs with instant response and delivery

    Even if you perceive system response as instant, it takes hundreds of milliseconds, and in the course of run they stack. That may be good enough for tests you do / mark / report yourself, but if your automated checks run contains hundreds of them, this takes a toll.

  • Web interfaces to work with “test runs” leave a lot to be desired in terms of ease and responsiveness.

The integrations I saw had issues:

  • Composing “test runs” in system may be slow unless it is trivial
  • Editing “test runs” may be slow and inconvenient
  • If your automated checks run is bound to obtain information from test management system, every run will be slower, just because of what I said above: no instant response, every request adds time to the run.
  • For the same reason getting report from test management system may slow for you getting results of your runs (run results have to be messaged back via APIs from whatever runs automated checks, and processed by test management system software)
  • The test management system have limited number of reports with limited formats, which may not suit for your automation checking needs
  • Integration connectors (the software between automated checks run software and test management software) has issues too. Software which is not supposed to look for problems does not solve problems in other software, it may add problems.

Please ask yourself “What do I want to achieve? What do I want my team to have and to do?” and think it over for a while.

  • Do you want people who will do check automation run and debug tests faster and in more convenient way? They need independent run management means for that, nothing like any test management system existing.
  • Do you want checks run faster on some CI system? Those runs better be independent of any test management systems.
  • What reports do you need? What reports does your team need? Are you sure that the system you want will deliver what you and them need? Maybe some report tool will do that faster and in more convenient way, without overloading some test management system database?

Look for getting possibilities and avoiding issues. In my experience that way led out of integrations with test management systems.

And I ask you to consider if test cases are the right tools for the jobs you want to be done.

 

“Stack of books” heuristic

Facing some problem like mathematical problems, I take stack of books on subject / problem. I open the top one, and look if it applies to my problem, if it is written clear enough for me, if it maybe contains solved problems like mine. If it does not, I put that book away and take the next from the stack. For more or less standard problems it should be somewhere. That worked for my mathematical problems in school, later for my mathematical problems in university, and it works for my IT / programming learning tasks.

For IT / programming tasks it may be not stack of books, but a number of e-guides, tutorials, articles, posts. For instance, it worked when I needed to learn something about SOAP protocol and services. I looked into one article, than one more, third, forth and finally found good enough explanation to build some concept of it in my head.

I often recall it when I see someone asking advice on “best” course or book on some topic. There could be not one book best for all. You better use a stack to find some which best for you.

Why grumpy?

Again and again people ask me why I always say that there is no such thing as good morning and why I’m grumpy in general.

“You know”, I say, “with my job I (am) …

  • Look for problems
  • Find problems
  • Talk about problems
  • Supposed to find as many problems as I can
  • Supposed to find most serious problems (like crashes) and talk about them
  • Afraid I miss important problem
  • Afraid that problem will be found not by me, but by someone else
  • Expected to win a “race” and find more problems with feature or product than everyone else
  • Nervous if I find too few problems
  • Think where more problems can be found
  • Motivate / instruct / learn how to find even more problems
  • Better remember every problem I found in years of my experience, as some problem somewhere can be the same or found with the same means / ways
  • I should not believe anyone, unless I really see only minor change with my own eyes. Even if one of the programmers says that he talked to the designer and checked with her, I should not believe it, I should check with designer myself (and find some minor but problems).

“If there were no problems here, you would not see me now”, I add, “Because where there are no problems, people don’t feel they need to hire me”.