It is impossible without updates today. It doesn’t matter if we are talking about the development of a premium project or a frituplay. In any case, the players are waiting for supplements. That’s just not everyone can benefit. If the players do not accept the new thing, you will lose both money and the audience. Therefore, you need to test it before launching. How they do it — we tell you in the article.
It is necessary to test a new feature in order not to lose money and users
Releasing an update to the game is a common practice. When it comes to a content update, there are usually no questions. Here it all boils down to the mantra “the same soup, but pour in thicker.”
Another thing is the new functionality. You can’t predict how his appearance will affect.
The producer of the Estonian Creative Mobile, Alina Brazdeikene, believes that there are three possible scenarios for the development of events after the update. The first one is successful: metrics will grow. Alternative scenarios are the preservation of the previous parameters or their change for the worse.
“Both of the latter options are negative. Keeping the same metrics is bad, because resources were clearly invested in the feature. It is unacceptable to lose them, unless you are trying to test some fundamental theory, and getting this answer is more important than the amount of resources you will invest in the feature. In the second case, it is due to the possible loss of its target audience,” Alina notes.
Alina Brazdeikene is currently working at Creative Mobile on hidden X-Files: Deep State
For this reason, pre-testing is a common practice today. The question is how to do it correctly, whether there are methods that allow you to most accurately understand how users will react to the innovation.
The first step to adding new functionality is analysis
Testing begins with an analysis of the existing version of the project.
“We always try to wait a little after the launch of the project and collect data (analytics, reviews) until we decide to implement something new,” says Maxim Kochurin, CEO of DevGame.
There are two reasons for this approach. First, the collected data form a control group. With an eye on it, the effectiveness of future updates is evaluated. The second is that there is something to start from when thinking through innovations.
“If you already know in advance what and how affects your users, it will be more convenient for you to make decisions about upcoming changes in the future: some of the hypotheses will gain more weight, some will disappear by itself,” Vasily Sabirov, a leading analyst at devtodev, says about the need for analysis.
Alexandra Radchenko, head of the research department at Playtestix, is also sure that it is necessary to start with an analysis. But it starts from a different premise. Sasha looks at analysis as one of the types of testing.
“It is important to analyze the statistics and opinions of the players, their gaming experience. Do everything that will help to find the critical points of the game. Next, we look at our upcoming changes. Are there points of intersection of problems and updates? If yes, then the risk of falling indicators is reduced, if not, then old and new problems may develop, and the audience will start to run away.”
Simply put, it calls for checking whether the list of features correlates with the wish list of players.
When it is not possible to test the update properly, you can only do with this approach. However, as Radchenko notes, it is better to combine it with other methods of analysis and verification.
The third approach to the analysis is formulated by Sam Strizhak, producer of Alternativa Games. He advises turning to making a forecast of how the invented feature will affect the project itself.
“As a result of the analysis, you should get a clear picture of the mechanical connections on the project and a forecast of the consequences. Using this information, you can make a much more meaningful decision on the functionality, based on the specific needs of the project,” Strizhak notes.
Alternativa Games is now focused on its new online action movie Tanki X
The analysis itself is often reduced to the formulation of questions and answers.
“First of all, we answer the following questions: which indicators, in our opinion, have the greatest impact on the income in the application, which of them and how can we improve, how much effort and money will we spend on solving this issue?”, explains Kochurin.
Radchenko from Playtestix also suggests starting to approach testing with a list of questions. “Initially, we need to sort out the goals of our changes and relate them to the audience,” she writes.
At the same time, Radchenko goes further. She suggests conducting various analyses depending on the type of planned changes (global and partial) and in accordance with the target audience. Based on this, she builds a “matrix of intersection of goals and focuses.” The latter should give the development team an answer as to which type of analysis of the existing version is optimal.
The main problem is that the analysis pushes back the stage of direct development of new functionality and testing, since it requires numerous studies.
That is, you will first have to study what and how with the conversion, with refusals, with where the dumps go, and so on and so forth. And only then it will be possible to proceed to the most interesting — the integration of new functionality.
Types of testing
Suppose the analysis is behind, the development of a new cool feature is also behind. Then, if no one is in a hurry, the testing stage begins.
Ready-made functionality poses the main problem for developers: how to check it. There are a lot of tools, but not all of them can be suitable for testing a specific innovation.
For features that are easy and cheap to implement, but at the same time effective, Alina Brazdeikene advises two possible scenarios.
“The first is to add tools for the A/B test to the build (this is a good investment that may come in handy many times in the future), carefully test (the feature), pour out the update and measure the results.”
“The second is to make the desired feature, create a new application. After that, we release it to a narrow circle of countries, buy users, drive them into the game and see the result.”
However, Alina notes that it is suitable “only if there is no possibility to resort to A/B testing [which we will talk about below], and services like PlaytestCloud* are not suitable for the experiment (especially if the experiment should give a plus to monetization / retention in the long term).”
*PlaytestCloud is a service that conducts live tests of the game on the target audience.
The pitfall with duplication is that there are usually some kind of duplication restrictions on platforms, so the team will have to make the new build as great as possible. At the same time, making differences should not affect the purity of the experiment.
“In comparison, the implementation of the A/B test feature in the product already seems simple and cheap,” Alina notes.
Alexandra Radchenko also considers the path, which can be conditionally called a soft-launch of the new version, inconvenient. In her opinion, this is not only more expensive than tests on a sample of users, but also introduces a great risk, because the update can be taken differently in the real market and where the test is carried out.
She speaks more positively about the practice of closed beta testing. “This is a well—known method,” says Alexandra. — We are recruiting a group of enthusiasts-discoverers of new sensations among the target audience. We invite you to an exciting journey through your favorite game, but with a peppercorn. We compare statistics before and after, collect opinions on changes.”
DevGame has a slightly different practice for testing functionality. Its key difference from the beta test is the absence of invitations. Instead, it is immediately deployed to a limited audience.
DevGame is known for its children’s projects on popular franchises
“We are publishing a new version for 5-10% of users (depending on the volume of the current audience),” says Maxim Kochurin. – “After that, we collect data and analyze how the new functionality was displayed on the main indicators and on the audience’s feedback about the product. If the changes led to an increase in those indicators that we wanted to correct with new functionality, 10% of users, we roll out the build by 100%. And so, step by step, we are introducing new features.”
The main indicator for DevGame is income. For this reason, the main “vector of decision-making on the implementation of the functionality is simple: if the hypothesis increases revenue, it is implemented.”
The testing tools and methods vary from company to company, but the main thing for everyone is the A/B approach. Four out of the five teams I’ve spoken to resort to it.
How to conduct testing (tips and cases)
An unequivocal supporter of A\B testing was Sabirov. “If there is an opportunity to conduct an A\ B test, it is better to conduct it,” said Vasily.
According to him, it should be done like this:
- create a set of users (it is necessary that they have not interacted with the new functionality before and have seen it for the first time already as part of the experiment);
- randomly split them into two or more groups;
- select one of the groups as a control, for the others — set the change/changes;
- conduct an experiment and summarize its results.
Even before it is carried out, it is necessary to “correctly set the final metrics”. Vasily advises to take monetary metrics for evaluation, for example, ARPDAU and cumulative ARPU by the seventh day.
However, it is not always possible to start from the results obtained. For example, when testing The X-Files: Deep State, an incident occurred. When A\B testing dialog styles from three options (sarcastic, role-playing and neutral), none won by a significant margin.
As a result, the Creative Mobile team started from videos from playtests. “We saw a very positive reaction to the sarcastic texts and decided to leave them,” says Alina Brazdeikene.
Working with a hypothesis
DevGame has been actively promoting its own approach to working with data for a long time. It doesn’t have any name. The bottom line is that the company’s developers always start with the formation of a hypothesis.
So they did when working with the game “Three Cats: Shop”. This was the studio’s second project for the brand “Three Cats”.
“The problem was that, in comparison with the first project, he had very low values for a number of basic indicators,” says Maxim Kochurin. “The income was also not very high.”
How did they work further?
We posed a problem: a very large number of short sessions.
We formulated a hypothesis: when entering the player selection screen, the child clicks on the “Play” button, and not on the character (entering the main gameplay). And thus leaves to play additional gameplay (dressing up the Christmas tree). The player gets bored and leaves the game altogether without ever seeing the main gameplay.
They proposed a proposed solution: at the first entrance, skip the screen with the character selection at all and immediately switch from the main screen to the main gameplay when pressing the Play button.
Tested: for 10% of users.
Then the data of the new version of DevGame was compared with those of the basic one.
So the studio found out that the hypothesis was fair, because the retention of the first day of the new version turned out to be 10% more, and the length of sessions doubled.
Good luck with testing the new functionality!
Be sure to write that if you have any additional questions.