ML with a human face: can interfaces make Machine Learning accessible?

ICT.Moscow discussed this issue with the lead author of a scientific article on transfer learning from Cornell University Swati Mishra, with Russian ML-developers from Yandex, Sberbank, consulting company GlowByte, among others

Recently, the popularity and adoption of artificial intelligence (AI) and machine learning (ML) into various business processes has been actively growing. This is evidenced, for example, by the steadily increasing media coverage, which indicates an increasing relevance of the technology. A growing number of application practices confirm the fact: ICT.Moscow database, in 2021 alone, collected more than a hundred of them.

However, the use of AI in business processes is accompanied by a rather significant-stop factor: in order to effectively use ML algorithms to solve problems, you need to be a specialist in ML and AI. This problem can be solved in various ways. For example, Cornell University in the United States is developing a platform with a “transfer learning” approach that allows people without special skills to use ML algorithms. Data scientist from KPMG Germany Philip Vollet, in turn, talks about a new noticeable trend, the development of machine learning graphical user interface (MLGUI).

Is the field of AI on the verge of a tipping point, when, thanks to such specialized interfaces, ML will in fact become a publicly available tool that does not require deep specialized knowledge?

ICT.Moscow discussed this issue with the lead author of a scientific article on transfer learning from Cornell University Swati Mishra, with Russian ML-developers from Yandex, Sberbank, consulting company GlowByte, among others. The developers whose solutions are certified for use in the Russian healthcare system (in the field where AI is most in demand), in turn, talked about their specifics of using and working with ML interfaces.

Why do we need MLGUI?

"The peculiarities of MLGUI functioning follow the tasks that are solved by ML-algorithms. Philip Vollet, a data scientist in the consulting industry, sees MLGUI as an analytics application interface. Pavel Snurnitsyn head of the Advanced Analytics practice of GlowByte Consulting clarified specifics of this stance. In the consulting practice of Data Science, the term "Analytical Applications" has been fixed for a relatively long time. Essentially, this is what Philip Vollet calls MLGUI, an interface application for the end-business user that works with the results of ML models and advanced analytics.

The need for this kind of application arises in a business problem where ML or advanced analytics are used as an assistant to a human expert making the final decision. "The user, in this case, is not a data science and machine learning specialist, but a business analyst, expert or engineer who makes decisions based on an assistant in the form of ML/AI", said Pavel Snurnitsyn, Practice Leader of Advanced Analytics at GlowByte Consulting.

Denis Zhikharev, head of the Department of Integration Projects and Authentication Systems of the Moscow Department of Information Technologies (DIT), explains what effects can be expected due to this approach.

"Mr. Vollet in a way suggests moving from a low level of programming in machine learning to a higher one. The idea is to create a kind of Scratch (simplified programming environment – ICT.Moscow) for machine learning. Undoubtedly, this approach will be extremely useful in the training of specialists (educational component) and the popularization of ML. It will allow even beginners to master the basic principles of ML, make machine learning more understandable for a wide range of people.

"It is worth noting that we cannot talk about how much this approach will be in demand in the professional environment in the near future since the modification of already well-established machine learning practices will most likely be perceived by the professional community as a complication rather than a simplification. What’s important is the context of the question. In this case, we are not talking about GUI for interpreting machine learning results and process control (this is a separate ML category with extensive tools and proven practices), but about a GUI for the ML process itself in order to simplify it for developers and researchers", said Denis Zhikharev, Head of the Department of Integration Projects and Authentication Systems at DIT.

What is stopping a layman from starting to use machine learning to solve their problems? According to Swati Mishra from Cornell University, there is a significant barrier that can be removed through the implementation of MLGUI.

"Graphical interfaces for machine learning systems certainly are an important link in making AI accessible to non-experts. The reason behind this is that the modern, performant AI systems are mostly data-driven, visualizing the data flow within the AI greatly helps people understand how AI makes the decisions for them, whether to rely on it and how to tweak it. A major requirement to building AI systems is to learn how to code. Let's face it. Computer languages are not easy to learn. It takes a lot of motivation, and effort to master a programming language enough so that one can integrate an AI system into their workflows. GUI can help remove some of these barriers by providing affordances to understand and build certain AI models that might be useful for the task", said Swati Mishra, PhD student at the Department of Computer Science and Informatics, Cornell University.

In the machine learning segment, it is important to distinguish between two directions, or stack: development (training) and operation. Igor Kuralenok, head of ML services at Yandex.Cloud, has shown in the diagram what levels of work with machine learning, which determine the presence of a particular MLGUI, can be in each direction.

To understand where the end-user appears in this logic, you also need to understand the entire life cycle of a machine-learning model as described by Pavel Snurnitsyn, explaining who the main “users” are at each stage.

Thus, the end-user connects to the work with machine learning at the last, fourth stage, working with MLOps and tools for monitoring machine learning algorithms in production. However, here Igor Kuralenok sees a significant problem.

"The categories of MLOps and control tools contain a large field that is currently under study. Each developer and each company in their own way solve the problems that arise at this stage: model degradation, monitoring settings, versioning, the process of launching production, etc. The problem is that there is no standardization", said Igor Kuralenok, Head of ML Services at Yandex.Cloud.

Challenging tasks for complex technology

Before discussing the prospects for standardizing ML, including the MLGUI segment, it is necessary to understand exactly what tasks an ML interface solves and what makes it different. Stanislav Kirillov, the head of the ML Systems group at Yandex, warns about the tasks arising at the first stages of the ML life cycle that cannot be solved using the GUI.

"There are tools for different levels of immersion in the details of the training. Almost anyone can assemble their first machine learning process using instructions and examples from the documentation if they can prepare the data. But there are two very challenging tasks in machine learning. The first is to understand whether a machine-trained model is really needed in a particular case, how it will solve the problem, and how you can make sure that in practice everything will work correctly. The second task is to find, clean and prepare data suitable for training ML models. These two tricky problems are not addressed by interfaces. Model training itself requires a basic understanding of exactly how machine learning can help you in your task - for example, you need to understand what quality metrics are, where to put data, and this is already enough to solve problems in AutoML style,” said Stanislav Kirillov, Head of ML Systems at Yandex Group.

In this case, what tasks can MLGUI solve?

"The interface has two main functions: the first is viewing, the second is editing and creating. The first task is relatively simple, we can have an interface for editing the original markup language or the "viewer". There really are such GUIs, we can visualize any neural network architecture. A neural network can be represented as a computational graph, and there is such a visualization now – for example, a graph visualizer in TensorBoard (see the screenshot below – ICT.Moscow). The second point is editing or creating, and here, everything is more complicated. There is no simple editor like Word for text now. For instance, if we edit the site code, then the output is what the visitor sees, but not what the programmer sees. The situation is approximately the same with the creation and editing of neural networks", said Alexey Klimov, Technical Leader at SberCloud.

The expert clarifies that graph visualization of a complex neural network may be incomprehensible even for ML specialists, not to mention end users. For example, it will be unclear what tasks some of the ML algorithms perform. However, usually simpler neural network architectures are visualized using the GUI, and this tool is already available to non-AI specialists.

"If we take typical machine learning tasks, such as viewing a metric, this interface is clear. We see that the metric grows in the code of the working neural network, as the quality of the model increases. The line grows like a regular linear chart. In simple neural network architectures, you can see how the process proceeds. Generally speaking, graph visualization in ML is needed in order to understand how internal methods work. When we see a large table, this is not very clear, but when we see its visual representation and distribution, it is much more convenient. For example, a computer vision network detects certain objects, and we can see what exactly it pays attention to», said Alexey Klimov, Technical Leader at SberCloud.

The specifics of internal machine learning methods precisely determine the key differences between MLGUI and interfaces without ML. Philip Vollet from KPMG Germany refers to such differences as the need to take into account more variables, as well as the variance of datasets and algorithms over time. The interlocutors of ICT.Moscow agree with this premise.

"These are important tasks, and there are no completely ready-made tools yet, especially in terms of monitoring the quality of models in real-time for prompt decision-making in case of noticeable degradations. It seems to me that this will become one of the hot topics in the near future. Another important task, I consider the creation of systems that increase the connectivity of various processes and machine learning tasks, meaning systems that make it easy to understand on what data and with what parameters the model was trained, how it behaved in experiments in the product and at what moment it was written off and what was replaced", said Stanislav Kirillov, Head of ML Systems at Yandex Group.

In other words, the expert is sure that MLGUI should take into account all stages of the ML development life cycle, presented in the diagram above, in one way or another. This is logical considering that the degradation of the ML model returns the user to the first stage: preparing a new dataset and updating the functionality. Moreover, a layman needs to know at what point ML algorithms cease to produce a relevant result, which means that the interface must take into account in time to inform about it.

"An important component of the processes in the operationalization of MLGUI is that the trigger for rebuilding the application is not only the developer's initiative, as in the classic case but also the application itself. For example, the model inside it can understand that the environment and conditions for using the model have changed and a more complex retraining process needs to be launched, and new data from outside the application itself will be required. Of course, you need a lot of visualizations and graphs to give the user the opportunity to analyze and understand what the ML model suggests to him to do and why it suggests it. But charts and visualizations alone are not enough, otherwise, just a BI tool would be enough. The MLGUI application should also have built-in capabilities for starting feedback loops so that the user looks at the result, makes his own expert adjustments, changes parameters, starts the recalculation and gets a new result, and so on until he is satisfied with the quality of the proposed solutions and does not start up these the decision is further into the business process Pavel Snurnitsyn", said Practice Leader at Advanced Analytics, GlowByte Consulting.

It cannot be argued that the designated task is now being effectively solved with the help of MLGUI. This means that there are still stop factors that restrain the introduction of machine learning systems into business processes.

"When we talk about implementation, the blocking factors are, first of all, the unavailability of those systems in which machine learning is embedded to exist with such modules. The model trained "by the button" must be programmatically integrated into the system. The next stop factor is the absence or insufficient formulation of intelligible product quality metrics of the service or process into which the model is embedded. Then problems with the specifics of ML begin. This is monitoring the quality of models that are already working in business processes for quality degradation due to, for example, seasonal effects, as well as visualization of such processes. Then there are machine learning problems: for example, monitoring the quality of trained models and comparing their metrics to easily decide if a new model is good enough", said Stanislav Kirillov, Head of ML Systems at Yandex Group.

From fragmentation to uniform standards

Igor Kuralenok names another significant problem in the field of MLGUI and ML in general, that is, the lack of standardization. This argument is confirmed by Swati Mishra of Cornell University.

"Just like any other tech solution, there is no one-size-fits-all AI solution. If a task can be accomplished by merely providing a few labeled instances, then why bother going inside the AI system and making any significant changes, and hence interactive tools will focus more on providing labeled input data efficiently. On the other hand, when the task is critical, one cannot expect to only rely on a few data instances, and hence the GUI should focus on providing more control to the user", said Swati Mishra, Ph.D. student at the Department of Computer Science and Informatics, Cornell University.

In other words, MLGUI, from Swati Mishra’s perspective, is determined by the type of problems that are solved by the algorithm. Alexey Klimov from SberCloud looks at this problem from a different angle and notes that the interface depends on the neural network model and the methods embedded in it.

"We can say that often a "curse of dimension" arises here when there is a lot of data and you cannot show them all at once. Or there are so many parameters in the model that it is also unclear how to visualize them. In general, there is a loose division of methods into Explainable AI and Black Box AI. Various regulators, by the way, limit the use of Black Box AI when solving critical tasks (for example, EU regulators - ICT.Moscow). Neural networks are not fully Explainable AI. We can evaluate the performance of a neural network as a whole on a sample, but it is often difficult to say why in a particular case the decision was made in this way. The interface itself is highly model-dependent. For simple models, it is understandable, but for new, less studied models, on the contrary, there are no interfaces yet", said Alexey Klimov, Technical Leader at SberCloud.

Of course, it is also important to take into account the factor of the team, that is, the end-users. Igor Kuralenok mentioned the difference in approaches: today each company can solve the same problem using ML in its own way. At the same time, Pavel Snurnitsyn says that the size of the team working on ML projects is also important.

"While the team is small and there are few ML tasks and projects in front of it, the participants and roles can work with their disparate tools as it is convenient for anyone: someone writes code and scripts himself, someone uses specialized tools, someone from business analysts looks at data through Excel, and someone asks for MLGUI. But with the growth of the team and the number of projects, it becomes necessary to manage all of this and create a single interface and entry point for all roles, which will just stitch the whole variety of tools and platforms", said Pavel Snurnitsyn, Practice Leader at Advanced Analytics, GlowByte Consulting.

Thus, it was possible to identify at least four criteria that should be taken into account when standardizing MLGUI:

The type of tasks that machine learning solves.
Features of machine learning methods and algorithms.
Team approaches to problem-solving.
The number of specialists involved in the process.

At the same time, Igor Kuralenok notes that the process of standardization in the field of ML (and therefore MLGUI) has already begun. The prospect is in the unification of ML platforms through cloud solutions, the expert and the head of the cloud service of a large technology company assumes.

"Now that the standardization is beginning to crystallize, many use some conditional standards both in processes and in tools. Unfortunately, we are still at the point of fragmentation. At the request of MLOps you can find a million libraries with a billion different functions that do not overlap: one MLOps-tool solves such problems, the other solves other ones, etc. The main problem is the operational issue, which is platform-dependent. All platforms are now heterogeneous. Conventionally, some work in the cloud, others work on their devices or on some platform that already provides certain services, and so on. However, some consolidation of tools in the clouds is happening, especially in terms of operation. To train and use machine learning requires very diverse devices, which also quickly become obsolete. It is far from being always clear which devices are needed.Thus, we are moving towards the MLaaS (ML-as-a-Service) model, although there is no single standard yet. Moving from the current point of fragmentation to unification will take at least the next five years", said Igor Kuralenok, Head of ML Services at Yandex.Cloud.

No barriers in medicine

One of the main areas not directly related to IT, but actively introducing artificial intelligence technologies, is medicine. Swati Mishra from Cornell University talks about the importance of ML in medicine from a scientist's perspective.

"Healthcare is also a critical area. Conducting clinical trials at large scales for new drugs and vaccines can definitely be accelerated using human-in-the-loop machine learning. We need to make AI accessible to professionals in these fields", said Swati Mishra, Ph.D. student at the Department of Computer Science and Informatics, Cornell University.

However, it is necessary to take into account the fact that critical decisions are made in medicine, on which life can depend. Accordingly, the clarity and transparency of machine learning algorithms play an important role.

"Our service is used by doctors, so you can't do without a clear graphical interface. Its absence is definitely a stop factor, without which the introduction of models would be impossible. We do not use ready-made dashboards, we have our own web interface. It is more difficult to maintain, but it is more flexible if you need to add some non-standard functionality. The most important thing is that the doctor can easily understand what is happening and what values mean what", said Vladimir Borisov, Head of Forecast Model Development at Webiomed.

Evgeny Zhukov from Care Mentor AI notes that for their company there is no problem with the lack of accessible MLGUI. At the same time, he also emphasizes the fact that there is no standardization in the ML field, but he does not consider this as a significant stop factor for implementation.

"In fact, we do not have such a problem. In general, in our case, it is difficult to make some kind of unique MLGUI for all products at once because each product is unique in its own way. And development sometimes requires a huge amount of experimentation with different models and data. It will be extremely difficult to take into account all the parameters when designing a GUI. It is much more important to support the solution with quality documentation and make it reproducible for other data scientists. Among the interfaces that can also be attributed to machine learning is the interface of our tool for data markup. Overall, it is quite convenient for doctors and is similar to the tools they regularly use to view medical images. The markup results are exported in a format suitable for further processing and training of the neural network", said Evgeny Zhukov, Data Scientist at Care Mentor AI.

The company Third Opinion, in turn, observes the problem but does not consider it critical.

"The problem is relevant since the presence of a GUI makes it possible in some cases to simplify the interaction of the ML department with other departments of the company that need to use ready-made neural networks, as well as to simplify the implementation of algorithms. It is not a significant stop factor, since the basic processes for using neural networks without a GUI have already been built", said Alexander Gromov, Computer Vision Team Lead at Third Opinion.

Will voice make things easier?

ICT.Moscow also discussed with experts the prospects of using voice interfaces for working with machine learning. The experts agreed that the prospects, if any, are small. Speaking of medicine, then Evgeny Zhukov from Care Mentor AI called voice interfaces as a whole inapplicable for solving the company's problems. Alexander Gromov from Third Opinion takes the same position.

"They are not applicable, since they do not have high recognition accuracy and can lead to errors in processing commands from the user, for example, related to data processing and testing of trained models. It is also highly doubtful that the use of voice interfaces will speed up the development process compared to the graphical one", said Alexander Gromov, Computer Vision Team Lead at Third Opinion.

However, the head of the Webiomed predictive model development department, Vladimir Borisov, nevertheless believes that voice interfaces can complement MLGUI: for example, in order to fill in information about a patient. The position of developers of medical companies correlates with what experts from other companies say. Igor Kuralenok, head of ML services at Yandex.Cloud, claims that it hasn't come to that yet. Stanislav Kirillov from Yandex clarifies that the scenarios are very limited.

"As a secondary tool for small automation, yes, but as a primary tool for starting processes and monitoring, I don't see any real applicability here", said Stanislav Kirillov, Head of ML Systems at Yandex Group.

But there are other points of view. For example, Denis Zhikharev from DIT is confident that voice will very quickly become a daily routine in the field of work with ML.

"This is the next step. Looking at the speed with which voice interfaces are developing now, I think that they are in the near future", said Denis Zhikharev, Head of the Department of Integration Projects and Authentication Systems at DIT.

Pavel Snurnitsyn from GlowByte Consulting makes a similar point. He reminds us that the use of voice assistants is already a widespread phenomenon in construction and works with reports and dashboards.

"In MLGUI cases, voice interfaces, I think, will find even greater application, precisely due to the interactivity and feedback during the work of a business user. In some cases, such as, for example, when an engineer interacts with a digital advisor to manage a complex production process, voice commands for an analytical application will be more convenient than classical interaction through the GUI", said Pavel Snurnitsyn, Practice Leader at Advanced Analytics, GlowByte Consulting.

Finally, Swati Mishra of Cornell University believes in the promise of voice.

"Voice interfaces are certainly better than GUI's for some tasks that require speed and precision and there is some research that focuses on building voice interactions for advanced data analytics. Language is certainly a more intuitive medium to communicate with AI, and voice interactions find a lot of applications in AI that work on natural language data. However, there is more to be done in understanding how we can formulate instructions for complicated tasks like building and AI, or teaching and AI to do something", said Swati Mishra, Ph.D. student at the Department of Computer Science and Informatics, Cornell University.