Scientists from the University of Warsaw: Artificial intelligence helps discover new drugs
AMP peptides have an effect similar to antibiotics. We used artificial intelligence to find peptides with the greatest antimicrobial potential, says Marcin Możejko from the University of Warsaw.
You are a member of a team that is focused on discovering peptides that are active against various microorganisms. The team consists of scientists from different disciplines. You do this using artificial intelligence – using the deep generative model HydrAMP. Let’s start from the beginning: what are the AMP family peptides that you are working on?
Antimicrobial peptides – to put it briefly – belong to a family of very short proteins. The origin of their discovery is interesting: they come from frogs. And frogs – as we know – live in an environment rich in bacteria and therefore must have a natural defense against them. This defense is provided by antimicrobial peptides. The class of peptides present in frogs has a very interesting mechanism of action. Most often, it consists in the fact that they have – on the one hand – the ability to attach themselves to the cell membrane of a microorganism, but on the other hand they would like to move away from it at the same time. They can therefore be compared to a Velcro that is coming apart: this Velcro wants to penetrate the bacteria, tear the cell membrane, but the other side wants to break away from the bacteria. In this way, the cell membrane of the bacteria is torn, which leads to the death of the bacteria.
Peptides – if such drugs are created – may therefore be a support or complement to antibiotics. Resistance to peptides will probably appear over time, as it happened with antibiotics…
The chance that bacteria will become resistant to peptides is small – precisely because of the mechanism of action of peptides. Peptides destroy the cell membrane of bacteria. So the emergence of resistance of bacteria to peptides would require modification of its cell membrane, and this is evolutionarily very difficult. It will be more difficult for bacteria than in the case of antibiotics to adapt to such a mechanism of action of peptides.
Antibiotic resistance is becoming an increasing problem…
Indeed, it is estimated that in 2050 antibiotic resistance will cause more deaths than oncological diseases.
Why might this happen?
We use too many antibiotics, including in hospitals. It is hospitals that become breeding grounds for antibiotic-resistant bacteria, where bacteria gain a natural environment for the emergence of resistance mechanisms. Microbes naturally evolve and genetically modify themselves. It may happen that one of the modifications causes a given bacterium to acquire resistance to a given antibiotic. And then it will spread, infecting more patients. The more antibiotics are administered, the greater the chance that a given bacterium will become resistant. And if we do not have drugs for a given class of bacteria, these microorganisms begin to wreak havoc among patients in hospitals. As we know, hospitals house people whose immune systems are weakened. So these bacteria drastically reduce our chance of recovery.
This is confirmed by studies, such as those conducted at Northwestern University in Chicago during the COVID-19 pandemic. They showed that many COVID patients in hospitals were infected with antibiotic-resistant bacteria and often died not so much from COVID but from infection with these bacteria.
Currently, for many reasons, the development of new classes of antibiotics has stopped. Most pharmaceutical companies do not take up this challenge not only because of the difficulties of scientific research, but also the costs of developing this type of antibiotics and the relatively low return on investment. This has created a field for the academic community and … peptides.
What is the generative model you used to search for peptides?
The simplest way to describe it is this: there is a large number of examples available and we want to use AI to generate new ones – for example, images in a specific style or texts on a given topic. The same is true when we have access to a huge number of peptides and only some of them are interesting for us to solve a given problem. This is where AI algorithms come in, which aim to generate new objects.
Recently, there has been a huge development of these methods, especially when it comes to generating images or texts. However, in biological and chemical environments, these methods do not perform as well as in the case of images or texts. This is primarily due to the fact that in natural sciences, completely different data is used, there is a different work paradigm.
What is your role in Prof. Ewa Szczurek’s team that discovers peptides? What do you do in it?
I am a mathematician, I deal with neural networks and data analysis. For me, a peptide is a sequence, figuratively speaking – a string. We have letters, i.e. amino acids, from these letters we compose strings, i.e. peptides. We assumed that in our work we will generate strings that are sets of amino acids, no longer than 26 letters.
Are there a lot of short peptides, those letters?
There are many, many short peptides (at least hundreds of thousands). After appropriate data preparation, it turned out that there are at least 11 thousand peptides that we know are active, meaning they “kill” bacteria. We managed to isolate a smaller subset: several thousand very active peptides. Those that do not need much in the vicinity of bacteria to “kill” them.
How did you go about finding these peptides that could potentially be used as antimicrobial drugs?
As I said, technically each peptide is a sequence of amino acids, a string of letters; but not every such string is a good peptide. Just like in a language – every string is a sequence of letters, but not every sequence of letters makes any sense.
We have constructed a model that takes these strings and builds a map of them. What is unique about our solution is that we have constructed a map of all peptides – both active and inactive, because in most models that exist, a map is built only of those several thousand active peptides. For this reason, huge candidate areas are not explored, because in order to search them, someone would have to find them first.
We were able to model the language of these peptides in a very efficient way. All the peptides that we modeled that “made sense” were synthesized. That is, our peptide map actually did not show the sequences that did not make sense.
The peptide map was created. What happened next?
In the second step, we used the map to identify these new peptides. We were interested in peptides that are active in themselves or have certain properties. First, we found the location of a given peptide on our map, we located it. Then we searched the environment of this peptide, that is, we used the map to search around our peptide for “treasures”, i.e. very active peptides that could be candidates for new drugs.
In this way we achieved the most significant success from our perspective – we had an inactive peptide and we turned it into a peptide that was active – because we found six modifications of it, all of which were very active.
This opened the way for us to search for peptides with properties that we had not had the opportunity to find before, because we were limited to searching only for peptides similar to those known and effective.
What happened next with the candidates? Did you continue to weed them out?
This mathematical machine, this map generated a huge number of candidates. Peptide synthesis is expensive, so we had to choose from all these candidates those that were the most promising. Again, we did it in a few steps, using the vast biological knowledge in our team, especially that of Paulina Szymczak, M.Sc.
First, we threw out the candidates that might cause a problem. Then, we filtered those peptides, taking into account other scientists’ models for predicting their activity. And then, third, we created a huge ranking system to pick the absolute best candidates, because by mining this map, we could generate as many as 60,000 of them. So, just like Google sorts its results, we created a system that sorted the peptides from the most likely to be very active to the least likely.
In the last step, we used the results from another laboratory operating at the University of Warsaw – from Piotr Setny’s laboratory, which develops molecular simulations used to assess whether a given peptide will enter the cell membrane, whether it will perforate it or not. Here I would like to thank NVIDIA. Since one simulation takes several days, and the assessment of several hundred of our candidates would take months – a group of programmers devoted their time to speeding up these simulations.
As a result, we selected a group of about 30 peptides, which we sent for evaluation to the laboratory of prof. Wojciech Kamysz in Gdańsk. There, these peptides were synthesized and tested on real bacteria. These results gave us final confirmation that the peptides are active and have the ability to kill bacteria. Seven peptides have these abilities that are very, very promising.
You even gave them names…
Yes, I named one after my beloved aunt, Zofia, who made me a scientist. She bought me encyclopedias, Lego blocks, which I put together. Today I also “put together blocks” – maybe a bit more complicated ones.
So not the method of laborious research in laboratories, trial and error, but now mathematics helps in finding new drugs. What’s next with this idea, what are the chances for clinical trials, commercialization?
Our method is just a demonstration of what can be done. We propose to use it to generate a much larger number of candidates. We would like these peptides to be so highly active that relatively few of them can be administered. In order to treat patients with them, it is also necessary to check their toxicity, which we also tested. We checked whether the peptides kill only bacteria, not patient cells.
It should be emphasized that there are no drugs in this category yet. Giving our drug to bacteria is very simple: we can add our drug to the dish and check whether the bacteria die or not, but giving these drugs to patients is a completely different matter. It is much more complicated. We are dealing with the first step – selecting candidates. Taking further steps is impossible without the help of partners from pharmaceutical companies. If we manage to provide suitable candidates, we will try to interest pharmaceutical companies and we will be able to create new drugs.
Do you see any chance that these types of models will be used to design other new drugs?
Yes. I am someone who looks at this problem from a mathematical perspective, mathematical modeling. What we needed for this model to construct active peptides was first of all information about whether they are active or not. Our model was constructed so that the peptides are active against one strain of bacteria, and we are currently working on extending this model to multiple strains of bacteria.
I can easily imagine using this model to find out if a given peptide will kill cancer cells or treat other diseases. From the perspective of a mathematical model, it looks exactly the same. The mathematics gives us such an abstract description, and the subsequent applications are easy to absorb into the model.
I was the author of the mathematical methods of artificial intelligence, but the success of our algorithm had many fathers, I would like to thank the entire team.