Alphafold

AlphaFold – AI breakthrough unfolding a new path for Protein Research

‘Structure is Function’ – is a principle on which biology is based. 

Introduction

Proteins are essentially the crux of life – be it in a single cell (enzymes, hormones, etc) or outside of our bodies in our surroundings (food, medicines, etc). The function of all these proteins is dependent solely on its 3D structure and any change in this structure, alters its performance.

Nobel Laureate Christian Anfinsen believed that ‘the folding pattern and 3D structure of any protein can be determined from its amino acid sequence’.  Further to his experiments and based on his theories, numerous research groups have been working relentlessly on the ‘Protein-Folding Problem’ to figure out how and why the protein attains only a specific conformation out of the multitude of possible folding patterns. 

Predicting Quaternary Structure of Protein from Primary Structure

Conventionally, scientists have been using techniques like X-ray crystallography, Nuclear Magnetic Resonance, Cryo-electron microscopy etc to try and elucidate the 3D protein structure. These experiments involve years of laborious work and require investment worth millions of dollars.

CASP Challenge 2020

Recently, on the 30th of November, 2020, a new milestone in Protein Research was achieved when the results of the 14th biennial Critical Assessment of Protein Structure Prediction (CASP) challenge were announced. AlphaFold – a program developed by DeepMind, an Artificial Intelligence research lab affiliated with Google and its parent company was successful in predicting protein structure from its amino acid sequence comparable with the experimental results with a GDT score of 92.4. 

The CASP challenge held by the Protein Structure Prediction Centre at the University of California, was started by Prof John Moult and his co-founders in 1994 with an aim of boosting computational research in protein biology. The assessment was based on Global Distance Test (GDT) on a 0-100 scale, depending on the similarity of the predicted structure with the experimental results. 

Figure: (Source- deepmind.com/blog), Improvements in the median accuracy of predictions in the free modelling category for the best team in each CASP, measured as best-of-5 GDT

As can be seen from the graph above, the GDT scores over the years were very low implying poor resemblance between the experimental and computational outcomes.

AlphaFold

DeepMind entered the competition for the first time in 2018 with its AlphaFold program and outwit all the participants with a GDT score of more than 60. Though not close enough to being accurate, it was definitely a hope for a successful model. AlphaFold in 2020 could predict even the most challenging protein structure with a GDT score of 87 (25 GDT points higher than its competitor)

AlphaFold: The key to the protein folding problem

AlphaFold in 2018 predicted the distance between pairs of amino acids in a protein based on structural and genetic data using deep learning, a subset of AI. However, this approach could not lead them any further. John Jumper, the project lead for AlphaFold. The team then resorted to some different thinking strategies. Jumper mentions that they started developing the program based on the principles of biology, physics, machine learning and years of experience and work of the experts in the field of protein folding over the past five decades.

How did they develop it?

This time, they also included additional information about the physical and geometrical constraints that play a role in determining the 3D conformation of a protein. The program was developed to fulfill the tough task of predicting the final protein structure of the target protein.

Jumper explained that if we consider a folded protein to be a ‘spatial graph’, then the amino acid residues can be said to be nodes and the residues in close proximity can be considered to be connected by edges.

The latest version of AlphaFold used at CASP14, was based on a neural network system and trained with the publicly available protein data bank which consists of around 170000 protein structures. In addition to this, the program was also trained with other large databases having protein sequences whose structures were not yet deduced. AlphaFold can predict the protein structure with the repetitive application of a system which involves the use of  evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph. This whole prediction process requires only a few days as opposed to the years of experimental research.

The impact

Professor Andrei Lupas (Director of the Max Planck Institute for Developmental Biology, Germany and a judge for the CASP challenge) said that his lab was working on a bacterial protein for almost a decade but could not really put together the information obtained from the X-ray diffraction data. But, with the insights on the protein shape from AlphaFold, they could deduce the structure in half an hour. Professor Lupas complimented the DeepMind team on the accuracy of the model and claimed that this new milestone will help them in their quest to understand how signals are transmitted across cell membranes.

So what role does this new landmark play in the common world?

AlphaFold cannot and will not definitely replace the experimental research, but as Professor Lupas commented, “It’s going to require more thinking and less pipetting.”

This quicker method for the prediction of protein structure, can assist the scientists in faster experiments and lesser investment in times of time and infrastructure. The spared time and money can be better utilized to study the function of proteins, effect of structure on function, cause of changes in protein structure, and the implications such alterations have for example in various diseases such as Parkinsons, Alzheimers, etc. This AI miracle, AlphaFold can serve as a boon to researchers and aid in drug discovery, designing protein drugs, developing enzymes for various applications, and much more. It can revolutionize research, solving a number of unsolved issues while creating several new unexplored avenues, unthinkable at the moment. 

A very recent application of AlphaFold is the prediction of COVID-19 proteins Orf3a and Orf8, the knowledge of which will certainly have a significant place in understanding the various aspects of the viral infection as well as in the development of vaccines.

Conclusion

To total it all, a statement by Professor Lupas can be quoted: “It’s a game changer. This will change medicine. It will change research. It will change bioengineering. It will change everything”

At Let’s Excel Analytics Solutions we solve life’s greatest problems with the help of Artificial Intelligence and Machine Learning. If you are stuck with such a problem and want to know how data science can solve it, then contact us.

Data Science

Data Science in Healthcare Industry

Introduction


Data is everywhere. From small businesses to large multinational organizations, data is used in almost every area of study and work. From the small mathematical problems solved by a child to the complex functions executed in large organizations, data is used almost everywhere.

Data is one of the most important components of any organization, because it assists leaders in making decisions based on absolute certainty, comprising of facts, statistical results and trends. Any result based on correct and concise data tends to be correct. Data can reveal a lot about an organization, and organizations rely heavily on this data.

Due to the growing relevance and importance of data, data science came into the picture. Data science is a multidisciplinary field. It uses algorithms, scientific procedures and approaches to derive conclusions from massive amounts of data. This data can be either structured or unstructured. In this article, we shall be looking at data science in the healthcare industry.


Data Analytics in Healthcare

See the source image

Medicine and healthcare are two of the most important components of our lives. Traditionally, medicine and medical advice was given solely by the doctors based on the patient’s symptoms. However, this was not always accurate and was prone to errors. With the advancements in the field of data science, it is now possible to obtain a more accurate diagnosis. The AI is being used not only as a tool for diagnosis but also for break through discoveries. In a latest advancement Google has achieved a huge success in unfolding protein structures. The very core of the problem that many biochemical Scientists were trying to solve from many decades!

Scientists have also developed ‘DNA Nanopore Sequencer’ which is a tool that helps patients before they suffer from septic shock. It provides genetic sequences mapping, which abbreviates the time span of the information preparing activity. Moreover, this tool recovers genomic information, BAM document controls, and provides calculations.
The new health data science perspective allows applying data analytics, that are collected from various fields, to augment the healthcare sector. There are several areas in healthcare, such as drug discovery, medical imaging, genetics, predictive diagnosis and others which make full use of the results derived at through data science techniques. With ERM’s, clinical trials and internet research, there is so much data being accumulated every day. With the majority of people seeking healthcare advice online, gathering data has become increasingly convenient.

How can it work?

Let us now try to derive an insight into how data science and healthcare can become mutually beneficial.

  1. Data management and Data Governance: The opportunities derived from managing data efficiently are extensive. When data is managed effectively, it makes information easily accessible to all those in the healthcare industry. When data is analysed and shared effectively among doctors and healthcare providers, it will enable them to be more personal and humane in their approach towards treatment. Since the healthcare sector has its fair share of risks, data analytics should always be at the top of its game; it should be up-to-date and acute. The Data related to Medical records, ongoing condition charts of patients, medical database, genetic research, medical image diagnoses can be effectively leveraged to unfold valuable information.
  2. Each patient’s medical records can be combined into one dataset, and then analysed and utilised when needed, to derive at the required conclusions.
  3. Data management also involves data sharing. Data can be shared across several datasets, eliminating the need for excessive office work.
  4. When data is analysed repeatedly, it will bring out any and all errors in clinical data.
  5. Cloud-based clinical software enables faster processing of data, leading to time saved when deciding on treatment or obtaining test results.
  6. Machine learning assists in shortening the process of drug discovery.

Challenges ahead


While data governance has been recognized as crucial to healthcare, there are opportunities to expedite the prioritization of data governance, so that data is accurate, complete, structured, precise and available. Data governance plays a pivotal role in patient engagement, care coordination, and looking after the overall health of the community. If data is not governed properly, different healthcare companies will release inconsistent data which will prove to be a major hindrance. Healthcare data science apps exist in order to avoid such inconveniences.

Workflow Optimization and Process Improvements: Big data analytics is not as profound in healthcare. Hence, certain decisions are taken based on the ‘gut instinct’. Apart from this, lack of coherent healthcare information exchange between the systems and shortage of skilled workers to fill knowledge gaps are other two challenges involved in the process.


Opportunities Genetics/Genomics

  • Treatment personalization: With the introduction of new technologies, including new forms of genomic profiling or sequencing, it provides a new look at the world of genomics. The massive amounts of data today produce genetic data faster than ever. This is partly because the techniques of structuring data, lag behind the ability to actually get the data. Healthcare data science produces copious amounts of data, but that data needs to be made sense of. Some of the challenges in the field of genomics are:
  • Studying human genetic variation and its impact on patients
  • Identifying genetic risk factors for drug response

Opportunities in Medical Imaging

  • Medical Imaging: Medical imaging refers to the process of creating a visual representation of the body for medical analysis and treatment. If is a non-invasive method for doctors to look inside the human body and decide on the required treatment plan. With the swift growth of healthcare and artificial intelligence, this process of medical imaging becomes easier. Some of the types of medical imaging include tomography, longitudinal tomography etc. The primary methods of medical imaging are X-ray computer tomography (CT), PET, and MRI. Medical imaging needs the images to be absolutely accurate. Even minor discrepancies might lead to disastrous results, which can be catastrophic to the patients. The images need to be precisely viewed and interpreted. Data analysis refines these images by enhancing their characteristics like

Opportunities in Predictive Analytics

Predictive analytics refers to a technology that learns from experience, i.e. data, to predict a patient’s behaviour. It builds a connection between the data and the consequent actions which need to be taken based on that data. Predictive analytics allows healthcare to use predictive models or models found specifically in health data science. This allows identification of risks even before they occur. However, there are some drawbacks to predictive analytics.

Predictive analytics is already being used in healthcare manufacturing to meet safety and efficacy requirements of drug products and medical devices.

Opportunities in Drug Research

If we look back to the time of another major pandemic, the Spanish Flu, we see that drugs and vaccines took a considerable amount of time. But now, with the help of data science, data from millions of test cases can be processed within weeks. Development of vaccines and other drugs has become easier and less time-consuming.

How can Let’s Excel Analytics Solutions help here          

We at Let’s Excel develop easy-to-use software interfaces using Artificial Intelligence and Machine Learning algorithms to take healthcare research to next level with data science.  Below is an example of the diagnosis of a tumor as benign or malignant using DataPandit‘s MagicPCA solution.  

Advantages

  • Lesser time taken and more precise outcomes lead to more effective work processes.
  • Healthcare providers and other staff get the chance to perform more tasks in limited time.
  • More effective work processes lead to higher recovery rates, faster reactions to crises and, in turn, less fatal results.
  • Patients get more personalized treatments.

Conclusion

Healthcare has a vast amount of data being generated every day. This data needs to be made sense of, it needs to be structured and organized so that meaningful conclusions can be derived at from the data. The healthcare industry needs to heavily utilize this data so that patients’ lifestyle can improve, diseases can be predicted before their inception. Moreover, with medical imaging analysis, it is now possible for doctors to find even the most microscopic tumours. Doctors can also monitor the conditions of their patients from remote locations.

Data science is already doing wonders for the healthcare industry. It is only a matter of time before it proves itself to be invaluable.

Data Science Journey

Data Science Journey: Guidance for the New Bee


Considering the fast paced development in the world of Data Science his words are likely to become true. We live in the age of information and it’s quite usual to get overwhelmed with the amount of data we process each day, both in our professional and personal lives. The Internet these days is full of buzzwords related to machine learning, artificial intelligence, deep learning and the Internet of Things. Have you been wondering, if you can really make use of all these techniques in real life? Do you wish to begin your data science journey too? Then read this article to know where you can begin as a new bee!

Bill Gates once said, “A breakthrough in machine learning would be worth ten Microsofts”

Data Science Journey is based on the foundation of mathematical and statistical concepts which are universally applicable to all the sciences. That is the reason why data science is not limited to any specific field of study. It finds applications in numerous fields such as Healthcare, Food and Beverages, Petrochemicals, Agriculture, Defence and Space. To back these claims, let’s take a look at some common applications of artificial intelligence and machine learning in above mentioned fields:

Field NameCommon Applications
HealthcareClassification and Quantification of raw materials: Non-destructive testing of raw materials using spectroscopic sensors like IR, NIR, Raman etc.Distinguish between materials: Innovator Vs. Generic ProductDrug Discovery: Quantitative Structure Activity Relationship, Molecular modellingGenomics: Personalised medicines or dietMedical diagnosis: Cancer PredictionMaterial selection: Composition of materials that results in desired quality
Food and BeveragesAutomating sensory evaluation of productsClassification and Quantification of raw material: Identifying the source of raw materials and nutritional profile of the material (% of carbohydrate, fat and protein)Similarity between materials:Identifying substitute for an ingredientMaterial selection: Composition of materials that results in desired qualityShelf life: When is the product likely to degrade
PetrochemicalsClassification and Quantification of raw materials: Non-destructive testing of raw materials using spectroscopic sensors like IR, NIR, Raman etc.
AgricultureBetter crop yield: Identifying seeds with superior qualityCrop quality/ harvesting: Is it best time to harvest crop Shelf life: Predicting shelf life of harvested cropSoil texture using sensors
Defence and SpaceMaterial selection: Composition of materials that results in desired qualitySpace exploration: Is there water on mars?
Data Science Applications in various fields

I am sure you must have gotten interested in this new age Mantra and be wondering will this be applicable to you and how?

To know this let’s begin by answering below questions:

  • Are you dealing with large sets of data that do not make real sense to the human eye?
  • Are you currently using some tools to sort and analyze your data but still struggling and thus looking for a viable alternative?
  • Have you been told that the buzzwords of machine learning, artificial intelligence or the Internet of Things could solve a problem that you are faced with today?
  •  Are you very much fascinated by this new avenue seen all over the internet, but taking the first steps seem too daunting to make any real progress?
  • Do you believe that, trust is good but evidence is better?
Trust is good, evidence is better.

If you answered yes for any of the above questions, then yes, Data Science Journey is for you! Peter Sondergaard has once famously said that, ‘“Information is the oil of the 21st century, and analytics is the combustion engine”.

The best part is that anyone can use the data science techniques and benefit from them. You need not have to be a coder or an expert mathematician. Various software tools have been developed by experts in the field which can be purchased as per your requirements. 

Our cloud-based DataPandit software solutions is one such simple and user friendly interface developed by Let’s Excel Analytics Solutions.These softwares enable you to get appropriate insights out of your data and lead you in the right direction.

Data science can be learnt not just with theory but with hands-on experience. It can be said that Data Science is a habit, not a skill. The more you practice it, the stronger you get.

[newsletter_form]