Introduction to DeepSomatic
Google has announced DeepSomatic, an AI tool that can identify cancer-related mutations in tumour genetic sequences more accurately. Cancer starts when the controls governing cell division malfunction. Finding the specific genetic mutations driving a tumour’s growth is essential for creating effective treatment plans. Doctors now regularly sequence tumour cell genomes from biopsies to inform treatments that can target how a particular cancer grows and spreads.
The Challenge of Somatic Variants
Cancer genetics is complex. While genome sequencing finds genetic cancer variations, distinguishing real variants from sequencing errors is difficult and where an AI tool would provide welcome assistance. Most cancers are driven by ‘somatic’ variants acquired after birth rather than inherited ‘germline’ variants from parents. Somatic mutations happen when environmental factors like UV light damage DNA, or when random errors occur during DNA replication. When these variants alter normal cell behaviour, they can cause uncontrolled replication, driving cancer development and progression.
How DeepSomatic Works
In clinical settings, scientists sequence both tumour cells from a biopsy and normal cells from the patient. DeepSomatic spots the differences, identifying variations in tumour cells that aren’t inherited. These variations reveal what’s fuelling the tumour’s growth. The model converts raw genetic sequencing data from both tumour and normal samples into images representing various data points, including the sequencing data and its alignment along the chromosome. A convolutional neural network analyses these images to differentiate between the standard reference genome, the individual’s normal inherited variants, and cancer-causing somatic variants while filtering out sequencing errors. The output is a list of cancer-related mutations.
Training a More Precise AI Cancer Research Tool
Training an accurate AI model requires high-quality data. For its AI tool, Google and its partners at the UC Santa Cruz Genomics Institute and the National Cancer Institute created a benchmark dataset called CASTLE. They sequenced tumour and normal cells from four breast cancer samples and two lung cancer samples. These samples were analysed using three leading sequencing platforms to create a single, accurate reference dataset by combining the outputs and removing platform-specific errors. The data shows how even the same cancer type can have vastly different mutational signatures, information that can help predict patient response to specific treatments.
The Effectiveness of DeepSomatic
DeepSomatic models performed better than other established methods across all three major sequencing platforms. The tool excelled at identifying complex mutations called insertions and deletions, or ‘Indels’. For these variants, DeepSomatic achieved a 90% F1-score on Illumina sequencing data, compared to 80% for the next-best method. The improvement was more dramatic on Pacific Biosciences data, where DeepSomatic scored over 80% while the next-best tool scored less than 50%. The AI performed well when analysing challenging samples, including a breast cancer sample preserved with formalin-fixed-paraffin-embedded (FFPE), a common method that can introduce DNA damage and complicate analysis.
An AI Tool for All Cancers
The AI tool has shown it can apply its learning to new cancer types it wasn’t trained on. When used to analyse a glioblastoma sample, an aggressive brain cancer, it successfully pinpointed the few variants known to drive the disease. In a partnership with Children’s Mercy in Kansas City, it analysed eight samples of paediatric leukaemia and found the previously known variants while identifying 10 new ones, despite working with tumour-only samples. Google hopes research labs and clinicians will adopt this tool to better understand individual tumours. By detecting known cancer variants, it could help guide choices for existing treatments. By identifying new ones, it could lead to new therapies. The goal is to advance precision medicine and deliver more effective treatments to patients.
Conclusion
DeepSomatic is a powerful AI tool that can help identify cancer-related mutations in tumour genetic sequences more accurately. Its ability to analyse complex mutations and perform well on challenging samples makes it a valuable asset for researchers and clinicians. By adopting this tool, they can better understand individual tumours and develop more effective treatment plans. The potential of DeepSomatic to advance precision medicine and improve patient outcomes is significant, and its development is an exciting step forward in the fight against cancer.
FAQs
- What is DeepSomatic?
DeepSomatic is an AI tool that can identify cancer-related mutations in tumour genetic sequences more accurately. - How does DeepSomatic work?
DeepSomatic spots the differences between tumour cells and normal cells, identifying variations in tumour cells that aren’t inherited. It converts raw genetic sequencing data into images and uses a convolutional neural network to differentiate between the standard reference genome, the individual’s normal inherited variants, and cancer-causing somatic variants. - What kind of data is used to train DeepSomatic?
DeepSomatic is trained on a benchmark dataset called CASTLE, which includes sequenced tumour and normal cells from four breast cancer samples and two lung cancer samples. - What are the benefits of using DeepSomatic?
DeepSomatic can help researchers and clinicians better understand individual tumours, develop more effective treatment plans, and advance precision medicine. - Can DeepSomatic be used for all types of cancer?
Yes, DeepSomatic has shown it can apply its learning to new cancer types it wasn’t trained on, making it a valuable tool for researchers and clinicians working with various types of cancer.









