Can search engine data save lives from pancreatic cancer?

Gerd Gigerenzer discusses how search engines use big data analytics to “diagnose” your state of health | BMJ Opinion

26645704270_087213a622_z

Image source: NIH Image Gallery – Flickr // CC BY-NC 2.0

Image shows pancreatic desmoplasia. Pancreatic cancer is associated with a vast desmoplastic reaction in which the connective tissue around the tumor thickens and scars. 

Imagine this warning popping up on your search engine page: “Attention! There are signs that you might have pancreatic cancer. Please visit your doctor immediately.” Just as search engines use big data analytics to detect your book and music preferences, they may also “diagnose” your state of health.

Microsoft researchers have claimed that web search queries could predict pancreatic adenocarcinoma. A retrospective study of 6.4 million users of Microsoft’s search engine Bing identified first-person queries suggestive of a recent diagnosis, such as “I was told I have pancreatic cancer, what to expect.” Then the researchers went back months before these queries were made and looked for earlier ones indicating symptoms or risk factors, such as blood clots and unexplained weight loss. They concluded that their statistical classifiers “can identify 5% to 15% of cases, while preserving extremely low false-positive rates (0.00001 to 0.0001)”, and that “this screening capability could increase 5-year survival.” The New York Times reported: “The study suggests that early screening can increase the five-year survival rate of pancreatic patients to 5 to 7 percent, from just 3 percent.”

Read the full blog post here

How big data is being mobilised in the fight against leukaemia

In a project funded by Bloodwise and the Scottish Cancer Foundation, we have created LEUKomics. This online data portal brings together a wealth of CML gene expression data from specialised laboratories across the globe | Lorna Jackson & Lisa Hopcroft for The Conversation

Leucemia mieloide cronica (LMC)

Image source: Paulo Henrique Orlandi Mourao – Wikimedia // CC BY-SA 3.0

Our intention is to eliminate the bottleneck surrounding big data analysis in CML. Each dataset is subjected to manual quality checks, and all the necessary computational processing to extract information on gene expression. This enables immediate access to and interpretation of data that previously would not have been easily accessible to academics or clinicians without training in specialised computational approaches.

Consolidating these data into a single resource also allows large-scale, computationally-intensive research efforts by bioinformaticians (specialists in the analysis of big data in biology). From a computational perspective, the fact that CML is caused by a single mutation makes it an attractive disease model for cancer stem cells. However, existing datasets tend to have small sample numbers, which can limit their potential.

Read the full blog post here

Can big data help cancer patients avoid ER visits?

For this project, doctors and data miners are specifically focusing on lung cancer patients | ScienceDaily

network-1762521_960_720

By flagging things like recent lab tests, radiology visits, or patient-reported symptoms, Penn’s team is hoping to come up with a formula that will predict when a patient is likely to end up visiting the emergency room. Right now, the formula can predict an estimated one out of every three ER visits, giving doctors the chance to take action before a patient gets to that point.

Read the full overview here