Apply now

The power of technology during this outbreak

The new coronavirus outbreak is going global. Emerging technologies such as big data are seen as having played a notable part in preventing and containing the spread of the novel coronavirus which has raged across China.

The power of technology during this outbreak
Share article
How technology is helping fight 
the impact of the coronavirus? 

What’s the tool and method behind them? 

Today let’s dive into five examples 
that are being used right now.

AutoNavi
As China’s workforce is resuming work and production in the following weeks, one of the top concerns of most people is the number of people on public transportation. 

According to Autonavi(known in Chinese as Gaode Ditu), in addition to the real-time traffic flow of subway lines and passengers, it plans to launch more real-time traffic information in the near future to help users make better decisions on their travel arrangements. 

Functionality: 
real-time traffic density tracking on subways/train


Dataset: 
data is provided by the Beijing Municipal Transportation Commission and has covered all subway lines and stations in the city. Integrate with Map API for GIS location mapping


Method:

Technology tool used:
real-time streaming (most likely, depends on its internal tech stack)

Data pipeline:
Messaging queue for Producer and Consumer (Kafka), SparkStreaming(realtime analysis, like aggregation for traffic count), Datastore for SQL can use Hbase, other in-house tools.


MIT researchers used a machine-learning algorithm to identify a drug called halicin that kills many strains of bacteria. Halicin (top row) prevented the development of antibiotic resistance in E. coli, while ciprofloxacin (bottom row) did not. (Image from courtesy of the Collins Lab at MIT)
Functionality:
developed machine-learning computer models that can be trained to analyze the molecular structures of compounds and correlate them with particular traits, such as the ability to kill bacteria.

Method:
This data-driven prediction for building a model to enable drugs to kill bacteria is a classification problem, the random forest is a good baseline and SVM can be one of the candidate algorithms. A random forest is an ensemble of regression trees applied to bootstrapped versions of the training data. 

Typical machine learning applications set the classification threshold to choose majority vote(means overall trees in the forest), i.e. a classification threshold at 0.8. plot the receiver operating curve(ROC). Receiver operating curve plots of the true positive rate against the false-positive rate as the classification threshold is varied from 0 to1. The closer the receiver operating curve is to the top-left corner, the better the prediction quality. A common metric by which to measure overall prediction accuracy is, therefore, the area under the ROC, the AUC. 

This application uses a library of about 6,000 compounds from Broad Institute’s Drug Repurposing Hub as a test dataset and 100 million molecules selected from the AINC15 database. In normal cases, if the dataset is relatively in high dimension with more features, the techniques feature engineering can be used to see which variables are the more important predictors. Scikit Learn library provides API to get feature importance scores for all attributes based on the correlation between features and class.

Security staff members check passengers’ temperature at Jinggangshan Airport in Ji’an, East China’s Jiangxi province, Feb 10, 2020. The airport has taken measures such as increasing disinfection frequency and testing passengers’ temperature to curb the spread of the novel coronavirus. Photo from Xinhua.
Functionality:
screening and identification recognition and body temperature warning system. It’s a typical computer vision application combined with thermography for temperature monitoring for the human body.

Dataset: 
FLIR dataset

Tool:
Thermal camera named FLIR (Forward Looking Infrared)

Algorithm: 
YOLO can be used for object detection; darknet is a good candidate.

The training model is highly recommended first using transfer learning, which is more industrial style, only the big Lab will more likely train their own model from scratch. For it saves time and cost when training customized datasets on the trained models (onImageNet) by using pre-trained convolution weights. TensorFlow and its model zone provide the pre-trained algorithms, which can be one of the options to implement in model training and inference stage. The basic CNN is the foundation for this model to generate the neuron, convolution layer, and dense layers.

After training, the test dataset is used to check the mAP and IOU score. The train, test step needs to iterate till the meet model performance (the business problem defined before project starts, normally the PoC was built on that phase, so the model should meet the objective set on that stage). The deployment for the model to application normally uses the AWS deployment tool.


Author

Chloe Ji
A self-taught programmer and mainly code in Python, also code in JavaScript and new to Scala. Currently, she is working as a data scientist in industry Blockchain, previously worked in a Computer vision task. She is interested in open source projects and big data, crazy biker in town.
Solomon Soh

Solomon Soh is a Data Science Consultant for UpLevel Singapore. He specializes in operational, marketing and financial analysis with a strong flair in applying ML and RL models to derive consumers’ behaviors. An ex-management consultant, he appreciates the importance of applying the DS right to solve business or societal problems.


At this moment, hundreds of data scientists from around the world are working on data science projects about the coronavirus. 

Stay healthy and positive, spring is coming soon!
Solomon Soh is a Data Science Consultant for UpLevel Singapore. He specializes in operational, marketing and financial analysis with a strong flair in applying ML and RL models to derive consumers’ behaviors. An ex-management consultant, he appreciates the importance of applying the DS right to solve business or societal problems.



At this moment, hundreds of data scientists from around the world are working on data science projects about the coronavirus. 

Stay healthy and positive, spring is coming soon!


Want to know more about Le Wagon's 9-week bootcamp?
Download Syllabus
Keep reading
Graduate stories

For the first time in my life, I’m excited about my career.

A week after finishing the bootcamp, I was offered two full-time Ruby developer roles, and have just started as a junior developer at Purr Digital, building Rails apps for clients! Hands down, applying at Le Wagon has been the best decision I’ve ever made.

Graduate stories

From finishing a law degree to becoming an IOs developer shortly after Le Wagon

After my law studies, I realized that I liked law, but I didn’t love it. So I joined Le Wagon to learn to code and six months later, I'm working as an iOS developer.

Graduate stories

Life as a remote coder

Imagine coming to the conclusion that you can work anywhere in the world doing a job you love, and all you have to do is put in a little effort. Imagine discovering that the possibility of a non-traditional career, one where you travel to new places time and time again, is actually well within your reach.

Interested in joining the #1 ranked coding bootcamp?

We are in 39 cities worldwide.