Machine Learning
Big Data Exposure
Programming
Analytical Skills
Deadlifting
Singing
'bout Daryl
I'm Daryl, a final year student in the National University of Singapore majoring in Business Analytics.
​
I'm a student by day and an analyst by night (and a weight-lifter by noon). I'm passionate about data science and believe data is all around us, provided we know where to look.
Improving yield (% of functional manufactured dies) is critical for the business result of semiconductor operations. Micron collects a vast amount of data during the manufacturing process from the process equipment.
​
As an intern, I was part of a development team that worked on using the vast data collected to improve yield by identifying the root cause of failures. The project utilised data stored on an internal HDFS. The data was extracted in Spark using Scala. The analysis was then run in R using GBM to identify the root causes.
My role on the team was the development of production-level R code for analysis. Besides testing, improving and implementing the model, I also proposed and implemented the parallelisation of the code on Spark using SparkR, amounting to massive time-savings.
​
Besides improving the speed, I also improved the accuracy of the model by implementing several data pre-processing techniques such as the handling of sparse factorials.