Hi, I’m Annie! 👋

Storyteller. Visual creator. Data enthusiast. I turn complex insights into simple, visually compelling stories that make people say, 'Now I get it.'

MEng in Data Science @ UCLA
A recent UCLA Data Science grad with 5 years of experience in data, specializing in predictive modeling, data visualization, and statistical analysis across multiple domains, including the retail and manufacturing. I'm dedicated to leveraging my skills to help organizations succeed by making strategies accessible and effective.

Right now, I’m seeking my next move. I invite you to check out my resume for more details. I’m delighted to be contacted at LinkedIn/ Email. Let’s connect and explore data together!

Something About Me

🍰 Food and travel enthusiast
🤝 Enjoy meeting new people and exploring new cultures
💻 Speak fluent Python, SQL, Power BI, and Tableau

Professional Experience

Sales Data Analyst, Retail and Ecommerce @ GOOSH INC (06/2023 – Present)

Analyzed sales and improved performance, contributing to a 20% sales increase.
Built Power BI dashboards for data-driven decision-making.

Data Scientist, Manufacturing @ Nan Ya Plastics Corporation (08/2019 – 05/2022)

Applied machine learning models to optimize production processes, reducing resource waste by 50%.
Led digital transformation initiatives with data visualization tools and robotic process automation techniques.

Projects

Anticipating Tomorrow: Predicting Radiologist Case Volumes

Date: Dec, 2023

In collaboration with Massachusetts General Hospital, this project developed a forecasting system using LightGBM and ensemble models to predict radiologist workloads 1-7 days ahead. With a <10% SMAPE, the system improved healthcare resource allocation, ensuring better staff planning, reducing delays, and enhancing patient care efficiency.

AI Resource Optimization and Waste Reduction

Date: May, 2022

This project focused on building AI-driven predictive models to optimize production planning in plastic manufacturing. By identifying key factors influencing wastewater pollutants, we reduced COD levels from 45,000 ppm to 30,000 ppm, cutting treatment costs and minimizing environmental impact. This approach improved resource management, reduced waste, and increased production efficiency by 20%.

Vectorized Similarity Search in Multi-modal Databases

Date: Dec, 2023

In this project, we developed a multimodal database using CLIP embeddings to enable text-to-image and image-to-text searches. With kNN and ANN algorithms, it achieved 96% precision on the MS-COCO dataset. The project also included a user-friendly interface built with Streamlit, which allows users to easily perform dynamic searches between text and images, making it quick and easy for users to conduct multimodal AI information retrieval.

Enhancing EEG Data Classification with CNN, CRNN, and Ensemble Models

Date: May, 2023

This project aims to improve the accuracy of EEG data classification for better brain signal interpretation. It applies CNN, CRNN, and CRNN with Attention to classify EEG data, and the result shows that CNNs excel at capturing short-term patterns, while CRNNs handle longer sequences effectively. By ensembling 64 models, the system achieved 73.6% accuracy, showcasing the benefits of diverse model architectures to improve EEG signal analysis.

Building Word Embeddings with GloVe for NLP Applications

Date: March, 2023

In this project, we built and trained the GloVe model to create word embeddings by analyzing word co-occurrence patterns. Our work involved constructing and optimizing a word-word co-occurrence matrix, training the embeddings using weighted least squares, and fine-tuning hyperparameters for improved performance. We evaluated the model on tasks like word analogy and similarity, achieving 75% accuracy in analogy prediction. These embeddings improved performance in tasks like information retrieval and natural language processing.