
MIKE ANDERSON
MACHINE LEARNING SPECIALIST
A dynamic and highly skilled Machine Learning Engineer and Data Scientist with over 5 years of experience, Mike was quickly promoted three times at National Guardian Life due to exceptional technical expertise and business acumen. He has a proven track record in machine learning, data analysis, and AI integration, currently pursuing an M.S. in Artificial Intelligence from Johns Hopkins University. He’s a top 10% performer in machine learning competitions and a founding member of NGL's AI Committee. Mike is proficient in Python, SQL, and AWS tools, with a strong background in quantitative analysis, data mining, and data visualization. He has demonstrated leadership in cross-functional teams, delivering impactful, data-driven insights.








SKILLSET
-
Supervised learning: regression (linear, logistic), classification, and model tuning
Unsupervised learning: clustering (KMeans, GMM, DBSCAN), dimensionality reduction (PCA, t-SNE, UMAP)
Ensemble methods: Random Forest, Gradient Boosting, XGBoost, CatBoost, LightGBM
Time series modeling: ARIMA, Prophet, rolling stats, and seasonality decomposition
Evaluation metrics: ROC-AUC, F1 score, confusion matrices, precision/recall tradeoffs
Model selection: cross-validation, grid/randomized search, hyperparameter tuning
Feature engineering: transformation pipelines, interaction terms, encoding, scaling
Explainability: SHAP, permutation importance, partial dependence plots
-
Fully connected (dense) networks: feedforward architectures, dropout, batch normalization
Convolutional Neural Networks (CNNs): for image recognition, object detection, edge cases
Recurrent Neural Networks (RNNs): LSTM, GRU for sequential tasks like time series and text
Embeddings: learned representations via Word2Vec, Doc2Vec, and TensorFlow/Keras Embedding layers
Optimization: backpropagation, gradient descent variants (Adam, RMSProp, SGD)
Overfitting strategies: regularization, dropout, data augmentation
Frameworks: PyTorch, TensorFlow, Keras — including building from scratch for deeper understanding
-
Generative AI: prompt engineering, fine-tuning large language models (LLMs), text summarization
NLP: transformers (BERT, RoBERTa, T5), sentiment analysis, named entity recognition, embeddings
Search + retrieval: vector search with FAISS, embedding-based similarity, RAG (retrieval augmented generation)
Reinforcement learning: policy gradients, Q-learning, basic agent-environment frameworks
Ethics and alignment: fairness, bias detection, interpretability, responsible deployment
AI-driven systems: end-to-end ML/AI architectures that solve real business problems
-
Pipeline orchestration: training, validation, deployment, monitoring
Drift detection: custom solutions for feature and prediction distribution shift
Model versioning: MLflow, DVC, model registries, reproducibility
Automated retraining: scheduled jobs, model decay triggers, batch/online updates
Monitoring: prediction confidence tracking, input schema validation, latency thresholds
Integration with business systems: embedding models into APIs, web tools, or BI layers
-
AWS: S3, EC2, SageMaker, Lambda, Glue, Athena, Redshift, CloudWatch
Azure: Azure ML, Blob Storage, Synapse, Azure DevOps, Key Vault
GCP (familiarity): BigQuery, Vertex AI, GCS
Infrastructure-as-Code: Terraform for provisioning repeatable, secure cloud environments
Cost optimization: data lifecycle policies, autoscaling, spot instances
-
GitHub Actions: secure workflows for testing, packaging, deployment
DevSecOps: gated merges, environment secrets, PR labeling, unit tests
Package automation: Poetry-based versioning, wheel builds, wheelhouse packaging
Multi-environment deployment: staging → UAT → prod rollouts, rollback plans
Notebook → production transitions: converting notebooks to robust Python modules and services
-
ETL pipelines: ingesting from APIs, databases, flat files, streaming sources
Data wrangling: pandas, PySpark, SQL joins, window functions, cleaning pipelines
Storage design: normalized + denormalized schemas, Delta Lake, data lakes vs. warehouses
Job scheduling: Airflow, AWS Step Functions, cron-based tasks
Graph databases: Neo4j design for knowledge graphs and entity relationship mapping
Data quality checks: validation, profiling, anomaly detection models for pipeline health
-
Business Intelligence: Tableau, Power BI, AWS QuickSight
Python viz: Matplotlib, Seaborn, Plotly, Dash, Streamlit (custom tools + prototypes)
Storytelling: crafting narratives through data with audience-specific framing
Interactive dashboards: filters, drilldowns, parameterized views for stakeholder control
Model explainability UI: SHAP plots, feature importance summaries, prediction explanations
Real-time insights: dashboards connected to streaming/near-real-time data pipelines