MIKE ANDERSON

MACHINE LEARNING SPECIALIST

A dynamic and highly skilled Machine Learning Engineer and Data Scientist with over 5 years of experience, Mike was quickly promoted three times at National Guardian Life due to exceptional technical expertise and business acumen. He has a proven track record in machine learning, data analysis, and AI integration, currently pursuing an M.S. in Artificial Intelligence from Johns Hopkins University. He’s a top 10% performer in machine learning competitions and a founding member of NGL's AI Committee. Mike is proficient in Python, SQL, and AWS tools, with a strong background in quantitative analysis, data mining, and data visualization. He has demonstrated leadership in cross-functional teams, delivering impactful, data-driven insights.

SKILLSET

- Supervised learning: regression (linear, logistic), classification, and model tuning
- Unsupervised learning: clustering (KMeans, GMM, DBSCAN), dimensionality reduction (PCA, t-SNE, UMAP)
- Ensemble methods: Random Forest, Gradient Boosting, XGBoost, CatBoost, LightGBM
- Time series modeling: ARIMA, Prophet, rolling stats, and seasonality decomposition
- Evaluation metrics: ROC-AUC, F1 score, confusion matrices, precision/recall tradeoffs
- Model selection: cross-validation, grid/randomized search, hyperparameter tuning
- Feature engineering: transformation pipelines, interaction terms, encoding, scaling
- Explainability: SHAP, permutation importance, partial dependence plots
- Fully connected (dense) networks: feedforward architectures, dropout, batch normalization
- Convolutional Neural Networks (CNNs): for image recognition, object detection, edge cases
- Recurrent Neural Networks (RNNs): LSTM, GRU for sequential tasks like time series and text
- Embeddings: learned representations via Word2Vec, Doc2Vec, and TensorFlow/Keras Embedding layers
- Optimization: backpropagation, gradient descent variants (Adam, RMSProp, SGD)
- Overfitting strategies: regularization, dropout, data augmentation
- Frameworks: PyTorch, TensorFlow, Keras — including building from scratch for deeper understanding
- Generative AI: prompt engineering, fine-tuning large language models (LLMs), text summarization
- NLP: transformers (BERT, RoBERTa, T5), sentiment analysis, named entity recognition, embeddings
- Search + retrieval: vector search with FAISS, embedding-based similarity, RAG (retrieval augmented generation)
- Reinforcement learning: policy gradients, Q-learning, basic agent-environment frameworks
- Ethics and alignment: fairness, bias detection, interpretability, responsible deployment
- AI-driven systems: end-to-end ML/AI architectures that solve real business problems
- Pipeline orchestration: training, validation, deployment, monitoring
- Drift detection: custom solutions for feature and prediction distribution shift
- Model versioning: MLflow, DVC, model registries, reproducibility
- Automated retraining: scheduled jobs, model decay triggers, batch/online updates
- Monitoring: prediction confidence tracking, input schema validation, latency thresholds
- Integration with business systems: embedding models into APIs, web tools, or BI layers
- AWS: S3, EC2, SageMaker, Lambda, Glue, Athena, Redshift, CloudWatch
- Azure: Azure ML, Blob Storage, Synapse, Azure DevOps, Key Vault
- GCP (familiarity): BigQuery, Vertex AI, GCS
- Infrastructure-as-Code: Terraform for provisioning repeatable, secure cloud environments
- Cost optimization: data lifecycle policies, autoscaling, spot instances
- GitHub Actions: secure workflows for testing, packaging, deployment
- DevSecOps: gated merges, environment secrets, PR labeling, unit tests
- Package automation: Poetry-based versioning, wheel builds, wheelhouse packaging
- Multi-environment deployment: staging → UAT → prod rollouts, rollback plans
- Notebook → production transitions: converting notebooks to robust Python modules and services
- ETL pipelines: ingesting from APIs, databases, flat files, streaming sources
- Data wrangling: pandas, PySpark, SQL joins, window functions, cleaning pipelines
- Storage design: normalized + denormalized schemas, Delta Lake, data lakes vs. warehouses
- Job scheduling: Airflow, AWS Step Functions, cron-based tasks
- Graph databases: Neo4j design for knowledge graphs and entity relationship mapping
- Data quality checks: validation, profiling, anomaly detection models for pipeline health
- Business Intelligence: Tableau, Power BI, AWS QuickSight
- Python viz: Matplotlib, Seaborn, Plotly, Dash, Streamlit (custom tools + prototypes)
- Storytelling: crafting narratives through data with audience-specific framing
- Interactive dashboards: filters, drilldowns, parameterized views for stakeholder control
- Model explainability UI: SHAP plots, feature importance summaries, prediction explanations
- Real-time insights: dashboards connected to streaming/near-real-time data pipelines

MIKE ANDERSON

MACHINE LEARNING SPECIALIST

SKILLSET

Mike Anderson

mikeanderson0289@gmail.com

MIKE ANDERSON

MACHINE LEARNING SPECIALIST

SKILLSET

MACHINE LEARNING

NEURAL NETWORKS

ARTIFICIAL INTELLIGENCE (AI)

ML OPS

CLOUD

CI/CD

DATA ENGINEERING

VISUALIZATION / DASHBOARDS

Mike Anderson

mikeanderson0289@gmail.com