Data Science Analyst (Аналитик DS)
Москва, РоссияМиддл • Сеньор • Тимлид/Руководитель группы
Удаленная работа • Частичная занятость • Работа в офисе
Опыт работы более 5 лет
Опыт работы более 5 лет
Короткая ссылка: gkjb.ru/g14e1
О себе
На данный момент Data infrastructure engineer.
Мои компетенции и опыт
Data Scientist / ML Engineer
Specialization: Machine Learning • MLOps • NLP • LLM • Predictive Analytics • Cloud Infrastructure
Professional Summary
Data Scientist with 6 years of experience in building end-to-end ML solutions and MLOps infrastructure. Specializing in implementing advanced machine learning methods, including work with large language models, computer vision, and predictive analytics. Expert in industrial deployment of ML models, creating intelligent systems, and optimizing business processes with AI.
Work Experience
Data Scientist / ML Engineer
AI Education Technologies, Moscow | 2025
Development of an intelligent educational platform based on LLM:
- Designed and implemented an end-to-end adaptive programming learning system using Llama 3.1
- Developed an NLP pipeline for dynamic generation of learning tasks and automatic evaluation of student solutions
- Implemented a collaborative filtering-based recommendation system for curriculum personalization
- Created prompt engineering and fine-tuning mechanisms to ensure learning content relevance
- Result: MVP AI trainer with 92% solution evaluation accuracy and personalized learning trajectories
MLOps and industrial deployment:
- Built a complete MLOps cycle with automated training, validation, and monitoring of ML models
- Implemented a system for collecting and analyzing educational metrics for continuous algorithm improvement
- Optimized inference time by 40% through embedding caching and request batching
Lead Data Infrastructure Engineer / DataOps
Data Processing Center, Moscow | 2020 – present
Implementation of proactive analysis and Predictive Maintenance:
- Initiated and implemented a predictive analytics project for equipment failures
- Organized collection and parsing of нужен доступ к резюме data from all servers into a centralized storage
- Conducted feature engineering and trained a classification model (Random Forest / XGBoost)
- Predicting HDD/SSD failure 7-14 days before actual failure
- Result: 40% reduction in incidents, transition to proactive maintenance
Performance and cost optimization:
- Automated scaling of computing resources for Data Science training environments
- Used Kubernetes HPA and Python scripts for load adaptation
- Result: 25% reduction in cloud costs
Key Projects
AI Trainer for Programming Education Based on LLM
Task: Create an intelligent system capable of generating personalized learning tasks and evaluating solutions in real-time.
Solution:
- Development of full-stack solution architecture: Python/Flask backend, CodeMirror frontend
- Integration and customization of Llama 3.1 via Ollama API for content generation
- Implementation of complex business logic for progression and adaptive learning
- Creation of a recommendation system based on analysis of student errors and successes
- Implementation of MLOps practices for monitoring generation quality and solution evaluation
Result: Working MVP of an educational platform with AI coach. Solution evaluation accuracy - 92%, curriculum personalization for each student.
Equipment Failure Prediction System
Task: Reduce the number of incidents caused by sudden server equipment failures.
Solution:
- Collection and analysis of historical нужен доступ к резюме data from 1000+ HDD/SSD
- Feature engineering: creation of disk degradation features
- Building and testing binary classification models (Random Forest, Gradient Boosting)
- Best model (XGBoost) showed precision нужен доступ к резюме for "failure" class
- Development of Python script for daily scoring and report generation
Result: Identification of 85% of potential failures нужен доступ к резюме days in advance. 40% reduction in incidents.
Time Series Analysis for IT System Load Forecasting
Task: Optimize allocation of computing resources and plan equipment upgrades.
Solution:
- Analysis of CPU, memory, and disk subsystem load time series over 2 years
- Identification of seasonal patterns and growth trends
- Building forecast models (ARIMA, Prophet) for 6 months
- Visualization of results in interactive dashboards (Plotly Dash)
Result: Justified equipment procurement plan. 15% savings on urgent purchases.
Additional Activity
Participation in Kaggle Competitions
- Active participation in 10+ competitions with focus on tabular data and NLP
- Implementation of full ML pipeline: from EDA to model ensembling
- Application of advanced feature engineering and validation methods
- Team collaboration and analysis of top participants' solutions
Result: 5 years of competitive experience. In the top 10% of Kaggle competition rankings.
Technical Skills
Machine Learning & AI:
- Advanced ML: LLM, NLP, Prompt Engineering, Fine-tuning, DQN, SARSA
- Frameworks: PyTorch, TensorFlow, Transformers, Scikit-learn, XGBoost
- Methods: Classification, Regression, Clustering, Recommendation Systems
Data Science & Analytics:
- Analysis: EDA, Feature Engineering, Statistical Testing, A/B testing
- Libraries: Pandas, NumPy, SciPy, Matplotlib, Seaborn, Plotly
- Time Series: ARIMA, Prophet, LSTMs, Anomaly Detection
Programming & Tools:
- Languages: Python (professional), SQL, Bash
- Cloud: Yandex Cloud, Docker, Kubernetes
- Tools: Git, Jupyter, VS Code, Linux, Flask/FastAPI
Education and Qualifications
Higher Technical Education
нужен доступ к резюме Bichevsky Academy, Saint Petersburg | 2008
Department of Automated Control Systems
Information Systems and Technologies (qualification: engineer)
- Key disciplines: Computer Systems Architecture, Databases, System Programming, Control Theory
- Graduation project: Development of an Automated Control System for an enterprise under conditions of uncertainty
- Connection to current specialization: Fundamental engineering training in system design laid the foundation for working with distributed data systems
Professional Development in Data Science
2026 | Machine Learning. Advanced
OTUS
- Time Series: Fourier and Wavelet transformation, Automatic Feature generation
- Recommendation Systems: cold start problem, SVD and ALS algorithms, two-level model
- Bayesian Learning: PyMC: Markov Chain Monte-Carlo (MCMC), Metropolis–Hastings, Generalized Linear Model (GLM)
- Reinforcement Learning: Markov Decision Process, Value Iteration, Policy Iteration, Temporal Difference, SARSA and Q-learning, Actor-Critic
- Production Code: REST architecture, Flask API, Docker, Yandex Cloud, DVC, MLFlow
- Production: Advanced Data Preprocessing, AutoML, H2O and TPOT, Featuretools
2022 | Data Science Specialist
Yandex Practicum
- Machine learning, deep learning, feature engineering
- Industrial development of ML models, MLOps basics
- Final project: Customer churn prediction for a telecommunications operator
2021 | Systems and Business Analytics
GeekBrains
- Requirements analysis, solution architecture design
- Business processes, KPIs, system performance metrics
- Integration of analytical systems into IT infrastructure
- Final project: Automation of procurement department processes (assignment from SportMasterLab)
Languages
- Russian (native)
- English (B2 – Upper-Intermediate, technical documentation)
