Data Science Analyst (Аналитик DS)

Москва, Россия
Миддл • Сеньор • Тимлид/Руководитель группы
Аналитика, Data Science, Big Data
Удаленная работа • Частичная занятость • Работа в офисе
Опыт работы более 5 лет
О себе

На данный момент Data infrastructure engineer.

Мои компетенции и опыт

Data Scientist / ML Engineer

Specialization: Machine Learning • MLOps • NLP • LLM • Predictive Analytics • Cloud Infrastructure

Professional Summary

Data Scientist with 6 years of experience in building end-to-end ML solutions and MLOps infrastructure. Specializing in implementing advanced machine learning methods, including work with large language models, computer vision, and predictive analytics. Expert in industrial deployment of ML models, creating intelligent systems, and optimizing business processes with AI.

Work Experience

Data Scientist / ML Engineer
AI Education Technologies, Moscow | 2025

Development of an intelligent educational platform based on LLM:

  • Designed and implemented an end-to-end adaptive programming learning system using Llama 3.1
  • Developed an NLP pipeline for dynamic generation of learning tasks and automatic evaluation of student solutions
  • Implemented a collaborative filtering-based recommendation system for curriculum personalization
  • Created prompt engineering and fine-tuning mechanisms to ensure learning content relevance
  • Result: MVP AI trainer with 92% solution evaluation accuracy and personalized learning trajectories

MLOps and industrial deployment:

  • Built a complete MLOps cycle with automated training, validation, and monitoring of ML models
  • Implemented a system for collecting and analyzing educational metrics for continuous algorithm improvement
  • Optimized inference time by 40% through embedding caching and request batching

Lead Data Infrastructure Engineer / DataOps
Data Processing Center, Moscow | 2020 – present

Implementation of proactive analysis and Predictive Maintenance:

  • Initiated and implemented a predictive analytics project for equipment failures
  • Organized collection and parsing of нужен доступ к резюме data from all servers into a centralized storage
  • Conducted feature engineering and trained a classification model (Random Forest / XGBoost)
  • Predicting HDD/SSD failure 7-14 days before actual failure
  • Result: 40% reduction in incidents, transition to proactive maintenance

Performance and cost optimization:

  • Automated scaling of computing resources for Data Science training environments
  • Used Kubernetes HPA and Python scripts for load adaptation
  • Result: 25% reduction in cloud costs

Key Projects

AI Trainer for Programming Education Based on LLM

Task: Create an intelligent system capable of generating personalized learning tasks and evaluating solutions in real-time.

Solution:

  • Development of full-stack solution architecture: Python/Flask backend, CodeMirror frontend
  • Integration and customization of Llama 3.1 via Ollama API for content generation
  • Implementation of complex business logic for progression and adaptive learning
  • Creation of a recommendation system based on analysis of student errors and successes
  • Implementation of MLOps practices for monitoring generation quality and solution evaluation

Result: Working MVP of an educational platform with AI coach. Solution evaluation accuracy - 92%, curriculum personalization for each student.

Equipment Failure Prediction System

Task: Reduce the number of incidents caused by sudden server equipment failures.

Solution:

  • Collection and analysis of historical нужен доступ к резюме data from 1000+ HDD/SSD
  • Feature engineering: creation of disk degradation features
  • Building and testing binary classification models (Random Forest, Gradient Boosting)
  • Best model (XGBoost) showed precision нужен доступ к резюме for "failure" class
  • Development of Python script for daily scoring and report generation

Result: Identification of 85% of potential failures нужен доступ к резюме days in advance. 40% reduction in incidents.

Time Series Analysis for IT System Load Forecasting

Task: Optimize allocation of computing resources and plan equipment upgrades.

Solution:

  • Analysis of CPU, memory, and disk subsystem load time series over 2 years
  • Identification of seasonal patterns and growth trends
  • Building forecast models (ARIMA, Prophet) for 6 months
  • Visualization of results in interactive dashboards (Plotly Dash)

Result: Justified equipment procurement plan. 15% savings on urgent purchases.

Additional Activity

Participation in Kaggle Competitions

  • Active participation in 10+ competitions with focus on tabular data and NLP
  • Implementation of full ML pipeline: from EDA to model ensembling
  • Application of advanced feature engineering and validation methods
  • Team collaboration and analysis of top participants' solutions

Result: 5 years of competitive experience. In the top 10% of Kaggle competition rankings.

Technical Skills

Machine Learning & AI:

  • Advanced ML: LLM, NLP, Prompt Engineering, Fine-tuning, DQN, SARSA
  • Frameworks: PyTorch, TensorFlow, Transformers, Scikit-learn, XGBoost
  • Methods: Classification, Regression, Clustering, Recommendation Systems

Data Science & Analytics:

  • Analysis: EDA, Feature Engineering, Statistical Testing, A/B testing
  • Libraries: Pandas, NumPy, SciPy, Matplotlib, Seaborn, Plotly
  • Time Series: ARIMA, Prophet, LSTMs, Anomaly Detection

Programming & Tools:

  • Languages: Python (professional), SQL, Bash
  • Cloud: Yandex Cloud, Docker, Kubernetes
  • Tools: Git, Jupyter, VS Code, Linux, Flask/FastAPI

Education and Qualifications

Higher Technical Education
нужен доступ к резюме Bichevsky Academy, Saint Petersburg | 2008
Department of Automated Control Systems
Information Systems and Technologies (qualification: engineer)

  • Key disciplines: Computer Systems Architecture, Databases, System Programming, Control Theory
  • Graduation project: Development of an Automated Control System for an enterprise under conditions of uncertainty
  • Connection to current specialization: Fundamental engineering training in system design laid the foundation for working with distributed data systems

Professional Development in Data Science

2026 | Machine Learning. Advanced
OTUS

  • Time Series: Fourier and Wavelet transformation, Automatic Feature generation
  • Recommendation Systems: cold start problem, SVD and ALS algorithms, two-level model
  • Bayesian Learning: PyMC: Markov Chain Monte-Carlo (MCMC), Metropolis–Hastings, Generalized Linear Model (GLM)
  • Reinforcement Learning: Markov Decision Process, Value Iteration, Policy Iteration, Temporal Difference, SARSA and Q-learning, Actor-Critic
  • Production Code: REST architecture, Flask API, Docker, Yandex Cloud, DVC, MLFlow
  • Production: Advanced Data Preprocessing, AutoML, H2O and TPOT, Featuretools

2022 | Data Science Specialist
Yandex Practicum

  • Machine learning, deep learning, feature engineering
  • Industrial development of ML models, MLOps basics
  • Final project: Customer churn prediction for a telecommunications operator

2021 | Systems and Business Analytics
GeekBrains

  • Requirements analysis, solution architecture design
  • Business processes, KPIs, system performance metrics
  • Integration of analytical systems into IT infrastructure
  • Final project: Automation of procurement department processes (assignment from SportMasterLab)

Languages

  • Russian (native)
  • English (B2 – Upper-Intermediate, technical documentation)

Специализация
Аналитика, Data Science, Big Data
Отрасль и сфера применения

Уровень
МиддлСеньорТимлид/Руководитель группы

Интересные кандидаты