Data Lab.Hard Skills
Data Engineering
Pipeline Orchestration
Tools
Cron
Airflow
Luigi
Prefect
Rundeck
Dagster
Metaflow
Cloud
AWS Step Functions
Google Cloud Composer
Azure Data Factory
Data Storage
RDBMS
Classic
MS SQL Server
PostgreSQL
MySQL
Oracle
Distributed
Cloud
AWS Redshift
MS Synapse
Google BigQuery
Snowflake
YDB
ArenaData
On premise
Citus
Apache Hive
Clickhouse
NON - RDBMS
Apache Hadoop
MongoDB
MS Cosmos DB
Apache Ignite
Apache Cassandra
AWS Dynamo DB
Google Firebase
MariaDB
Neo4J
Arango DB
Storage
Cloud
AWS S3
Google Drive
MS Blob Storage
Minio S3
MS Sharepoint
MS OneDrive
Ya Disk
Sber Disk
Protocols
FTP
SFTP
SCP
S3
WebDAV
HTTP
ETL
Languages
Python
requests
sqlalchemy
pandas
psycopg2 / 3
bonobo
BeautifulSoup
lxml
dask
ScraPy
connectorX
Scala
Catz
Java
R
Go
Tools
Pentaho DI
TalenD
MS Data Factory
MS SSIS
Databricks
Informatica
AWS Glue
AWS DMS
MQ
Rabbit
Kafka
AWS SQS
AWS MQ
IBM MQ
Software Engineering
Algorithms & Data Structures
Deploy & Code Maintanance
CI / CD
Gitlab CI CD
Github actions
Jenkins
Team City
Git
GitHub
Bitbucket
GitLab
Code quality
Linters
Python
Flake8
wemake
mypy
pycodestype
Testing
Python
pytest
pytest-coverage
hypothesis
unittest
Formatters
Python
black
pre-commit
MLOps
Data tracking & Quality
DVC
CML
pandera
pydantic
Experiment tracking
ClearML
MLFlow
Serving
bentoml
flask
FastAPI
Languages
Scala
Python
Data Science
Statistics
Python
SciPy
Pingouin
Statsmodel
EDA
pandas-profiling
sweetviz
Machine Learning
Classic ML
Scikit-Learn
Clusterization
KMeans
DBSCAN
Agglomerative
Linear models
Logistic Regression
Ridge
Lasso
Vowpal Wabbit
Gradient Boosting
LightGBM
CatBoost
XGBoost
Deep learning
TensorFlow
PyTorch
PyTorch Lightning
Keras
Jax
NLP
NLTK
Natasha
Razdel
Textblob
spaCy
pyMorphy2
Emdeddings
HuggingFace
Faiss
Quadrant
Gensim
fasttext
Computer Vision
Image manipulation
OpenCV
Pillow
Scikit-Image
Detection / Segmentation
detectron2
segmentation-models
pytorch-toolbelt
Mahotas
SimpleITK
Pytesseract
PyTorchCV
timm
Recommendation Systems
Collaborative Filtering
Reinforcement learning
Bandit
Monte Carlo
Dynamic programming
Temporal difference
Data Mining
Languages
Python
NumPy
numba
Pandas
polars
MathPlotLib
Seaborn
Plotly
R
Scala
Java
SQL
Tools
MS Excel
BI
MS Power BI
Qlik Sense
Qlik View
Streamlit
Tableau
Spotfire
reDash
Metabase
Ya Datalens
Apache Superset
Visiology
IT Infrastructure
Hosting & Serverless calculation
Cloud
Linux
Windows
Azure Windows Server
AWS EC2
AWS Lightsail
Ya.Cloud
SberCloud
Selectel
Containers
Docker
docker-compose
Container hosting
AWS ECS
Azure Databricks
Kubernetes
AWS Lightsail
On premise
Linux
Windows
Authorization
MS Active Directory
LDAP
AWS IAM
Load Balancer
Azure Traffic Manager
Nginx
AWS Elastic Load Balancing
Citrix ADC
HAProxy
Kubernetes
DNS
LAN
WAN
Server
AWS Lightsail
AWS Route 53
Google Public DNS
Yandex DNS
Domain
Registration
Routing delegation
Structure
Zones
DNS Records
A
AAA
CNAME
MX
NX
PTR
SSL Certificates
DV
EV
OV
Wildcard
Multi domain
Monitoring
Zabbix
Grafana
Kibana
Prometheus
Google Gmail
Yandex Mail
AWS Workmail
IaC
Ansible
Terraform
AWS CloudFormation
Agile & Communication
Task trackers
Jira
Trello
Knowlege sharing
Confluence
Communication
MS Teams
MS Outlook
Corporate Social Network
Yammer
Diagramms
Draw.io
Gliffy
MS Visio
Miro
PlantUML
Mindomo
Figma
Presentations
MS PowerPoint
Google Presentations
LaTeX Beamer