User challenges

User challenges

1.

Communication & documentation

Unable to share files
from DAW

Compliance issue

Unable to share notebooks
from DAW

Compliance issue
notebooks can contain data

Lack of alerting

Limited communication of issues to users

Difficult onboarding for analysts

Missing, Unclear documentation

Difficult to problem solve / debug

Missing, Unclear documentation

Slow to setup new projects

Missing, Unclear documentation

Slow setup of ingestion process

Unclear documentation

Difficult to test ingestion process

Missing/Unclear documentation

Difficult to configure frameworks

Missing/Unclear documentation

2.

Workflows

Technical knowledge
required to use DAW

Unfamiliar with Python/Git

Unable to share notebooks
from DAW

Unfamiliar with Git

Difficult to navigate
files in DAW

Missing capability

Missing non-file-based
ingestion capabilities

Missing capability

Missing Airflow Features

Missing capability

Difficult to analyze a pipeline

Missing capability

Interruption due to re-login

No user-based login / portal

Difficult to visualize
data quality and lineage

Missing tool to analyze lineage and quality json files

3.

Frameworks

Complex to configure

Inconsistencies between
frameworks

Getting data in and out
of Edison is difficult

Framework complexity does not
align with user's needs

4.

Infra

Airflow stability

Fast deployment of many pods

Kubernetes stability

Lots of small issues
adding up

DAW stability

Releases

Protegrity stability

Vortex stability

8.

CI/CD

Slow to setup
new pipelines

Lack of automation

Difficult to test
pipelines

No data on DEV & ACC

Release process slow
and unstable

7.

Data Governance

Difficult to visualize
data quality and lineage

No data catalog

Analysts find it difficult to
find the right files

No data catalog

Slow setup of ingestion process

Unclear who to contact

Getting data in and out
of Edison is difficult

Lack of ownership structure

6.

Monitoring

Lack of Airflow
job monitoring

Missing capability

Lack of alerting

Limited monitoring
on the platform

Lack of observability

No dashboards on pipelines statuses,
data quality, dataset creations, ...

Difficult to problem solve / debug

Unclear or no error messages,
no access to logging

5.

Access

Analysts have limited
access to necessary data

Strict access rules on Edison

Manual data-access
for non-BE projects

Missing capability

Data(set) Access is slow

SSP request + deployment
is slow

Interruption due to re-login

Too many different roles