Categories: All - operations - infrastructure - failure - culture

by Paweł Badeński 15 years ago

669

Agile Infrastructure

The text discusses the importance of agile methodologies in managing IT infrastructure, emphasizing the need for speed, flexibility, and collaboration between developers and operations teams.

Agile Infrastructure

You are not so special you think you are

Agile Infrastructure

Culture

Manage flow
The best way to fight fires is never let them get started
Planning for fires is hard
Work together
Learning and respect
There is only us

Fail happens

Fire drills
Be confident
Try different failure cases
Go and unplug your system :D
"Out of the window" test
Fail safe
Practice makes perfect
Try not to cause it
Questions
How fast can you be back up?
How long?
Can you afford to be down?

Infrastructure is code

Infrastructure is application
API driven abstraction (cloud computing, etc.)

The mystery machine

The machine in the corner that everyone is afraid to turn off, but no one why it's on

Operations are stakeholder!

Your site cannot be down - you are loosing money!
Non-functional requirements

Agile for development; Waterfall for deployment ?!

Communication between dev team and ops team is facilitated by ticket system

Configuration drift

changes are painful
inconsistencies between machines (2, 4, 8, 100?)
confusion
mistakes

Techniques

Always ship trunk
Share the repository
boundary objects
minimize surprise

get rid of ceremony, pagers, antagonism, etc.

conversations early in the process

ops can see the work devs are doing

everyone sees everyone else working
everyone knows where too look
keep configs in sync with application code
Information radiators
helps two groups (dev and ops) to have a conversation
dev and ops see the same thing, in the same place
share metrics
Tag everything
Correlate

have the same power as with failed tests in TDD - know exactly why something is wrong

when?

synchronization - get all machines sync'd

what?
who?
Deploy early and often
the ceremony is waterfall process
there should be no ritual
Continuous Integration
run functional tests
assert services are running
test new builds
Monitoring
test driven?
need baseline chart, trends
don't just look at the data when things are bad
what does 'normal' looks like?

You have to know how 'green' looks like to know how 'red' looks like

Feedback

One step deploy
lower the fixed cost of deploy
computers are really goot at running the same commands over and over

manual deployment is error prone

having people deploy manually is immoral

you DO NOT want to have manual scripts

one automated process from version control to line services

one process for devs, testers and it operation; across all environments

Build from source
disaster recovery
scaling

speed of thought

building infrastructure is not a big manual process

test from a known state

setup process

dev, test and prod not out of sync

no one is editing config files; they are automatically pulled from svn

roll config changes forward (dev-test-prod)
automated provisioning and deployment of services
Configuration Management
apply dev-test-prod cycle to infrastructure
reason about services, instead of systems
manage server lifecycle
audit and enforce consistency
put systems into a known state
Version Control
Anything that matters
Documentation

preferably executable

Database schema
Application code
Applications configurations
System configurations
Network configurations

Done is deployed

Different environments

dev environment should not be very different than production environment
you cannot keep in sync testing and production environment
it works in test environment and not in production

Hero culture

BECAUSE everyone keeps for granted you'll keep machines 24/7
NO! Heroism is not good for operations
Bad mistakes
Patching on live production system 5 am
Heroism is a virtue

Infrastructure renaissance

developers and operations can work together
you can change easier
you can change faster

IT operations

more flexible
faster feedback
enabler
differentiator