You are not so special you think you are
Agile Infrastructure
Culture
Manage flow
The best way to fight fires is never let them get started
Planning for fires is hard
Work together
Learning and respect
There is only us
Fail happens
Fire drills
Be confident
Try different failure cases
Go and unplug your system :D
"Out of the window" test
Fail safe
Practice makes perfect
Try not to cause it
Questions
How fast can you be back up?
How long?
Can you afford to be down?
Infrastructure is code
Infrastructure is application
API driven abstraction (cloud computing, etc.)
The mystery machine
The machine in the corner that everyone is afraid to turn off, but no one why it's on
Operations are stakeholder!
Your site cannot be down - you are loosing money!
Non-functional requirements
Agile for development; Waterfall for deployment ?!
Communication between dev team and ops team is facilitated by ticket system
Configuration drift
changes are painful
inconsistencies between machines (2, 4, 8, 100?)
confusion
mistakes
Techniques
Always ship trunk
Share the repository
boundary objects
minimize surprise
get rid of ceremony, pagers, antagonism, etc.
conversations early in the process
ops can see the work devs are doing
everyone sees everyone else working
everyone knows where too look
keep configs in sync with application code
Information radiators
helps two groups (dev and ops) to have a conversation
dev and ops see the same thing, in the same place
share metrics
Tag everything
Correlate
have the same power as with failed tests in TDD - know exactly why something is wrong
when?
synchronization - get all machines sync'd
what?
who?
Deploy early and often
the ceremony is waterfall process
there should be no ritual
Continuous Integration
run functional tests
assert services are running
test new builds
Monitoring
test driven?
need baseline chart, trends
don't just look at the data when things are bad
what does 'normal' looks like?
You have to know how 'green' looks like to know how 'red' looks like
Feedback
One step deploy
lower the fixed cost of deploy
computers are really goot at running the same commands over and over
manual deployment is error prone
having people deploy manually is immoral
you DO NOT want to have manual scripts
one automated process from version control to line services
one process for devs, testers and it operation; across all environments
Build from source
disaster recovery
scaling
speed of thought
building infrastructure is not a big manual process
test from a known state
setup process
dev, test and prod not out of sync
no one is editing config files; they are automatically pulled from svn
roll config changes forward (dev-test-prod)
automated provisioning and deployment of services
Configuration Management
apply dev-test-prod cycle to infrastructure
reason about services, instead of systems
manage server lifecycle
audit and enforce consistency
put systems into a known state
Version Control
Anything that matters
Documentation
preferably executable
Database schema
Application code
Applications configurations
System configurations
Network configurations
Done is deployed
Different environments
dev environment should not be very different than production environment
you cannot keep in sync testing and production environment
it works in test environment and not in production
Hero culture
BECAUSE everyone keeps for granted you'll keep machines 24/7
NO! Heroism is not good for operations
Bad mistakes
Patching on live production system 5 am
Heroism is a virtue
Infrastructure renaissance
developers and operations can work together
you can change easier
you can change faster
IT operations
more flexible
faster feedback
enabler
differentiator