QCon New York 2015 key takeaways - Part 1
This is the first part of my notes and key take aways from Qonn New York 2015, a three-day conference with main features (for me) on microservices, containers and culture.
Automating operational decisions in real-time @ Netflix
tl;dr Really interesting talk to see how “the big boys” keep track of things (services/bugs/customers) at scale. Feels like we’re a long way from this; we have visibility over what we’re doing but would take a fair amount of work, understanding, patience and maintenance to put something that is useful in to production.
Netflix deal with big numbers when it comes to metrics, not just throughput but also what “keeping things up” looks like. In terms of humans this simply doesn’t scale. Decisions need to be repetable and incremental, not “squishy”.
Also, metrics naturally “drift” over time due to things like the internet, the market, load. Detecting tiny incremental changes manually is impossible but machines can “learn” what the boundaries of normal are and tweak themselves based on existing use cases. Netflix feed “tagged” data to their algorithms that “know” what they are looking for and can adjust themselves automatically to help keep servers up or detect anomalies.
Example, when an EC2 instance boots it may simply be faulty, this needs to be detected in real-time and destroyed before it’s put in front of customers. Each time a server is tested a “score” is generated by the automated tests and if it’s <95% to the control server it will be binned.
Netflix’s Viewing Data Microservices: How We Know Where You Are in House of Cards
tl;dr Great talk and speaker, really interesting guy. I definitley feel this is something we should be thinking about a lot more in our business, even if we don’t implement it fully or in a stripped down/different way, we should be thinking about the systems we control in a differentlyrce for reason. Systems shouldn’t represent/dictate implementation, microservices shouldn’t work like that.
Existing architecture - stateful tier hosting their monolith viewing service. Quote, “This is great”, it all works fine.
Why did they move to a new “GEN4” architecture?
- Ahead of imminent need
- Growth is trending up and potential growth is also growing
- Vicious cycle of viewing, learning and improving the experience
- Too many stateful instances in constant use, wanted better scaling and less waste
Principles for more to new system
- Architectures are “throwawayable” or “limited time” and based on a collect, process, provider pattern.
- Continuity is paramount
- Small iterations
- Externalise state
- Technology has moved on (Cassandra is now 2.0, AWS is mature), GEN4 is in a lot better position
- Queues handle anything that needs processing and can be deferred
- Shadow testing as a way of migration, duplication into new architecture until they reached parity on the numbers
Once happy, the stateful tier was then scaled down
GEN5 hunch, ever smaller systems with on iteration on the collect, process, provider theory.
Microservices and the art of taming the Dependency Hell Monsters
tl;dr We’re doing a lot of this stuff in parts of the business, we need to be sure this continues to proliferate and permeate all facets of the web team. It’s “general practice” in open source for a reason.
Works for Gilt, “American ASOS” with a big dev culture similar to Etsy.
Internally they have many small RPC style services, product-catalog, inventory-service, cart-service, users. Problem is, as they become mature, the builds get larger and slower, each service has a different “play book”, increased boilerplate, quality drops which slows down development.
The biggest thing they appreciate is api consistency, something we we need to think a lot more about here at HX.
Minimizing the pain… API design must be first class, Gilt operate a “schema first design”. You need to identify data, structure your resources and plan your relationships up front. Proven to add value but the hardest thing to do.
“contract > implementation”
Went on to talk about general practice
- use open source
- use semver
- backwards compatibility
- minimal external libraries
- know when things change
- consistent naming (REST)
- smaller/flatter services so less dependencies
Managing Complexity - Functionality
tl;dr Best speaker of the conference, sold his idea well and will be looking in to functional programming more to see if we can adapt any patterns to what we’re doing with JS.
Move fast and break things is fine for some but in practice, we want to move fast without breaking. And software is complex so this is inherently difficult in two major ways:
- Essential complexity - the problem you are trying to solve
- Accidental complexity - the problems you or I can fix for example, writing or optimising code
Haskell (and more to the point functional programming) makes software easier through the following principles
- No dependencies.
- Pure functions have no side affects, they simply translate inputs to outputs
- “State” is only local to a function
- Functions can be parallelised easily
- In Haskell values are passed “by value” so mutating data structures becomes irrlevant
Reactive (event based) programming solves this. You don’t give callbacks but instead get given an event stream. Event propagation happens automatically so the code is a lot simpler and callback hell is avoided. Essentially you place a pointer to a variable “x” and say, this variable gets updated to the result of “y”. “y” will emit an event when it’s finished executing and the program will continue to function without the need for a callback style approach.
In this example, getChar returns a character (note this could be an IO operation) and assigns it to “c” and then putChar prints it out.
// Haskell snippet main :: IO () main = do c <- getChar putChar c
It’s a lot simpler to read and maintain.Tags conference, qcon, new, york, 2015