LeadDev London 2019

This marks my second time at LeadDev London 2019, and you’ll see from my tweet stream, I get somewhat enthusiastic. The conference was (if possible) even more inclusive, diverse, and welcoming this year – made even more so with it’s positioning during pride month (Meri highlighted it was a fluke, though she also said that in previous years about bigger political upheavals that overlapped with the conference – I feel there are too many coincidences here! :))

I have to call out the badge adornments. Upgrading to a pride lanyard, and giving both my communication preferences and pronouns on my badge was an awesome touch – I truly hope all conferences follow this model. This was only a drop in the ocean on the inclusivity though – a creche, an alcohol free room, a quiet room, a prayer room – so much to absolutely adore about the approach and care that is taken to looking after humans.

My leadership journey over the past 12months between LeadDev 2018 and this year has been an interesting one and has seen me move from an engineering lead in a relatively small company to one of the biggest global pharma companies. Lots of personal growth, lots of exposure, loads of challenges – it’s been a blast.

This years’ LeadDev for me was underpinned by a few core themes, and each resonated:

  • Inclusion and Diversity – Let’s get it out in the open – I’m from the demographic that is part of the problem. I’m a CIS gendered, middle class, middle aged white man. I’m an ally to the LGBTQ+ community as actively as able (my trans son helps a lot with that), and I thought I was an ally to the rest of the diversity spectrum, though this conference has really highlighted that in my position of privilege there is so much more I can do. I was lucky enough after the conference to attend and be welcomed at a talk on feedback at London Tech Ladies user group, and it was tragic to hear just how bad some of the ladies there had it in the workplace. I&D is, thankfully, handled very well at GSK, but LeadDev coupled with the meetup has reminded me that I need to be more proactive and engaged in my support (with permission) of more diverse groups.
  • Psychological Safety and Org Culture – it was heartening to see time and again in talks these referenced or referred to. Anyone who’s had a conversation with me knows I’ll talk till the cows come home around these subjects. Vulnerability, courage, and helping people to bring their whole selves to work is waaaay more important than the actual tech. It ran as a golden thread through the whole conference and was hugely comforting to hear. It’s essential to any effective organisation.
  • Feedback and High Performing Teams – it felt great to reflect on a lot of the practices and approaches highlighted by speakers. Things that seemed to work well for them with regards to engaging and investing in the people within the org, and a lot of them were aspects we’d already adopted (or adapted) to some success. It feels like there’s perhaps a tale from the trenches from our own experiences here that may well be valid to listen to.
  • Artificial Intelligence – a subject I have little awareness of, but that I’ve seen a few times since joining GSK, I have a healthy tech scepticism in interacting with a bot that tells me I’m “looking peaky”. The cold chills run abound at this. Happy to talk to AI when interacting with my bank or tracking a missing order on amazon, but I draw the line at my healthcare. S’the future though innit 🙂

Some of the best talks for me that it’s worth highlighting. I’ll include links to the videos as soon as they’re available.

Navigating Team Friction

Lara Hogan (@Lara_Hogan) (video)

Some great lessons based around Tuckman’s stages of group development when navigating team friction, and some great references to emotional intelligence as a basis of communication (the amygdala hijack has had us all as it’s slave for far too long!). Rarely is there a tiger in the room when we react badly to something – the classic conundrum demonstrated being the emotive subject of ‘desk moves’! A great example followed, giving surgeons who perform the same procedure across multiple hospitals and perform it multiple times, yet performance only really ever seemed to rise significantly in the surgeons home hospital. The outcomes highlighting the importance of team interaction, trust, workflows etc. – performance is so much more than just hiring a group of “rockstars” and hoping for the best.

I loved the BICEPS model highlighted when looking at our core needs at work:

  • Belonging – community, connection, a group, a tribe
  • Improvement/Progress – towards purpose, improving the lives of others, meeting KPIs/OKRs
  • Choice – flexibility, autonomy, decision making
  • Equality/Fairness – access to resources/info, equal reciprocity. When something isn’t fair we will as a species literally riot
  • Predictability – of resources, time, direction, and future challenges. Be careful though, too much and we get bored. Too little, and our amygdala comes out to play!
  • Significance – status, visibility, recognition

“Knowing and addressing peoples core needs is a shortcut to making others feel understood and valued”

I’ve followed Lara’s 1:1 guidance for some time, and found the question around ‘how do you like to receive feedback’ really useful in all 1:1s with new staff. The model that Lara proposed for feedback feels similar to others I’ve seen, but immediately applicable.

  • Observation – of a behaviour or action. Make it purely factual
  • Impact – of that behaviour. Share how you feel, but share it in a way they will care about it.
  • Open Question or Request – in talking to adults, they’re already thinking about next steps, so you can ask a genuinely open/curious question “What do you hope will be the result of your actions in the meeting?” Generally, asking a question like this reveals so much more to feed into the shared pool of meaning in that conversation. You can make a request at this stage, but Lara highlighted an open question often worked best.

Self Care
Take care of yourself – distance yourself from an unhealthy work environment. Everything covered will help navigate team friction, but it won’t help solve a bad environment. If you’re in that kind of environment, your interpersonal risk should not be at jeopardy if you’re in an unhealthy environment.

Bottom up with OKRs

Whitney O’Banner (@WooBanner) (video)

The best t-shirt of the day (and I was wearing my ‘Software is Made by Humans’ t-shirt!) – Whitney was wearing the amazing “Make tech more black” t-shirt! Any talk on Objectives and Key Results (OKRs) that starts with “Google got it wrong!” is going to grab anyones attention! This was an amazing talk, and I’d urge you to watch it. Challenging the status quo on a number of commonly held conceptions around OKRs and really made me think (and will absolutely generate discussion and change in our OKR process at work!)

Key takeaways

  1. Skip Individual OKRs
    Thankfully we already do this – OKRs down to the individual level just felt like bureaucracy and overly heavy process. Individual OKRs were likened to corporate meetings – they slow us down and don’t give us the bigger picture. How can we improve them? We can’t, just ditch them. If you need something at that level, go to ‘tasks’ instead.
  2. Ignore the metrics
    When first introducing OKRs, it’s hard enough to get the structure, the approach, and the alignment – don’t sweat the measurement detail too much. You want to improve X by 20%? Why 20%? Is that a stretch goal? Why not 30%? Whitney used the term SWAGS (Sophisticated Wild Ass Guesses) – awesome!
  3. Avoid Cascading Goals
    Heresy! This was Whitney’s key point – if you drop nothing else, drop this! Shouldn’t our goals align with the company? No… (just kidding, yes). You should absolutely want to achieve your company goals. Laszlo Bock (Googles former VP of people operations) said: “Having goals improves performance. Spending hours cascading goals up and down the company, however, does not. It takes way too much time and it’s too hard to make sure all the goals line up.” Managers tend to over value their ideas by 42%. Frontline employees tend to under value theirs by 11%. Let the frontline employees decide on ‘how’ they’ll align and add value. Bottoms up!

Inclusion stars with I

Dora Militaru (@DoraMilitaru) (video)

I wish this had been a longer talk (it was one of the 10minute slots) – it was, as a middle aged, able bodied, CIS gendered white man, a revelation. I’ve felt lucky to have more chance to support and ally the LGBTQ+ community, but this talk really highlighted to me how much more I need to do across all of the inclusion and diversity spectrum. Tech can still be a hostile place. We have male dominated job descriptions, male dominated interview processes. Some people have to work 10 times as hard in tech to do well. FT release an I&D report that I shall be reading and trying to action my own improvements.

If you read this and feel that things like diversity quotas are anti-meritocratic, double check your privilege – you may well be part of the problem. There were, however, two problems highlighted with quotas:

  • Fixing diversity is not the HR teams responsibility, and quotas tend to lie with them
  • They can trick us into thinking we’ve fixed the problem when we’ve hit the target

Nine out of ten women still work in jobs where the men are paid more for the same job. I loved the quote: “Quotas may be the only way of achieving, eventually, a world where quotas are obsolete”. Paying lip service to diversity has to end – implementation is key.

Tech seems to be particularly bad for this – in a recent Stack Overflow survey it was highlighted that 34% said that diversity was their lowest priority when looking for a new job.
So many references flooded out at the end:

  • If minorities aren’t applying, fix it – you can do so
  • Advocate actively
  • Become an ally!
  • pronoun.is and tampon.club were examples given
  • If you worry that you don’t know on something, ask – the worst thing you can do is remain silent
  • If you’re in a room making a decision and everyone in the room looks like you, that’s a warning sign
  • Celebrate diversity!

Ola Sitarska (@OlaSitarska)’s talk on an inclusive hiring process resonated a lot with the above and added some further context for me too and shouldn’t go without mention.

Facilitation Techniques 202

Neha Batra (@NerdNeha) (video)

I’m not even going to try to write this one up, but will update as soon as the video is out. I made so many scrambled notes, but it was awesome. If you facilitate anything in your working life, this had so much practical, tangible, actionable advice. How can a talk that mentions ‘Post-It Super Sticky Portable Tabletop Easel Pad with Dry Erase Panel’ be wrong? Honestly, watch it 🙂

Special Mention

To the end of day one story telling of Nickolas Means (@Nmeans) who gave us a thorough history of Eiffel’s Tower, and tied it back to organisational politics and relationship building.

The end of day two saw the amazing Clare Sudbery (@ClareSudbery) and Dr Sal Fredenberg (@SalFreudenberg) talk about an incredible array of topics pivoting around a hackathon where they invented time travel… no, really! Watch it – both amazing speakers, and they delivered their content superbly. I learned of the spoon theory during this, and have already used it in defining my own energy levels in a conversation after 3 back to back days in London – loved it. Oh, and they came onto the stage to the amazing Polystyrene and X-Ray Spex ‘Oh Bondage, Up Yours!’ – what’s not to love!

Daring to Dare Greatly

I’ve laughed, cried, and felt great affinity while reading Brené Brown’sDaring Greatly‘. There are so many takeaways, though it is important for us to remember that we are enough, and sometimes the bravest thing we can do is show up.

Her leadership manifesto should be at the heartbeat of every organisation, not only for it’s humanity, but as an indicator of just how emotionally intelligent and empathic that organisation is.

I couldn’t recommend the book more highly.

Organisational growth, a leadership journey, and the tombola book club


tombola is going through a fascinating period of growth.  We’re breaking into new markets, we’re expanding our workforce to help us build richer experiences for our customers, and we’re going through some organisational change to help support that evolution.  Our values and goals have always existed, and in a smaller organisation, it was always easy to get to the source of ‘why’ with our work.  As we grow, it’s a challenge for all of us that everyone, from the most junior of members of staff to our senior leaders are all sharing those goals, values, and culture.  Our effectiveness as an organisation pivots around this culture, these values, permeating through the organisation and becoming part of everyone’s DNA.  Elon Musk put it best when he said “Every person in your company is a vector. Your progress is determined by the sum of all vectors.”.  If you’re lucky enough (as we are) to have incredibly effective people, but they’re all pulling in different directions, you are far less efficient as an organisation.

I’d been doing my best to be an effective part in this growth, and as a software developer/architect, who’d transitioned into leadership some time ago now, initially reluctantly – I’m a geek, and I love to play with code.  I realised that my own effectiveness ceased to be how many great systems I could build or how many features I could write into our software, but instead became absolutely about how best I can support and help grow a team that understands “what” they’re building, and critically, “why” they’re building it.  My effectiveness would be at its peak when my team was building the right things, right, without me needing to steer any of that.  I realised I was woefully under skilled in leadership, communication, and effective support of those people who were solely responsible for building value for our customers.  An ex-tombola colleague who’s gone on to great things in leading said it best – you can be the best software developer in the world, but in transition to leadership you become the most junior of leaders – you’re back to “Day #0”.  You have to actively want to progress those skills that would help make you a better leader in the same way that we all as developers have actively progressed the skills that took us from junior through to the top of our game.

I realised that my own knowledge of “leadership” was still in it’s evolving (as it always will be) – I use the quotes as it’s such a broad gamut – it’s communication, it’s cultural, it’s delivery, it’s support, it’s helping to translate vision, it’s delivery of the ‘why’, it’s serving, it’s growth, and it’s so many other things.  There are so many theories to learn, so many practices to master, that it really is like being a junior developer again.  I decided I needed help, and as with my development career, that help would take the form of reading, watching, and talking – all of those things that I felt could help me become more effective for my team, and more effective for the organisation.

It was at this point, after reading a number of books on various elements of leadership that I felt I wanted to share this with others.  tombola has some really clever people, and in talking to some of them, I realised that others were on this same pathway, and wanted to share this with each other too, so we formed the tombola book club.  The books would be those books that we felt would help us grow as people, to help us, to help our teams, and to help the organisation.

The Book Club

My own hopes for this were to develop as a leader personally, but also to help others within the organisation share that learning and openly and honestly discuss how best to apply those things that we were learning and contribute to our own organisational culture.  Learning organisations are effective agents for delivery and their staff feel ownership, empowerment, and purpose, and I wished the book club to be a small part of the driver towards that.

After initial engagement, there were four of us keen to help co-ordinate this, and a number of others across the organisation who clearly wanted to share in that pathway of learning.  We shortlisted (longlisted!) those books that we all felt would be beneficial or that had influenced us, and we asked the group to vote.  The plan being to read one book per month.  The books currently on the list are below.

The membership of the book club is broad – it covers a diverse range of business areas and representations, and has attendance from the full range of staff from the most junior up to the senior leadership team across all disciplines.  It’s about a 50/50 split between technical and non-technical – again, something I’d hoped to achieve but feel lucky that we have.

Running the book club

We wanted this to be more than just ‘did you enjoy the book’ – we wanted richer communication, and we wanted to understand how what we’d read could be relayed back into tombola.  Having seen this format used before, we went with the ‘lean coffee‘ (a structured, but agenda-less) approach to our monthly meeting – this helped us focus the discussion around the book, and generated a flow based on what people wanted to talk about.  People organically raised and removed questions during the month, and we voted on those that were most important to us.

The meeting was then approximately an hour, and the topics were discussed in priority order, though these weren’t strict, and we ended up discussing around some of them into the wider impact of things within tombola, which was precisely what I hoped to get out of the book club.


The first month – ‘Drive: The Surprising Truth Behind What Motivates Us’

The club voted to read Daniel Pink’s ‘Drive’.  I’d already read this as part of my own improvement pathway, but I was glad to read it again.  We supported people in their choice of physical, ebook, or audio book, and we setup a chat room for any points/discussion during the month, and an email list for broader topics if needed.  As a personal aside, for anyone with a commute, I can massively recommend audible – I find reading physical books a little slow, but consuming it via audio book really has been a revelation and I’ve devoured books this year.

Throughout the month, questions came in to our lean coffee board about elements of the book that people wished to raise questions on, and very often, they had a tombola slant on them – ‘how could X work in tombola’, ‘do you relate to Y? Is that really how we operate?’ etc. – this was exactly what we hoped to get out of the club, and aside from having a group of people learning, is having those honest conversations about what we had learned that could help us on the journey the company is on.

I think we voted in a perfect book to start the book club – a lot of it was about self, and about your own views on the world, but it also helped us ask some crucial questions around the application of that in our local context.  The book club members agreed that the conversation around the book was almost as effective as the book for them, so we’ll chalk that one up as a good start!

The future – the good, and the challenges

Month two, and The Phoenix Project was voted in, which I’m massively excited by having read it previously and really feeling that although it’s heavy on the IT (and 50% of the book club are non-IT staff), it’s a book about organisational change, about flow of work, and about how an organisation that struggled with communication, collaboration, and visibility ultimately succeeds by going through change.  Perfect!

There is a challenge now that we hadn’t anticipated though – and that is one of scale – the first month went so well that the word is out, and we have almost another 50% of our current membership in new applicants.  We’ll be talking through this as a group to understand how best to support this, as having people on this pathway with us can only be good for the company, but that small/intimate feel of the group while discussing a book really adds value.  I’m sure there’ll be a solution there, we have clever people who want good outcomes.

In closing

I’d strongly recommend any company going through growth or change, any company where you have new leaders, you have a culture of personal development, or you have a desire for people to talk more setup a book club like this.  Your leaders and your staff are likely already reading these books, they’re learning from them, and they will have views on how each of them could benefit your organisation.  At the very worst, you’ve helped to bring together like minded people who want to help you grow.  At it’s best, tapping into that by supporting your staff and their personal growth while sharing that journey with them could just be the thing that helps all of those vectors within your organisation line up and start pushing in the same direction.

If you have any recommendations for books that ought to be on our list, please let us know.

Failed releasing, diagnosing collaboratively, and culture – tales from the trench

A quick write up of an incident yesterday that bit us and caused 2 rollbacks of our live UK website, and I thought I’d write it up as I think it highlights a number of things around the organisation and its culture, and was ultimately a ‘good day’.

Our bingo website release pipeline is pretty streamlined, and we generally release “relatively regularly” (2-3 times per week). I use the air quotes, as 2-3 times a week is nowhere near a short enough feedback loop if we’re attempting to amplify the feedback loop, but we are improving that all the time (the pipeline would happily push all changes to live multiple times per day, but there are more considerations there – a topic for another post).

The key failing up front: It had been 8 days since we released (for no good reason), and our release yesterday morning seemed to go great – all unit tests passed, integration – tick, e2e – tick, smoke tests – tick.  We’d manually checked our inactive stack (we do blue-green deployment), and all was good to go.  We switched the customers over to the new release, and all was great.

Then it wasn’t…  About 10minutes after the release we started to get some exceptions on one of our dashboards:

All of which were loudly looking like database connectivity issues.

Which was quickly followed in hipchat by notifications from our customer service staff:

Cultural Win #1

We’re not what you would call a DevOps setup – we do have separate development and operations teams, and in the past, we have struggled in the same way that a lot of organisations that separate these functions do – we’re on the same journey, but looking out of different windows. It’s the usual issues that have been blogged about by others at length – infrastructure blaming developers for setting the world on fire, developers blaming infrastructure for moving slower than they want – both wanting to achieve exactly the same outcome for the business, neither one of them realising it.  You’ve seen it play out in many organisations I’m sure.

We’ve spent a lot of time as an organisation and as teams trying to improve upon our communication, respect, and overall culture, and it played out brilliantly in this – we all gathered together, talked about the issues, investigated quickly and came up with some cracking approaches on diagnosis that helped us rule out both any infrastructure changes and database changes really quickly.  There was no blame, there was no ‘us and them’, there was just ‘let’s get this fixed, together’ with a focus entirely on the customer.

For those organisations that have evolved their practices into either a more DevOps approach, or have gone through the cultural change and have evolved a shared why and a shared journey, this may seem simple, but even 6 months ago this would have been a situation that generated far more friction and silo’d mentality.

Although we were still having the live issues, we knew that rolling back the site and investigating together was then the right thing to do, and there was shared ownership.

Root Cause, and Cultural Win #2

As the feedback loop wasn’t great: 8 days of regular code commits across two teams, a number of active projects in development, and a number of new pathways (albeit feature toggled off), so getting to root cause was a daunting task.

We peer review every single merge to our releasable branch, so if it was code, it was likely to be something very subtle.  It turned out there had been 193 commits and 231 files changed, so it felt like it was going to be a needle in a haystack.

We got lucky – while I was reviewing a piece of code, I saw a unity resolution setup like this:

container.RegisterType<ICoreThingyService, CoreThingyService>(new ContainerControlledLifetimeManager());

Oh, snap. Our services almost all register as PerResolveLifetimeManager, and this one had been new’d up as a singleton.  This had dependencies against data connections, and although those data connection dependencies were per resolve, it would never matter as we had one copy of the above for the lifetime of the application, so the connections were getting held onto (and worse, during connection pooling saturation, disposed of without getting re-setup).  Suddenly it all became clear on why the core issues were all database related.

We have two teams working on this codebase, and again, it’s an indicator of the strength of our interactions and culture that it became then a blameless discussion, an agreement on this as the root cause, and, as soon as the fix was put in place, a shared review of the fix to quickly get it available for testing.

It’s a loooong time since I’ve been bitten by a singleton in a codebase, but I would guess it won’t be the last in my career.  They’re awesome, except in the many, many situations where they absolutely are not!

Continual Improvement – Next Steps

I’ve worked at tombola now for a little over 8 years, and it brings great comfort that the organisation is where it is now.  It’s always been a generally good place to work culturally, but there’s definitely a strong shift over the past year or so into something that is far more collaborative, more respectful, and with a greater understanding of the ‘why’.

Some steps I’ve identified from this one issue:

  • Shorten and amplify the feedback loop – although our feedback loop in terms of dashboarding, alerting, etc. is pretty good, if we had been releasing more regularly as a matter of course, we would have seen this within a very short space of time after it was introduced, and then diagnosis would have been so much easier and quicker.
  • Tie in simple load testing to releases – once I’d integrated the fix into our release branch (which then went through all of our testing listed above, and automatically deployed through our dev and UAT environments), I load tested it with apache bench to validate that I could not replicate the problem.

    Although I could hammer the response times (the server I load tested isn’t load balanced and is far smaller than our production boxes), I was hitting it with enough load to saturated and replicate the problem we’d seen in live and we had zero exceptions raised from the testing.I will ensure that we tie this into our release cycle at some point so that we routinely perform automated load testing on our deploying code, and alert the pipeline if there are any issues.
  • Cultural – I think tombola are on a great path here already, but it feels great to be part of that cultural shift towards more respectful, shared collaboration.  I’ll endeavour to live that ‘why’.

re:Invent 2017

I feel like I only ever write up conference attendance on this blog, apologies!

I was fortunate enough to be one of the 4 developers tombola sent to re:Invent this year –

There were some seriously big announces in both of the keynotes – I won’t go over them here, but you can read about them straight from the mouth of AWS.

The keynotes are both available to watch too:
– Andy Jassy Keynote
– Werner Vogels Keynote

The conference had many different strands and covered some of the bigger hotels in Vegas, but my own particular drive was around architecture, devops, and performance.

Some of the talks I attended that I feel will have a direct impact on the future direction of the business, and will see us leverage performance and cost saving elements while broadening our approach across the organisation.

Cache me if you can

Markus Ostertag @osterjour watch now

“There are only two hard things in computer science: cache invalidation and naming things” – Phil Karlton

A really powerful talk for us – we feel we cache well, though it’s clear from this talk that there is so much more that we can do.  It focused on becoming faster, having your app do less, so that you can do more (in aggregate).  Memory, we all know, is faster and cheaper than CPU, so this was one of the primary wins, and it demonstrated key caching strategies and techniques at each layer.

One thing we haven’t yet entertained is caching via cloudfront of our main web application.  Even if the TTL is 0, cloudfront could help as it is optimised for that ‘last-mile’ delivery of your app.  Couple that with lambda@edge, and you have a potentially nice, cacheable, yet with dynamic elements means of getting your site out more easily to customers.  lambda@edge could also give us a nice pathway into our A/B testing work.

He talked significantly about what to look for in your caching – what the big hitters were, where you were getting the hits/misses and where optimisations could be made around this, and highlighted how even small changes from monitoring these could have massive impact.

Some key guidelines:

  • Choosing your TTLs on caching wisely is another obvious gotcha – even small TTLs help, but obviously aggressive caching to avoid any ‘walk of shame’ to the data layer is rarely a bad thing.
  • Cache everything! (Sessions, Results, Aggregations, Templates, Environments, Configurations, etc.)
  • Log and monitor everything – you can’t optimise your caching strategy if you don’t know the simple outputs like hits/misses
  • If you’re not using close to 100% of your ram, you have an optimisation opportunity there
  • Don’t worry about data duplication in caching too much – if it achieves speed
  • Don’t be afraid of negative caching – if a ‘no results’ is valid, consider caching it to avoid the next call having to do the database call

The adding up of these minor performance gains was put into sums far better than I could here, though even making a 1ms improvement when you are seeing 10,000 requests per second is still 10 seconds overall saving, which added up to 7,000+ instance hours saved per month.

There was a very good comparison on Elasticache of Memcached versus Redis variants.  I can’t immediately find a situation where we’d use Memcached over Redis, but if you have one, please comment.

He talked heavily on caching in front of the DB, though it’s one of the bigger changes that you can make as there’s a significant need to ensure the architecture is aware of it all.  You can either cache using TTL invalidation, or keeping the cache in sync at all times with synchronous writes – it works, but it’s a huge change to the application.  With RDS, you can have uncoupled writes and invalidations via lambda (are we seeing a pattern here yet on lambda?), which effectively gives you DB triggers ala SQL server.  If using DynamoDB, you have DAX (Dynamo DB Accelerator) which seems like a no brainer if you want to eek out performance.

Consider your 80/20s when looking at caching:

  • Find your heavy hitters – the bigger operations either in amount of data processed, or amount of requests
  • Have them in memory as much as possible
  • It pays off to do special things for them and handle them as a special case

Scaling to your first 10million users

Ben Thurgood watch now

“Many decisions are reversible, two-way doors” – Jeff Bezos

This was a really interesting talk about the journey from small scale, single user sites all the way up to millions of users – things to consider, AWS services to perhaps look at, and there were certainly takeaways for us to assist with our current journey towards improving scalability. Rather than re-iterate the details of the talk, it’s best to just watch through it and decide where you are in the evolution of your architecture, and what you can take away as wins.

A day in the life of a netflix engineer III

Dave Hahn @relix42 watch now

The numbers from netflix are just astounding:

  • 100,000,000,000s events through their data pipeline every day
  • 10,000,000,000s reqests handled by edge systems every day
  • 1,000,000,000s metric time series, aggregated, collated and stored every day
  • 100,000,000s hours entertainment streamed to customers every day
  • 10,000,000s devices talking into our service every day
  • 1,000,000s requests per second through the front door every second
  • 100,000s EC2 instances answer those requests
  • 10,000s auto-scaling instances every day
  • 1,000s production changes every day (code pushes, feature flags – daily average about 4,000)
  • 100s micro services to create that experience
  • 10s terabits of video over the internet every second
  • 1 goal – winning moments of truth

I found it interesting that there has been a bit of noise on twitter about cargo culting of netflix, and doing things ‘just because netflix does it’ – I’d agree that you should never just follow blindly (after all, you’re not netflix – what works for them…), but equally, learning from some of the practices they have in place both technically and culturally I think is aspirational and really does give a focus to that passion and drive to improve.  The numbers are bigger, but the challenges are often the same.

Some really good discussion on the tooling they use to facilitate the above in this talk too, and there are certainly tools we’re using already that were OSS’d by netflix, and I’m sure there will be more in future.

Performing chaos at netflix scale

Nora Jones @nora_js watch now

“Chaos doesn’t cause problems, it reveals them” – Nora Jones

A contender for ‘best talk of the conference’ for me and one that immediately inspired me towards both building in resilience, but also introducing chaos to see just where that resilience is, and more importantly, is not.

Chaos Engineering allows us to expose weaknesses in a way that testing in all forms doesn’t.  Testing allows us to address ‘knowns’, and knowns are generally so much easier to plan in and predict.  Chaos deals with the unknowns – part of the reason they are termed ‘experiments’ rather than ‘tests’.  Chaos engineering gives us a new way to increase confidence – how will our payments service handle increased latency with our third party supplier?  How would we handle half of our load balanced web pool falling over?

Chaos is inevitable – there are companies out there who make a living based on the fact it exists.  Chaos engineering attempts to bring that knowledge earlier into the flow and allow us to understand the problems before it becomes a pagerduty alert at 2am.

Nora gave an effective pathway to introducing chaos at an organisation – do not start in production, start with non-critical services, and only include services that *want* to be chaos’d.

The forces of chaos were highlighted as:

Obviously, safety and monitoring key business metrics is key.  Really worth a watch.


An incredibly motivating conference, and something with direct takeaways for the business.  Seeing others talk about and be open about operating at scale, and how they have solved the same problems that we’re also solving, and being open with their knowledge (both in talks and out of them) was really empowering, and I feel sure that I and my team will be applying what I learned for the foreseeable as we work towards the future.

DDDNorth – a day of free learning in Bradford

It was a 5am alarm that woke myself, and likely my colleagues, on a saturday morning when most people would be comfortably in the land of nod, or contemplating how best to laze away their saturday. For these tombola developers though, it was a drive down to Bradford to attend DDDNorth – a day long free conference setup and run by the community and supported by some brilliant sponsors. The drive down was uneventful, and we were presented with caffeine and brekky before the talks commenced.

Myself, Michael Tomaras, and Luke Hill were in attendance – I’ll relay the talks that were most inspiring to me.

There were two key talks for me – one, which I’ve heard and read a bit about anyway, was around the Spotify model for scaling agile by Stephen Haunts, and the second was a war story from Nathan Gloyn after 18 months of working on a number of projects where microservices played a part.

Microservices – Nathan Gloyn

We’re on a journey of growth at tombola that is seeing us diversify our software products in order to facilitate growth more readily – and although I’ve studied significantly around architecting, building, and supporting microservices, I thought a talk dedicated to ‘what I’ve learned after a year of building a system’ would be right up my street.

There was a bit of background about microservice patterns (and anti-patterns), and discussions around indetification of bounded contexts, fat vs thin microservices and just some key gotchas – security, service discovery, logging (and logging, and logging, and logging some more).

Some key takeaways:

  • Deployment (deploy small, avoid single repo for multiple services),
  • Identity and Authorisation (get these right up front – don’t attempt to retro fit it, it’ll get inordinately harder),
  • Build based upon need (not because it’s cool),
  • Configuration (strongly consider configuration management – consul/zookeeper/et al),
  • Logging (you can never log too much),
  • Monitoring (ensure you understand the baseline and health of each component, but ensure you are monitoring the system as a whole too),
  • System flow (correlation / session tokens in order to track journeys and requests through various systems is crucial)

None of these new, though distilled well by Nathan and he delivered an effective talk. The only thing missing from this for me was around the organisational change required to support microservices – a move we’re currently undertaking in terms of a shift away from a more monolithic single deploy application into many more smaller, co-ordinated, API driven services. Conways Law and team structure vs architecture design within an organisation is of key interest to me, and I think it’d have been nice to see a little more around this in the talk.

Scaling Agile with the Spotify Model – Stephen Haunts

Another useful war story about how Stephen and the team at his previous employer had managed the growth of the organisation via the spotify model which they modified in a rather comic ‘lord of the flies’ motif, with islands (multiple companies) and lookouts (marketing/sales type roles that protected the developers from the external landscape that was very much waterfall / deadline driven).

Some really refreshing pointers during this for me on just how best to empower and inspire the workforce while adapting to the growth and change of the organisation.

Key slide of the day for me though was one presented from a Harvard Business Review article.

This is such an incredible visual metaphor for just how satisfied, engaged, and inspired employees would be within an organisation, and I think this will be the one image that goes up on the wall in the office – definitely something to aspire to.


Post is also available at ops.tombola.co.uk (or will be, soon!)


A brilliant day of learning, some really useful talks, and a day to get some discussion with peers from the industry – all for the bargain price of £0.00. Further discussions with peers in other sectors who highlighted that recruitment for them was as difficult as it was for us, no matter how cool or interesting the work is you are doing (by the way, we’re hiring! see our careers site)

Free learning, free food, free chat, free inspiration – what’s not to like? Thanks DDDNorth.

DDD Scotland 2016 – A sunny day in Edinburgh

Saturday saw an early start to travel from the north east up to Edinburgh for the first DDD Scotland in a number of years and the hashtag was already seeing activity, so it was clearly going to be a good day.

First, and only, disappointment of the day – no coffee at the venue for breakfast!  They made up for it later in the day, though a cup of rocket fuel would have helped after starting at 5am.


Some very good talks, and some real takeaways from the day – as with any event of this type, a re-firing of the engines is definitely part of the aim – new ideas, new directions, new approaches – and the day most certainly delivered on this.

I was particularly impressed with the guys from Skyscanner – both culturally and technically they have identified their problems well, and effectively delivered solutions and organisational change in order to minimise pain – a success story in the making.

A Squad Leaders Tale – the Skyscanner squads model

Keith Kirkhope (@kkirkhope)

The squads model from spotify has been widely discussed and is a model that Skyscanner has adopted (and adapted) that model during a period of organisational and architectural change, and it would appear that they have done so to good effect.

They had the typical growing pains of any organisation that has gone from early success and grown significantly really:

  • They’d hit product gridlock
  • Their source control/release process was unwieldy
  • They were broken into functional teams (front end, etc. etc.

They were able to apply a lot of thinking around the theory of constraints, and indeed highlighted that they realised they were only as fast as their slowest unit. 

They adopted the squads model, though included a few modifications to better fit their organisational structure (they included squad leads at the top of each squad, but also brought in a tribe lead, a tribe engineering lead, and a tribe product lead to give better oversight across each squad).

Each squad is essentially self managing and self directing – they come up with their own goals, metrics, and success criteria (and example given was ‘to help users find their ideal flight(s) and book it as easily as possible’)

Some really positive side effects from this empowerment – for example, project leads become redundant, the product owner becomes one of the key foci.

I managed to ask a number of questions in this talk to better grasp the model, though unfortunately the one I didn’t ask was around Conways Law.  This organisational change seemed to be fundamental to their move from monolith to micro services, and I suspect without it they could not have so effectively broken down that monolith.  This change was a top down led change, and it’d be fascinating to learn more about the drivers behind it.  It’s the first time I’ve seen the direct impact of the communication structures of an organisation directly impacting the design of the systems.

Breaking the monolith

Raymond Davies (@radyrad88)

Another talk from Skyscanner, this one a very detailed history of skyscanner’s journey from inception through to current day and covering a great deal of the technical decisions made along the way – some of which I shall investigate as part of our own journey at tombola as we face some similar issues/growing pains.

They moved from classic asp, through webforms/mvc, to mvc, and eventually arrived at their current architecture which is evolving towards a micro service model.

Some key takeaways from this one:

  • Aggressively decommission older/legacy ‘kill it with fire’
  • Theory of constraints played a bit part in their evolution
  • They weren’t affraid to look at alternative team structures and architectures

Some technologies to look at:

  • varnish and esi, riverbed traffic manager

This and the squads talk were the high point of the day for me.

Other talks

Windows brings docker goodness – what does it mean for .net developers (Naeem Sarfraz)

A great talk and the speaker was very knowledgeable – the current state of the nation for docker looks good.  Certainly not yet at the point where you’d want to deploy (far from it), but the technology is maturing nicely.

Versions are evil – how to do without in your APIs (Sebastian Lambla)

The holy wars on RESTful endpoints, and his points were very well argued.  Worth seeing the video below.

Slides From Talks

ASP.NET Core 1.0 Deep Dive – Christos Matskas

You Keep Using the word agile… – Nathan Gloyn

Versions are evil – how to do without in your API – Sebastian Lambla

“Advanced” Functional Programming for Absolute Beginners – Richard Dalton

CQRS and how it can make your architecture better – Max Vasilyev

Ladies and Gentlemen the plane is no longer the problem – Chris McDermott

Breaking the Monolith (video) – Raymond Davies

BuildStuff Lithuania 2015

Just returned from a fantastic trip to Lithuania to attend BuildStuff 2015 and thought I’d get my notes down into a blog post to help distill and to build a brown bag session for the team at work.

The focus this year seems to have been heavily around a few key topics:

  • Functional programming played a big part and it was clear from even those talks that weren’t functional that there is a shift across to this paradigm in a lot of people’s work.
  • Agile process and approaches featured heavily as an underpinning, and indeed one of the best talks of the conference for me was Liz Keogh’s talk on ‘Why building the right thing means building the thing right’
  • Micro services (micro services, micro services, micro services) is still the hipster buzzword, though at least there were hints that the golden gooses’ egg has cracks in it (they’re still seen as a very positive thing, though they’re not without their own costs and limitations)
  • APIs naturally featured heavily in a few talks as people move more towards service orientation/micro services, and there are now a healthy set of talks on the ‘how do do this part right’
  • Continuous Integration/Continuous Delivery seems to have become less popular/less cool as a topic, but I was able to get some very useful insights on the conference that helped a lot.

You can see the full list of talks I attended here.

My tweet stream for the conference is here, and the full tweet stream for the #BuildStuffLT hashtag is here.

I attended some talks based upon the calibre of the speaker, and in some cases that was a disappointment – I of course won’t mention names, though there were a few of the bigger personalities that disappointed in presentation.

Couple of the talks I took more notes at (in chronology order);

5 Anti-Patterns in Designing APIs – Ali Kheyrollahi (@aliostad)

I loved the visual metaphor presented early in this talk of the public API as an iceberg where the vast majority of the activity is under the surface in either private APIs or business logic, and the public facing element is a small part of it.

The anti-patterns were listed as follows:

  • The transparent server – Exposing far too much information about the internals or the implementation. Having to request resources with your userId in the URL (get-details/12345 instead of /get-details/me) for example.
  • The chauvinist server – Designing the API from the servers perspectives and needs and pushing that thinking and process to any clients if they wish to consume it. Interestingly, Ali came off the fence and suggested HATEOS as an anti-pattern in this regard – I’m not convinced, but it was refreshing to see a strong opinion on this.
  • The demanding client – where certain limitations are enforced from a client perspective (e.g. forcing versioning into the URL as opposed to the headers)
  • The assuming server – where the server assumes knowledge on issues that are inherently client concerns. Good example here was pagination – /get-winners/page=1 versus /get-winners?take=20&skip=0 – we don’t know anything about the form factor on the server, so a ‘page’ has no context.
  • The presumptuous client – a client taking on responsibilities that it cannot fulfil (e.g. client implementing an algorithm that the server should handle, client acting as an authority for caching/authorisation etc.)

Another analogy I liked was in thinking of the API like a restaurant. The front of house is pristine, controlled, serene, structured. How the food arrives at the table is unimportant, and the kitchen (the server side of the API) could be a bed of chaos and activity, so long as the delivery to the front of house is pristine.

Service Discovery and Clustering for .net developers – Ian Cooper (@icooper)

This was listed as .net developers, though in reality the concepts equally applied across other technology stacks but it was nice to see code examples in .net for some of these.

He covered some of the fallacies of distributed computing:

  • The network is reliable
  • Latency is zero
  • Bandwidth is infinite
  • The network is secure
  • Topology doesn’t change
  • There is one administrator
  • Transport cost is zero
  • The network is homogenous

And also covered a number of things around Fault Recovery:

  • Assume a timeout will happen at some point
  • Retry pattern (http status code 429 – ‘Retry-after’)
  • Circuit breaker pattern (another mention for Polly here, which is an awesome library)
  • Introduce redundancy (be careful where state is stored)

Discovery was discussed at length (naturally), and he covered both Server and Client side discovery, as well as the general tooling available to help manage this (Consul, Zookeeper, AirBnB SmartStack, Netflix Eureka, etcd, SKyDNS) and covered the importance of self registration/de-registration of services.

A lot of practical/good content in here and a cracking speaker. Really liked the way he delivered demos via screencast so that he could talk rather than type – I think a lot of speakers could benefit from this approach.

Why Building the Right Thing means Building the Thing Right – Liz Keogh (@lunivore)

A lot of this talk focussed around Cynefin, a framework that seems to have arrived from Dave Snowden and describes a system for understanding and evaluating complex systems as they evolve. This talk covered a number of known concepts to me, but in a new way, so it very much hit upon my ‘must learn more about this’. It covered massively more than I could do justice to (though the link to the talk above from Liz is very similar to the one she presented), and she covered a whole pathway through an organisations agile fluency.

One of two talks at the conference that really gave me ‘take aways’ to go and learn and get better at – so massively happy I attended.

ASP.NET 5 on Docker – Mark Rendle (@markrendle)

This is the first time I’ve seen Mark present and I hope it shan’t be the last. Brilliantly clever bloke, fantastic presentation style, and clearly knows his topic areas well (he gave a closing keynote too which was equally good).

I played with vNext of asp.net in early beta, so it was incredible to see how far it’s come since then. He had brought it all the way up to date (RC1 of the framework had been launched the day before, and he included it in the talk), and the flow and interaction has become really polished.

I have to admit to being behind the curve with regards Docker – understand it conceptually, have kicked up a few docker images, but nothing anywhere near production or usable at any scale. I don’t really have any solid need for it right now, though the talk did demo how easy it was to fire up and deploy the code to a docker container and it’s possibly something to look at once the container/unikernal platform settles down.

All of the demo’s were given on linux/mono, though that evening (tragic I know) I re-worked through the talk on OSX and it all worked a treat so it does indeed seem like Microsoft has the open source/multi-platform delivery message correct here. I’ll do a follow up post on this as it’s now the topic that will take up most of my play time in the evenings.

Continuous Delivery – The Missing Parts – Paul Stack (@stack72)

I talk with Paul at most conferences and have been to his talks in the past, so I hadn’t really thought I’d attend this talk (I’ve heard all he has to say!) – so glad I did. It started after a twitter conversation pre-talk with him and Ryan Tomlinson around where the complexity in micro-services exists (away from the code, and more towards the wiring/infrastructure of it all). Thankfully, Paul’s talk focussed around exactly those topics and it was almost a rant towards the micro-services fandom that is exhibited heavily at conferences currently.

He covered the key tenets of Continuous Delivery:

  • Build only once (never ever build that ‘same’ binary again once you’ve shipped it)
  • Use precisely the same mechanism to deploy to every environment – that doesn’t mean you can use right click, publish to push up to production 😉
  • Smoke test your deployment – this is key – how do you know it works?
  • If anything fails, stop the line! It’s imperative at any stage that you can interject on a deploy that fails

Covered some common misconceptions about continuous delivery:

  • It’s something only startups can do – it’s true that starting in greenfield makes it easier to build upon, but anyone can move towards continuous delivery
  • It’s something that only works for nodeJS, Ruby, Go developers – any ecosystem can be pushed through a continuous delivery pipeline
  • We can hire a consultant to help us implement it – domain knowledge is crucial here, and someone without it cannot come in and help you solve the pain points
  • Continuous delivery is as simple as hooking up github to your TC account – all parts of the pipeline really need to be orchestrated and analysed

There was a really good example of successful continuous delivery and it was a quote from facebook. They deploy new functionality to 17% of the female population of new zealand. Basically, by the time the major metropolitan cities come online, they already know if that feature is working or not.

Some other key takeaways from this talk – you have to ensure you deliver upon the 4 building blocks of DevOps (Culture, Automation, Measurement, and Sharing) in order to ensure you have a strong underpinning. Again, this harks to the micro-services talks – just moving your auth system into a separate service doesn’t give you a micro-service. You need solid infrastructure underpinning it, you need orchestration, you need instrumentation and logging, you need some way of that service being discovered, etc. etc.

Continuous Delivery (to me) feels like a solid building block that needs to be in place and working well in order to act as a feeder for things that micro-services would hinge upon.

He mentioned the Continuous Delivery Maturity Model, and it’s worth everyone reviewing that to see where they sit in each category. One of the key things for my organisation is to review our cycle time and see just what our flow looks like, and if there are any key areas that we can improve upon.

CraftConf 2015 – did someone say microservices?

I’ve just returned (well, I’m sitting on a balcony overlooking the Danube enjoying the sunshine) from two days at CraftConf 2015 and thought I’d share my thoughts on my first attendance to this conference (it’s in it’s second year).  Firstly, lets get the cost aspect out of the way – this conference is incredibly good value for money.  Flights, conference ticket and hotel came to less than the cost of most 2 day conference tickets in London, yet the speaker line up is incredible and the content not diminished because of this economy – if you have difficulty getting business sign off on conferences in the UK, you could do worse than look at this.  That said, a conference is all about the content so lets talk about the talks.

Themes – Microservices, microservices, microservices

Thankfully, more than one person did say that microservices for them was ‘SOA done properly’ – the talks I gravitated toward tended to be around scaling, performance, cloud, automation and telemetry, and each of these naturally seemed to incorporate elements of microservice discussion.  Difficult to escape, though I guess based on the number of people who haven’t yet adopted a ‘small, single purpose services within defined bounding contexts’ (ourselves included) in the room, it was a topic ripe for the picking.


I won’t cover all of the talks as there was a lot of ground covered over the two days – thankfully they were all recorded so will be available to stream from the ustream site (go give the craft conf link above) once they’re all put together.

That said, there were some that stood out for me:

Building Reliable Distributed Data Systems

Jeremy Edberg (Netflix, @jedberg)

I’m a long time fan of netflix’s technology blog, so seeing them give a talk was awesome. I think this one sat up there as one of the best of the conference for me. A number of key points from the talk:

  • Risk in distributed systems – often on releasing teams look at risk to their own systems, risks in terms of time of day, but often overlooked is the risk to the overall ecosystem – our dependencies are often not insignificant and awareness of these is key in effective releasing
  • A lot of patterns were discussed – bulkheading, backpressure, circuit breakers, and caching strategies that I really must read more around.
  • Queuing – the approach of queuing anything you’re writing to a datastore was discussed – you can monitor queue length and gain far better insight into your systems activity.
  • Automate ‘all the things’ – from configuration and application startup, code deployment and system deployent – making it easy and quick to get a repeatable system up and running quickly is key.
  • ‘Build for 3’ – when building and thinking about scale, always build with 3 in mind – a lot of the problems that come from having 3 systems co-ordinate and interact well continue on and are applicable once you scale up.  Building for 2 doesn’t pose the same problems and so bypasses a number of the difficult points you’ll cover when trying to co-ordinate between 3 (or more).
  • Monitoring – an interesting sound byte, though alert on failure, not the absence of success.  I think in our current systems at work we’re mostly good at this and follow the pattern, though we can, as always, do better.

Everything will break!

Deserving of it’s own section as this really has to be handed to netflix as an incredible way of validating their systems in live.  They have a suite of tools called the simian army which are purposely designed to introduce problems into their live systems.  The mantra is ‘You don’t know your ready unless you break it yourself, intentionally and repeatedly’ – they have a number of different monkeys within this suite, and some of them are run more regularly than others, but this is an astonishing way of ensuring that all of your services in a distributed architecture are designed around not being a single point of failure, or not handling things like transient faulting well. 

It is seen as an acceptable operational risk (and indeed he confirmed they had) to take out customer affecting live services if the end goal is to improve those services and add more resilience and tolerance to them.  Amazing!

Incident Reviews

Their approach to these fitted well with what I’d hope to achieve so thought I’d cover them:

It was all about asking the key questions (of humans):

  • What went wrong?
  • How could we have detected it sooner?
  • How could we have prevented it?
  • How can we prevent this class of problem in the future?
  • How can we improve our behaviour?

Really does fit in with the ‘blameless postmortem’ well.

The New Software Development Game: Containers, microservices, and contract tests

Mary Poppendieck (poppendieck llc, @mpoppendieck)

A lot of interesting discussion in this keynote on day two, but some key points were around the interactions between dev and ops and the differing personality types between them.  The personality types were broadly broken down into two: Safety focussed and promotion focussed.  The best approach is the harness both personalities within a team, and ensure that they interact.

Safety focussed

These people are about failure prevention – asking ‘is it safe?’ and if not, what is the safest way that we can deliver this?  Motivated by duty and obligation.  They find that setbacks cause them to redouble their efforts whereas praise causes a ‘leave it all alone’ approach.

Promotion focussed

‘All the things!’ – all about creating gains in the ‘lets do it’ mindset. They will likely explore more options (including those new and untested).  Setbacks cause them to become disheartened whereas praise focuses them and drives them.

As a ‘promotion focussed’ person primarily, I’ve oft looked over the fence at the safety focussed and lamented – though really I think understanding that our goals are the same but our approaches different is something I could learn from here.

From monolith to microservices – lessons from google and ebay

Randy Shoup (consulting cto, @randyshoup)

Some interesting content in this one – his discussion around the various large providers and their approaches:


  • 5th complete rewrite
  • monolith perl -> monolith c++ -> java –> microservices


  • 3rd generation today
  • monolithic rails -> js / rails / scala –> microservices


  • Nth generation today
  • monolithic c++ -> java / scala –> microservices

All of these have moved from the monolithic application over to smaller, bounded context services that are independently deployable and managed.

He was one of the first (though not the last) to clarify that the ‘microservices’ buzzword was, for him, ‘SOA done properly’.  I get that microservices has it’s own set of connotations and implications, though I think it’s heartening to hear this as it’s a view I’ve held for a while now and it seems others see it the same way.

Some anti-patterns were covered as well.

  • The ‘mega service’
    • overall area of responsibility is difficult to reason about change
    • leads to more upstream/downstream dependencies
  • Shared persistence
    • breaks encapsulation, encourages backdoor interface violations
    • unhealthy and near invisible coupling of services
    • this was the initial eBay SOA effort (bad)
  • “Leaky abstraction” service
    • Interface reflects providers model of the interaction, not the consumers model
    • consumers model is more aligned with the domain.  Simpler, more abstract
    • leaking providers model in the interface constrains evolution of the implementation

Consensus is everything

Camille Fournier (Rent the runway, @skamille)

Not a lot to say about this one as we’re still in the process of looking at our service breakout and on the first steps of that journey, though I’ve spoken to people in the past around consensus systems and it’s clearly an area I need to look into.

Some key comparisons between zookeeper and etcd, though as Camille highlighted, she hadn’t had enough time with Consul to really do an effective comparison with that too.  Certainly something for our radar.

Key takeaway (and I guess a natural one based on consensus algorithms and quorum) was odd numbers rule – go from 3 to 5, not to 4 or you risk locking in your consensus.


A great and very valuable conference – discussion with peers added a whole host of value to the proceedings and to see someone using terraform tear down and bring up a whole region of machines (albeit small) in seconds was astounding and certainly something I’ll take away with me as we start our journey at work into the cloud.

A lot of the content for me was a repetition of things I was already looking at or already aware of, though it certainly helped solidify in me that our approach and goals were the correct ones.  I shall definitely be recommending that one of my colleagues attend next year.

Switching the client side build library in visual studio 2013 MVC template to gulp and bower


A lot of people use Mads Kristensen’s absolutely awesome Web Essentials plugin for Visual Studio – we use it for less compilation, and bundling of our less/js.  It does however fall down when you need to use it in a continuous integration context, so we find that we keep the compiled/bundled output in our repository.

Couple that with the fact that in the next release of visual studio, gulp/grunt/bower are becoming first class citizens in terms of it’s support out of the box.

Scott Hanselman’s point in that post is a valid one – nuget is a superb addition to the .net ecosystem, and compare it to the dark days of ‘download a DLL from somewhere and hope’, it’s revolutionised .net development.  But there are other, arguably far better, and certainly far richer ecosystems out there for client side build, which on the one hand is absolutely awesome (npm is easy to build for and publish modules to), and on the other hand, daunting (I counted at least 15 modules that would simply minify my css for me).  Thankfully, the community talks/blogs a lot about this, so finding commonly used packages is as easy as reading from a number of sources and seeing which one comes out on top.

Microsoft are to be applauded for taking this approach and opening up the pipeline in this way – their whole approach recently with OSS of the .net clr, as well as the potential promise of a reliable .net on linux via vNext, and it’s a great time to be a .net dev.

All code for this example post is available at https://github.com/terrybrown/node-npm-gulp-bower-visual-studio

What is Gulp?

I won’t go into detail, as many other posts cover it well.  Essentially, it is a streaming build system written in node that allows people to create tasks and build up a pipeline of activities such as transforming less, copying files, validating javascript, testing, etc.  It is a more recent addition to the market (grunt, a tool with similar aims, though a different approach is another in the same arena).

What is Bower?

Essentially, a package manager for front end libraries (be they javascript, css, etc.) – think of it at a rudimentary level like nuget for client libraries.  There is a very good short video on egghead.io

Holy wars solved early – Gulp vs Grunt

Clever people have written about this.  I personally prefer the streams approach and the code over configuration driven nature of gulp over the ‘temp file all the things’ and config based approach of grunt.

Getting Setup – local dev machine + visual studio

Machine needs to be running node and gulp (gulp needs to be installed globally)

Node has just hit v 0.12 which has a number of updates (not least to streams3 and away from the somewhat interesting streams2)

node --version

Will confirm which version of node you’re running.  You don’t need the latest version, though the update in 0.12 has been a long time coming.

Setting up gulp/bower

npm install gulp -g
gulp --version
npm install bower -g
bower --version

TRX – Task Runner Explorer: This will give you a custom task runner for gulp within visual studio.

NPM/NBower Package Intellisense: Who doesn’t like intellisense right?

Grunt Launcher: Not ideally named, but a great little add on to give you right click support for gulp/bower and grunt.

You may also want to follow the steps in http://madskristensen.net/post/grunt-and-gulp-intellisense-in-visual-studio-2013 to get full intellisense.

Note: Switch off build in web essentials (it’s being used purely for intellisense)

File > New Project – and a tidy up

We want to hand over all JS and CSS handling to gulp.  This includes bundling and minification, as well as hinting/linting. We’ll start with the default MVC template from Visual Studio as the basis of our work.

Remove asp.net bundling/optimization

In the current template for MVC sites, Microsoft provide a handy bundling mechanism that although fine for smaller sites, still maintains the same problems as above and doesn’t give you separate control over your ‘distribution’ JS/CSS.  We’ll remove:

Microsoft.AspNet.Web.Optimization (and dependencies WebGrease, Antlr, Newtonsoft.Json)

This will also involve a few changes to web.config and the codebase (see https://github.com/terrybrown/node-npm-gulp-bower-visual-studio/commit/5cfb58b8e57faa4c518a067fa473d740e43725a3)

Remove client side libraries (we’ll replace these later)

  • bootstrap 3 (bower: bootstrap)
  • jquery (bower: jquery)
  • jquery validation (bower: jquery-validation)
  • jquery unobtrusive validation (bower: jquery-validation-unobtrusive)
  • modernizr (bower: modernizr)
  • RespondJS (bower: responsd)

Setting up Bower

bower init

This will lead you through a number of questions (accept defaults throughout for now, though you can read up on the options here)

You will end up with a bower.json file that will look something like:


Re-installing javscript and css dependencies

Take all of the package references above that we removed (the bower versions) and run the following on the command line:

bower install bootstrap jquery jquery-validation jquery-validation-unobtrusive modernizr respond --save

Do NOT forget the ‘- -save’ postfix at the end – this will ensure that your bower.json is updated with the local dependencies.

This will start the download and install, and you will end up with a new folder in your solution called ‘bower_components’ folder which contains all of the local dependencies.  Ensure you add this folder to your .gitignore (or source control ignore file of choice).

As a temporary step, switch to visual studio – add the ‘bower_components’ folder to your solution, and re-map all of your js/css files from the default template to the newly downloaded versions.


Setting up the build with Gulp

Firstly, we need to get this local solution ready to receive npm packages as dependencies (gulp + the other supplemental libraries we’ll be using are available via npm.

npm init

Again, accept all of the defaults really, or whatever you fancy in each field.

The examples from here down will be somewhat contrived – your own use case can dictate what you do at each step here, but for the purposes of example, what we want to achieve is:

  • Deliver all jquery and jquery validation libraries into a single request
  • Deliver bootstrap and respond as a single request
  • Create a basic more modularised structure for our CSS using less and then concatting/minifying as part of the build

In our real use cases at work, our needs are far more complex, but the above will serve as an example for this post.

Setting up a default ‘gulpfile.js’.

var gulp = require('gulp');

// define tasks here
gulp.task('default', function(){
  // run tasks here
  // set up watch handlers here

You can name and chain tasks in gulp really easily – each one can act independently or as part of an overall build process, and TIMTOWTDI (always) – what I’ll put forward here is the version that felt easiest to read/maintain/understand.

Deliver multiple vendor libraries into a single request

var gulp = require('gulp');
var del = require('del');
var concat = require('gulp-concat');

var outputLocation = 'dist';

gulp.task('clean', function () {
	del.sync([outputLocation + '/**']);

gulp.task('vendor-scripts', function () {
	var vendorSources = {
		jquery: ['bower_components/jquery/dist/jquery.min.js',

		.pipe(gulp.dest(outputLocation + '/scripts/'));

gulp.task('default', ['clean', 'vendor-scripts'], function(){});

Ok, there are a number of things in here – key points:

  1. Read from the bottom up over – if you issue a straight ‘gulp’ command on the command line, you wil always run the ‘default’ task.  In this case, it doesn’t do anything itself (the empty function as the third param), but instead has a chained dependency – it’ll run ‘clean’ first, then (upon completion) run ‘vendor-scripts’ tasks.
  2. ‘clean’ task uses the ‘del’ npm module to clean out the output folder we will be pushing the built scripts/css to.
  3. ‘vendor-scripts’ uses the ‘gulp-concat’ npm module to simply join an array of files together (in this case, the jquery + jquery validation files)

if you switch to a command prompt window and run ‘gulp’ on it’s own, you will see output similar to:


And in visual studio, you will now see a hidden ‘dist’ folder there with the output of what you have just generated (remember to update your .gitignore – you do not want to commit these)

Disabling Web Essentials

Less has been our tool of choice for our CSS for some time now, and web essentials really did/does rock as a VS plugin to aid your workflow on those (nice inbuilt bundling, compilation, etc.  That said, now that we’re moving to a more customised build process, we need to switch the compilation side of it off.

Tools > Options > Web Essentials

Switch everything in ‘Javascript’ and ‘LESS” to false.

Deliver minified and concatenated CSS from LESS

We contrived a number of .less files in order to create the proof of concept:


@brand_light_grey_color: #EFEFEF;

.border-radius(@radius: 4px) {
	-moz-border-radius: @radius;
	-webkit-border-radius: @radius;
	border-radius: @radius;


@import "_mixins.less";

body {
    padding-top: 50px;
    padding-bottom: 20px;

/* Set padding to keep content from hitting the edges */
.body-content {
    padding-left: 15px;
    padding-right: 15px;

/* Override the default bootstrap behavior where horizontal description lists 
   will truncate terms that are too long to fit in the left column 
.dl-horizontal dt {
    white-space: normal;

div.rounded {


@import "_mixins.less";

/* Set width on the form input elements since they're 100% wide by default */
textarea {
    max-width: 280px;

Nothing complex, though it’ll let us at least build a workflow around them.

There are a couple of key tasks we want to perform here:

  1. Grab all less files and compile them over to css
  2. Compress that css
  3. Push them all into a single file in our dist folder

Thankfully, the ‘gulp-less’ plugin performs the first two tasks, and we have already achieved the other for our JS so it’s just a repeat of those steps.

Integration into Visual Studio and tying it all together

We now have a basic working build that we can add to as and when our process demands – node and the node package manager (npm) have a massive ecosystem of libraries to support all sorts of tasks (generaily, gulp- prefixed for gulp related build tasks), so you can start to build from this point forward.

Key thing now is tying this workflow into Visual Studio, and this is where the cool happens.  The Task Runner Explorer gives us a lot of extensibility points.


Each of these tasks/sub-tasks can be right clicked and ran as you would do from the command line easily, but you also have a nice option to ‘bind’ certain actions in Visual Studio to steps within your grunt build.



In this instance, we have bound our ‘clean’ gulp task to a ‘clean solution’ within visual studio.

Tying it all together – watching the solution

Web essentials was awesome at monitoring your work real time and updating bundled files (both less and js) into their respective outputs, but thankfully, gulp comes to the rescue in the guise of ‘gulp-watch’ – this is a highly configurable module that allows you to perform actions on changes to files.

Thankfully, now that we have all of the other tasks, the watch workflow is simply a matter of matching up targets to watch, and tasks to run when things happen to those targets.

var watch = require('gulp-watch');

gulp.task('watch', function () {
	gulp.watch('bower_comonents/**/*', ['vendor-scripts', 'vendor-css']);
	gulp.watch('Content/**/*.less', ['css']);

gulp.task('default', ['clean', 'vendor-scripts', 'vendor-css', 'css', 'watch'], function(){});

Once we have that, we can go back to the task runner explorer, right click the ‘watch’ task, and set it to run on solution open.

We now have our solution in watch mode permenantly and any changes to our less or the vendor scripts will trigger the appropriate tasks.

What’s next?

We’ve solved the problem (compiled css/js needing to be in our repo with web essentials), so the next steps really are incorporating this gulp build task into our CI server (TeamCity), though we’ll leave that for a follow up post.

Now that we have a whole set of automation going, we may as well re-introduce linting/hinting of our less and javascript too – some configuration will be needed here to ensure we’re happy with the outcomes, but fundamentally the ‘right thing to do’.

Testing our JS workflow is the next natural step, and there are plenty of gulp+other task runners to sit within this workflow that will let you validate your scripts either at build time or at save.