Explaining Modern Data Stack (just a few notes)

The aim of this page📝 is to help answer the question of “What is Modern Data Stack” (MDS), basically summarizing and enriching what Benn Stancil had to say on Changelog Podcast — https://changelog.com/podcast/600

Pavol Kutaj
2 min readJul 31, 2024
  • First of all, MDS does not have a clear definition and does not stemp from a single author
  • This is opposed to CDP — customer data platform — created by David Raab in 2013, another important category — https://customerexperiencematrix.blogspot.com/2013/04/ive-discovered-new-class-of-system.html
  • As for MDS, the most accepted definition could be that it is a consensus about philosophy and principles going about the data emerging in the 2010s
  • The mentioned CDP is a part of MDS: next to BI tools (Mode), event tracking (Snowplow), data warehousing (Snowflake), CDP (mParticle, Segment), data lake (s3) or workflow orchestration (AirFlow); there is around 30 subcategories listed in https://www.moderndatastack.xyz/categories

Principles

  • Cloud-first
  • Modular, “micro-servicy”, not behemoth type of suites. MDS is anti-oracle.
  • Moving data between tools, data visualization, bi for something, in the space there is lots of products for narrow verticals
  • Historical period: there was a period of time when data was a hot VC space, so there was a draw to create companies in this space. mode predated this fad (2017–2021).
  • Internal data tools could be financed (the story of Preset is highly illustrative)
  • We are all in the same cohort; modern data stack refers to that.

Challenges

  • But the future may not be as bright because
  1. AI is the latest hot thing and MDS is therefore not as attractive as pre-chat-gpt era
  2. “Rent seeking behaviour” (Cloud) may become less popular as data volumes increase, storage and compute costs increase, and reliance on 3rd parties becomes more of an issue That makes the MDS harder to adopt for true enterprise-sized businesses, as you’re “paying rent” to 3rd party providers to achieve things that can be achieved massively cheaper if done in house. Basecamp is the biggest example of this, who moved off AWS and GCP entirely, saving millions of $ https://basecamp.com/cloud-exit

LINKS

--

--

Pavol Kutaj

Today I Learnt | Infrastructure Support Engineer at snowplow.io with a passion for cloud infrastructure/terraform/python/docs. More at https://pavol.kutaj.com