<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Yak Shaving]]></title><description><![CDATA[Thoughts at the intersection of software architecture, strategy and working in Tech.]]></description><link>https://www.siddharthsarda.com</link><image><url>https://www.siddharthsarda.com/img/substack.png</url><title>Yak Shaving</title><link>https://www.siddharthsarda.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 17 Apr 2026 23:40:35 GMT</lastBuildDate><atom:link href="https://www.siddharthsarda.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Siddharth Sarda]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[yakshaving@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[yakshaving@substack.com]]></itunes:email><itunes:name><![CDATA[Siddharth Sarda]]></itunes:name></itunes:owner><itunes:author><![CDATA[Siddharth Sarda]]></itunes:author><googleplay:owner><![CDATA[yakshaving@substack.com]]></googleplay:owner><googleplay:email><![CDATA[yakshaving@substack.com]]></googleplay:email><googleplay:author><![CDATA[Siddharth Sarda]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Why do we need a bias parameter in neural networks?]]></title><description><![CDATA[Changing the weight of the function changes the steepness of the sigmoid, but sometimes we need to be able to shift the function, right wards or leftwards.]]></description><link>https://www.siddharthsarda.com/p/why-do-we-need-a-bias-parameter-in</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/why-do-we-need-a-bias-parameter-in</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Tue, 26 Dec 2023 15:34:00 GMT</pubDate><content:encoded><![CDATA[<p>Changing the weight of the function changes the steepness of the sigmoid, but sometimes we need to be able to shift the function, right wards or leftwards. For this we need the bias parameter.</p><p>Problem visualised without a bias parameter from a Stack Overflow <a href="https://stackoverflow.com/questions/26399233/clarification-on-bias-of-a-perceptron/26607180#26607180">answer</a>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DiUK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DiUK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 424w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 848w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 1272w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DiUK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png" width="657" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:657,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24805,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DiUK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 424w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 848w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 1272w, https://substackcdn.com/image/fetch/$s_!DiUK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2b7a15e-f8c8-49d9-9d0c-a976516b214c_657x280.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Another <a href="https://stackoverflow.com/questions/2480650/what-is-the-role-of-the-bias-in-neural-networks">answer</a>.</p><p>Note: Instead of intending to write long posts and not writing anything, I will now intend to write more &#8220;Bangladeshi train station&#8221; like posts ala Tyler Cowen.</p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Service Health and Health Checks]]></title><description><![CDATA[We were talking about health check endpoints(/health) for a service in our team standup yesterday. These checks are usually used by some kind of work producer (load balancer) for example to perform checks and servers are taken out out of service if these checks fail. Like always there were roughly two camps about thinking about health checks. At one end people were fine with a simple health check which returns true. On the other hand there was the thinking that we should do more complex checks involving checking dependencies for example or more comprehensive integration tests.]]></description><link>https://www.siddharthsarda.com/p/service-health-and-health-checks</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/service-health-and-health-checks</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Thu, 13 May 2021 20:23:16 GMT</pubDate><enclosure url="https://cdn.substack.com/image/fetch/h_600,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We were talking about health check endpoints(/health) for a service in our team standup yesterday. These checks are usually used by some kind of work producer (load balancer) for example to perform checks and servers are taken out out of service if these checks fail. Like always there were roughly two camps about thinking about health checks. At one end people were fine with a  simple health check which returns true. On the other hand there was the thinking that we should do more complex checks involving checking dependencies for example or more comprehensive integration tests. </p><p>I keep oscillating between these two camps. Simple health checks which return true tell you very little about the actual health of your service and are mainly there to just keep the load balancer happy.  More complex checks might result in the situation where a temporary blip in a dependency ends up with all your nodes being taken out of the load balancer. </p><p>Or something like this happening.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/cba/status/1026163079107686401&quot;,&quot;full_text&quot;:&quot;<span class=\&quot;tweet-fake-link\&quot;>@copyconstruct</span> <span class=\&quot;tweet-fake-link\&quot;>@lizthegrey</span> One of the larger outages at <span class=\&quot;tweet-fake-link\&quot;>@TwitterEng</span> I ever saw was kinda the opposite. The scheduler took down a whole cluster because the health check endpoint broke even though the services were otherwise fine &#128565;&quot;,&quot;username&quot;:&quot;cba&quot;,&quot;name&quot;:&quot;chandler&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Sun Aug 05 17:48:37 +0000 2018&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:0,&quot;like_count&quot;:8,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:false}" data-component-name="Twitter2ToDOM"></div><p>These are the kind of outages that stay with you. Thankfully, some load balancers implement a special strategy called <em>fail-open</em> to deal with the case when all servers start failing health checks. This from the <a href="https://aws.amazon.com/builders-library/implementing-health-checks/">Amazon Builders Library</a>.</p><blockquote><p><em>For example, the AWS Network Load Balancer fails open if no servers are reporting as healthy. It also fails out of unhealthy Availability Zones if all servers in an Availability Zone reports unhealthy.  Our Application Load Balancer also supports fail open, as does Amazon Route 53. </em></p></blockquote><p>Anyways, there are two issues with thinking about health checks in this binary way.</p><ul><li><p>The health of a service lies on a spectrum. </p></li></ul><ul><li><p>The perception of health needs to include the client side perspective. </p></li></ul><h3>Health of a service is not binary</h3><p>This insight is picked almost verbatim from Cindy Sridharan's <a href="https://copyconstruct.medium.com/health-checks-in-distributed-systems-aa8a0e8c1672">lovely essay</a>. From the essay:</p><blockquote><p><em>The &#8220;health&#8221; of a process is a spectrum. What we&#8217;re really interested in is the quality-of-service &#8212; such as how long it takes for a process to return the result of a given unit of work and the accuracy of the result.</em></p></blockquote><p>The essay goes on a little beyond the point of health checks and makes some very interesting points. The key takeaways for me:</p><ul><li><p>For the layer that determines whether to give traffic to a particular node or not, its more interesting to understand whats the ability of the node to handle the work thats being sent its way, rather than just whether it's up or not.</p></li></ul><ul><li><p>A large percentage of outages can be avoided by using various graceful degradation techniques. This involves creating some mechanism of applying back pressure to signal that the service is overloaded. One such example is <a href="https://eng.uber.com/qalm-qos-load-management-framework/">Qalm from Uber</a>. </p></li></ul><ul><li><p>Back pressure needs to be propagated all the way up the call chain, if its not there would be some degree of queuing at some component of the ecosystem.</p></li></ul><p>So yes, build some backpressure in your normal workflow is one way to indicate distress while not making a bad situation worse.</p><h3>Health lies in the eye of the client</h3><p>From Steve Yegge's famous <a href="https://gist.github.com/chitchcock/1281611">platform</a> rant</p><blockquote><p><em>monitoring and QA are the same thing. You'd never think so until you try doing a big SOA. But when your service says "oh yes, I'm fine", it may well be the case that the only thing still functioning in the server is the little component that knows how to say "I'm fine, roger roger, over and out" in a cheery droid voice. In order to tell whether the service is actually responding, you have to make individual calls. The problem continues recursively until your monitoring is doing comprehensive semantics checking of your entire range of services and data, at which point it's indistinguishable from automated QA. So they're a continuum.</em></p></blockquote><p>If the perception of a service's health is completely internal it will miss a lot of outages. One of the most common things I advise developers to do when setting up SLOs for their service is to set it up from the client perspective.  A very similar point is also made in the <a href="https://www.youtube.com/watch?v=iF9NoqYBb4U">Metrics that Matter </a>talk from Google.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4h68!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4h68!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 424w, https://substackcdn.com/image/fetch/$s_!4h68!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 848w, https://substackcdn.com/image/fetch/$s_!4h68!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 1272w, https://substackcdn.com/image/fetch/$s_!4h68!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4h68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1176316,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4h68!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 424w, https://substackcdn.com/image/fetch/$s_!4h68!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 848w, https://substackcdn.com/image/fetch/$s_!4h68!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 1272w, https://substackcdn.com/image/fetch/$s_!4h68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9642db24-edde-4a2e-8cbc-9f05516c96c0_3571x2012.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you are to take away anything from this, remember to instrument your metrics from the client side, those are the ones that really matter.</p><h3>Measuring Service Health</h3><p>So, how do we think about service health and health checks. The <a href="https://aws.amazon.com/builders-library/implementing-health-checks/">Amazon Builder's Library</a> again gives us a great framework around this. The define 4 kinds of health checks:</p><p><strong>Liveness Checks</strong>: These test for basic connectivity and the presence of a server process.</p><p><strong>Local Health Checks</strong>: These check resources which are not shared by the peers of the server. (So, no dependency checks). Examples include checking the ability to do disk I/O, checking supporting daemon processes which put functionality at risk in subtle, difficult to detect ways.</p><p><strong>Dependency Health Checks</strong>: These checks check for the ability of the server to communicate with dependencies, if any metadata that is consumed by the server is stale or other issues. If any automation is built around the dependency health checks, the right amount of thresholding needs to be built in to prevent the automation from taking any drastic action unexpectedly. </p><p><strong>Anomaly Detection</strong>: These check for if a server is misbehaving compared to its peers. Examples include clock skew, old code running, extra reported latency etc. </p><p>The first two are good candidates for a traditional /health endpoints. The latter two are health checks which are performed from a point of view that is external to the service. These are good candidates to monitor and alert some human operator on. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xlIo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xlIo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 424w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 848w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 1272w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xlIo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png" width="1456" height="571" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:213090,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xlIo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 424w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 848w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 1272w, https://substackcdn.com/image/fetch/$s_!xlIo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9a692200-bf83-40dd-96e0-8cdfc42e6b39_2204x864.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I think that settles the debate, at least in my in my head, about service health checks</p>]]></content:encoded></item><item><title><![CDATA[Time in Distributed Systems]]></title><description><![CDATA[Confusion that never stops, Closing walls and ticking clocks]]></description><link>https://www.siddharthsarda.com/p/time-in-distributed-systems</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/time-in-distributed-systems</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sun, 11 Apr 2021 19:24:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!A4yN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A4yN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A4yN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A4yN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg" width="500" height="546" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A4yN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A4yN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F018a00c7-d3d2-4d42-8562-24b78e993686_500x546.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Just like Rob&#8217;s dilemma in High Fidelity, what came first in distributed systems is an hard question to answer. This is because coming to an agreement on what time it is now in a network of computers is difficult.&nbsp;</p><p>In this post, I want to discuss the two kinds of time that are relevant when we talk about distributed systems, the characteristics of both, the different kinds of problems that they solve and their usage in real world distributed systems.</p><p>We will discuss:</p><ul><li><p>Physical time</p></li><li><p>Logical time</p></li><li><p>How Dynamo and other dynamo like systems use vector clocks</p></li><li><p>How Spanner provides the highest level of consistency using Truetime API</p></li></ul><h2>Physical Time or Wall Clock</h2><p>The first kind of time is time as we know it, the time we get by looking at a clock. In most cases, a process on a computer finds the physical time using the gettimeofday() system call. Physical time on a machine is usually calculated with the help of quartz crystals that vibrate at known frequencies when electricity is applied to them. Like with everything you get what you pay for.</p><p>In an ideal world, all the machines would be perfectly synchronized since that would give us a mechanism to order events no matter what node they originated in. For a more detailed treatment of how synchronized clocks make a lot of distributed algorithms easier, <a href="https://dsf.berkeley.edu/cs286/papers/clocks-podc1991.pdf">look at this paper from Barbara Liskov</a>. Our world is far from perfect though and locks in different machines can gloriously go out of sync. Hence the need for clock synchronization algorithms.</p><p>Most clock synchronization algorithms work on probabilistic assumptions about clock rate and message delay. So they synchronize clocks with some skew <em>epsilon </em>i.e. they guarantee that if c1 and c2 are clocks on two nodes of the network the difference between the time at c1 and the time at c2 does not differ by more than epsilon with some very high probability.</p><p>The most popular synchronization algorithm is NTP which is used to bring a computer&#8217;s clock to within a few milliseconds of UTC. (Sidenote: Whenever needed to make a choice of a timezone, always use <a href="http://yellerapp.com/posts/2015-01-12-the-worst-server-setup-you-can-make.html">UTC</a>.)&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!de9t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!de9t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 424w, https://substackcdn.com/image/fetch/$s_!de9t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 848w, https://substackcdn.com/image/fetch/$s_!de9t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 1272w, https://substackcdn.com/image/fetch/$s_!de9t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!de9t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png" width="850" height="474" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:474,&quot;width&quot;:850,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!de9t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 424w, https://substackcdn.com/image/fetch/$s_!de9t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 848w, https://substackcdn.com/image/fetch/$s_!de9t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 1272w, https://substackcdn.com/image/fetch/$s_!de9t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F66344bf6-05d0-4ff1-b1c5-c11b77cb7aa5_850x474.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>NTP uses a strata of clocks with the lowest stratum time server (stratum = 0) synchronized with a high quality clock such as an atomic clock or GPS. NTP also has to deal with leap seconds. Leap seconds are 1 second adjustments that need to be applied to UTC to make sure it doesn&#8217;t fall out of sync with observed Solar Time. The default is to pause the clock for a while but smearing the leap second over 24 hours is becoming more popular and is used by both <a href="https://developers.google.com/time/smear">Google</a> and <a href="https://aws.amazon.com/blogs/aws/keeping-time-with-amazon-time-sync-service/">Amazon</a>.</p><p>Even with a functioning NTP running on your servers, the skew between locks range from anywhere between 100 -250 ms. There are certain algorithms which can provide tighter skews at the cost of not being internet scale such as PTP, which is used in the High Frequency Trading world. There is also the Google TrueTime service which we will be discussing a bit later while discussing use of physical clocks in real world systems.&nbsp;</p><p>Getting a tighter skew costs a lot more though and sometimes all we are concerned about is what happens within a distributed system. Enter Logical time.</p><h2>Logical time and Vector Clocks</h2><p>Let&#8217;s say we want to determine the order of events within a distributed system. We can use physical clocks, but as we just discovered physical clocks are not a panacea. So is it even possible to define what happens before what without using physical clocks.&nbsp;</p><p>This is exactly what Leslie Lamport covered in his seminal paper <a href="https://lamport.azurewebsites.net/pubs/time-clocks.pdf">Time, Clocks, and the Ordering of Events in a Distributed System.</a> Lamport defines his world of distributed systems as n nodes running and exchanging messages with each other. He then defines the happened before relationship <em><strong>&#8216;&#8594;&#8217;</strong></em> as follows.</p><blockquote><p><em>(1) If a and b are events in the same node, and a comes before b, then a &#8594; b.&nbsp;</em></p><p><em>(2) If a is the sending of a message by one node and b is the receipt of the same message by another node, then a &#8594; b.&nbsp;</em></p><p><em>(3) If a &#8594; b and b &#8594; c then a &#8594; c.&nbsp;</em></p></blockquote><p>Two distinct events a and b are said to be concurrent if we cant say whether a happened before b or b happened before a.</p><p>He then introduced the concept of a clock, where the clock is an arbitrary number assigned to an event. The Lamport clock condition L is as follows.</p><blockquote><p><em>If a &#8594; b then L(A) &gt; L(B).</em></p></blockquote><p>These clocks, also known as Lamport Clocks,&nbsp; therefore can be implemented as monotonically increasing integers which are incremented on every operation in a node.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O5Sm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O5Sm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 424w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 848w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 1272w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O5Sm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png" width="504" height="206" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:206,&quot;width&quot;:504,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O5Sm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 424w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 848w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 1272w, https://substackcdn.com/image/fetch/$s_!O5Sm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4045ab2-730e-4e67-9e2a-ede11fe652df_504x206.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>While Lamport clocks give us a way to capture potential causality they are very limited. If L(A) &lt; L(B) then the only thing we can say for sure is that B did not cause A. They do not help us understand whether A and B were concurrent.&nbsp;</p><p>To solve this, vector clocks were introduced which are a generalisation of Lamport clocks where each node tracks the maximum Lamport clock that it knows of, of every other node.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AGAg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AGAg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 424w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 848w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 1272w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AGAg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png" width="504" height="211" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/fc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:211,&quot;width&quot;:504,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AGAg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 424w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 848w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 1272w, https://substackcdn.com/image/fetch/$s_!AGAg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc260382-af54-4abb-a438-4e1ed0a12ac5_504x211.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Checking the happened before relationship between two events x and y translates to checking if each entry in the vector of x is smaller or equal to the corresponding entry in the vector of y and one is strictly smaller. If not, they are concurrent.&nbsp;</p><p>The authors of this <a href="https://queue.acm.org/detail.cfm?id=2917756">article</a> provide another way of thinking about vector clocks in terms of tracking causal histories.&nbsp;</p><blockquote><p><em>Causality can be tracked in a very simple way by using causal histories. The system can locally assign unique names to each event (e.g., node name and local increasing counter) and collect and transmit sets of events to capture the known past.</em></p><p><em>For a new event, the system creates a new unique name, and the causal history consists of the union of this name and the causal history of the previous event in the node. For example, the second event in node C is assigned the name c2, and its causal history is Hc = {c1, c2}(shown in figure 2). When a node sends a message, the causal history of the send event is sent with the message. When the message is received, the remote causal history is merged (by set union) with the local history.</em></p></blockquote><p>In our example, here&#8217;s how the tracking of causal histories would look like.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!maza!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!maza!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 424w, https://substackcdn.com/image/fetch/$s_!maza!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 848w, https://substackcdn.com/image/fetch/$s_!maza!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 1272w, https://substackcdn.com/image/fetch/$s_!maza!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!maza!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png" width="551" height="231" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:231,&quot;width&quot;:551,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!maza!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 424w, https://substackcdn.com/image/fetch/$s_!maza!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 848w, https://substackcdn.com/image/fetch/$s_!maza!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 1272w, https://substackcdn.com/image/fetch/$s_!maza!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5710bff1-c81e-469f-9563-6d0de4123e44_551x231.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The astute reader might notice that Vector Clocks are essentially a compact encoding of causal histories.</p><p>This <a href="https://queue.acm.org/detail.cfm?id=2917756">article</a> provides a thorough treatment of vector clocks if you are interested in looking into them more.</p><p>I should mention there is a huge class of  algorithms called consensus algorithms such as Paxos, Viewstamped Replication, Zab/Zookeeper, and Raft. They also provide ways of defining an ordering of events across a distributed system even though physical time cannot safely be used for that purpose. I intend to discuss this soon in a futue blog post.</p><p>Vector Clocks are great if all we want to do is to track order of events within a system, but the moment we want to introduce an observer outside the confines of the distributed system (for example a client), the limitations of vector clocks becomes quite apparent. Katy Perry captures the limitations of vector clocks very well in the following lyrics.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T-0N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T-0N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 424w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 848w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 1272w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T-0N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png" width="1456" height="807" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:807,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T-0N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 424w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 848w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 1272w, https://substackcdn.com/image/fetch/$s_!T-0N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8ab343d-f23f-437e-a5d2-9fcc7c552c4b_1600x887.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are things which only make sense in the realm of physical time. One such example is the concept of failure of a node. Typically we say a node has &#8216;crashed&#8217; when the client has been waiting too long for a response. Too long is a concept that only exists in physical time. Without physical time it&#8217;s impossible to say whether a node is dead or merely pausing before doing its next thing.&nbsp;</p><p>Another such concept where physical time is necessary is the concept of <strong>external consistency</strong> while talking about transactions in a database.</p><blockquote><p><em>&#8220;A violation &#8220;of external consistency occurs when the ordering of operations inside a system does not agree with the order a user expects.&#8221;</em></p></blockquote><p>Or in other words, External Consistency basically says that for two transactions T1 and T2</p><blockquote><p><em>if T2 starts to commit after T1 finishes committing, then the timestamp for T2 is greater than the timestamp for T1.</em></p></blockquote><p>External consistency is a tighter guarantee than the C promised in CAP because its defined for transactions whereas the C in CAP is equivalent to linearizability defined only for a simple read write register and not objects.&nbsp;</p><p>So, we understand physical time and the problems associated with it. We understand Vector clocks and their limited applicability to certain problems. Next I want to look at how real world systems use both these concepts and the implications that they bring.</p><h4>Use of Vector Clocks in Dynamo</h4><p><a href="https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf">This paper </a>was published by Amazon in 2007. It described an unapologetically eventually consistent key value store used internally within Amazon. Eventually Amazon released this as a managed data service called DynamoDB. This paper inspired a lot of dynamo style systems such as Riak and Cassandra and is quite seminal work in the NoSQL space. I strongly recommend you read it.&nbsp;</p><p>For concurrent updates Dynamo as described in the paper offered two reconciliation mechanisms.&nbsp;</p><ul><li><p>Client specific Reconciliation where when conflicting versions were detected the responsibility to resolve them was passed to the client application.</p></li><li><p>Last write wins reconciliation where the object with the biggest physical timestamp is retained as the latest version.</p></li></ul><p>The client specific reconciliation uses vector clocks to detect when divergent versions of an object exist.&nbsp;</p><blockquote><p><em>&#8220;Dynamo uses vector clocks in order to capture causality between different versions of the same object. A vector clock is effectively a list of (node, counter) pairs. One vector clock is associated with every version of every object. One can determine whether two versions of an object are on parallel branches or have a causal ordering, by examine their vector clocks. If the counters on the first object&#8217;s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten. Otherwise, the two changes are considered to be in conflict and require reconciliation. In Dynamo, when a client wishes to update an object, it must specify which version it is updating. This is done by passing the context it obtained from an earlier read operation, which contains the vector clock information. Upon processing a read request, if Dynamo has access to multiple branches that cannot be syntactically reconciled, it will return all the objects at the leaves, with the corresponding version information in the context. An update using this context is considered to have reconciled the divergent versions and the branches are collapsed into a single new version&#8221;</em></p></blockquote><p>The diagram below shows an example of this might work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BKvF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BKvF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 424w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 848w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BKvF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png" width="1456" height="1365" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/d789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1365,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BKvF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 424w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 848w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!BKvF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd789f2d3-32b1-4e32-a70c-a7ad2627f208_1600x1500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This concept was also borrowed in other dynamo style systems, most notably <a href="https://riak.com/why-vector-clocks-are-easy/">Riak</a>. The problem with using vector clocks is that you might end up with a very large number of concurrent versions or siblings as riak calls them and you will need to trim them and its critics argue that this is unnecessarily complicated.</p><p>The publicly available version of DynamoDB only seems to have the last writer wins reconciliation mechanism so it seems like the critics (<a href="https://www.datastax.com/blog/why-cassandra-doesnt-need-vector-clocks">most notably Cassandra</a>) can feel <a href="https://news.ycombinator.com/item?id=3529476">justified in their opinions</a>.&nbsp;</p><h4>Spanner and TrueTime</h4><p><a href="https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf">Spanner</a> is Google&#8217;s highly scalable globally replicated database. It&#8217;s unique in providing external consistency guarantees at a global scale. One of the key aspects of its design is the Truetime API which explicitly exposes the uncertainty around time.&nbsp;</p><p>The TrueTime API has 3 methods</p><blockquote><p><em><strong>TT.now</strong>: This returns an interval [min, max] in which the absolute time at which this function was invoked truly lies.</em></p><p><em><strong>TT.after(t)</strong>: Returns true if the time is definitely after t.</em></p><p><em><strong>TT.before(t)</strong>: Returns true if time is definitely before t.</em></p></blockquote><p>The Truetime API provides much tighter guarantees around the uncertainty of time than NTP does. The uncertainty in Truetime is around 7 ms, the equivalent of NTP has an upper bound of 250ms. Truetime achieves its tight time bounds by using underlying time resources of GPS and atomic clocks which provide highly accurate time.</p><p>With the uncertainty of time being so tight, Spanner achieves external consistency with a very simple trick which it calls commit wait. Let&#8217;s revisit the definition of external consistency again</p><blockquote><p><em>if T2 starts to commit after T1 finishes committing, then the timestamp for T2 is greater than the timestamp for T1.</em></p></blockquote><p>In Spanner, for every transaction, a node assigns a commit time to the transaction which is the upper bound of the TT.now() call. Before a node is allowed to communicate that a transaction has been committed, it must wait out the uncertainty, typically around 7ms, which guarantees that we are now definitely past the commit time assigned to the transaction.&nbsp;</p><p>Recall that by definition the &#8220;true&#8221; start time of any competing transaction T2 will be after the commit time assigned to the original transaction. The timestamp assigned to T2 will be greater than or equal to this &#8220;true&#8221; time as it also has to wait out the uncertainty. Voila! we have external consistency. (It took me a while to wrap my head around this).&nbsp;</p><p>There is another system, CockroachDB,&nbsp; which uses <a href="https://www.cockroachlabs.com/blog/living-without-atomic-clocks/">this concept of uncertainty of time</a> but on commodity hardware, to provide <a href="https://www.cockroachlabs.com/blog/consistency-model/">consistency guarantees</a> which are second only to Spanner. From the blog</p><blockquote><p><em>When CockroachDB starts a transaction, it chooses a provisional commit timestamp based on the current node's wall time. It also establishes an upper bound on the selected wall time by adding the maximum clock offset for the cluster \[commit timestamp, commit timestamp + maximum clock offset]. This time interval represents the window of uncertainty.</em></p><p><em>As the transaction reads data from various nodes, it proceeds without difficulty so long as it doesn't encounter a key written within this interval. If the transaction encounters a value at a timestamp below its provisional commit timestamp, it trivially observes the value during reads and overwrites the value at the higher timestamp during writes. It's only when a value is observed to be within the uncertainty interval that CockroachDB-specific machinery kicks in. The central issue here is that given the clock offsets, we can't say for certain whether the encountered value was committed before our transaction started. In such cases, we simply make it so by performing an uncertainty restart, bumping the provisional commit timestamp just above the timestamp encountered. Crucially, the upper bound of the uncertainty interval does not change on restart, so the window of uncertainty shrinks. Transactions reading constantly updated data from many nodes may be forced to restart multiple times, though never for longer than the uncertainty interval, nor more than once per node.</em></p></blockquote><p>Since CockroachDB runs on commodity hardware running NTP, this uncertainty interval can go upto 250 ms, which sucks but this should not happen for most transactions. </p><p>That concludes our journey of time in distributed systems. As a followup, an interesting area of research in distributed systems are CRDT(Conflict free Replicated Data Types) which attempts to avoid the need to worry about time and order.  They provide a kind of safety that is sometimes called strong eventual consistency: all hosts in a system that have received the same set of updates, regardless of order, will have the same state provided that the updates to the system are commutative, associative, and idempotent. The research tries to explore how expressive we can be within these constraints. I want to cover CRDTs in <a href="https://www.siddharthsarda.com/p/learning-distributed-systems">my journey</a> eventually.</p><p>There were a huge amount of resources which helped me in understanding all the nuances around time. I am sure, I probably missed a nuance or two. If you want to look at them further here are the ones I recommend:</p><ul><li><p><a href="https://lamport.azurewebsites.net/pubs/time-clocks.pdf">Time, Clocks and Ordering of events in a distributed system</a> - The original paper describing Lamport clocks by Leslie Lamport</p></li><li><p><a href="https://dsf.berkeley.edu/cs286/papers/clocks-podc1991.pdf">Practical uses of synchronized clocks in distributed systems</a> - A paper by Barbara Liskov describing how one can take advantage of synchronized clocks in certain distributed algorithms</p></li><li><p><a href="https://queue.acm.org/detail.cfm?id=2917756">Logical Clocks are easy</a> A handy way to understand and reason about Logical Clocks.</p></li><li><p><a href="https://queue.acm.org/detail.cfm?id=2745385">There is no now</a> Another interesting article exploring the problem with time in distributed systems.</p></li><li><p><a href="https://queue.acm.org/detail.cfm?ref=rss&amp;id=2878574">Time is an Illusion</a> A paper exploring physical time and synchronization problem in more details.</p></li><li><p>These <a href="https://news.ycombinator.com/item?id=6329619">flamewars</a>  and <a href="https://aphyr.com/posts/299-the-trouble-with-timestamps">discussions</a> around the use or lack there of of vector clocks in <a href="https://riak.com/why-vector-clocks-are-easy/">Riak</a> and <a href="https://www.datastax.com/blog/why-cassandra-doesnt-need-vector-clocks">Cassandra</a>. Bonus: <a href="http://www.datastax.com/dev/blog/amazon-dynamodb">Cassandra&#8217;s comparison with DynamoDB</a>.</p></li><li><p>The <a href="https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf">Dynamo</a> and <a href="https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf">Spanner</a> papers.</p></li><li><p><a href="https://gist.github.com/timvisee/fcda9bbdff88d45cc9061606b4b923ca">Falsehoods programmers believe about time</a>.</p></li></ul><p>If you have reached until here, please consider following me on <a href="https://twitter.com/sidsarda">twitter</a>.</p>]]></content:encoded></item><item><title><![CDATA[Learning Distributed Systems ]]></title><description><![CDATA[A self directed learning path and foundational material]]></description><link>https://www.siddharthsarda.com/p/learning-distributed-systems</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/learning-distributed-systems</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Wed, 24 Mar 2021 09:09:10 GMT</pubDate><content:encoded><![CDATA[<p>I am a generalist by nature. I have spent a fair majority of the last 5-6 years writing code on, using and deploying fairly large scale distributed systems as part of my work. It annoys me a little bit that my understanding of them remains equivalent to that of a dilettante.</p><p>Combine that annoyance with some availability thanks to the lull between switching jobs, I decided to make an attempt to <a href="https://en.wikipedia.org/wiki/Grok">grok</a> distributed systems.&nbsp;In this post I describe my approach as well as recommendations to some foundational material which I have gone through myself.</p><p>As a practitioner it&#8217;s my intention to reach the real world distributed systems (DynamoDB, Spanner or more recently Flighttracker for example) as soon as possible.&nbsp; wasn't necessarily starting from scratch, I had read <a href="https://amzn.to/3sccUwH">Designing Data Intensive Applications</a> and even done <a href="https://martin.kleppmann.com/2020/11/18/distributed-systems-and-elliptic-curves.html">the course</a> that <a href="https://martin.kleppmann.com/2021/02/23/patreon.html">Martin</a> released last year. So, I tried reading the papers describing these systems right away. I felt like a 5 year old reading Shakespeare. The words made sense but I am not sure I understood them at a fundamental level. Clearly I needed to do more reading.</p><p>I found a few good recommendations from <a href="http://muratbuffalo.blogspot.com/2021/02/foundational-distributed-systems-papers.html">Murat Demirbas</a> and <a href="https://www.the-paper-trail.org/post/2014-08-09-distributed-systems-theory-for-the-distributed-systems-engineer/">Henry Robinson</a> of both primary source material as well as a pragmatic approach to tackle these topics as a practitioner.</p><p>My approach is to group the topics discussed by both Murat and Henry, and attempt to read and find approachable material which contextualises the results and ideas - bridge material. If you choose to follow me along this journey of mine, you can expect links to these resources as well as a summary of my own understanding of these topics.</p><p>I have roughly arrived at the following grouping of topics.&nbsp;</p><ul><li><p><em><strong>Foundations</strong></em></p></li><li><p><em><strong>Failure and Time</strong></em></p></li><li><p><em><strong>System Models</strong></em></p></li><li><p><em><strong>Impossibility Results (CAP, FLP, Two Generals, Byzantine Generals)</strong></em></p></li><li><p><em><strong>Broadcasting, Replication and Consensus </strong></em></p></li><li><p><em><strong>Consistency Models</strong></em></p></li><li><p><em><strong>Real World Systems</strong></em></p></li></ul><p>This may not be perfect but I have also made peace with the fact that the most logical way to group these topics may appear to me only after I have a sufficient understanding of these topics. Please send <a href="https://twitter.com/sidsarda">me</a> a message if you think there can be a better grouping or there are topics I have missed.</p><p>Right, so let&#8217;s get to foundations. For the most part this is copied right from Henry&#8217;s list. I have made a few additions including adding Martin&#8217;s course.</p><h2>Foundations</h2><p><a href="https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing">Eight Fallacies&nbsp;</a></p><p>The distributed system equivalent of the commandments. A lot of the system models that we will encounter are based on these fallacies.&nbsp;</p><h3>Books</h3><p><a href="https://www.amazon.com/gp/product/1449373321/ref=as_li_tl?ie=UTF8&amp;tag=siddharthsard-20&amp;camp=1789&amp;creative=9325&amp;linkCode=as2&amp;creativeASIN=1449373321&amp;linkId=1c297d627da5e1520958a242dd914c1a">Designing Data Intensive Applications</a></p><p>The Distributed Data section (Chapters 5-9) in this book are a very solid introduction to distributed systems. If you read them thoroughly and internalize it, I think you are already ahead of the curve. Personally I find that the way I learn best is revisiting the same content repeatedly from different angles.&nbsp;</p><p><a href="http://book.mixu.net/distsys/">Distributed Systems for Fun and Profit</a></p><p>Recommended in the paper trail blog post. I quickly skimmed over this and found some things that did not sit comfortably with me - 2 Phase commit as an example of a CA (CAP theorem) system. My preference would be to stick to Martin&#8217;s book or his video course. However in the spirit of most models are wrong, some models are useful, one could also go through this as an introduction.</p><h3>Online Course</h3><p><a href="https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB">Distributed Systems Lecture Series from Martin Kleppmann:</a></p><p>My preferred way would be to go over this set of videos from Martin Kleppmann. Part of a course he gave to second year students at University of Cambridge. I have personally done this course. Fairly approachable and just around 8 hours of videos.&nbsp;</p><p>I prefer the video course over the book as I think it just about covers a bit more topics, is more approachable. Just like the book, you could choose to stop here and you will have more of an understanding than most. Personally, I feel while its a start, the course wasnt enough for me to internalize all that I learnt.&nbsp;</p><h3>Distributed Systems in Practice</h3><p><a href="http://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/">Notes on distributed systems for young bloods</a></p><p>While this was posted more than 5 years ago, most of this holds up. I found myself vigorously nodding to almost every point.&nbsp;</p><h3>Papers</h3><p><a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.7628">A Note on Distributed Computing</a></p><p>This paper from 1994 argues that distributed computing is fundamentally different from &#8216;local&#8217; computing because of differences in latency expectations, partial failures and concurrency. It&#8217;s an approachable read. You could also see how even some of the current developments (for example service meshes) are an attempt to tackle the same difficulties in distributed systems that are raised in the paper. It also illustrates the problems in distributed computing with an example of NFS. It&#8217;s interesting that you could see the tug of war between safety and liveness even then.&nbsp; I have written a summary <a href="https://github.com/siddharthsarda/distsyspapers/blob/main/a_note_on_distributed_computing.md">here</a>.</p><p><a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/acrobat-17.pdf">Hints for Computer System Design</a></p><p>I must confess I have only skimmed over this article and its only tangentially related to distributed systems. However a lot of the advice there is timeless (The paper is from 1983).&nbsp;</p><p>Right, so that's that. If you find the list overwhelming, just do <a href="https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB">Martin&#8217;s course</a>. You will be fine. </p><p>If you liked that you should follow me on this journey by subscribing to this newsletter or <a href="https://twitter.com/sidsarda">twitter</a> or both. Onto Failure and Time next.&nbsp;</p>]]></content:encoded></item><item><title><![CDATA[Something important for someone important]]></title><description><![CDATA[What to work on?]]></description><link>https://www.siddharthsarda.com/p/something-important-for-someone-important</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/something-important-for-someone-important</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sat, 13 Mar 2021 07:48:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Mv51!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There was a phase in my career where I was working towards a promotion. I was involved in a lot of activities but I felt I was neither making a meaningful impact to company objectives while simultaneously being extremely overwhelmed. I was part of a very successful team but it was hard for me to point out my contribution clearly. I also felt like I wasn&#8217;t making progress towards my personal goals. I never got that promotion and very rightly so. </p><p>Similar scenarios have been discussed by Charity Majors in&nbsp; <a href="https://charity.wtf/2021/03/07/know-your-one-job-and-do-it-first/">Know your One Job and Do it First</a> and Tanya O&#8217;Reilly about <a href="https://noidea.dog/glue">glue work</a>. I think there is a distinction between the two protagonists as described.&nbsp;</p><p>The protagonist in Charity&#8217;s post was involved in a variety of tasks without having clarity on what the organisation cares about, whereas the protagonist in Tanya&#8217;s post is doing the invisible work required for the organisation to be successful. The latter is glue work. It&#8217;s important to distinguish it from<a href="https://en.wikipedia.org/wiki/Filler_(media)"> filler work</a>. Filler work is low quality work which fills up the day but its value is questionable.&nbsp;</p><p>When I reflect upon what I was doing, it was a lot of filler work. In the aftermath of me not getting promoted, I tried to develop a framework of what I should spend time on.</p><p><strong>Who is the work important for?&nbsp;</strong></p><p>A mentor gave me some advice which is to always 3 set of objectives:</p><ul><li><p>One set of objectives that are important to the organisation</p></li><li><p>Another set of objectives that are important to your manager</p></li><li><p>Final set of objectives that are important to you.</p></li></ul><p>I think this is a useful framework to manage expectations mostly with yourself as to how you can expect to be rewarded. The first two you can reasonably be expected to be rewarded for by the organisation, the third you are doing for yourself. Any reward is a nice bonus. The ideal state is when there is significant overlap in the 3 things, but reality is seldom that clean.</p><p>I would also like to add that it&#8217;s important to distinguish between important and urgent tasks. Spending time on urgent things might feel that you are doing important work, but in reality you are not. If you find yourself spending time on urgent things too much of the time, take a step back and evaluate.</p><p>Figuring out what&#8217;s important to you sounds probably the easiest of those problems but is harder than you think. The best framework I have found so far is this post from Will Larson on structuring a <a href="https://lethain.com/forty-year-career/">40 year career</a>. A lot of things which fall under glue work - stakeholder management, writing great documents can also reasonably fall under here. While you might not be rewarded immediately, this is an immensely useful muscle to develop for leadership roles.</p><p>The first step in figuring out what&#8217;s important for your manager is to ask them. It&#8217;s also important to determine which of these objectives are must haves for them and what are good to haves. If you choose to spend time in the good to have objectives be prepared to not be supported from your manager. It&#8217;s good to have a continuous sync with your manager if any of these realities have changed.&nbsp;</p><p>When it comes to what's important to the organisation we have the things that are stated explicitly in terms of business goals. The real signal here is though what the organisation rewards and more importantly what it does not reward.&nbsp;&nbsp;&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mv51!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mv51!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mv51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg" width="1456" height="814" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:814,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mv51!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mv51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc01f8bf4-caf8-4608-9ca6-d45c92b7344a_1600x895.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example while a lot of organisations say hiring might be important for them, few are structured in a way to incentivise engineers spending significant time on it. Mentoring people is another that the organisation might not structurally reward, but mentoring has a lot of long term benefits, so my suggestion to most people would be to stick to it.</p><p>I sincerely wish managers and organisations would be to have their actions match their words. People ultimately figure this out anyways and a gap between the two causes a lot of erosion of trust. One particular insidious failure mode which I have seen of this is where leaders pile on their personal projects on the teams whilst not having alignment amongst themselves.&nbsp;</p><p><strong>What about changing what's important for the organisation</strong>?</p><p>Most overachievers have a strong desire to bring about change where they work. If you are successful in bringing about this change it can have significant upside but conversely the way there is very attritional.&nbsp;</p><p>Let&#8217;s be clear about something though -&nbsp; the fact that you want it to be important doesn't necessarily mean it actually is. Managers often make this problem worse by not giving their reports feedback about what's important. Often because it&#8217;s unclear to themselves.&nbsp;</p><p>If I distil my advice, it&#8217;s basically to do something important for someone important. It is very simple but it&#8217;s hard to get right, especially among the noise of busyness in big organisations.</p><p>If you are here, maybe you should follow me on&nbsp;<a href="https://twitter.com/sidsarda">twitter</a>!</p>]]></content:encoded></item><item><title><![CDATA[Ashby's Law and Managing Software Entropy ]]></title><description><![CDATA[Why one size fits all doesn't work]]></description><link>https://www.siddharthsarda.com/p/ashbys-law-and-managing-software</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/ashbys-law-and-managing-software</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sun, 07 Feb 2021 19:09:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2tYU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Once you reach a certain size as a company, you will typically have a fairly varied software portfolio. Some systems would be legacy systems, others more recent ones. Some systems form the bedrock of your business, other systems mostly support back office work.</p><p>As we make changes in these systems, we increase their <a href="https://en.wikipedia.org/wiki/Software_entropy">Software Entropy</a>. As stated in the<a href="http://www.laputan.org/mud/"> seminal paper</a>, even against the best efforts of the most architecturally conscious organisations, most systems tend to end up as a <strong>Big Ball of Mud</strong>. A lot of software strategy is essentially about managing this entropy.</p><p>I daresay, software strategy comes a little bit later in the organisation&#8217;s life cycle, as initially it needs to succeed as a viable business. This makes total sense.</p><p>So let&#8217;s assume we are a successful business and own a set of systems which power this business. We are ready to form a strategy to deal with the entropy of these systems. Most typically we start with a system that is of most importance to the business. We choose an approach to deal with the entropy of this system and it seems to work.&nbsp;</p><p>Often, having seen some success with the approach, we declare victory and rollout the same approach across all our systems. These attempts often end up in failure and at huge costs. I personally am guilty of going down this route.</p><p>Turns out, this is so common that there is a law about this. It&#8217;s called Ashby&#8217;s Law also known as The Law of Requisite Variety.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2tYU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2tYU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2tYU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg" width="336" height="530" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:336,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2tYU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2tYU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7573d6fc-8a51-4f21-ac2f-bc7b36fdfadd_336x530.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Or put in a more approachable way</p><blockquote><p><strong>&#8220;</strong><em><strong>In order to deal properly with the diversity of problems the world throws at you, you need to have a repertoire of responses which are (at least) as nuanced as the problems you face.</strong></em><strong>&#8221;</strong></p></blockquote><p>Essentially, we need a variety of strategies to manage different systems in our software portfolio. Depending on the various attributes of each system, we can then apply a strategy to manage their respective entropy.</p><p>Let&#8217;s look at some common strategies that one can apply to each system in your software portfolio.</p><p><strong>Do nothing</strong></p><p>This is a valid strategy when the domain that you are trying to explore is new to you. This applies to new systems being built in new domains that your business is navigating. It&#8217;s suited for exploration.&nbsp;</p><p>This is also a commonly adopted strategy by people when the software has not been tended to for a long time.Then it becomes more of a by-product of inertia.&nbsp;</p><p><strong>Refactoring</strong></p><p>Once you have an understanding of the domain, before the system gets out of your hand, you need to make sure that the software now represents the domain and the problem you are solving. This activity needs to be done continuously and with discipline.&nbsp;</p><p>In practice, I have seen refactoring often fail. This happens because of a variety of reasons. If you let your system grow too much, if the person or the people doing the refactoring are not using the right abstractions, if the abstractions are not communicated well in the team, if the tools you have to protect your refactoring from getting smudged all of this plays a part into this.</p><p><strong>Rewrites</strong></p><p>Rewrites, while being extremely desired by engineers, are an extremely risky choice.&nbsp; As Joel Spolsky wrote in<a href="https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/"> his essay</a>, engineers love this because it&#8217;s harder to read code than it is to write it.&nbsp;</p><p>A not exhaustive list when they make sense are:</p><ul><li><p>When the abstractions that you originally started designing the system with, no longer hold.&nbsp; While not strictly a rewrite, the<a href="https://stripe.com/blog/payment-api-design"> evolution of Stripe&#8217;s API</a> is a great example of&nbsp; this. The Stripe API initially started out&nbsp; with the idea that payments are finalised instantly as their primary use case was credit cards. Finalisation means that the user has sufficient confidence is guaranteed. However with most payment methods, including crypto, finalisation takes a while. Stripe&#8217;s API was then redesigned with this abstraction in mind.&nbsp;</p></li><li><p>The other case in which a rewrite makes sense is if you are going for a paradigm shift.<a href="https://twitter.com/StanTwinB/status/1336890442768547845"> Uber rewrote their app</a> to fundamentally utilise functional/reactive patterns and redesign their UI to allow multiple product teams to work together. Personally, I worked on a project at work, where to keep a handle on latency of our user funnel, we rewrote the backend so as to not make any database queries. All the requisite data was stored locally on each box and any updates propagated through kafka streams. The astute reader will notice this is just CQRS in practice.</p></li></ul><p>However rewrites are extremely risky. At best they slow you down massively during the rewrite, at worst they can be threatening to your very existence. The current system has acquired its calluses the hard way and those calluses serve a purpose. Translating that into a new system always takes longer than you thought. Maintaining backwards compatibility is hard. You might end up replacing your system with an over-engineered monstrosity - which Fred Brooks in the<a href="https://amzn.to/2OjaihL"> Mythical Man-Month</a> called the Second System Syndrome.</p><p>You have been warned.&nbsp;</p><p><strong>Reclaim</strong></p><p>For the longest time, I thought that you only had the 3 choices that I mentioned above. However, I recently came upon<a href="https://lethain.com/reclaim-unreasonable-software/"> this article</a> by Will Larson. Will suggests translating your beliefs about your system into the desired properties and behaviours. Then one can try to implement validations and assertions of these desired properties and behaviours and eventually you would be able to reason about your software.</p><p>Another approach of reclaiming systems which have regressed into incomprehensible monstrosities is to break them apart into logical services. Then we implement SLOs and SLIs, clean up the APIs and iteratively try to reduce the entropy in each individual system. Sounds a lot like Will&#8217;s idea, doesn&#8217;t it?</p><p>This strategy makes most sense for systems which, while chock full of important business logic, have fallen into a state of disrepair due to lack of attention. This can also be a very valid strategy before embarking on a full rewrite. Once you can reason about the system slightly better, you can choose how best to rewrite it and which parts.</p><p><strong>Replace</strong></p><p>I think the best software to have in your portfolio is one whose maintenance you are not responsible for. You could probably use X as a service especially when X is not one of your core capabilities i.e payments, communications platform etc. If it doesn't bring you competitive advantage, you probably should not be building it.</p><p>If you have already built it, chances are this system is more likely to decay, for the simple reason that it&#8217;s not your priority. Replace it with a commoditised service. Increasingly, there are also offerings like<a href="https://retool.com/"> Retool</a> which let you quickly build internal tools.&nbsp; This is a topic I want to explore more in the future.</p><p>These are the strategies I have commonly seen applied. While the top 3 are quite common, I think Reclaim and Replace are an incredibly useful lens with which to look at your system.</p><p>Humans have a tendency to crave one size fits all strategies. Perhaps we deal with complexity by pretending that the complexity does not exist. We end up fighting against Ashby&#8217;s Law and losing.</p><p>Let&#8217;s remember, the world is complex.</p><p>And Variety beats Variety.</p>]]></content:encoded></item><item><title><![CDATA[Developer Progression as a function of navigating complexity]]></title><description><![CDATA[How to become a (more) senior developer]]></description><link>https://www.siddharthsarda.com/p/developer-progression-as-a-function</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/developer-progression-as-a-function</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Wed, 30 Dec 2020 21:33:40 GMT</pubDate><enclosure url="https://cdn.substack.com/image/fetch/h_600,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F28bc0788-6e0d-4a91-ba8c-59684f53ff47_1600x486.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I came across this <a href="https://files.eric.ed.gov/fulltext/EJ1204031.pdf">paper</a> which looks at ways to teach students complex systems courtesy of Jessica Kerr&#8217;s <a href="https://jessitron.com/2020/10/11/clockwork-to-complexity-scale-in-time-and-software/">blog</a>.</p><p>Based on how students reason about complex systems, they were placed in 4 categories in order of increasing sophistication: <strong>Completely Clockwork</strong><em>, </em><strong>Somewhat Clockwork, Somewhat Complex and Completely Complex.</strong></p><blockquote><p><em>&#8220;Clockwork responses are those that show deterministic, linear, single-cause, non-networked, centralized, or static system interactions or states, whereas complex responses are those that demonstrate nondeterministic, nonlinear, multiple causes, networked, decentralized, or dynamic system interactions or states&#8221;.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O2h_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O2h_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 424w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 848w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 1272w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O2h_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png" width="1456" height="575" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:575,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O2h_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 424w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 848w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 1272w, https://substackcdn.com/image/fetch/$s_!O2h_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F325111ba-e7ab-4fb2-8bb1-743e63fe634d_1600x632.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In my experience this translates extremely well to the progression of a developer and their understanding of the complexity of the ecosystem they operate in. So, I wanted to attempt to derive a learning progression for developers.</p><p>At each level of development, I recommend a few books. Each book hopefully is a thread you can pull on, if you want to go deeper into a particular topic.The book recommendations are subjective, but I believe that one can chart their own curriculum following the basic model. The recommendations that I make are just based on my own experience and aim to serve merely as guidance.</p><p>When it comes to progression to the next level, my general advice would be to master the level you are operating on and then aim to understand whats needed to operate at the next level.</p><p><strong>Completely Clockwork</strong></p><p>At this level most of the work is involved in making changes in a single service or component. This is where most of the <em>&#8216;coding&#8217;</em> happens. The amount of time that will be spent reading your code will be multiples of the amount you would take to write it. You will work more with existing code rather than writing new code. A corollary of this fact is that you need to know how to make your code easy to change.So your focus should be on writing readable code and making it easy to change. </p><p>Your scope of work will probably within your team and its very likely that your circle of influence, impact and visibility would mostly be your team as well. The abstraction at which you operate in is your team. </p><p>The advice I would give at this level is to optimise for seeing problems solved through different <a href="https://charity.wtf/2020/11/01/questionable-advice-the-trap-of-the-premature-senior/">reference points</a>. </p><p>My recommendations for this level would be:</p><ul><li><p><a href="https://amzn.to/3rDymed">Refactoring</a></p></li><li><p><a href="https://amzn.to/2L4q3HJ">Clean Architecture</a></p></li></ul><p><strong>Somewhat Clockwork</strong></p><p>As you begin to build familiarity and also gain experience, you might see that increasingly you begin to look outward from the immediate scope of your team. You begin to try to understand how the surroundings affect the goals of your teams.  You start going on <a href="https://noidea.dog/blog/surviving-the-organisational-side-quest">organisational side quests</a> to achieve your team&#8217;s goals. The abstraction at which you operate is still typically your team but the way you look at things becomes more outward looking.</p><p>In modern software development, distributed systems are ubiquitous. Even when you are making a change to a single service, that service most likely lives in an ecosystem of other services and a plethora of other distributed tools. As you grow in seniority you need to understand how best to interact with other services  and also the tradeoffs involved with each tool.&nbsp;</p><p>Books for this level</p><ul><li><p><a href="https://amzn.to/3rFMkfH">Designing Data Intensive Applications&nbsp;</a></p></li><li><p><a href="https://amzn.to/3o1QDjf">Building Microservices</a></p></li></ul><p><strong>Somewhat Complex</strong></p><p>As you gain more experience, you would have ideally developed some <a href="https://blog.koehntopp.info/2020/08/31/on-touching-candles.html">calluses</a>, had a ring side view of  some <a href="https://twitter.com/StanTwinB/status/1336890442768547845">heart stopping close shaves</a>. This will prepare you to operate at next abstraction: teams of teams.  You would be advising on how to organise software and teams best to make teams successful in aggregate. You could be either leading <a href="https://lethain.com/migrations/">migrations to address technical debt</a> or <a href="https://lethain.com/reclaim-unreasonable-software/">reclaiming unreasonable software</a>. It will also probably be your call as to when to do what.</p><p>A practitioner operating mostly at this level needs to be able to use models to explain the  overall system or domain they are operating in and its interactions with its neighbouring domains. These models are not just to explain the software system but also the teams that surround that software. You need to understand the best models for organising teams to drive maximum impact. Additionally, as the operator of <a href="https://how.complexsystems.fail/">complex systems</a> they wear two hats : that of the producer and that of the protector against failure. The practitioner needs to be able to handle this duality well.&nbsp;</p><p>Books for this level</p><ul><li><p><a href="https://amzn.to/381eAl7">Domain Driven Design</a></p></li><li><p><a href="https://amzn.to/3o2KenM">Team Topologies</a></p></li><li><p><a href="https://amzn.to/3mYTWpZ">SRE Book&nbsp;</a></p></li><li><p><a href="https://amzn.to/381eWrX">Accelerate&nbsp;</a></p></li></ul><p><strong>Completely Complex</strong></p><p>The abstraction that you operate in now is the entire organisation or in bigger companies, an entire division or department. Your focus is primarily on what to build rather than the how.  You will be doing a lot of writing, <a href="https://www.linkedin.com/feed/update/urn:li:activity:6530173230102122496/">since you need to communicate at scale</a>.</p><p>According to the paper, a completely complex system is a dynamic process, which is constantly in a state of flux. Any patterns here are emergent. Changes have both short term and long term repercussions. You also realise that the system is self organised for the most part. This sounds a lot like the modern organisation. At this scope you are typically operating as the technical leadership. You need to be able to form technical strategies grounded in the needs of the business.&nbsp;</p><p>Books for this level</p><ul><li><p><a href="https://medium.com/wardleymaps/on-being-lost-2ef5f05eb1ec">Wardley Maps</a></p></li><li><p><a href="https://amzn.to/38Msirg">Enterprise Architecture as Strategy</a></p></li></ul><p>So to summarise, this is what we end up with as a progression plan for developers over varying degrees of complexity.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NM7M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NM7M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 424w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 848w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 1272w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NM7M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png" width="1456" height="442" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:442,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:172810,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NM7M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 424w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 848w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 1272w, https://substackcdn.com/image/fetch/$s_!NM7M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3051da8e-aaca-477b-93d1-96e243d91b22_3846x1168.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There is a pithy saying - &#8220;<em><strong>All models are wrong, some are useful</strong></em>.&#8221; The model I have described here is not perfect. The skills and responsibilities needed at different levels are at a continuum, not discrete. I merely aim to expose you to the different abstractions that engineers operate in based on where they are in their career. I hope you found this useful.</p><p>If you are here, follow me on <a href="https://twitter.com/sidsarda">twitter</a>.</p>]]></content:encoded></item><item><title><![CDATA[Pioneers, Settlers and Town Planners]]></title><description><![CDATA[How to organise product teams]]></description><link>https://www.siddharthsarda.com/p/pioneers-settlers-and-town-planners</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/pioneers-settlers-and-town-planners</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Wed, 16 Dec 2020 19:27:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5Cax!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Back in the dark ages, technology teams were grouped in terms of aptitude (DBAs, Sysadmins, Developers, testers etc). Code was thrown over walls and somehow shipped to the customer.</p><p>Over time, it was established that the best way to organise teams is to set them up to be self sufficient. This meant a people with different aptitudes form a team and ideally own the end to end delivery of a product.</p><h4>Yin and Yang</h4><p><a href="https://twitter.com/swardley">Simon Wardley </a>of the Wardley Maps fame also settled into this conventional wisdom but found that there was a lot of conflict between his teams.  He hypothesized that the conflict was caused because there were fundamentally two types of work - <em><strong>new development</strong></em> and <em><strong>core operations</strong></em>. A similar theory is behind Gartner&#8217;s <a href="https://www.gartner.com/en/information-technology/glossary/bimodal">Bimodal IT</a> assertion which asserts that there are two streams of work &#8212; one focused on predictability, the other on exploration.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Cax!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Cax!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Cax!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg" width="768" height="576" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73827,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Cax!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Cax!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F13a7ce01-d2be-4987-b611-43e277b3ff00_768x576.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This setup is extremely popular with the rise of platform teams supporting product teams. It was even formalised in an AWS service (<a href="https://aws.amazon.com/proton/">AWS Proton</a>) announced at ReInvent 2020. </p><p>What Simon found was by grouping his multi-aptitude teams into two groups, somehow it increased the amount of infighting but more importantly the evolution of services/components was not happening. </p><h4>Evolutionary flow of a Service</h4><p>One of the central thesis around Wardley maps is </p><blockquote><p>Standardization of components enables creations of better complexity.</p></blockquote><p>This means services evolve from <strong>Genesis</strong> -&gt; <strong>Custom Built</strong> -&gt; <strong>Product</strong> -&gt; <strong>Commodity. </strong>The services that you choose to evolve basically defines your strategy. </p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gGf4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gGf4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gGf4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg" width="1456" height="980" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:980,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:210937,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gGf4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gGf4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F55dec2d0-a0e5-4811-968a-dd1773296bb4_2382x1604.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Anyways, so back to the story. Simon Wardley found that in his bimodal world, the evolution of services wasn&#8217;t happening. Instead, the operations people kept complaining how lassez-faire the new development people were and the new development kept complaining how much the operations people had a stick up their a**.</p><h4>Pioneers, Settlers and Town Planners</h4><p>The hero of the new development people is the &#8216;<strong>Pioneer</strong>&#8217;. The Pioneer is great at breaking through the organisational ennui and bringing change about. He does get bored quite easily and cant be trusted to write proper documentation.</p><p>The hero of the operations people is the &#8216;<strong>Town Planner</strong>&#8217;. The town planner obsesses about operational health and minimising risk. </p><p>You need both of them in your organisation but if you stick them together, they fight. This is something I have seen from personal experience and it benefits neither of them. </p><p>The missing ingredient to make them work are the &#8216;<strong>Settlers&#8217;</strong>. The settlers get along with both pioneers and town planners. They are the Salieri to the Pioneer&#8217;s Mozart. They recognise the brilliance of what the pioneers build and make it functional. They give their creation enough structure that Town Planners can take services from them and commoditise them.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GOn9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GOn9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GOn9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/dbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:190076,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GOn9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GOn9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbf4ff4c-1c4a-4e42-8e2d-d4fa16fbab05_1920x1080.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The trio also setup a nice cycle for the evolutionary flow of the services. Settlers take services from pioneers and make success happen. Town Planners steal from Settlers and commoditise services. Upon these commoditised services pioneers build their newest innovation. A healthy food chain for the enterprise.</p><p>So, Simon realised that the best way to organise team was to make teams of people which had the same attitude (pioneer/settlers/town planners) and different aptitudes (PO/dev/designer). This both kept the peace and allowed for the evolutionary flow of services. </p><p>I will be honest, I haven&#8217;t seen this in practice anywhere, but I have seen all of these personas. The most successful team I could think of which was a combination of the three but with immense respect between each of them which probably made it work.</p><p>What I have instead seen is this happening more organically, because pioneers get bored and moved on. Pioneers usually have a settler than hangs around them who then takes over. Occasionally some of these services are commoditised, though being a town planner is truly an art form. Many town planners often end up designing ghost towns. On the other hand the pioneer&#8217;s lassez faire attitude and the zombie systems they left in their wake rubs enough people the wrong way for them to become a political hot potato. Being a settler is a tough balancing act.</p><p>I hope that this writeup makes more managers realise the need for all three attitudes and to manage the evolutionary flow of a service. You can do worse than following Simon Wardley on twitter or read through <a href="https://medium.com/wardleymaps/on-being-lost-2ef5f05eb1ec">Wardley Maps</a>.</p><p>For what its worth very similar ideas first expressed by Robert X Cringely in <a href="https://amzn.to/3ajWZGC">Accidental Empires</a>, first published in 1993 about the type of personalities that drive innovation in Silicon Valley. The equivalent personality types were <a href="https://blog.codinghorror.com/commandos-infantry-and-police/">Commandos, Infantry and Police.</a></p><p>Also, you should probably follow <a href="https://twitter.com/sidsarda">me</a> on twitter.</p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Reliability Theatre]]></title><description><![CDATA[How over enthusiasm can backfire around SLOs and SLIs]]></description><link>https://www.siddharthsarda.com/p/reliability-theatre</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/reliability-theatre</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sun, 29 Nov 2020 18:05:34 GMT</pubDate><content:encoded><![CDATA[<p>Since Google published its <a href="https://amzn.to/3lf80L3">SRE book</a> in 2016, a lot of self respecting organisations have rightly recognised SLIs and SLOs as extremely effective tools to manage reliability for your important services. A quick recap of the definitions </p><blockquote><p>An SLI is a service level&nbsp;<em><strong>indicator</strong></em>&#8212;a carefully defined quantitative measure of some aspect of the level of service that is provided. Latency and Availability are good examples for an SLI. </p><p>An SLO is a&nbsp;<em><strong>service level objective</strong></em>: a target value or range of values for a service level that is measured by an SLI.&nbsp; For example, an SLO for a service could be that the latency for 99% ile of its requests are responded to within 400ms. </p></blockquote><p>Google explicitly positioned SLIs and SLOs as a way to balance programmer/developer time spent on innovation and reliability but I have seen this lesson lost on many that I have interacted with. This is fundamentally because the fact that we are hired to create business value and not do those other things doesn&#8217;t register itself with most. Patrick McKenzie has a lovely <a href="https://www.kalzumeus.com/2011/10/28/dont-call-yourself-a-programmer/">essay</a> about this which I think should be required reading for every programmer. </p><p>The lesson that I hope everybody took away from the reliability book is that SLIs and SLOs should establish the threshold at which developers stop their primary task of creating value and spend time improving the reliability of their systems. Instead what I have seen happen is that well intentioned managers take this amazing tool and turn into a referendum of the engineering excellence of their organization. I call this <em><strong>&#8216;Reliability Theatre&#8217;</strong></em>. </p><p>Some of the most common examples of this that I have seen are the following though I am sure this is not exhaustive by any means.</p><p><em><strong>Setting unrealistic SLOs disconnected to your customer</strong></em></p><p>Its very likely that as a team you might end up with a wide assortment of services. Not all services that you might own are equally critical to your business. There are ones that bring the big money in, then there are ones your users might give you a huge latitude for. Instead of taking that latitude, managers often set unrealistic SLOs for their teams. The reliability version of the <a href="https://en.wikipedia.org/wiki/Death_march_(project_management)">death march</a>  is to set a target of reducing the latency SLO of a legacy service to a respectable level when not a single customer of that service has that expectation. </p><p>You should instead focus on creating a feedback channel with your customer wherever they might be and understand their expectations. </p><p><em><strong>Setting way too many SLIs</strong></em></p><p>A lot of teams turn every metric that they possibly can into an SLI preferring quantity over quality. Instead of focusing on what your users need you to measure, teams try to set everything that can go wrong as an SLI. In my opinion this eventually just leads to alert fatigue and harms responder well being. </p><p>I think the focus should be instead to distil what your users care about into 2 or 3 meaningful SLIs. The chapter &#8220;<em>Implementing Service Level Objectives</em>&#8221; in the <a href="https://amzn.to/3o5sisd">2nd Google book in the SRE series</a> has some extremely useful pointers to help you do that. When it comes to SLIs, like many things, <em>less is more</em>.</p><p><em><strong>Overdoing Operational Reviews</strong></em></p><p>Now don&#8217;t get me wrong, I don&#8217;t think SLIs and SLOs are a set and forget activity. Nor do I think that management taking an interest in the operational health of teams is a bad idea. Where I do draw the line is when over enthusiastic managers in an attempt to display their operational prowess convince their teams that they have to use every operational review feature that PagerDuty offers. </p><p>Just like everything else, I think operational reviews are an important part of the reliability culture and the frequency and number of operational reviews that you do should be proportional to the criticality of your service. </p><p>These are the ways that I have seen well meaning teams mess up managing SLIs and SLOs. The solution is simple. Go back to the basics, strip apart the theatre from reliability. Focus on your customer and your business. Figure out the minimum expectation they have from you. Use that as your SLIs and SLOs. Free your developers to innovate. </p><p>Rinse. Repeat.</p><p></p><p></p><p></p><p></p><p></p><p> </p>]]></content:encoded></item><item><title><![CDATA[Janna Bastow and Lean Roadmapping]]></title><description><![CDATA[Product Roadmaps and Agility can go hand in hand]]></description><link>https://www.siddharthsarda.com/p/janna-bastow-and-lean-roadmapping</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/janna-bastow-and-lean-roadmapping</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sun, 22 Nov 2020 20:21:49 GMT</pubDate><content:encoded><![CDATA[<p>As someone who &#8216;grew&#8217; up in a hardcore experimentation culture in Booking.com circa 2015, the idea that someone can speak certainly about how a product would look like in the future does not sit very easily with me.  I take with some skepticism the grandiose statements often made by product leaders at the launch of new products.</p><p>The fact of the matter is even at relatively established companies such as the one I work at, most product initiatives inevitably fail. Also, if anything this year has made John Lennon&#8217;s quote &#8216;<em>Life is what happens when you are busy making other plans</em>&#8217; particularly poignant. </p><p>That said while individual product initiatives and directions often change, in most cases the product vision and the general objectives, especially when they are outcome driven stay surprisingly robust.</p><p>Thats why I think this <a href="https://twitter.com/simplybastow/status/1168531672335343616">twitter thread</a> sat very well with me. <a href="https://twitter.com/simplybastow">Janna Bastow</a> while bemoaning the limitations of standard roadmaps and instead exhorts people to flesh out their vision, outcome driven objectives, time horizons and then come up with a lean roadmap which should serve as the prototype of the product strategy.</p><p>While thinking about this,  I also began to wonder how easily this concept lends itself to technical strategy as well. But thats for another day.</p>]]></content:encoded></item><item><title><![CDATA[In defence of legacy systems]]></title><description><![CDATA[Legacy systems is probably the dirtiest and most reviled work in tech.]]></description><link>https://www.siddharthsarda.com/p/in-defence-of-legacy-systems</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/in-defence-of-legacy-systems</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sat, 21 Nov 2020 20:54:20 GMT</pubDate><content:encoded><![CDATA[<p>Legacy systems is probably the dirtiest and most reviled work in tech.  After all at <a href="https://www.linkedin.com/in/siddharthsarda/">my job</a> in Booking.com for the last 3 years I have mostly worked on projects which fall under that so fashionable term &#8216;digital transformation&#8217;.</p><p>However, while working with other developers, I often find a desire to start afresh to avoid the complexity of legacy systems without attempting to contemplate why that complexity exists in the first place.</p><p>Its perhaps expressed most eloquently by <a href="https://queue.acm.org/detail.cfm?id=3390746">Pat Helland</a> through this quote </p><blockquote><p>The best place to build a subway is in the open cornfields of Nebraska.</p></blockquote><p>Of course, it makes no sense to build in Nebraska because there is no demand for it and while it&#8217;s harder to build a subway in a metropolitan city, thats where there is the most demand for it.</p><p>Similarly, legacy systems evolve to their current level of complexity because they solve important business problems. Trying to understand and then simplify that complexity can be incredibly rewarding for both your company and your career.</p><p>I believe Paul Graham alluded to this in a very different context in his essay <a href="http://paulgraham.com/schlep.html">Schlep Blindness</a>. </p><p></p>]]></content:encoded></item><item><title><![CDATA[Yak shaving]]></title><description><![CDATA[At the intersection of software architecture, technical strategy and working in Tech]]></description><link>https://www.siddharthsarda.com/p/coming-soon</link><guid isPermaLink="false">https://www.siddharthsarda.com/p/coming-soon</guid><dc:creator><![CDATA[Siddharth Sarda]]></dc:creator><pubDate>Sat, 21 Nov 2020 10:11:17 GMT</pubDate><content:encoded><![CDATA[<p>Welcome to Yak Shaving by me, Siddharth Sarda. </p><p>I work as a Principal Developer at Booking.com and write about my experiences and my reflection on those experiences. </p><p>I intend to write regularly, aiming at once a week frequency, reality is more nuanced. </p><p>While you wait for my next post, some of my posts that people have  liked are:</p><ul><li><p><a href="https://www.siddharthsarda.com/p/developer-progression-as-a-function">Developer Progression as a function of navigating complexity</a></p></li><li><p><a href="https://www.siddharthsarda.com/p/pioneers-settlers-and-town-planners">How to organise product teams</a></p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.siddharthsarda.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.siddharthsarda.com/subscribe?"><span>Subscribe now</span></a></p><p>In the meantime, <a href="https://www.siddharthsarda.com/p/coming-soon?utm_source=substack&utm_medium=email&utm_content=share&action=share">tell your friends</a>!</p>]]></content:encoded></item></channel></rss>