About Anish Dhar
Anish is a serial entrepreneur and Co-Founder & CEO of Cortex, the Reliability-As-Code platform for engineering teams to effectively manage and scale their microservices, funded by Sequoia Capital and Y Combinator. Before co-founding Cortex, Anish served in a senior engineering role at Uber, where he developed and maintained apps for products such as Uber Eats and Jump, the scooter and bike-sharing platform. His entrepreneurial endeavors include co-founding Divtera Capital, a marketplace for home equity, and Homeroom, a collaborative software platform for companies and employees to manage their workflow.
In this interview, Anish Dhar discusses Cortex, the future of microservices, time-saving automation techniques, the company’s customer data protections and how new services can help hold teams accountable and create a healthier business culture. These forward-looking ideas can help you navigate computing complexity with greater ease and simplicity.
Tell us about the Cortex story. How did you get started and what was your inspiration?
So we’ve been building Cortex for about a year and a half and it was largely based on our own experiences as engineers. The company got off the ground with the help of two of my closest friends.
I was previously an engineer at Uber. At Uber, it was a classic case of microservices going wrong. On my team, we had 250 to 300 services to manage and we used a combination of Excel sheets and different internal tools to track basics; things like service ownership or who’s on call for a service.
But the concept of service quality became tribal knowledge, which made it difficult for new engineers who joined the team, and it also made it very hard to work with services, especially because services are often named after Game of Thrones characters. It felt very hard to understand not only ‘What does the service do?’ or ‘Who owns it?’, but ‘Where does the documentation for this service live?’ Or ‘Is this a good quality service?’
And so Uber is obviously on one end of the service complexity scale, but we realize that every company that introduces microservices has to deal with this problem in different ways. And often times, that means creating an Excel spreadsheet. Or companies are forced to build an internal tool.
And so we started Cortex to productize robust tools to track services, giving organizations a solution without having to build internally.
Give us a brief overview of your product/services:
Cortex helps engineering teams understand and improve their service oriented architecture. It automatically catalogs all of an organization’s internal services, and then maps them to all of the data about the services through integrated third-party tooling. Cortex aims to organize all the information about an organization’s service-oriented architecture, and then helps grade the quality of those services across teams.
What automation does Cortex provide that will be the most time saving for engineers? What pain-staking manual processes are you removing?
I think the first layer of automation is to explore all of a given organization’s tooling to automatically surface all of the services within the company. For example, if you’re running Kubernetes, we have a Kubernetes integration that will automatically discover services that are being built and that will put them into the Cortex platform. It will then automatically map that service to all the data about it.
More specifically, we can automatically map services to the Git repos, to their on-call rotations, where all the documentation is, and so it just keeps this information up to date for you, which is really useful.
In terms of the second layer, I think the most painstaking process we automate is around production readiness checklists and migrations. Often times, SREs will need to create these checklists that say, ‘Does this service have an on-call rotation?’ ‘Is it production ready?’ ‘Is it meeting its SLAs?’ That involves digging through six or seven different tools to consolidate the data and really, the most common way they do this is using an Excel spreadsheet. And so with Cortex, because we integrate with all this tooling and because you can use scorecards to define these best practices and rules, you can just create a scorecard and we’ll automatically start grading these services. We’ll tell you if something changes about the quality of the service, which is really useful in larger organizational initiatives.
How will the scorecards or complex rule builders help teams to ensure security best practices across all services?
The scorecards let you define best practices for your services. And so what that means is, for example, a lot of companies we work with will have a security scorecard–So they’ll have rules like ‘there should be less than 15 Snyk vulnerabilities, across all this set of services’. Or ‘do these services have these config files in each of their repositories?’ or ‘Are all of these services running this package version?’ In all of these scenarios, you can use scorecards to ensure from a security standpoint, that services are meeting best practices.
Are teams responsible for coming up with all the rules/guidelines or will you provide anything out of the box?
We have a couple of different scorecard templates that help you automate scorecards; things like 12 factor. We even have basic on-call health scorecards that’ll show you, ‘is the MTCR and PTA for services less than some threshold that we determine?’ So there are some starting templates, but then we also provide a really easy scorecard builder that enables admins to select an integration and Cortex provides all of the rules possible so that organizations can pick and choose different integrations to build a scorecard. And if there’s something that we don’t support, organizations can push it into Cortex, to our API, and then write rules on top of it. A lot of companies will use a combination of the default templates we provide that layer and additional custom rules on top of that.
With the recent software supply chain attacks, how does Cortex assist customers in ensuring the integrity of code?
There are a few different pieces here. Based on the rules that an organization has, if there’s, for example, some attack that reduces your uptime, Cortex can detect those changes based on the set rules. Then, Cortex will email and Slack service owners, alerting them to the fact that something has changed and that there may be something to look into. And the more specific the rules are, the more you can dig into what changed in relation to the code.
One really cool integration we have with GitHub involves breaking API change detection. So, for example, we integrate with schema-based definitions like open API or Graph QL or G RPC, and if you make a breaking API change, we can actually comment on the GitHub PR telling you that ‘hey this change is going to affect these three other services’. It will actually tag the owners of those services.
From an external standpoint, based on the rules you have, we can tell you if something is ‘off’ with some part of your software stack. And internally, we try to help you keep the quality of your code high by reducing the service complexity.
You are retaining creds for many SaaS services. How do you protect your customer data?
Yeah, security is extremely important to us. Like I mentioned, the vast majority of our integrations are read-only, and so we need to have read-only access to get the data, so we can’t modify anything. However, we encrypt all our API keys at rest, we use TCP to store our keys and for a lot of our security minded customers, we actually have a self-hosted version of Cortex, so you can deploy everything internally in your own network. Nothing will ever leave that environment, so that’s an option for organizations that are very security sensitive.
Cortex is a catalog of services that construct the basis of (container) apps, and who owns them. How is this information not simply tracked in Atlassian or other existing applications like internal corporate wiki or SharePoint by developers?
I think the issue here is that companies try to track this information and it ends up being tracked in several different tools. The most common workflow I’ve seen involves someone with a confluence page, and then they’ll also have an Excel sheet, and someone else will have something on their Google Docs…etc. It just becomes very difficult to keep everything up-to-date because things move so rapidly in high growth companies.
A service that’s active today may not still be active in six months. No one knows who’s responsible for updating the spreadsheet. No one likes doing the work of maintaining such systems, if you talk to anyone who actually has maintained such things. It’s atrocious to keep up-to-date because services change or because engineers leave and join the company. Because these tools don’t integrate and provide that functionality, it’s almost like you’re applying a BandAid rather than a real solution. It’s just not very effective. So the problem is that information is tracked in these tools and yet these tools are just not built for tracking service data, which is where Cortex becomes useful.
Shift left, DevSecOps and CI/CD (continuous integration /continuous development) have radically changed software and service development. Where do you see security insertion in this process?
That’s a great question. In taking a step back, I think one of the difficult things from a security standpoint that we’ve seen with our clients is that there are a lot of security best practices and guidelines that engineers should be following, but it becomes difficult to enforce or track whether teams and service owners are actually complying with those guidelines.
And so I think Cortex really fits the unique case there, where security teams used scorecards to track whether these best practices are being followed from a CIC perspective, but also from a service owner, service health perspective. It then becomes really easy to track whether teams are making progress against certain objectives. Cortex really sits in the middle between engineering and security to help facilitate a lot of those conversations.
What would you say about the cliché ‘software will save the world’?
I think there’s a lot of truth to that statement. I think software has permeated every industry and every aspect of our lives. There are so many industries where you’re starting to see complex software as catalyst for positive change. At the same time, I think there’s a flip side to it though. We have to be wary about the influence that software can have. The impact that it can have on mental health is very important to keep track of, for example. All-in-all, I think there are a lot of pros to software development and I do think the pros outweigh the cons, but it’s important to be aware of the downsides as well.
Anything else that you wish to share with Cyber Talk’s executive-level audience?
There’s this disconnect that I think happens in a lot of engineering organizations, especially enterprises, where so many different teams have different initiatives, but you know, some specialized set of engineers, from security to SRE, may have their own goals. In a lot of modern enterprises, the tools that these teams are given to actually enforce best practices and guidelines are not that great.
I view Cortex as a way to hold teams accountable, but more importantly, as a way to really instill that culture of ownership and reliability across the engineering team. I think that, by encouraging service owners to care about the reliability and security of their services, it creates a healthier culture overall and it reduces the risk that if a key engineer leaves the company, that things will be really difficult or that you’ll lose that tribal knowledge. I think it’s important for executives to consider that equation early on, and to empower their teams to retain a centralized location in order to continually improve the quality of services.
How can organizations connect with you?
We do a two week free PLC with all the companies we work with, where you can onboard as many services as you want and we’ll help you set up scorecards. If you just go to our website, getcortexapp.com and enter your email address, we’ll set organizations up with accounts that they can try out.