When you reach mid-stage, you still have few resources and a relatively small team, but you’re now expected to deliver far more sophisticated products. This stage can potentially be the most challenging phase. It’s critical that the progress you make now isn’t at the expense of the necessary groundwork required for future phases of your growth.
Make sure you implement a solid process for SQL-based data modeling. From BI to data science, your data models must be shared across all your analytics use cases, as they serve as the underlying business logic for your analytics. Ensure that your process is run transparently, allows all users to make changes to data modeling scripts, and is version controlled.
Migrate to Snowplow Analytics from your existing web analytics and event tracking. You’ll miss out on collecting more granular data and set yourself up for big bills from Mixpanel, Heap, Segment, and the like down the road if you don’t make the change now. Snowplow is relatively easy to use, well-supported, scales, and it's free. Moreover, it’s likely to be compatible with the rest of the framework you're using.
At this stage, significant infrastructure investments are still an expensive distraction. So don’t get carried away and start investing in heavy-duty data infrastructure. Stay agile, push your data warehouse and SQL hard. All you need at this stage is the processing power of your data warehouse. It's far cheaper to pay for servers than humans so rely on data warehouse horsepower at this stage.
At the growth stage, when you’re hitting 500 employees, focus on building analytics processes that scale. Till now your two or three analysts were probably operating in an ad-hoc manner by exchanging knowledge and code informally. However, this will soon begin to break down very quickly when you start increasing your number of analysts. In fact, you’ll perform less well as your team grows if you don’t manage this transition well. To combat this, you need to bring in structure within your analytics team. You can go with the centralized or embedded model. There is no right answer here, but this decision is going to be central to how you deliver analytics to your growing organization.
Implement data testing. At this point, data is flowing into your warehouse from at least a dozen sources. To make sure the data that is being loaded is continuing to conform to your rules, you need a process. If you're automated process isn't reliable enough the analysis quality will start degrading, and you won’t know why.
You have to be serious about version control to produce high-quality code. Get all your employees in git, disable force-pushes to master and train them how to use branches. Every code that is deployed to production must have been merged through a pull request process which includes a review from another employee.
There’s no way around this. You have to get serious about documentation. This will add some overhead, but you have to invest the time and energy needed to document it all. If you don’t, you’ll find your analysts spending more time figuring out how to use it or where to get specific data from, than they do in conducting analytics.
Documentation is painstaking. Code reviews take time and energy. Analysts aren’t used to having to test their code. Especially among your long-term team members, there will be resistance to doing things this way. These processes will make analytics faster, more reliable and easier, but implementing them will prove difficult. However, you’ll push through, if you’re serious about scaling analytics.
Remind yourself that there's nothing to lose by analyzing too much at once. Most appear afraid of tracking too much because they believe they’re bogging down the system, running out of space or taking too long to query. However, in today’s market, finding more space is cheap, and if you don’t track enough, you won’t learn enough. It can end up being like a forest fire that begins with tiny tradeoffs that grow to impact your whole work. You can't just hand your team a snapshot of your aggregate sales numbers or top-line metrics. You can not spot the underlying forces that are actually making things happen when you go ahead without taking a closer look at how things have transformed over time. This way you can't see how changes at the granular level of your product affect sales or engagement.
Sharing data not only creates a healthy sense of transparency, but it also helps align different business units that should be working together. You don't want anyone struggling to find data from multiple sources. It's not unusual for two marketers, both worried about clickthrough rates, from the same company, to track down the data by different means. Maybe one asked an engineer for help while the other found it through a BI tool. In the end, they'll probably come up with two different answers to the same question. Your employees need a central, consistent way to access the data they need to do their jobs.
If you go with the wrong analytics framework, it'll disempower your team. It'll draw a line between the ones who understand and wield data to the ones who don't. This happens way too often and is toxic. When the least tech-friendly employee in your team builds a cohort chart intuitively and understands it immediately, you've picked the right system. When picking analytics software or tools, that's the baseline of ease you must look for.
Also, cut yourself some slack. Sometimes startups are bound to operate with intangibles. So measure the tangibles and act on the analytics.