Finding the Right Level of Precision for Practical Analysts

All models are wrong, but some are useful; or, why "just get me this number" is so often such a painful and difficult question for analysts to answer

If you work with an analyst, you’ve probably heard them ask at some point a question like: How rigorous do we need to be here? How precise and accurate of an answer do you need? Knowing you need to ask these questions — and better yet, being able to answer them by yourself without needing someone else to tell you — is a fundamental skill for an analyst. I want to call it a basic skill, because it’s something almost every analyst has to exercise daily, except it’s neither a simple nor easy skill to acquire!

Consider: if I ask you how much revenue our company made last year because I want to put it in a press release, an answer like “about $700 million” is fine. If I ask you how much revenue our company made last year because I need to put it in an official financial statement, you’d better get the answer right to the precise dollar. If you don’t know what kind of rigor the question-asker is looking for and make the wrong assumption, you can easily waste your time overengineering a simple task — or cut the wrong corner, and make a very costly and consequential mistake.

My example might seem contrived and overly simple — it ought to be obvious to even a new grad analyst that they need to just ask what a metric request is about! Well, for one thing, actually answering a lot of basic questions about revenue is surprisingly complicated (if you’ve never had to deal with questions about the timing of revenue recognition, be grateful).

But more importantly, I think it is far from obvious to new analysts that they need to be asking these kinds of questions before going off and pulling numbers. And even once they understand they need to determine the level of rigor required for the business problem at hand, it still takes a long time to learn how to choose the appropriate level of rigor. That’s what this post is about.

Wait, Why is this a Problem?

I think what I’ve laid out so far should ring bells for pretty much any analyst, but I wouldn’t be surprised if non-analyst readers are a bit confused about what’s going on here. “Count this number for me” is essentially a lot of what non-analyst stakeholders ask of analysts, and it seems like there really shouldn’t be much question about rigor here: isn’t the number just the number? There shouldn’t really be any ambiguity about how many customers made a purchase last month, or how much GMV we’ve booked from this one account. What level of rigor or precision is up for debate here?

As the revenue example I opened with shows, there is one very basic level of precision we need to decide as analysts in giving you an answer: rounding. If you just need a general idea of how important a particular account is, giving you that account’s GMV rounded to the closest million is probably fine. On the other hand, if you need the answer because you have to cut a check to someone, we are going to want a very precise number, potentially down to the cent.

Now while in Excel rounding is just a matter of clicking a button (or tapping the right keyboard shortcut), in pulling data we are often doing a lot more than just kindergarten-level arithmetic to give you a rounded number. For example, how worried do we need to be about accounting for refunds? Or expected refunds? If we know typically 1% of GMV will be lost due to refunds in the following 30 days, do we need to worry about accounting for this? If we have to account for this, that makes the data pull itself more complicated but also may need us to do some forecasting on top of the raw data — which means there is no longer just one simple correct answer to this question! We can avoid all this headache if we just establish off the bat whether a number that’s about 99% accurate is sufficient.

And the reality is, business questions for analysts rarely are as (seemingly) straightforward as a “Count this number for me” type of question. More typical questions are:

  • Traffic was down yesterday, should we be worried?

  • This new product feature we’re A/B testing is performing surprisingly poorly, can you look into what’s going on?

  • We need a forecast of how much your business area is going to increase revenue next year, can you make us a model?

In answering all of these questions, knowing how much rigor to apply is a critical skill! There is no strictly speaking right or wrong answer to these questions — “should we be worried?” is a question where the right answer partly depends on the facts, but also depends on the business’s risk tolerance. “What’s going on with this A/B test?” is a question where the right answer is something that will help the product team ship a successful product — not a metric rounded to the appropriate number of significant figures. And when it comes to forecast models, the statistician’s saying “All models are wrong” literally has its own Wikipedia article! As I have often told my teams, all our forecasts are going to be wrong, because nobody can ever predict the future with perfect precision. The only question that matters is whether our forecast is going to be useful.

What they Won’t Teach You in Business (or Data Science) School

There are of course non-analyst stakeholders who appreciate that the analyst’s job here is more complex than just “pulling the number.” But they may still not appreciate why this is so difficult for an analyst, especially a junior one — or even an experienced one who’s inexperienced in this particular domain. While I don’t think I’ve ever heard someone say this directly to their analyst, in so many conversations with stakeholders when I’ve heard (or been) the analyst asking for guidance on how accurate or precise the answer needs to be, I’ve heard a response that amounts to: “You tell me. You’re the expert, this is your job. I don’t know enough about the numbers myself to tell you!”

This is completely fair of the non-analyst who, mind you, is basically always a busy person with their own full-time job that doesn’t have much to do with spending the whole day thinking about numbers. It’s still frustrating for the analyst, who needs their input and context — and who is already showing more judgment and thoughtfulness than the analyst who plows ahead without asking any questions.

So if it’s a fair expectation of the analyst to assess for themselves the amount of rigor necessary to answer a question, why is this skill so often lacking? Why can’t analysts learn this in school, or why can’t their manager or mentor show them the ropes? I would love to be proven wrong, but I think the answer is simply: this isn’t something you can learn in school, or any type of environment other than one with real-on-the-job stakes.

Yes, it’s a fair expectation that analysts be able to exercise their own judgment prudently here — but that’s also why experienced analysts and data scientists are so expensive to hire. This is a skill that can’t be learned through a bootcamp or classroom exposure. Analysts who know how to effectively scope and problem-solve analytical asks had to pay their tuition fees in the form of on-the-job learning and making very real mistakes. That’s why this skill is so rare, and so expensive to hire for.

This skill can’t be taught because the skill is fundamentally about knowing what to do when there is no right or wrong answer. This contravenes the premise of the vast majority of exercises you’ll encounter in any statistical or mathematical course at the undergraduate level: any time you’re given a numerical problem, it’s virtually certain that there’s going to be a “correct” answer. Of course, students have to do projects and can be given exercises like “Round this number to the appropriate number of significant digits.” But none of these will come close to approximating the actual feeling of having your manager say to you, “Hey traffic was down yesterday, any idea why?”

Of all the classes I took as a student, I think those that touched on public policy often prepared me the most for this. Even in a quantitative department like economics, where analytical problems always have a mathematically right answer, classes focused on public policy would often remind us: the “right” answer often depends on what the policymaker’s values and priorities are. While still not helpful or applicable to anything directly in the workplace (beyond as the most theoretical guide), this at least is far more useful than teaching aspiring analysts that their job is just to arrive at the correct number.

So, What Can Analysts Do?

Obviously, I think there can be no good substitute for real world problem-solving and real world stakes when it comes to honing analysts’ judgment in this field. The best quantitative problem solver is still going to need to learn when and how to apply their talents — otherwise, they will far too often find themselves overcomplicating simple tasks and boiling the ocean in the pursuit of finding a more unnecessarily precise answer.

But of course, there are rules of thumb and principles that analysts can try to apply. While I and many others have made dumb mistakes like failing to ask the right questions before doing a bunch of ultimately pointless work, there’s no reason we can’t try to train analysts to ask more questions, or give them principles that will help them ask better questions. As a manager and mentor, whenever I worked with junior analysts this was usually an area I spent a lot of time emphasizing.

Some basic rules of thumb I’ve found useful:

  1. Determine what this is going to be used for — the why behind the question. Usually knowing this already sheds a lot of light on whether a rough 5-minute answer is all that’s needed, or if this seemingly simple ask is going to be a project all of its own.

  2. Get the simple answer first anyway — often even you as the analyst won’t know how complicated getting to the final answer will actually be. Trying to get a simple answer first is a quick way to assess whether this question is trickier than it seems. For example, to answer the question “Should we be worried about traffic dropping yesterday?” you should probably try to pull the traffic data yourself to see how easy getting that data is, and whether simply plotting daily traffic over the last few weeks tells you anything obvious.

  3. When the stakeholder has no opinion or input on how accurate this needs to be, give them the simple answer with a summary of the caveats/other things you could do to make it more precise. This both shows that you’ve worked on the problem and thought deeply about it, and avoids you going off on potential wild goose chases before confirming with the stakeholder that they do need a particular amount of detail and precision.

Obviously, I have other more complicated learnings that can’t be boiled down into simple rules or principles. I’ll share these in future posts, both because I need the room to go into more detail about them, and because I think they are less straightforwardly actionable.

For example, I’ve found it helpful to think about the type of analyst I am — the role I am playing in my organization — when reflecting on the amount of rigor needed to answer a question. I can tell you that the amount of precision and statistical rigor you need as a Product Data Scientist working on ranking search results for a marketplace is many orders of magnitude higher than the amount of rigor you need as a Business Analyst working on a stat pack to inform product strategy for annual planning. But I can’t tell you how to directly translate reflections about this into something you can actually apply as an analyst in your day-to-day work. Instead, this awareness can help you indirectly in thinking about where you want to take your career, and in how you work with and learn from other members of your team. And so, this topic probably deserves a whole post in of itself.

For the time being, I hope you come away with a greater appreciation for why the analyst’s job is so hard beyond just the technical minutiae of pulling data, and why the best analysts are often so hard and costly to hire. And if you are an analyst, or have an analyst you care about, hopefully this helps you or them scope and prioritize work more efficiently. I remember when starting my career, “scoping” and “prioritization” sounded like overused corporate buzzwords or project management jargon. But the longer I’ve worked, the more I appreciate how crucial it is to first figure out what’s most important, and plan how to deliver that top priority first upfront, before starting any deeper analytical work in earnest. Hopefully that’s a lesson which resonates with you, too.

If this post resonates in any way — or even if it doesn’t — feel free to drop me a line. I’d love to hear if this post has jogged any memories, or sparked questions I can answer, whether personally or in future posts.

Reply

or to participate.