Understanding Where You Need Big Data vs Small Data

Series 1: Episode 14 | 8 June 2022

Show notes | Transcript

When we are at a very early stage of the development process, it is very difficult to get big data.

In today’s episode of the Circuit Breaker Show, we take a look at Big Data to show you how to develop better products.

  • You’ll learn why Big Data is not the answer, but only part of the answer.
  • Bob will talk about design of experiments.
  • You will discover why it is very important to look at the market in context.
  • Bob will share the projects they have worked on to help with Big Data.

Join us for this riveting discussion.

Enjoy!

 

What You’ll Learn in this Show:

  • The impact of Big Data on innovation.
  • Where you can leverage Big Data.
  • Why over-reliance on Big Data is detrimental.
  • The importance of clustering data rather than segmenting it.
  • And so much more…

 

Resources:

Todd Rose | The End of Average

Thomas Kuhn | The Structure of Scientific Revolutions

Hosts

Understanding Where You Need Big Data vs Small Data – transcript

 

Greg Engle
Hey Bob. Today, we’re going to talk about something that I think you’re going to lead the conversation a lot, I’m going to follow but that’s ok. The first question is, how do you think Big Data has affected the way people look at innovation?

 

Bob Moesta

What’s interesting to me is how big data has caused people to be more reliant on correlation and causation. What happens is, they’re trying to find “trends” in the big data, where they have and can see the underlying thing, or the one thing that causes something to happen, they’re looking for the one trigger, the common element across all these different things. What you’re finding is that as people get big data, they feel ‘As long as I have the common thread in the big data, that’s the thing I’ve got to do more of’.

 

Greg Engle

If I said, big data is showing us the effect, but not the cause. 

 

Bob Moesta

That’s correct. But they assume if they know the effect, they understand the cause, or the way they’ve asked the question, or the data they’ve collected, literally infers what the cause is, but it’s not the cause. There isn’t one cause and one effect of anything, it’s sets of causes that cause and effect. Part of it is to understand what those sets or sequence of things that make it happen. Often, we have data that is aggregated, then analyzed, as opposed to analyze how it works, then aggregated, there’s a big difference between that. 

 

People say if a million people used it, then it might be the answer, the reality is that most useful information comes from being very narrow and very deep, and connecting the causation and being able to go to the big data and see how it works. I know that I haven’t been able to extract causation from big data, I have to do it almost on an N of one basis. Clay and I used to talk about this notion of one where let’s do one interview and understand what caused somebody to do this, then we can connect those dots and then look to another person. 

 

Once we see that causation we can go to the big data and look at that for those connections. People are often. trying to make connections where they aren’t because of correlation. At the same time, they’re not actually measuring the right things, or they’re not segregating the data into time delay and how the cause happens, and the effect happens later. For example, they’ll look at everything on Tuesday and say what happens on Tuesday, but I might have the cause on Tuesday, but the effect doesn’t happen till Wednesday, so I don’t really see it. The way they look at data is very confounded that way, and they’re literally trying to compress everything.

 

Greg Engle

Correct me if I’m kind of wrong, but it seems like they’re trying to push big data further and further up the pipeline of innovation.

 

Bob Moesta

Yes, and that you need more and more data to make decisions. The fact this, what you need is better data, not more data.

 

Greg Engle 

Yeah, or the size of data sometimes matters and sometimes does not, it depends on what you’re trying to accomplish.

 

Bob Moesta

In most cases, when people want size, or sizing of a market, or I need several samples to be how confident I am that this is working to do that. And my thing is, we do need that near the end and as we launch, but when we’re very early in the development process, it’s hard to get big data. Trying to ask people who are comfortable with big data to move upstream is like I’m asking somebody who’s used to playing in a symphony to be a soloist. This notion of how you play with small data to get to an understanding that then lets you look at the big data in a very different way. 

 

Greg Engle 

I want to make sure we’re not saying big data is bad?

 

Bob Moesta

No. But it’s very hard to find the insights you need to be able to figure out what is the causation and a lot of cases, the anomalies of big data tell me more than the big data or the average of anything. I’m a very big proponent of the book, End of average by Todd Rose, he talks about how this very useful tool of average has helped us through the years, but it’s also helped distort the way we see the world in so many ways that it’s actually causing problems. 

 

Greg Engle 

It’s one of those things where it’s still a matter of knowing what tool you need, and when to use it and how to use it. A hammer is a great tool.

 

Bob Moesta

That’s right. But a hammer doesn’t work in every situation.

Greg Engle 

But if I’m trying to screw a screw in a hammer is probably not the right tool.

Bob Moesta

The other part to me, is that big data is being used as an excuse, I don’t have enough data to know. People don’t know how to be scrappy enough to generate the data they need to figure things out. Something I learned very early in my career was designing experiments from Dr. Taguchi and RA Fisher and people like that. The whole aspect of learning that was helping me be able to do very few things and learn a lot from it. Taguchi would always say the most important thing for an engineer, is to figure out how to generate the least amount of information that helps him make the most confident discoveries and direction, we don’t have unlimited dollars, time, and money and to do something. So, how do I use it to generate very efficient technical information to make better decisions? The notion is that there’s way more unknown than known, and big data almost assumes, we have everything, it’s all in here, what do we need to do, and we don’t know what we’re missing. Data comes in at different levels, they can talk about your heart rate, they can talk about your blood pressure, but at some point, there’s a whole other level of; what’s your enzymes? What’s your testosterone levels? and all these other things that have an impact on all those things. It’s as if they try to isolate them as individual things as opposed to looking to them as sets of things. I believe that most everything is set theory, but the fact is, I don’t need a big set to understand how things work.

 

Greg Engle

What is set theory in your mind?  

 

Bob Moesta 

Set, again people confuse root cause with root causes, they’ll say, one root cause, the reason why I bought that car is it had great gas mileage. In most cases, it has virtually nothing to do with why you bought the car, it has to do with the fact your car had 170,000 miles on, it had two major repairs in the last month, people try to narrow it down to one thing that cause you to do this. Then they’re like Oh, we’ve got to get gas mileage, the reality is that’s not the case. We play with small data to help us scale to large data, but we don’t use large data to help us get to the insights, the insights come from individual understanding individuals and the causal sets of what happened to then help us see how we see it in the big data.

 

Greg Engle

When we’re coaching teams, when do we usually reach for big data.

 

Bob Moesta 

So, in a lot of cases, we’ll end up using it where we will find the jobs, and we can see when people did this, and this and that, it’s like, that’s when they’re in this job, but when they’re in this other job, they do something different. For example, at base camp, one of the jobs was to help me think it through, which is a project that’s in the early stages. As they started to think about the project, what were the tasks and who’s going to be involved and everything, they would build a project, they create a bunch of tasks and never assign them, they then invite a bunch of people to comment on it, there’ll be a bunch of back and forth on the comments, then it would move into an execution part.

 

 Primarily they were created to help them think it through versus somebody else in a different job would create all the tasks they might invite people, but those people only uploaded things. Suddenly, we could see once we understood the underlying causal mechanisms from the job site, this was like 10, 12, 13 interviews but then we could go into the data and see the behavior and extract ‘Oh, these people are in this job, and those people are in that job’. We use big data, in some cases to help us confirm or at least investigate the big data to find these patterns. My belief is, it’s very hard to find these patterns without doing the qualitative causal things in between. I go back to that end of one, Clay would always say ‘new theories always come from anomalies.’ If we treat everything as an anomaly, and we understand how they cluster together, and aggregate up, it’s easier than trying to segment things down.

 

Greg Engle 

We try to tell people, know what you’re using, and why are using it, so big data is no different. There’s times that you can use jobs be done, and it’s right, and there are times that you can be using jobs to be done that might be wrong. Big Data, you know some people will say, hearing what you heard, I’ll never use it in the upfront work, which is not necessarily correct, because it depends on what I’m looking for. For example, when you were doing the milkshake stuff, you looked at data to find the anomaly. Finding the anomaly in the morning of using the milkshake machine, which wasn’t allowed to do, but you found people were doing it, and you were like why? It was to say, hey, how do we go find out why they’re doing this?

 

Bob Moesta 

The interesting part to me is, most people look at anomalies, there’s a book, Thomas Kuhn wrote this Structure of Scientific Revolutions. He talks about the fact of how people have seen data, and they see the anomaly, and scientists will say ‘that data points not good because of this, and this and this’, and they throw the data out. What you learn is when you talk to people who are innovators, they look at anomalies as like the source of new knowledge they don’t know.

 

Greg Engle

Yeah, I always need to know what I’m looking for. If we’re an upfront ideation, and we’re looking for a place to go investigate, I might use big data to find those things. If I’m in the middle, and I’ve done job to be done, or I’ve done segmentation, and I want to figure out how people behave I might use big data that then, and at the very end I might use big data to help size the market.

 

Bob Moesta 

Sizing is about prioritization. Where should I go? What should I do? 

 

Greg Engle 

I think a lot of times we think of sizing the market differently than other people do. This obsession with sizing the market is somewhat funny to me because we size it in a vacuum, and we don’t size it in context. In context, the important thing is sizing, not just ‘there’s a billion people in the world 75% of them have to eat food, therefore, this business is worth $15 billion’.

 

Bob Moesta 

That’s right. I’ve worked with people who say ‘I need to have Gen pop people see this concept tell me what they’re going to do with it. I need 40% to be a seven or eight for us to launch’. To be honest, it always feels like it’s an insurance policy done by the Church of finance that’s put through marketing to kind of say like, is this a good bet or not? If you look at the history of tools like that, it’s almost always worse than you think, meaning sometimes you’ll launch and it says it’s going to be a $20 million business and it’s $100 million business and you’re still screwed, because you don’t have any of the infrastructure to do it. Or the other time you think it’s $100 million business, and it’s a $10 million business. 

 

Greg Engle 

That’s where the overreliance on the big data tells us an answer that isn’t possible to give us. 

It’s hard.

 

Bob Moesta 

The excuse we constantly get is ‘we know it’s not accurate, but it’s better than anything else we have, so we’re going to use it’. The amount of money they spend on it, if they understood how to get to the underlying causation, and they can see the patterns for the big data, they can be way more efficient and effective in doing it, because it’s not just who, it’s who, when, where, and why, if we can see that part of it, that’s what it is. People say, ‘Oh we’ve got a lot of data’. I’m like, ‘I want to see the anomalies.’ They’re like ‘why do you want to see that?’ I’m like, ‘because anomalies are where it all starts’. One of the anomalies we found was, who are the people who shouldn’t be using base camp? We have a doctor’s practice using it, we have architects using it, we have not for profit using it. They would say, we know what agencies and what software companies use it, but we have no idea why these other people are using it. That’s how being able to make it grow was coming from the anomalies, not from being better at the core. I think that’s the other thing, most people think if I make it better for the best customers, it’s going to make it better. The reality is, often it makes it worse, because you may alienate the people who aren’t the best customers, because they can’t do it.

 

Greg Engle

I want to just point out again, we’re not saying big data is bad. We’re saying know what you’re using it for? Understand its limitations. Know how to supplement it. Don’t over rely on anything. I would say the same thing about anything we do, you can’t over rely on anything. It’s a valence of a lot of different things to help us find a way.

 

Bob Moesta 

It’s this notion that you need different perspectives, and big data is one of many perspectives you need to have on any kind of problem or thing you’re looking at. The reality is we overemphasize because it’s a large sample, and it’s statistically significant, and we can say that this has an impact. When you’re trying to innovate, there’s a lot more that goes into it than just having big data tell you that you have the answer.

 

Greg Engle

Are you working on anything to help with big data? Or to help with sizing of markets? Using some of the stuff we do?

 

Bob Moesta 

My happy place is when I have like a 20 gigabyte data file of everything about something and I’m like, Okay, I’m going to go, I’ll be back in a day. I usually call it playing with the data, it’s looking at it from all these different perspectives, just doing that, when I’ve been able to do that with people, it’s been very fun. What I’m realizing is that most people don’t know they know how to segment things, which is how do you tear things apart? Or how do you reduce it into parts. The reality is, most people don’t understand clustering, and how you take things and put it together. The aspect of learning how to cluster and taking a set of data and using cluster to it as opposed to segmenting at it, most people think it’s the same thing. You have what we call emergent properties, when you put two things together that don’t necessarily always go together but when you put them together, you have a new property that didn’t exist before. I’m working on some data analysis from the qualitative of jobs, to then use the variables we find in terms of the causation that’s there to help them understand 1. How to size, 2. How to understand what are the underlying things we need to do in this market or in that market? 

 

Greg Engle 

we’re prototyping that on a very small scale with people. We don’t know if other people want it or not. We’re just doing it internally to figure out what we can do with it, maybe work with some people doing it. But that is one of the things we are working on is how do we help size? How do we help put people in context? That’s the big thing; how do you put people that are taking these surveys into context?

 

Bob Moesta 

You and I did this when we build houses like we would take the MLS, we’d look at what houses sold in our area that were on the used market. Then we tried to understand we could take based on how long was it on the market? What was going on around it? who bought it? The underlying premise when we went to the homebuilding market was anybody who built a new home never looked at used? Within the first month, we dispelled all that. Then we said, how do we compete more, we want to be better than a used home and almost be in the same price point. So we used all that data to tell us what to build and where to build in a very systematic way. And we were able to follow the market as it went up and down of what we needed to adjust.

 

Greg Engle

Yeah, the builders will often say, Why would anybody buy used toothbrush? 

 

Bob Moesta 

That’s right. The interesting part is the one thing we did find over time that was a big difference was the plate height, the height of the ceiling, and that all the used homes in the area were eight-foot plate, and we made everything 10-foot plate. And so, when we went to compete, and we’re roughly in the same price point as a used house, but we had the 10 foot plate, the rooms look so much bigger, and it costs virtually nothing to do it. It’s of those things where we could see what that was, but we also understood by interviewing people say, why did you buy this and not used? There were some underlying things that we ended up adding to what we looked at, it was very useful. I mean, we went from what 150 homes or 300 homes, then there were 1000 homes in a little over three years. We did great.

 

Greg Engle 

Yeah. So again, I think we’re prototyping that kind of stuff. It’s not anything out in the market now. We don’t know if people want it or not yet, we’re just playing with what can we do? what can we do with jobs? More quantitative if you will?

 

Bob Moesta 

Yep. I think that I’m wrestling in this space of traits, and how people describe traits. You’ve got this trait, or I’ve got that trait, but how does it connect to jobs? There’s a lot of people who can tell you what your Myers Briggs is, and if you have this Myers Briggs, you’re more likely to do this versus that. What I’m finding is that context has more impact. What context you’re in is going to have more impact in terms of how you make that decision of what to do, than it is about your Myers Briggs. I’m wrestling with Trait Theory and being able to understand what are effects? What are the causes? And, ultimately, how do people make decisions? 

 

Greg Engle  

We just had this happen to us, right is in a disc, which is getting along steady relationship and stuff. But in the context of trying to do these podcasts, I can’t be that way with you. Because then we’re going to go in 15 billion different directions at a time and I’d have to be way more direct. So the context of what we’re trying to do dictates what I’m going to do. The job I want done, which is to have a podcast that is understandable. A podcast that resonates with people, that is repeating all those different things, and I have to be more direct with you than I would normally be. So the context changes the way I behave in those situations. 

 

Bob Moesta 

That’s right. Big data would say, No, Greg is an introvert in a study. That’s where he wants to make sure everybody gets on. But in some cases, you know you need to step out of that S View and actually be more D.

 

Greg Engle  

So it changes. That’s what we’re trying to figure out what the quantitative stuff is, how do we get people in the right mindset? Because that’s what we want to know, we don’t want to know what they would do, if everything, if the moon and the sun and stars all aligned, we don’t care, we want to know when the moon is here and the stars are here. What do you do?

 

Bob Moesta 

That’s exactly right. Most people have a lot of data on who or when or where, but they don’t put them together. It’s the kind of who, when where and why that helps us see the right patterns of what’s there. One of the other things that is really is most people assume symmetry, meaning it’s like, this is a push, this is a pull. Nothing is as symmetrical as we think, and we have to actually realize that there are a lot of assumptions when we assume symmetry, that might have gone a little too far.

 

Greg Engle

I think it’s fine. We get to make the decision whether we publish or not.

 

Bob Moesta 

I know, but at the same time. This was a cold podcast, We’ll see how it turns out. But I feel like this is one of those things where you need big data, but you need little data. I feel like most people don’t know how to deal with little data. And little data is as important as big data.

 

Greg Engle 

I think the real moral to this podcast is know what tool you’re pulling out and know why you’re doing it and get the right metrics of how this is going to be measured. So that leads us to our homework. Again, this is a difficult one. All we want you to do is think about, as you’re pulling out any tool in the next week, we need you to define it as what it is? What does it tell you? Why are you using it? And that’s the big thing is why are you using it? And how do you measure its success? Because that tells you whether you’re using the right thing or the wrong thing in the right situations. As always, we hope you enjoyed, and you tune in next time.

 

More episodes

Causal Structures

On today's episode of the Circuit Breaker Show, Bob and Greg dive into a subject covered in Bob's book, "Learning to Build," namely causal structures, and why they represent one of the five bedrock skills every innovator must possess.

Reframing the Sales Process With Demand-Side Sales

"I don't need to sell my product; I need to help people buy it." On today's Circuit Breaker Show, Bob and Greg discuss their book Demand-Side Sales. They will talk about why they wrote it, and what they learned from it. They'll also talk about what they believe to be the most difficult challenge for people implementing demand-side sales.