I've been a data analyst for 15 years. I am, by now, more than a data analyst, but it is still the heart of what I'm about.
Coincidentally, that puts the start of my analyst career right about at the launch date of one of the most significant analytical tools since the abacus: Excel 97. Excel 97 was an amazing product in its day. If you go back in time, pivot tables were just a shade below pure magic. And, with the introduction of Access as a part of Office, Normals actually had a desktop-based SQL engine.
And VBA! Oh my god. I learned to program in VBA, made functional in Excel 97. Business users could, for the first time, automate the things that they previously had to spend their own time doing. Considering that our economy is founded on increasing employee productivity, giving Normal knowledge workers the ability to automate their own tasks is pretty darn significant.
The flow would go something like this: collect data in Access >> export to Excel >> run pivot tables and more >> make charts >> paste into PowerPoint >> print >> impress boss. In 1997, this was groundbreaking stuff. And the fact that it all worked together (reasonably successfully if not pleasantly) was a triumph of product integration.
Office 97 unlocked the possibility, for the first time ever, of doing legitimate data analysis at price points that your boss would actually pay for. And price is a critical factor in innovation—I don't care that SPSS has always been more sophisticated than Excel, as almost no one actually has access to it (and even if they did, it's well beyond the capability of Normals).
But Excel hasn't really changed since 97. VBA has been extended to account for the "Internet". Row limits have been increased. But otherwise, every meaningful user-facing feature has been UI-focused. Take a look at the Excel Wikipedia page and go through the version updates: nothing significant.
The reason for this is pretty straightforward. In 1997, Excel hit the end of where it was going to go as a technology and a product. It provided all of the tools a business analyst needed to do the things that a business analyst needed to do.
But as Excel, my old friend, has aged. It is has developed smile lines. Crow's feet. While it solves its problem—discrete business data analysis—very well, it does not do a good job with the larger problem: continuous business data analysis. Excel is good at answering a single question at a single point in time, but it completely, totally fails when it comes to ongoing process and decision making.
- Every Excel worksheet is a standalone entity. It doesn't have any relationship to the core data systems that run an organization: accounting, sales, point-of-sale, etc. So every time you export data to be analyzed in Excel, it is outdated by the time you open up the file. And you have to start over (a.k.a. "Save As...") in a month when you have to run the report again.
- There is no single point of truth. I was a data analyst at General Electric for a summer, and the experience was terrible. You want some type of data? Ask 8 different people to find out who has "the spreadsheet" and have that person email it to you. Repeat for every additional type of data you need. If you ever want updates, email all of those people again. If you actually want the data from disparate sources to relate to each other, you had better have a god to pray to. Additionally, any given metric could be defined a number of different ways. Is my "revenue" the same thing as your "revenue"? Often not. This makes it impossible to tell the derivation of any given metric without traversing through an entire spreadsheet.
- There is no standardized "way to construct a spreadsheet". So trying to read someone else's workbook is basically impossible. This creates incredible amounts of job security for sub-par analysts who never get fired because if they did, no one could ever take over their spreadsheets.
I need to stop the list there or I'll never finish writing this post and you'll never finish reading it. Essentially, Excel was too powerful. Users, desperate for solutions, used it as a tool for operational—continuous—problems as opposed to discrete decision analyses. And it failed utterly, costing businesses billions and billions of dollars in lost employee productivity and lost insight. At Deloitte, where I was a technology implementation consultant for five years, our job was to replace overgrown spreadsheets. Businesses would use spreadsheets to operationalize process, and when the process ground to a halt they would call us in to custom develop software to replace the functionality. We were paid a lot of money to do this.
Fifteen years later, we finally have a real attempt to solve this problem. Not some customized Cognos installation that has a "web front end": a real, productized, way of solving operational data analysis. GoodData, Birst, Bime, and RJMetrics are multi-tenant (cloud/SaaS) solutions that all have a single goal in mind: kill operational spreadsheets.
Each of these tools provides extract-transform-load capabilities to get all your data into a single repository and keep it current without having to email 8 people every month. They all provide a single point of truth, including a complete explanation for how that truth is derived. And, they solve the "bad analyst" problem in a different, somewhat unexpected, way. If there is an innate separation of data and analysis (which is something that all data warehouses force on you), it's very easy to rip up any given analysis and start from scratch with the raw data. Beautiful.
On February 4th, I'm joining RJMetrics as their Director of Marketing. I was an early customer while at Squarespace in 2010 and have been a champion ever since. I'm excited about participating in this industry in its very early stages, working with an incredibly talented team, and building groundbreaking software.
I wish I had access to tools like this when I was 17 and first learning how to use VLOOKUP(). I want to build them for all of the 17-year-olds out there now.