So, from what body of water do you pull your data? A rushing river of constantly changing data? A vast ocean of data from every imaginable source? Or a local lake of specific data?
Before we dive in
Creating and maintaining a data strategy at an institution can leave leaders feeling like they are barely keeping their head above water. What they need to know is that they can achieve accurate, consistent, time-sensitive, and actionable reporting by structuring the flow of data.
I am lucky enough to be able to talk to IT and functional area leaders from all over the country about their data reporting strategies. They frequently express the tidal wave of issues which impact their strategic initiatives, both negative and positive.
Between complex budgets, staff turnover, technical limitations, and a simple fear of the unknown, these leaders are tasked with making data-driven decisions that will set the stage for years and decades to come. The pressure is there to guide the institution’s data strategy to the point where everyone can report accurately and be effective.
The great thing is that the data is already there. It may not be readily accessible or organized, but it is there – waiting to be harnessed and used!
Fully understanding each of your data intake points is, of course, critical. Centralized or decentralized reporting factors into who “owns” the data, but in the bigger picture, the data can be leveraged and aggregated in a way that makes the overall data strategy of the institution effective.
Let’s dive in!
With all the tools, applications, and other sources (internal and external) that support and run the institution, data can be found drifting among various rivers, oceans, and lakes. The question is which of these bodies of water is best for not only collecting and navigating your data, but which is the best fit for your future data strategies?
Streams and Rivers
In my former life as a registrar, juggling the flow of data felt like paddling upstream – especially during registration periods. From Admissions to Financial Aid, Registration to Graduation, and everything in between, the quantity of data points that make up a ‘student’ is enormous. Having multiple tools like a learning management system (LMS), a degree audit system, and a student information system are essential for any institution. But combining them to be in sync and useful is another challenge altogether.
The pace at which data arrives can fluctuate just as some rivers and streams flow at different rates. “What-if” degree audits (run by students and advisors) tend to peak during course registration periods, whereas an LMS is often steady throughout each semester. On premise and cloud solutions have their own pace. Thus, timing is everything with these streams of data.
The rate at which data is refreshed is equally, if not more, important. A data point that enters the river at a given time could be downstream, miles away, by the time a report is run on it. It’s vital to know when and how the snapshots of data are taken.
“There is so much data!”
Have you ever sat back and wondered how to even begin to make sense of it all? Student data is a vast ocean of information. In fact, by the time you have finished reading this blog, more data points have probably been created!
An ocean of data means there is information as far as the eye can see! This does not happen overnight. Not only do reports and dashboards pile up over time, but data sources may arrive unexpectedly (e.g., an application gets added in Advancement; Financial Aid adopts a scholarship tool; or the local government agency sends Excel spreadsheets to be disseminated). Where does it end?
A little oceanography
We have a children’s science museum where I live. One of the exhibits is devoted to the oceans and all the factors that go into making it a sustainable environment. In one of the hands-on stations, the goal is to create “equilibrium” in the ocean by adjusting levers that represent different environmental factors. Temperature, oxygen/carbon dioxide levels, and animal and plant life are all some of the options one could adjust to see the positive or negative effect on the ocean. Too much of one factor, or not enough of another, will throw the entire ocean into disarray.
Equilibrium of your ocean of data – like that of the actual ocean – is a monumental task. One pull of the lever here or there can be costly and disruptive. However, you can start to shrink that ocean when you use those levers to reduce the amount of old and unusable data. Keep only the sources of data that mean something, that are accurate, and that add value to those data-driven decisions.
The last body of water we’ll examine, the lake, is likely the ideal comparison to a model in which higher education data can live and flourish.
Lakes may be formed by nature or created by humans. Their characteristics (pH level, temperature, aquatic life) are often specific to their geographic region and/or climate. For those manmade lakes created by the formation of dams, water levels can be also adjusted as needed.
Like lakes of water – and unlike oceans – data lakes tend to be more specific. They don’t need to contain every data point imaginable. Just the ones you know you need. And, in the same way lakes can be regulated by dams, an institution can control the level at which data is taken in and structured. (All the sources and specific points therein can be coordinated to flow into the lake at certain times and coordinated to be used in a manageable way.)
Think about your data being held in a lake-style body of water this way:
It’s a controlled ecosystem that aligns similar data points among multiple data sources in a way that is navigable and easier for IT, users, and institutional leaders to utilize to make actionable decisions with total confidence.
Let’s break it down
There are common data points among the various sources used by an institution. For example, student enrollment, student ID, and cohort identification can all exist in some form, but may be identified and defined differently. A major challenge to IT leaders has always been how to make sense of these common data points. They hear users from different departments refer to a ‘student,’ but see they are using five different data points to define what they mean.
A data lake can put everybody in the same boat, so to speak. It can ensure everyone is using identical, consistent, and agreed upon data. The data lake approach offers clear boundaries (or shorelines, if you will) and environments that are more manageable. The depth of each data source can be identified and controlled. Knowledge sharing amongst data owners can promote deeper collaboration and faster problem solving.
The question isn’t whether your college or university should possess a river, an ocean or a lake of data. It’s unavoidable and quite necessary to have all of them. The question is, “Which one fits best into a data strategy and can be used for key decision making?” You want this data to live, breathe, and thrive in an ecosystem that is unique, ambitious, and one that serves the mission and goals of the institution. It is a data lake, then, that would best serve the institution in this manner.