Unlock 2025 Benchmark data → Access insights to stay ahead in the digital experience race.

Get the Report
skip to content
Loading...
    • Why Product Analytics And what can it do for you?
    • How Heap Works A video guide
    • How Heap Compares Heap vs. competitors
    • Product Analytics + Digital Experience Analytics A deeper dive
    • The Future of Insights A comic book guide
    Watch a Demo
  • Data Insights

    • Journeys Visual maps of all user flows
    • Sense AI Analytics for everyone
    • Web Analytics Integrate key web metrics
    • Session Replay Complete context with a single click
    • Heatmaps Visualize user behavior instantly
    • Heap Illuminate Data science that pinpoints unknown friction

    Data Analysis

    • Segments User cohorts for actionable insights
    • Dashboards Share insights on critical metrics
    • Charts Analyze everything about your users
    • Playbooks Plug-and-play templates and analyses

    Data Foundation

    • Capture Automatic event tracking and apis
    • Mobile Track and analyze your users across devices
    • Enrichment Add context to your data
    • Integrations Connect bi-directionally to other tools

    Data Management

    • Governance Keep data clean and trusted
    • Security & Privacy Security and compliance made simple
    • Infrastructure How we build for scale
    • Heap Connect Send Heap data directly to your warehouse
  • Solutions

    • Funnel Optimization Improve conversion in user flows
    • Product Adoption Maximize adoption across your site
    • User Behavior Understand what your users do
    • Product Led Growth Manage PLG with data

    Industries

    • SaaS Easily improve acquisition, retention, and expansion
    • Retail and eComm Increase purchases and order value
    • Healthcare Build better digital patient experiences
    • Financial Services Raise share of wallet and LTV

    Heap For Teams

    • Product Teams Optimize product activation, conversion and retention
    • Marketing Teams Optimize acquisition performance and costs
    • Data Teams Optimize behavioral data without code
  • Pricing
  • Support

    • Heap University Video Tutorials
    • Help Center How to use Heap
    • Heap Plays Tactical how-to guides
    • Professional Services

    Resources

    • Down the Funnel Our complete blog and content library
    • Webinars & Events Events and webinar recordings
    • Press News from and about Heap
    • Careers Join us

    Ecosystem

    • Customer Community Join the conversation
    • Partners Technology and Solutions Partners
    • Developers
    • Customers Stories from over 9,000 successful companies
  • Free TrialRequest Demo
  • Log In
  • Free Trial
  • Request Demo
  • Log In

All Blogs

Data-Informed

Eliminating Data Silos, Part 1: Creating a Data Dictionary that Actually Works

December 11, 20238 min read
  • Facebook
  • Twitter
  • LinkedIn

Our recent market survey showed us that data silos remain a huge problem. If you're one of the 86% of teams with data silos, this blog series is for you! We'll be giving actionable advice on technical and cultural issues before assembling all these posts into a single reference ebook.

As our recent survey indicated, data silos remain a widespread problem. One way to reduce data silos is to build a data dictionary—a document that defines all the data points your organization collects and uses. Data dictionaries will help you make sure that everyone at your company uses key business terms consistently and that there’s a single measure (or set of measures) for them. The goal? To reduce confusion and keep your data reliable. 

Below, we’ll give some advice on how to create a data dictionary that actually does what it’s supposed to do, which is to keep all teams on the same page when it comes to data. Many companies fail at this, because they:

  • Don’t want to put the time in to make a data dictionary

  • Don’t get input and buy-in from key stakeholders across departments

  • Don't keep their data references up to date

We don't want you to have these problems, which is why we're giving you this guide.

First, what is a data dictionary?

Put simply, a data dictionary is a structured repository that gives teams a comprehensive description of your company’s data and how it’s used. Data dictionaries usually take the form of a table that starts with data terms, then adds the terms’ definitions, calculation formulas, and source information.

At Heap, we decided to call our data dictionary a Business Metrics Dictionary, because we realized it was most helpful (for us) to focus on high-level business terms (like market-qualified lead (MQL), annual recurring revenue (ARR), churn rate, renewal rate, etc.) rather than getting in the weeds with technical information.

Ultimately, you’ll want to build your data dictionary into a backend database. That will help determine your data structure, data elements, and validation rules. Just remember that you’ll need to get alignment across teams on your terms and formulas before building it. (Again, lack of alignment is the easiest way to make sure your data dictionary won’t work as it’s supposed to!)

What’s actually important when creating a data dictionary

When you’re building a data dictionary, you should always start by outlining all data points that are part of your organization’s regular operations. This ensures that the relevance, meaning, and quality of the data are the same for everyone across your company. 

After identifying key terms, you’ll need to list, define, and provide the source of each data point and add in any other details that might be necessary for teams to understand how to use it. At Heap, we found that the following elements were the most important to explain in our data dictionary, because they captured key pieces of information teams would regularly need to reference:

  • Business term. This is the term that teams will agree to use when referencing a certain data element. For example, if you’re referencing a timestamp for when an opportunity goes into negotiation phase, you can name this business term “Negotiation.” 

  • Term category. This is the category your data point falls into. Examples include sales, revenue, growth, retention, product, etc.

  • Definition. Use simple language and common words to define and describe what the data point is, when it’s used, and what teams need to know when referencing it. A CEO or brand-new employee should be able to read this definition and understand what the term means.

  • Formula. If applicable, this is your chance to show teams exactly how each data point is calculated. As with definitions, you’ll want to use as much plain English as possible when writing out formulas so that anyone can understand them.

  • Source used. This is where your data point comes from. For example, data might come from your own database, or from a third-party tool like Salesforce. 

  • Source name. This is the actual name of the column in your source/database. An example could be “Negotiation_Salesforce.”

  • Source owner. This is typically the team that owns or manages the data (e.g. sales, product, finance, customer success, etc.).

Entries in your data dictionary might then look like this: 

Business Term
Term category
Definition
Formula
Source used
Source name
Source owner

CAC (Customer Acquisition Cost)

Marketing

This acronym stands for “customer acquisition cost.” It measures the average amount of money our company spends on user acquisition.

CAC = Total cost (marketing + sales) divided by the number of new customers

internal_database_name

customer_acquisition_cost

Marketing

Negotiation

Sales

Timestamp to indicate that an opportunity has received verbal signoff and entered negotiation phase. 

N/A

Salesforce

Negotiation_Salesforce

Sales

Daily Active Users (DAU)

Product

This is the number of unique users who engage with your product in a one-day window.

DAU = [unique users] + [unique returning users] over the course of one day

Heap

daily_active_users

Product

You may also want to add columns in your data dictionary for data types, domain values, date created, date last updated, relationships to other data points, validation rules, who approved the data point, and other elements that might be important for your organization. 

How to build a data dictionary that actually works: 5 steps

Now that you know what a data dictionary entails, it’s time to start building one! This is where preparation and collaboration—two things many companies fail to take into account when starting this process—come into play. To ensure that your data dictionary becomes a go-to tool for all teams, do the following:

Step 1: Assemble a list of key terms

Time to complete: 2 weeks

First, create a Google Sheet. This will serve as a list of all business terms to include in your data dictionary. Consider building separate tabs that separate terms by department.

A big part of this step is getting input from relevant teams about what terms should be included and how they should be defined. This step is one of the most important for making a data dictionary that works. But it’s one where many companies drop the ball. (Why? Because it takes both time and extra cross-functional communication.)

At Heap, we realized it was essential to consult the following teams: 

  • Data

  • Engineering

  • Sales

  • Product

  • Marketing

  • Finance

  • Operations

While your needs might differ, building a list of terms will allow you to:

  1. Identify key terms you’ve seen used inconsistently across teams to describe actions, situations, or entities that your data describes. Examples might include total revenue vs. ARR (annual recurring revenue), gross sales vs. net sales, registrations vs. installations, etc. 

  2. Circulate your list among as many stakeholders as possible so that no terms are missed.

  3. Ensure each term is unique so there won’t be confusion with others.

Once we finished our business term list at Heap, we followed steps 2-5 (see below) to build out data dictionary entries in blocks of 15-30 terms per quarter. We did this until the project was complete. 

Breaking the work up like this ensures that when you reach the feedback step, team members can spend minimal time for review. Building your data dictionary in manageable blocks also gives colleagues an opportunity to add more business terms that need to be defined. 

Step 2: Define the terms

Time to complete: 3 weeks

Once you have all your terms, it’s time to define them! Using plain language, clearly and succinctly describe each term so its meaning won’t confuse the teams who will eventually be referencing it. Keep your wording conversational. Avoid jargon.  A brand-new employee should be able to read this definition and understand what the term means it.

Here are a few examples:

Term
Definition

ARR

Short for “Annual Recurring Revenue,” which refers to all types of recurring revenue projected over one year. This can include subscriptions, membership fees, or licensing fees.

DAY_ID

Timestamp to indicate the date an action started. 

MAU

Short for “Monthly Active Users,” which counts the number of users who have logged in to our web or mobile app each month.

When defining terms, don’t forget to include columns in your Google Sheet for any necessary source information that shows where the term came from and what team manages it.

Once all terms have been identified and defined it’s time to…

3. Write up formulas

Time to complete: 2 weeks

Does your data take the form of a string or a raw number, or does it need to be calculated with a formula? If it’s the latter, you’ll need to note the formula clearly and accurately, calling on other named data points as needed. Again, it’s best to incorporate plain English into these formulas if possible so that anyone can understand them later on. Here’s an example of how you might write out the formula for calculating customer lifetime value (CLTV):

CLTV = AOV (Average Order Value) × purchase frequency × customer lifespan

When figuring out your formulas, you should always ask yourself:

  • Where does this data come from, and what team owns it? 

  • What tool gathers the data?

  • How is it calculated, and what should the formula look like?

  • What other terms or formulas might this formula impact?

It’s best to finish this step before getting feedback from teams. Not only do formulas add extra clarification to the definition, but defining them ahead of feedback means stakeholders only have to review the document once vs. multiple times.  

4. Get feedback

Time to complete: 1-2 weeks

Once you’ve identified business terms, defined them, written up their formulas, and added any necessary source information, it’s time to get feedback. Ask key stakeholders to leave comments in your Google Sheet as they review your terms and definitions. You’ll want to get answers to the following questions:

  • Do these terms properly reflect the company’s data?

  • Will the names chosen conflict with any other data points that need tracking?

  • Have any terms been forgotten?

  • Are the definitions accurate?

  • Are the formulas accurate?

  • Have all relationships between terms been listed out?

During this phase, ask each stakeholder for a detailed review of each term and formula that pertains to them—they may be able to surface issues or use cases that you and other stakeholders haven’t thought of. If they come back with notes for improvement, work with them to finalize the formulas. Generally, you can give these stakeholders a week or so to make corrections or return notes to you. 

Remember: Once you’ve incorporated all feedback from all teams, get final signoff from leadership. This is a key step many companies forget! Whether you’re getting signoff from your CEO, your CTO, or some other department head who has final authority, it will solidify your new dictionary as a go-to source of truth for everything related to your company’s data. The terms you decide on should be the terms all teams across your company use going forward—no exceptions!

5. Maintain your data dictionary

While some teams continue to keep their data dictionaries in a simple spreadsheet, Heap uses Tableau. While any business intelligence tool that supports this function will work, just remember that linking your data dictionary to your actual data can help you save time if you need to make updates down the line.

Also, if any of your business terms change over time, make sure you update the terminology in your data dictionary while also alerting different departments and key stakeholders—including leadership—about the update. Similarly, if your company adds new business terms, you’ll want to add them to your data dictionary by following the process outlined above. 

----------------------------

Once you’ve built your data dictionary and your leadership has signed off on it as a source of truth, congrats! You’ve just taken a big first step in reducing data silos at your organization. This will have real impact on teams while also improving efficiency and encouraging collaboration across departments.

Was this helpful?
PreviousNext

Getting started is easy

Start today and elevate your analytics from reactive reporting to proactive insights. What are you waiting for?

Free Trial

Related Stories

See All

  • Creative visualization of AI CoPilot capability
    article

    Heap announces new generative AI CoPilot

    Heap, the leader in product analytics, unveils AI CoPilot’s open beta today.

  • Heap.io
    article

    What’s Next in Experience Analytics?

    What does the future of analytics hold, and what does it mean for you?

  • Heap.io
    article

    Building a Retention Strategy, Part 2: Connecting Activities to Revenue with a Metrics Tree

    If you read one post from this series, it should be this one.

Better insights. Faster.

Request Demo
  • Platform
  • Capture
  • Enrichment
  • Integrations
  • Governance
  • Security & Privacy
  • Infrastructure
  • Heap Illuminate
  • Segments
  • Charts
  • Dashboards
  • Playbooks
  • Use Cases
  • Funnel Optimization
  • Product Adoption
  • User Behavior
  • Product Led Growth
  • Customer 360
  • SaaS
  • Retail and eComm
  • Financial Services
  • Why Heap
  • Why Product Analytics
  • How Heap Works
  • How Heap Compares
  • ROI Calculator
  • The Future of Insights
  • Resources
  • Blog
  • Content Library
  • Events
  • Topics
  • Heap University
  • Community
  • Professional Services
  • Company
  • About
  • Partners
  • Press
  • Careers
  • Customers
  • DEI
  • Support
  • Request Demo
  • Help Center
  • Contact Us
  • Pricing
  • Social
    • Twitter
    • Facebook
    • LinkedIn
    • YouTube

© 2025 Heap Inc. All Rights Reserved.

  • Legal
  • Privacy Policy
  • Status
  • Trust