Database definition

What is a silo? – Definition from Techopedia

What does silo (data silo) mean?

An IT silo is an isolated point in a system where data is separated (intentionally or accidentally) from other parts of an organization’s information and communication technology (ICT) architecture.

A classic example of a silo is a relational database that stores customer addresses. If internal security policies prevent this information from being shared with the organization’s marketing team, for example, the database can be referred to as an information silo. When this happens, the organization may face the following obstacles:

  • Different business divisions will create multiple copies of the same data.
  • Employees will make decisions based on inconsistent or incomplete data.
  • C-level personnel will struggle to get an accurate overview of the organization’s data.
  • Mid-level managers will struggle to quickly locate and access data for specific business initiatives.

Data silos often occur in large organizations because departmental units often have their own business priorities. Silos can be created on purpose – using air pockets to protect sensitive information, for example – but they can also be created by people who want to protect their own turf within an organization. This practice, sometimes called knowledge hoarding, can be especially dangerous in organizations that don’t value transparency of information.

Techopedia explains the silo (data silo)

Many IT experts talk about the limitations and negative impact of information silos.

Importance of minimizing data silos

What if an organization wants to know if a new product will work with its current marketing strategy and customer base, and if its current marketing statistics and customer information are stored in separate data silos? They will have to take a metaphorical hammer on these silos and break them down to bring the data together. Here’s how:

1. Decide what data is needed to solve a business problem.

Consolidating and cleaning data to bring out business intelligence is no small feat. This is why the whole process should be guided by the questions that management wants to answer. Then write down those specific questions and decide what data will be needed to answer them.

2. Understand the location of the data to extract.

Perform a database audit to identify exactly what data the organization is already collecting. For each database, understand the following:

  • Where is this database?
  • What are the main characteristics, inputs and outputs of this database?
  • What data is captured?
  • Which of the data points recorded here can answer your business questions?
  • What is the best way to extract this information from the database?
  • How can this data be combined with other sources to create better context and analysis?

3. Consolidate data physically or virtually into a central repository.

Create dataflows to collect data from disparate databases and data warehouses. Consider using a Data Lakehouse architecture to support structured and unstructured data.

The idea is to create a combined data set that contains all the key information in one place. This may involve mapping individual data fields together, understanding the context of each data field, and developing individual data elements that show that data in a logical and consistent way.

4. Preprocess the data to consolidate and clean it.

Once the silos are broken, administrators will need to clean the data and check its quality. Initially, the data is likely to be “noisy”. It may have incorrect or missing information and other features that need to be smoothed out. This part of the process is vital for data integrity because end users need to trust the data if they want to use it to make data-driven business decisions. Be careful when cleaning data, however. It is important not to miss outliers and important trends, as these too can often provide valuable information.

5. Turn data into actionable business intelligence.

With data silos removed, end users can use the integrated and cleansed data to power reporting and business intelligence tools and be more confident in making business decisions. This translates to more efficiency, less waste, happier stakeholders and an improved bottom line.