You can edit almost every page by Creating an account. Otherwise, see the FAQ.

Data catalog

From EverybodyWiki Bios & Wiki





A data catalog is a library or inventory of data assets where data is kept neatly indexed, organized and ready for use. It contains data along with its metadata, which makes it easy to discover, understand, consume and govern the data. Data catalogs create an inventory of data assets across data lakes, databases, etc..[1].

According to Gartner, “a data catalog creates and maintains an inventory of data assets through the discovery, description and organization of distributed datasets. The data catalog provides context to enable data stewards, data/business analysts, data engineers, data scientists and other line of business (LOB) data consumers to find and understand relevant datasets for the purpose of extracting business value.”[2]

A data catalog is an important part of modern data management practices. It can be said that a data catalog combines the powers of a data dictionary and metadata repository and goes on to make the data itself easily searchable and visible to its users[3]. Modern data catalog tools also make it easier to manage data governance[4]

Concept[edit]

The concept of data catalogs arose as an answer to the most pressing challenges in data management—described by Gartner as finding and identifying data that delivers value, and supporting data governance and data security[5]. This is because the volume and veracity of data in companies is increasing but this growth is outpacing organizations' ability to drive value from it[6]

Comparison of data catalog tools[edit]

Here is an alphabetical list.

Company name License
Alation Alation Proprietary
Atlan Atlan Proprietary
AWS Glue AWS Proprietary
Collibra Collibra Proprietary
Informatica Informatica Proprietary
Azure Microsoft Proprietary
Talend Talend Proprietary
Kylo Teradata Apache license
Unifi Unifi Proprietary

References[edit]

  1. Henschen, Doug (2019-08-28). "Constellation ShortList™ Data Cataloging". Constellation Research Inc. Retrieved 2020-01-10.
  2. "Data Catalogs Are the New Black in Data Management and Analytics". Gartner. Retrieved 2020-01-09.
  3. Knight, Michelle (2017-12-28). "What is a Data Catalog?". DATAVERSITY. Retrieved 2020-01-09.
  4. September 17, 2018. "Data Cataloging Comes of Age". Transforming Data with Intelligence. Retrieved 2020-01-10.
  5. "Survey Analysis: Data Management Is Pressed Between Support for Analytics — and Data Governance, Risk and Compliance". Gartner. Retrieved 2020-01-09.
  6. "The Forrester Wave™: Machine Learning Data Catalogs, Q2 2018". reprints.forrester.com. Retrieved 2020-01-10. Unknown parameter |url-status= ignored (help)


This article "Data catalog" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Data catalog. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.