Very large database
A very large database, (originally written very large data base) or VLDB,[1] is a database that contains a very large amount of data, so much that it can require specialized architectural, management, processing and maintenance methodologies .[2][3][4][5]
Definition
The vague adjectives of very and large allow for a broad and subjective interpretation, but attempts at defining a metric and threshold have been made. Early metrics were the size of the database in a canonical form via database normalization or the time for a full database operation like a backup. Technology improvements have continually changed what is considered very large.[6][7]
One definition has suggested that a database has become a VLDB when it is "too large to be maintained within the window of opportunity… the time when the database is quiet".[8]
Sizes of a VLDB Database
There is no absolute amount of data that can be cited. For example, one cannot say that any database with more than 1 TB of data is considered a VLDB. This absolute amount of data has varied over time as computer processing, storage and backup methods have become better able to handle larger amounts of data.[citation needed] That said, VLDB issues may start to appear when 1TB is approached,[8][9] and are more than likely to have appeared as 30TB or so is exceeded.[10]
VLDB Challenges
Key areas where a VLDB may present challenges include configuration, storage, performance, maintenance, administration, availability and server resources.[11]:11
Configuration
Careful configuration of databases that lie in the VLDB is necessary to alleviate or reduce issues raised by VLDB databases.[11]:36—53[12]
Administration
The complexities of managing a VLDB can increase exponentially for the database administrator as database size increases.[13]
Availability and maintenance
When dealing with VLDB operations relating to maintenance and recovery such as database reorganizations and file copies which were quite practical on a non-VLDB become take very significant amounts of time and resource for a VLDB database.[14]
Backup and Recovery

The traditional scenario was a server has a tape drive attached, the database shuts down overnight and the database is copied to tape and the database is restarted when the backup completes.[citation needed] Maximum tape cartridge capacity increased from about 100GB to 2.5TB during this time.[citation needed] The time to write a full tape remained fairly constant at about 4 to 8 hours.[citation needed] So a database under 2.5TB might not cause any VLDB from this standpoint, providing the it was acceptable the database could be offline for the backup period, and for a similar time if a recovery was needed, and providing operators could be found to manage the tapes.[citation needed]
In practice requirements have evolved so that it is unacceptable for a database to be unavailable for a backup and it is also unacceptable for all but the smallest amount of time should a problem occur. These may be termed in terms of the goals of recovery time objective (RTO) and recovery point objective (RPO). If the size of the database is a key reason RTO or RPO goals are not judged to be achievable then a VLDB challenge has arisen.[citation needed] DBMSs have evolved techniques and strategies have been developed to manage these requirements however if the size of the database means the native tools are insufficient then alternative VLDB strategies may be needed.[citation needed]
Strategies that alleviate VLDB effects include online backups, redundant online copies of database files and backups, standby databases which may be geographically remote.[citation needed] Storage replication is another strategy.[citation needed]
Information Lifecycle
While some databases simply grow with data being retained many are able to remove particularly time dependent data a from database possibly to another database during regular maintenance moves. Doing mass deletes of such information, such as removing three months of transactions can require significant resource unless such data is partitioned so a partition can be removed simply with minimal resource demand.[citation needed]
Scalability
A database design should normally mean there is little impact on performance as the size of the database increases, though this will not be true if an access program needs to read all of a large high growth structure to find the information needed as opposed to being able to access that information through an index or other technique to avoid accessing the whole table. If that were to be unavoidable then a VLDB challenge might be considered to occur. Otherwise most performance issues may related to the number of concurrent database accessors rather than the size of the database.[citation needed]
Relationship to big data
VLDB is not the same as big data, however the storage aspect of big data may involve a VLDB database.[2] That said some of the storage solutions supporting big data were designed from the start to support large volumes of data, so database administrators do not encounter VLDB issues that older versions of traditional RDBMS's might encounter.[citation needed]
References
- ↑ "Oracle Database Online Documentation 11g Release 1 (11.1) / Database Administration Database Concepts". oracle. 18 Very Large Databases (VLDB). Retrieved 3 October 2018.
- ↑ 2.0 2.1 "Very Large Database (VLDB)". Technopedia. Archived from the original on 4 July 2018. Retrieved 3 October 2018.
- ↑ Gaines, R. S. and R. Gammill. Very Large Data Bases: An Emerging Research Area, Informal working paper, RAND Corporation
- ↑ Data Processing Magazine. North American Publishing Company. 1964. p. 18,58. Search this book on
- ↑ Cite error: Invalid
<ref>tag; no text was provided for refs namedWidlake - ↑ Sidley, Edgar H. (1 April 1980). Encyclopedia of Computer Science and Technology: Volume 14 - Very Large Data Base Systems to Zero-Memory and Markov Information Source. CRC Press. pp. 1–18. ISBN 9780824722142. Search this book on
- ↑ Gerritsen, Rob; Morgan, Howard; Zisman, Michael (June 1977). "On some metrics for databases or what is a very large database?". ACM SIGMOD Record. 9 (1): 50–74. doi:10.1145/984382.984393. ISSN 0163-5808.
- ↑ 8.0 8.1 Rankins, Ray; Jensen, Paul; Bertucci, Paul (18 December 2002). "21". Microsoft SQL Server 2000 (2nd ed.). SAMS. ISBN 978-0672324673. Administering Very Large SQL Server Databases. Search this book on
- ↑ "Oracle Database Release 18 - VLDB and Partitioning Guide". Oracle. 1 Introduction to Very Large Databases. Archived from the original on 3 October 2018. Retrieved 3 October 2018.
- ↑ "The Very Large Database Problem - How to Backup & Recover 30–100 TB Databases" (PDF). actifio. Archived (PDF) from the original on 19 February 2018.
- ↑ 11.0 11.1 Hussain, Syed Jaffer (2014). "Tuning & Applying Best Practices On Very Large Databases (VLDB)" (PDF). Sangam: AIOUG. Archived (PDF) from the original on 4 October 2018.
- ↑ Chaves, Warner (7 January 2015). "Top 10 Must-Do Items for your SQL Server Very Large Database". SQLTURBO. Archived from the original on 13 December 2015. Retrieved 5 October 2018.
- ↑ Furman, Dimitri (22 January 2018). Rajesh Setlem, Mike Weiner, Xiaochen Wu ((SQL Server Customer Advisory Team)), eds. "SQL Server VLDB in Azure: DBA Tasks Made Simple". MSDN. Archived from the original on 6 October 2018. Retrieved 6 October 2018.CS1 maint: Uses editors parameter (link)
- ↑ "Specialized Requirements for Relational Data Warehouse Servers". Red Brick Systems, Inc. 21 June 1996. Archived from the original on 10 October 1997.
<ref> tag with name "Widlake" defined in <references> group "" has no content.
This article "Very large database" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Very large database. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
| This page exists already on Wikipedia. |
