Encyclopedia Search Results
Database
For information on Wikipedia's database, see Wikipedia:Database.


A computer database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries[1]. The records retrieved in answer to queries are information that can be used to make decisions. The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science.

A typical query could be to answer questions such as, "How many hamburgers with two or more beef patties were sold in the month of March in New Jersey?". To answer such a question, the database would have to store information about hamburgers sold, including number of patties, sales date, and the region. The term "database" originated within the computing discipline. Although its meaning has been broadened by popular use, even to include non-electronic databases, this article is about computer databases. Database-like collections of information existed well before the Industrial Revolution in the form of ledgers, sales receipts and other business-related collections of data.

The central concept of a database is that of a collection of records, or pieces of information. Typically, for a given database, there is a structural description of the type of facts held in that database: this description is known as a schema. The schema describes the objects that are represented in the database, and the relationships among them. There are a number of different ways of organizing a schema, that is, of modeling the database structure: these are known as database models (or data models). The model in most common use today is the relational model, which in layman's terms represents all information in the form of multiple related tables each consisting of rows and columns (the formal definition uses mathematical terminology). This model represents relationships by the use of values common to more than one table. Other models such as the hierarchical model and the network model use a more explicit representation of relationships.

The term database refers to the collection of related records, and the software should be referred to as the database management system or DBMS. When the context is ambiguous, however, many database administrators and programmers use the term database to cover both meanings.

Many professionals consider a collection of data to constitute a database only if it has certain properties: for example, if the data is managed to ensure its integrity and quality, if it allows shared access by a community of users, if it has a schema, or if it supports a query language. However, there is no definition of these properties that is universally agreed upon.

Database management systems are usually categorized according to the data model that they support: relational, object-relational, network, and so on. The data model will tend to determine the query languages that are available to access the database. A great deal of the internal engineering of a DBMS, however, is independent of the data model, and is concerned with managing factors such as performance, concurrency, integrity, and recovery from hardware failures. In these areas there are large differences between products.

Contents

History

The earliest known use of the term data base was in November 1963, when the System Development Corporation sponsored a symposium under the title Development and Management of a Computer-centered Data Base[2]. Database as a single word became common in Europe in the early 1970s and by the end of the decade it was being used in major American newspapers.

The first database management systems were developed in the 1960s. A pioneer in the field was Charles Bachman. Bachman's early papers show that his aim was to make more effective use of the new direct access storage devices becoming available: until then, data processing had been based on punched cards and magnetic tape, so that serial processing was the dominant activity. Two key data models arose at this time: CODASYL developed the network model based on Bachman's ideas, and (apparently independently) the hierarchical model was used in a system developed by North American Rockwell, later adopted by IBM as the cornerstone of their IMS product. While IMS along with the CODASYL IDMS were the big, high visibility databases developed in the 1960s, several others were also born in that decade, some of which have a significant installed base today. Two worthy of mention are the PICK and MUMPS databases, with the former developed originally as an operating system with an embedded database and the latter as a programming language and database for the development of data-based software.

The relational model was proposed by E. F. Codd in 1970. He criticized existing models for confusing the abstract description of information structure with descriptions of physical access mechanisms. For a long while, however, the relational model remained of academic interest only. While CODASYL products (IDMS) and network model products (IMS) were conceived as practical engineering solutions taking account of the technology as it existed at the time, the relational model took a much more theoretical perspective, arguing (correctly) that hardware and software technology would catch up in time. Among the first implementations were Michael Stonebraker's Ingres at Berkeley, and the System R project at IBM. Both of these were research prototypes, announced during 1976. The first commercial products, Oracle and DB2, did not appear until around 1980. The first successful database product for microcomputers was dBASE for the CP/M and PC-DOS/MS-DOS operating systems.

During the 1980s, research activity focused on distributed database systems and database machines, but these developments had little effect on the market. Another important theoretical idea was the , but apart from some specialized applications in genetics, molecular biology, and fraud investigation, the world took little notice.

In the 1990s, attention shifted to object-oriented databases. These had some success in fields where it was necessary to handle more complex data than relational systems could easily cope with, such as spatial databases, engineering data (including software engineering repositories), and multimedia data. Some of these ideas were adopted by the relational vendors, who integrated new features into their products as a result. The 1990s also saw the spread of Open Source databases, such as PostgreSQL and MySQL.

In the 2000s, the fashionable area for innovation is the XML database. As with object databases, this has spawned a new collection of start-up companies, but at the same time the key ideas are being integrated into the established relational products. XML databases aim to remove the traditional divide between documents and data, allowing all of an organization's information resources to be held in one place, whether they are highly structured or not.

Database models

  • Atomicity: Either all the tasks in a transaction must be done, or none of them. The transaction must be completed, or else it must be undone (rolled back).
  • Consistency: Every transaction must preserve the integrity constraints — the declared consistency rules — of the database. It cannot place the data in a contradictory state.
  • Isolation: Two simultaneous transactions cannot interfere with one another. Intermediate results within a transaction are not visible to other transactions.
  • Durability: Completed transactions cannot be aborted later or their results discarded. They must persist through (for instance) restarts of the DBMS after crashes
  • In practice, many DBMS's allow most of these rules to be selectively relaxed for better performance.

    Concurrency control is a method used to ensure that transactions are executed in a safe manner and follow the ACID rules. The DBMS must be able to ensure that only serializable, recoverable schedules are allowed, and that no actions of committed transactions are lost while undoing aborted transactions.

    Replication

    Replication of databases is closely related to transactions. If a database can log its individual actions, it is possible to create a duplicate of the data in real time. The duplicate can be used to improve performance or availability of the whole database system. Common replication concepts include:

    • Master/Slave Replication: All write requests are performed on the master and then replicated to the slaves
    • Quorum: The result of Read and Write requests are calculated by querying a "majority" of replicas.
    • Multimaster: Two or more replicas sync each other via a transaction identifier.

    Parallel synchronous replication of databases enables transactions to be replicated on multiple servers simultaneously, which provides a method for backup and security as well as data availability.

    Security

    Database security denotes the system, processes, and procedures that protect a database from unintended activity.

    In the United Kingdom legislation protecting the public from unauthorized disclosure of personal information held on databases falls under the Office of the Information Commissioner. United Kingdom based organizations holding personal data in electronic format (databases for example) are required to register with the Data Commissioner. (reference: [1])

    Locking

    Locking is the act of putting a lock (access restriction) on an aspect of a database which at a particular given instance is being modified. Such locks can be applied on a row level, or on other levels such as an entire table. This helps maintain the integrity of the data by ensuring that only one user at a time can modify the data. Databases can also be locked for other reasons, like access restrictions for given levels of user.

    Databases are also locked for routine database maintenance, which prevents changes being made during the maintenance. See IBM for more detail.

    Applications of databases

    Databases are used in many applications, spanning virtually the entire range of computer software. Databases are the preferred method of storage for large multiuser applications, where coordination between many users is needed. Even individual users find them convenient, and many electronic mail programs and personal organizers are based on standard database technology. Software database drivers are available for most database platforms so that application software can use a common Application Programming Interface to retrieve the information stored in a database. Two commonly used database APIs are JDBC and ODBC.

    Database development platforms

    Notes

    1. ^ What is a Database?. The University of Queensland, Australia.
    2. ^ Swanson, Kenneth (1963-11-08). Development and Management of a Computer-Centered Database. dtic.mil. Retrieved on 2007-07-20.
    3. ^ S. Lightstone, T. Teorey, T. Nadeau, Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more, Morgan Kaufmann Press, 2007.

    References

    • S. Lightstone, T. Teorey, T. Nadeau, Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more, Morgan Kaufmann Press, 2007.
    • T. Teorey, S. Lightstone, T. Nadeau, Database Modeling & Design: Logical Design, 4th edition, Morgan Kaufmann Press, 2005.
    • C. J. Date, An Introduction to Database Systems, Eighth Edition, Addison Wesley, 2003.
    • J. Gray, A. Reuter, Transaction Processing: Concepts and Techniques, 1st edition, Morgan Kaufmann Publishers, 1992.
    • David M. Kroenke, Database Processing: Fundamentals, Design, and Implementation (1997), Prentice-Hall, Inc., pages 130-144
    • J. Shih, "Why Synchronous Parallel Transaction Replication is Hard, But Inevitable?", white paper, 2007.
    • Galindo, J., Urrutia, A., Piattini, M., Fuzzy Databases: Modeling, Design and Implementation ( guide). Idea Group Publishing Hershey, USA, 2006.
    • Galindo, J. (Editor). Handbook on Fuzzy Information Processing in Databases. Information Science Reference (an imprint of Idea Group Inc.), Hershey, USA, 2008.

    See also

    Sorry: result not found.
      Latest Comment:

      Add Your Comment:

    We welcome your Comment on this story.Comments are submitted for possible publication on the conditiin that they may be edited.Please provide your full name.We also require a working email address-not for publication,but for verification.The location field is optional. Read our Publication guidelines.

    Full name:   Email address:  
    Location:(optional)
    Your Comment::
    (max 1200
    characters)
    Remember my details
     (So you don't have to retype your details each time send feedback.)
     
    Email me if my Comment is published
     


    Fatal error: Cannot redeclare html2txt() in /home/politics/public_html/worldtracker.php on line 42