Saturday, March 7, 2009

SORT

What is Data Structure?

1. For me, data structure is the study in relation to computer science or world, on how computer store or keep data from hard copy to soft copy. Also, how to gather data from the computer and bring it in real world to make it useful. There are many ways on how to store data in a computer just like using stalks, queues, arrays, lists and trees. These are examples of data structure. Each of these data structure has there unique steps in storing data.


2. A means of storing a collection of data. Computer science is in part the study of methods for effectively using a computer to solve problems, or in other words, determining exactly the problem to be solved. This process entails (1) gaining an understanding of the problem; (2) translating vague descriptions, goals, and contradictory requests, and often-unstated desires, into a precisely formulated conceptual solution; and (3) implementing the solution with a computer program. This solution typically consists of two parts: algorithms and data structures.

Source: Sci-Tech Encyclopedia


3. The physical layout of data. Data fields, memo fields, fixed length fields, variable length fields, records, word processing documents, spreadsheets, data files, database files and indexes are all examples of data structures.

Source: Computer Desktop Encyclopedia


4. Way in which data are stored for efficient search and retrieval. The simplest data structure is the one-dimensional (linear) array, in which stored elements are numbered with consecutive integers and these numbers access contents. Data items stored nonconsecutively in memory may be linked by pointers (memory addresses stored with items to indicate where the "next" item or items in the structure are located). Many algorithms have been developed for sorting data efficiently; these apply to structures residing in main memory and also to structures that constitute information systems and databases.

Source: Britannica Concise Encyclopedia


5. A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data type. A well-designed data structure allows a variety of critical operations to be performed, using as few resources, both execution time and memory space, as possible. Data structures are implemented by a programming language as data types and the references and operations they provide.

Source: Wikipedia


6. A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways. In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms.

Source: SQLServer.com


7. A data structure is a way of organizing data that considers not only the items stored, but also their relationship to each other. Advance knowledge about the relationship between data items allows designing of efficient algorithms for the manipulation of data.

Source: www.geocities.com/hemanthb2010/datastructure.doc


8. A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data.

Source: commons.wikimedia.org


9. A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data type. A well-designed data structure allows a variety of critical operations to be performed, using as few resources, both execution time and memory space, as possible. Data structures are implemented by a programming language as data types and the references and operations they provide.

Source: Wikipedia


10. Data structures are containers that contain objects of data types. There are several common data structures, and each has its own behaviour and application. Common data structures are: arrays of one or more dimensions, stacks, linked lists (singly and doubly linked), queues, and trees (balanced, binary, and so on). There are others. Understanding data structures helps the student understand most of them, how they behave, and when to use which of them.

Source: http://www.cdacmumbai.in/index.php/cdacmumbai/education/fpgdst/fpgdst_topics_concepts

TYPES OF DATA STRUCTURES:


a. HASH TREE - In cryptography and computer science Hash trees or Merkle trees are a type of data structure which contains a tree of summary information about a larger piece of data – for instance a file – used to verify its contents. Hash trees are an extension of hash lists, which in turn is an extension of hashing. Hash trees where the underlying hash function is Tiger are often called Tiger trees or Tiger tree hashes.

How does it work? :

* A hash tree is a tree of hashes in which the leaves are hashes of data blocks in, for instance, a file or set of files. Nodes further up in the tree are the hashes of their respective children. Most hash tree implementations are binary (two child nodes under each node) but they can just as well use many more child nodes under each node.
Usually, a cryptographic hash function such as SHA-1, Whirlpool, or Tiger is used for the hashing. If the hash tree only needs to protect against unintentional damage, the much less secure checksums such as CRCs can be used.
In the top of a hash tree there is a top hash (or root hash or master hash). Before downloading a file on a p2p network, in most cases the top hash is acquired from a trusted source, for instance a friend or a web site that is known to have good recommendations of files to download. When the top hash is available, the hash tree can be received from any non-trusted source, like any peer in the p2p network. Then, the received hash tree is checked against the trusted top hash, and if the hash tree is damaged or fake, another hash tree from another source will be tried until the program finds one that matches the top hash.
The main difference from a hash list is that one branch of the hash tree can be downloaded at a time and the integrity of each branch can be checked immediately, even though the whole tree is not available yet. This can be an advantage since it is efficient to split files up in very small data blocks so that only small blocks have to be redownloaded if they get damaged. If the hashed file is very big, such a hash tree or hash list becomes fairly big. But if it is a tree, one small branch can be downloaded quickly, the integrity of the branch can be checked, and then the downloading of data blocks can start.


b. An AVL tree is a self-balancing binary search tree, and it is the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; therefore, it is also said to be height-balanced. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

How does it work? :

The basic operations of an AVL tree generally involve carrying out the same actions as would be carried out on an unbalanced binary search tree, but preceded or followed by one or more operations called tree rotations, which help to restore the height balance of the subtrees.
Pictorial description of how rotations cause rebalancing tree, and then retracing one's steps toward the root updating the balance factor of the nodes. If the balance factor becomes -1, 0, or 1 then the tree is still in AVL form, and no rotations are necessary. If the balance factor becomes 2 or -2 then the tree rooted at this node is unbalanced, and a tree rotation is needed. At most a single or double rotation will be needed to balance the tree.
There are basically four cases, which need to be accounted for, of which two are symmetric to the other two. For simplicity, the root of the unbalanced subtree will be called P, the right child of that node will be called R, and the left child will be called L. If the balance factor of P is 2, it means that the right subtree outweighs the left subtree of the given node, and the balance factor of the right child (R) must then be checked. If the balance factor of R is 1, it means the insertion occurred on the (external) right side of that node and a left rotation is needed (tree rotation) with P as the root. If the balance factor of R is -1, this means the insertion happened on the (internal) left side of that node. This requires a double rotation. The first rotation is a right rotation with R as the root. The second is a left rotation with P as the root.
To search for information in AVL tree is performed exactly as in an unbalanced binary search tree. Because of the height-balancing of the tree, a lookup takes O(log n) time. No special provisions need to be taken, and the tree's structure is not modified by lookups. If each node additionally records the size of its subtree (including itself and its descendants), then the nodes can be retrieved by index in O(log n) time as well.Once a node has been found in a balanced tree, the next or previous node can be obtained in amortized constant time.


c. A Radix tree, Patricia trie/tree, or crit bit tree is a specialized set data structure based on the trie that is used to store a set of strings. In contrast with a regular trie, the edges of a Patricia trie are labelled with sequences of characters rather than with single characters. These can be strings of characters, bit strings such as integers or IP addresses, or generally arbitrary sequences of objects in lexicographical order. Sometimes the names radix tree and crit bit tree are only applied to trees storing integers and Patricia trie is retained for more general inputs, but the structure works the same way in all cases.

How does it work? :

* The radix tree is easiest to understand as a space-optimized trie where each node with only one child is merged with its child. The result is that every internal node has at least two children. Unlike in regular tries, edges can be labeled with sequences of characters as well as single characters. This makes them much more efficient for small sets (especially if the strings are long) and for sets of strings that share long prefixes.
It supports the following main operations, all of which are O(k), where k is the maximum length of all strings in the set:
• Lookup: Determines if a string is in the set. This operation is identical to tries except that some edges consume multiple characters.
• Insert: Add a string to the tree. We search the tree until we can make no further progress. At this point we either add a new outgoing edge labeled with all remaining characters in the input string, or if there is already an outgoing edge sharing a prefix with the remaining input string, we split it into two edges (the first labeled with the common prefix) and proceed. This splitting step ensures that no node has more children than there are possible string characters.

A common extension of radix trees uses two colors of nodes, 'black' and 'white'. To check if a given string is stored in the tree, the search starts from the top and follows the edges of the input string until no further progress can be made. If the search-string is consumed and the final node is a black node, the search has failed; if it is white, the search has succeeded. This enables us to add a large range of strings with a common prefix to the tree, using white nodes, then remove a small set of "exceptions" in a space-efficient manner by inserting them using black nodes.