answersLogoWhite

0

Hashing methods related to data structure in C?

Updated: 8/16/2019
User Avatar

Wiki User

12y ago

Best Answer

While binary search trees provide logarithmic search times on average, we would like to do even better. For

integer keyed items, we can trivially get constant search time by using an array, with one position for each

possible key. However, with 232 possible keys, there's no way we could create an array large enough.

Instead, we could get almost constant search time if we create an array twice as large as our data set,

and store a value k at position kmodm, where m is the size of the array. To search for a value k, we again

compute k mod m and then check the corresponding array position. We would expect an even distribution

of values, so we expect one or two values per array position. To retain this expectation, we may need to

resize the array as we insert values. We want to keep the load factor, k

m, at about 0:5. Resizing potentially

requires us to move all values, but the total amortized cost of insertion is still constant. Searching is also

constant in this scheme.

There is the problem of multiple values mapping to the same array position, or bucket. This situation

is called a collision. There are multiple ways of resolving collisions. One way is to place a value that maps

to an occupied bucket to the next empty bucket. Another is to have linked lists at each bucket, and store

values in the lists. This is the solution we'll use.

This is ne for integer values, but what about arbitrary objects? We rst compute an integer corresponding

to the object, called a hash code, and use that to map the objects to positions. There are two

conditions on a hash code. Equal objects (as dened by the equals() method) must have equal hash codes,

and the hash code for an object cannot change while it is in a hash table. Then, assuming the hash code

can be computed quickly and distributes objects evenly in a table, this gives us constant-time searching for

arbitrary objects.

How should we compute a hash code on an object? Many objects can be represented as k-tuples of

integers. For example, an object corresponding to a point in two-dimensional space can be represented by a

double of integers, converting the

oating point x and y values bitwise to integers. A good hash code for a

k-tuple (x0; x1; :::; xk􀀀1) is

x0 an􀀀1 + x1 an􀀀2 + ::: + xn􀀀2 a + xn􀀀1

where a is a constant. Choosing a Prime number minimizes the chance of a collision.

Our hash code formula works even if k is unbounded, such as in strings. Java's String class uses the

following formula:

public int hashCode() {

int code = 0;

for (int i = 0; i < length(); i++) {

code = 31 * code + charAt(i);

}

}

This is our formula, with a = 31 and xi =charAt(i).

User Avatar

Wiki User

12y ago
This answer is:
User Avatar

Add your answer:

Earn +20 pts
Q: Hashing methods related to data structure in C?
Write your answer...
Submit
Still have questions?
magnify glass
imp
Related questions

What is hashing function in data structure?

If you read up on hashing, why hashing is done, what are its uses. Then you will be able to answer your own question. More to the point you will have studied the material that your homework question is intended to make you study. It is educational.


Define hashing and describe briefly including types of hashing and where it is used and advantages and disadvantages of hashing?

Hashing is performed on arbitrary data by a hash function. A hash function is any function that can convert data to either a number or an alphanumeric code. There are possibly as many types of hashing as there are data. How precisely the hash function works depends on what data it is meant to generate a hash code from. Hashing is used for a variety of things. For example, a hash table is a data structure used for storing data in memory. Instead of iterating through the structure to find a specific item, we associate a key (hash code) to a particular item (data). A hash code can be generated from a file or disk image. If the data does not match the code, then the data is assumed to be corrupted. Hashing has the advantage of taking a larger amount of data and representing it as a smaller amount of data (hash code). The code generated is unique to the data it came from. Generating a hash code can take time however, depending on the function and the data. Some hash functions include Bernstein hash, Fowler-Noll-Vo hash, Jenkins hash, MurmurHash, Pearson hashing and Zobrist hashing.


What is double hashing in data structure?

if collision is occurred in hash function then we can solve this problem by using double hash function


What is Homomorphic Hashing?

Homomorphic Hashing is a algorithm technique used for verifying data.


Why do you use hashing and not array?

Hashing provides a method to search for data.Hashing provides a method to search for data.Hashing provides a method to search for data.Hashing provides a method to search for data.


What is the difference between data structure and abstract data type?

An Abstract Data Type is an interface that interacts with a data structure. A Data Structure is an implementation of the ADT. for example. If you were going to create a linked list you would create an Interface listing all the methods required by the list. Then in the linked list class you would code how the list uses these methods. Hope this helps :)


Which hashing algorithm is used to verify the integrity of data that has been transmitted over a network?

MD5


What is database administartor?

dba is a person or group of persons who manages the data nd defining db structure,storage structure, accessing methods nd backup techniques


Which hashing function includes 128-bit hash value and is often used to verify the intergrity of data?

MD5


What are the subject-matters of data structure?

types of data structure types of data structure


How do you amend a data structure?

How do you amend a data structure?


What are the methods in collect data?

There are five different methods in collecting data. The methods in data collect are registration, questionnaires, interviews, direct observations, and reporting.