第七章Searching
7.1
Basic Search Techniques
A
table or a file is a group of elements, each of which is called a
record. Associated with each record is a key, which is used to
differentiate among different records.
In
the simplest form, the key is contained within the record at a
specific offset from the start of the record. Such a key is called
an internal key or an embedded key.
For
every file there is at least one set of keys that is unique. Such a
key is called a primary key.
It
will probably not be unique, since there may be two records with the
same state in the file, such a key is called a secondary key.
A
search algorithm is an algorithm that accepts an argument a and tries
to find a record whose key is a.
A
table of records in which a key is used for retrieval is often called
a search table or as dictionary.
Dictionary
as an abstract data type
typedef
KEYTYPE ... /* a type of key*/
typedef
RECTYPE … /* a type of record*/
RECTYPE
nullrec=... /*a “null” record*/
KEYTYPE
keyfunct(r)
RECTYPE
r;
{…
};
};
abstract
typedef [rectype] TABLE (RECTYPE);
abstract
member(tbl,k)
TABLE(RECTYPE)
tbl;
KEYTYPE
k;
postcondition
if(there exists an r in tbl such that keyfunct(r)==k)
then
member = TRUE
else
member = FALSE
abstract
RECTYPE search(tbl,k)
TABLE(RECTYPE)
tbl;
KEYTYPE
k;
postcondition
(not member(tbl,k)&& (search == nullrec)
||
(member(tbl,k) &&keyfunct(search)==k);
abstract
insert(tbl,r)
TABLE(RECTYPE)
tbl;
KEYTYPE
r;
precondition
member(tbl,keyfunct(r))==FALSE
postcondition
insert(tbl,r);
(tbl-[r])==tbl'
abstract
delete(tbl,k)
TABLE(RECTYPE)
tbl;
KEYTYPE
k;
postcondition
tbl==(tbl'-[search(tbl,k)]);
Binary
Search
The
most efficient method of searching a sequential table without the use
of auxiliary indices or table is the binary search.
O(logn)
low=0;
hi=n-1;
while(low<=hi)
{
mid=(low+hi)/2;
if(key==k(mid))
return
mid;
if(key<k(mid))
hi=mid-1;
else
low=mid+1
}
return
-1;
Interpolation
Search
Another
technique for searching an ordered array is called interpolation
search. If the keys are uniformly distributed between k(0) and k(n-1)
O(loglogn)
Tree
Searching
Inserting
into a binary tree
q=null;
p=tree;
while(p!=null)
{
if(key==k(p))
return
p;
q=p;
if(key<k(p))
p=left(p);
else
p=right(p);
}
}
v=maketree(rec,key);
if(q==null)
tree=v;
else
if(key<k(q))
left(q)=v;
else
right(q)=v;
return
v;
Deleting
from a binary search tree
q=null;
p=tree;
while(p!=null
&& k(p)!=key)
{
q=p;
p=(key<k(p)?left(p):right(p);
}
if(p==null)
return;
Balanced
Trees
If
the probability of searching for a key in a table is the same for all
keys, a balanced tree yields the most efficient search.
/*
Part I: search and inset into the binary tree*/
fp=null;
p=tree;
fya=null;
ya=p;
/*
ya points the youngest ancestor whick may become unbalanced*/
while(p!=null)
{
if(key==k(p))
return
p;
q=
(key <k(p))? left(p):right(p);
if(q!=null)
if(bal(q)!=0)
{
fya=p;
ya=q;
}
fp=p;
p=q;
}
q=maketree(rec,key);
bal(q)=0;
(key<k(fp))?left(fp)=q:right(fp)=q;
p=(key<k(ya))?left(ya):right(ya);
s=p;
while(p!=q)
{
if(key<k(p))
{
bal(p)=1;
p=left(p);
}
else
{
bal(p)=-1;
p=right(p);
}
}
/*Part
II:tree is unbalanced*/
imbal=(key<k(ya))?1:-1;
if(bal(ya)==0)
{
bal(ya)=imbal;
return
q;
}
/*
part III:rebalanced*/
if(bal(s)==imbal)
{
P=S;
if(imbal==-1)
rightrotation(ya);
else
leftrotation(ya);
bal(ya)=0;
bal(s)=0;
}
else
{
if(imbal==1)
{
p=right(s);
leftrotation(s);
left(ya)=p;
rightrotation(ya);
}
else
{
p=left(s);
p=left(s);
right(ya)=p;
rightrotation(s);
leftrotation(ya);
}
if(bal(p)==0)
{
bal(ya)=0';
bal(s)=0;
}
else
{
if(bal(p)==imbal)
{
bal(ya)=-imbal;
bal(s)=0;
}
else
{
bal(ya)=0;
bal(s)=imbal;
}
}
bal(p)=0
}
if(fya==null)
tree=p;
else
ya=right(fya))?right(fya)=p:left(fya)=p;
return
q;
A
multiway search tree of order n is a general tree in which each node
has n or fewer subtrees and contains one fewer key than it has
subtrees.
p=tree;
if(p==null)
{
position
= -1;
return
-1;
}
i=nodesearch(p,key);
if(i<numtrees(p)-1&&
key==k(p,i)
{
position=i;
return
p;
}
return
search(son(p,i));
====================================
p=tree;
while(p!=null)
{
i=nodesearch(p,key);
if(i<numtrees(p)-1&&
key==k(p,i)
{
position=i;
return
p;
}
p=son(p,i);
}
position
= -1;
return
-1;
traversing
a multiway tree
if(tree
!=null)
{
nt=numtrees(tree);
for(i=0;i<nt-1;i++)
{
traverse(son(tree,i);
printf(“%d”,k(tree,i));
}
traverse(son(tree,nt);
}
B+-Trees
One
of the major drawbacks of the B-tree is the difficulty of traversing
the key sequentially.
11.4
Hashing
A function that transforms a key into a table index is called a hash function. If h is a hash function and key is a key, h(key) is called the hash of key and is the index at which a record with the key should be placed.
#define
TABLESIZE …
typedef
KEYTYPE …
typedef
RECTYPE …
struct
record{
KEYTYPE
k;
RECTYPE
r;
}
table[TABLESIZE];
int
search(KEYTYPE key,RECTYPE rec)
{
int
i;
i=h(key);
while(table[i].k!=key
&& table[i].k!=nullkey)
{
i=rh(i);
//rehash
}
if(table[i].k==nullkey)
{
table[i].k=key;
table[i].r=rec;
}
return
i;
}
struct
nodetype *search(KEYTYPE key, RECTYPE rec)
{
struct
nodetype *p,*q,*s;
i=h(key);
q=NULL;
p=bucket[i];
while(p!=NULL&&p->k!=key)
{
q=p;
p=p->next;
}
if(p->k==key)
return
p;
s=getnode();
s->k=key;
s->r=rec;
s->next=NULL;
if(q==NULL)
bucket[i]=s;
else
q->next=s;
return
s;
}
Perfect
Hash Functions
Given
a set of keys K={k1,k2,...,kn} a perfect hash function is a hash
function h such that h(ki)!=h(kj) for all distinct i and j.
h(key)=((r+s*key)%x)d
remainder
reduction perfect hash function.
沒有留言:
張貼留言